Skip to content

Home > Docs > Best Practices > Spark Runtime Breaking Changes Matrix

โš™๏ธ Spark Runtime Breaking Changes Matrix

Navigate Fabric Spark Runtime Upgrades with Confidence

Category Status Last Updated


Last Updated: 2026-04-27 | Version: 1.0.0


๐Ÿ“‘ Table of Contents


๐ŸŽฏ Overview

Microsoft Fabric bundles Apache Spark, Delta Lake, Python, and dozens of pre-installed libraries into a managed runtime. Runtime upgrades can introduce breaking changes at multiple levels: Spark APIs, Python language features, library versions, Delta Lake protocol requirements, and configuration defaults. This guide provides a comprehensive matrix of every breaking change across Fabric Spark Runtime versions 1.1 through 2.0, with migration checklists and rollback procedures.

Runtime Lifecycle

timeline
    title Fabric Spark Runtime Lifecycle
    section Runtime 1.1 (GA)
        2023-11 : GA release
        2024-06 : Maintenance mode
        2025-03 : End of support
    section Runtime 1.2 (GA)
        2024-03 : GA release
        2025-01 : Default for new workspaces
        2025-09 : Maintenance mode
    section Runtime 1.3 (GA)
        2025-06 : GA release
        2026-01 : Default for new workspaces
        2026-06 : Expected maintenance
    section Runtime 2.0 (GA)
        2026-03 : GA release
        2026-06 : Default for new workspaces

๐Ÿ“Š Runtime Version Matrix

Core Component Versions

Component Runtime 1.1 Runtime 1.2 Runtime 1.3 Runtime 2.0
Apache Spark 3.4.1 3.5.0 3.5.1 4.0.0
Delta Lake 2.4.0 3.1.0 3.2.0 4.0.0
Python 3.10 3.11 3.11 3.12
Java 11 11 17 17
Scala 2.12.17 2.12.18 2.12.18 2.13.12
R 4.2 4.3 4.3 4.4
Pandas 1.5.3 2.1.4 2.2.1 2.2.2
NumPy 1.24.3 1.26.4 1.26.4 2.0.1
mssparkutils 1.0 1.1 1.2 2.0

Breaking Change Severity Legend

Symbol Severity Action Required
๐ŸŸข None No changes needed
๐ŸŸก Low Minor code updates, deprecation warnings
๐ŸŸ  Medium Code changes required, may affect behavior
๐Ÿ”ด High Significant refactoring, potential data issues
โ›” Critical Must fix before upgrade, data corruption risk

๐Ÿ Python Version Changes

Python 3.10 โ†’ 3.11 (Runtime 1.1 โ†’ 1.2)

Change Severity Impact Migration
match/case syntax available ๐ŸŸข New feature, no breakage Optional adoption
Exception groups (ExceptionGroup) ๐ŸŸข New feature Optional adoption
tomllib in standard library ๐ŸŸข Can replace toml package Optional
f-string parsing improvements ๐ŸŸข Existing f-strings work None
Faster CPython (10-60%) ๐ŸŸข Performance improvement None

Python 3.11 โ†’ 3.12 (Runtime 1.3 โ†’ 2.0)

Change Severity Impact Migration
distutils removed ๐Ÿ”ด from distutils import ... fails Use setuptools or packaging
imp module removed ๐ŸŸ  import imp fails Use importlib
asynchat, asyncore removed ๐ŸŸ  Legacy async code breaks Use asyncio
f-string nesting allowed ๐ŸŸข New feature Optional
typing.TypeVar defaults ๐ŸŸข New feature Optional
Comprehension inlining ๐ŸŸข Performance improvement None
unittest.mock changes ๐ŸŸก Some mock behaviors differ Review test assertions
# โŒ Runtime 1.3 (Python 3.11): Works
from distutils.version import LooseVersion
if LooseVersion("1.2.3") > LooseVersion("1.2.0"):
    pass

# โœ… Runtime 2.0 (Python 3.12): Required replacement
from packaging.version import Version
if Version("1.2.3") > Version("1.2.0"):
    pass

๐Ÿ“ฆ Library Breaking Changes

Pandas 1.5 โ†’ 2.x

Change Severity 1.5 Behavior 2.x Behavior Migration
append() removed ๐Ÿ”ด df.append(row) Removed Use pd.concat([df, row])
Copy-on-Write default ๐ŸŸ  Mutable views CoW by default Review in-place modifications
datetime64 resolution ๐ŸŸ  Nanosecond only Variable (ns, us, ms, s) Check timestamp arithmetic
inplace deprecation trend ๐ŸŸก Widely supported Some deprecated Chain operations instead
GroupBy.apply behavior ๐ŸŸ  Includes group keys include_groups=False default Add explicit parameter
DataFrame.swaplevel ๐ŸŸก Positional args Keyword-only Use axis= keyword
# โŒ Pandas 1.5 (Runtime 1.1)
df = df.append({"col": "value"}, ignore_index=True)

# โœ… Pandas 2.x (Runtime 1.2+)
df = pd.concat([df, pd.DataFrame([{"col": "value"}])], ignore_index=True)
# โŒ Pandas 1.5: Mutable view behavior
df2 = df[["col_a", "col_b"]]
df2["col_a"] = 0  # Mutates df in some cases (SettingWithCopyWarning)

# โœ… Pandas 2.x: Copy-on-Write โ€” df is never mutated
df2 = df[["col_a", "col_b"]]
df2["col_a"] = 0  # Only df2 changes, never df

NumPy 1.24 โ†’ 2.0

Change Severity 1.24 Behavior 2.0 Behavior Migration
np.bool alias removed ๐Ÿ”ด Alias for bool AttributeError Use np.bool_ or bool
np.int alias removed ๐Ÿ”ด Alias for int AttributeError Use np.int_ or int
np.float alias removed ๐Ÿ”ด Alias for float AttributeError Use np.float64 or float
np.complex alias removed ๐Ÿ”ด Alias for complex AttributeError Use np.complex128
np.object alias removed ๐Ÿ”ด Alias for object AttributeError Use np.object_ or object
np.str alias removed ๐Ÿ”ด Alias for str AttributeError Use np.str_ or str
String representation changes ๐ŸŸก np.array([1]) Different repr Update test assertions
np.in1d deprecated ๐ŸŸก Working Deprecation warning Use np.isin
# โŒ NumPy 1.24 (Runtime 1.1/1.2)
arr = np.array([1, 2, 3], dtype=np.int)
mask = np.array([True, False, True], dtype=np.bool)

# โœ… NumPy 2.0 (Runtime 2.0)
arr = np.array([1, 2, 3], dtype=np.int_)
mask = np.array([True, False, True], dtype=np.bool_)

Delta Lake / delta-spark

Change Severity Old Version New Version Migration
DeltaTable.convertToDelta API change ๐ŸŸ  Positional args Named args Use keyword arguments
Protocol version requirements โ›” Reader V1/Writer V2 Reader V3/Writer V7 See Delta Protocol section
Deletion vectors default ON ๐ŸŸ  Off by default On by default May increase read latency initially
Liquid clustering GA ๐ŸŸข Preview GA Can replace ZORDER
OPTIMIZE output schema ๐ŸŸก Fewer columns Additional metrics Update assertions on OPTIMIZE output
Default Parquet writer version ๐ŸŸก V1 V2 Forward-compatible
Coordinated commits ๐ŸŸ  Not available Available New concurrency model

mssparkutils Changes

Change Severity Old API New API Migration
fs.mount removed ๐Ÿ”ด mssparkutils.fs.mount() Not available Use OneLake paths directly
Credential API changes ๐ŸŸ  credentials.getToken() credentials.getAccessToken() Update method name
Notebook reference API ๐ŸŸก notebook.run("name") notebook.run("name", timeout) Add timeout parameter
fs.head output type ๐ŸŸก Returns string Returns bytes in some cases Handle both types
# โŒ Runtime 1.1: Mount-based access
mssparkutils.fs.mount("abfss://container@storage.dfs.core.windows.net", "/mnt/data")
df = spark.read.parquet("/mnt/data/file.parquet")

# โœ… Runtime 2.0: Direct OneLake path
df = spark.read.parquet("abfss://workspace@onelake.dfs.fabric.microsoft.com/lakehouse/Files/data/file.parquet")

๐Ÿ”บ Delta Lake Protocol Versions

Protocol Compatibility Matrix

Feature Min Reader Min Writer Runtime 1.1 Runtime 1.2 Runtime 1.3 Runtime 2.0
Basic Delta 1 1 โœ… โœ… โœ… โœ…
Column mapping 2 5 โœ… โœ… โœ… โœ…
Deletion vectors 3 7 โŒ โœ… โœ… โœ…
Liquid clustering 3 7 โŒ โŒ โœ… โœ…
Row tracking 3 7 โŒ โŒ โŒ โœ…
V2 checkpoints 3 7 โŒ โŒ โŒ โœ…

Critical Warning: Once a table's protocol version is upgraded, older runtimes cannot read or write to it. This is a one-way operation. Never upgrade table protocol without ensuring all consumers run a compatible runtime.

Checking Table Protocol

from delta.tables import DeltaTable

dt = DeltaTable.forName(spark, "silver.player_transactions")
detail = dt.detail().select("minReaderVersion", "minWriterVersion").collect()[0]
print(f"Reader: V{detail.minReaderVersion}, Writer: V{detail.minWriterVersion}")

# Check all tables for protocol compatibility
tables = spark.sql("SHOW TABLES IN silver").collect()
for t in tables:
    try:
        dt = DeltaTable.forName(spark, f"silver.{t.tableName}")
        detail = dt.detail().select("minReaderVersion", "minWriterVersion", "name").collect()[0]
        print(f"{detail.name}: Reader V{detail.minReaderVersion}, Writer V{detail.minWriterVersion}")
    except Exception as e:
        print(f"silver.{t.tableName}: Error - {e}")

โšก Spark Configuration Changes

Default Value Changes

Configuration Runtime 1.1 Runtime 1.2 Runtime 1.3 Runtime 2.0 Impact
spark.sql.adaptive.enabled true true true true ๐ŸŸข No change
spark.sql.adaptive.coalescePartitions.enabled true true true true ๐ŸŸข No change
spark.sql.shuffle.partitions 200 200 auto auto ๐ŸŸก AQE auto-tuning
spark.sql.sources.partitionOverwriteMode static static dynamic dynamic ๐ŸŸ  Partition overwrite behavior
spark.sql.parquet.datetimeRebaseModeInRead EXCEPTION CORRECTED CORRECTED CORRECTED ๐ŸŸก Historic dates
spark.sql.ansi.enabled false false true true ๐Ÿ”ด Type errors on overflow
spark.sql.legacy.timeParserPolicy LEGACY LEGACY CORRECTED EXCEPTION ๐ŸŸ  Date parsing strictness

ANSI Mode Impact (Runtime 1.3+)

# โŒ Pre-ANSI (Runtime 1.1/1.2): Silent overflow
spark.sql("SELECT CAST(2147483648 AS INT)")  # Returns -2147483648 (overflow wraps)

# โœ… ANSI mode (Runtime 1.3+): Throws ArithmeticException
# spark.sql("SELECT CAST(2147483648 AS INT)")  # ERROR: overflow

# To handle safely:
spark.sql("SELECT TRY_CAST(2147483648 AS INT)")  # Returns NULL
# Or explicitly:
spark.sql("SELECT CAST(2147483648 AS BIGINT)")  # Returns correct value

Partition Overwrite Mode Change

# โŒ Static mode (Runtime 1.1/1.2): Overwrites ALL partitions
df.write.format("delta") \
    .mode("overwrite") \
    .partitionBy("gaming_date") \
    .saveAsTable("bronze.slot_telemetry")
# Deletes data for ALL dates, writes only today's data

# โœ… Dynamic mode (Runtime 1.3+): Overwrites only affected partitions
# Same code now only overwrites partitions present in df
# Explicitly set if you need old behavior:
spark.conf.set("spark.sql.sources.partitionOverwriteMode", "static")

๐Ÿ”ง Migration Checklist

Pre-Upgrade (1 Week Before)

- [ ] Inventory all notebooks and their library imports
- [ ] Run `grep -r "np\.bool\b\|np\.int\b\|np\.float\b\|np\.object\b" notebooks/`
- [ ] Run `grep -r "\.append(" notebooks/` to find pandas append calls
- [ ] Run `grep -r "distutils" notebooks/` to find distutils usage
- [ ] Check Delta table protocol versions (see script above)
- [ ] Document current Spark config overrides
- [ ] Review custom library versions in environment.yml
- [ ] Run full test suite on current runtime (baseline)
- [ ] Back up critical Delta tables: `CREATE TABLE ... DEEP CLONE`

During Upgrade

- [ ] Set new runtime version in Fabric workspace settings
- [ ] Run smoke test notebook (core operations)
- [ ] Run schema validation on all Bronze/Silver/Gold tables
- [ ] Verify mssparkutils API calls work
- [ ] Check Spark UI for new warnings or errors
- [ ] Validate Delta table read/write operations
- [ ] Test Direct Lake connectivity

Post-Upgrade (1 Week After)

- [ ] Run full test suite on new runtime
- [ ] Compare query performance metrics (before/after)
- [ ] Review Spark event logs for deprecated API warnings
- [ ] Update environment.yml with new compatible library versions
- [ ] Update CLAUDE.md with new runtime version
- [ ] Document any behavioral changes observed
- [ ] Remove deep clone backups after 30-day stabilization

๐Ÿงช Testing Strategy

Automated Compatibility Scanner

import ast
import os
from pathlib import Path

BREAKING_PATTERNS = {
    "runtime_2.0": {
        "np.bool": "Replace with np.bool_ or bool",
        "np.int": "Replace with np.int_ or int",
        "np.float": "Replace with np.float64 or float",
        "np.object": "Replace with np.object_ or object",
        "np.str": "Replace with np.str_ or str",
        ".append(": "pandas append removed โ€” use pd.concat",
        "from distutils": "distutils removed in Python 3.12",
        "import imp": "imp removed in Python 3.12",
        "mssparkutils.fs.mount": "fs.mount removed in Runtime 2.0",
    }
}

def scan_notebook(file_path: str, target_runtime: str = "runtime_2.0") -> list:
    """Scan a notebook for breaking changes."""
    issues = []
    patterns = BREAKING_PATTERNS.get(target_runtime, {})

    with open(file_path, "r", encoding="utf-8") as f:
        for line_num, line in enumerate(f, 1):
            for pattern, fix in patterns.items():
                if pattern in line:
                    issues.append({
                        "file": file_path,
                        "line": line_num,
                        "pattern": pattern,
                        "fix": fix,
                        "code": line.strip()
                    })
    return issues

# Scan all notebooks
notebook_dir = Path("notebooks")
all_issues = []
for nb in notebook_dir.rglob("*.py"):
    all_issues.extend(scan_notebook(str(nb)))

print(f"Found {len(all_issues)} potential breaking changes:")
for issue in all_issues:
    print(f"  {issue['file']}:{issue['line']} โ€” {issue['pattern']} โ†’ {issue['fix']}")

Runtime Comparison Test

def runtime_compatibility_test():
    """Smoke test for core operations across runtime versions."""
    results = {}

    # Test 1: Delta read/write
    try:
        df = spark.range(1000).toDF("id")
        df.write.format("delta").mode("overwrite").saveAsTable("test.runtime_check")
        spark.table("test.runtime_check").count()
        results["delta_rw"] = "PASS"
    except Exception as e:
        results["delta_rw"] = f"FAIL: {e}"

    # Test 2: Pandas conversion
    try:
        pdf = spark.range(100).toPandas()
        assert len(pdf) == 100
        results["pandas_convert"] = "PASS"
    except Exception as e:
        results["pandas_convert"] = f"FAIL: {e}"

    # Test 3: mssparkutils
    try:
        files = mssparkutils.fs.ls("Files/")
        results["mssparkutils"] = "PASS"
    except Exception as e:
        results["mssparkutils"] = f"FAIL: {e}"

    # Test 4: ANSI mode behavior
    try:
        result = spark.sql("SELECT TRY_CAST('not_a_number' AS INT)").collect()
        results["ansi_mode"] = "PASS" if result[0][0] is None else "UNEXPECTED"
    except Exception as e:
        results["ansi_mode"] = f"FAIL: {e}"

    # Test 5: NumPy types
    try:
        import numpy as np
        arr = np.array([1, 2, 3], dtype=np.int_)
        mask = np.array([True, False], dtype=np.bool_)
        results["numpy_types"] = "PASS"
    except Exception as e:
        results["numpy_types"] = f"FAIL: {e}"

    return results

๐Ÿ”™ Rollback Procedures

Workspace-Level Rollback

1. Navigate to Fabric Workspace โ†’ Settings โ†’ Spark Settings
2. Change Runtime Version back to previous version
3. Wait for runtime pool to restart (~5 minutes)
4. Run smoke test notebook to verify rollback
5. Check Delta table accessibility (protocol version may block!)

Table-Level Rollback

# If a Delta table protocol was upgraded and you need to roll back:
# WARNING: Protocol downgrades are NOT supported by Delta Lake

# Option 1: Restore from deep clone backup
spark.sql("DROP TABLE silver.player_transactions")
spark.sql("""
    CREATE TABLE silver.player_transactions
    DEEP CLONE silver.player_transactions_backup_pre_upgrade
""")

# Option 2: Export and re-import
df = spark.read.format("delta").load("path/to/backup/")
df.write.format("delta").saveAsTable("silver.player_transactions")

๐ŸŽฐ Casino Workload Impact

Notebook Runtime 2.0 Risk Key Changes
01_bronze_slot_telemetry.py ๐ŸŸก Low mssparkutils path updates
01_silver_slot_cleansed.py ๐ŸŸก Low ANSI mode on type casts
01_gold_slot_performance.py ๐ŸŸก Low Pandas 2.x for toPandas()
Streaming notebooks ๐ŸŸ  Medium Trigger API changes, checkpoint format
ML notebooks ๐Ÿ”ด High NumPy 2.0 type aliases removed

๐Ÿ›๏ธ Federal Workload Impact

Notebook Runtime 2.0 Risk Key Changes
USDA Bronze/Silver/Gold ๐ŸŸก Low Standard Delta operations
SBA loan analysis ๐ŸŸก Low Pandas concat migration
NOAA weather forecasting ๐ŸŸ  Medium NumPy 2.0, scikit-learn compat
EPA sensor streaming ๐ŸŸ  Medium Streaming API changes
DOI geospatial ๐ŸŸ  Medium GeoPandas + NumPy 2.0

๐Ÿšซ Anti-Patterns

Anti-Pattern 1: Upgrading Without Testing

# โŒ WRONG: "Just switch the runtime and see what happens"
# โœ… CORRECT: Run the full compatibility scanner and test suite first

Anti-Pattern 2: Upgrading Table Protocols Blindly

# โŒ WRONG: Enabling all new Delta features on production tables
spark.sql("ALTER TABLE silver.transactions SET TBLPROPERTIES ('delta.enableDeletionVectors' = 'true')")
# Now Runtime 1.1 workspaces can't read this table!

# โœ… CORRECT: Only upgrade after confirming all consumers are on compatible runtime

Anti-Pattern 3: Ignoring Deprecation Warnings

# โŒ WRONG: Suppressing warnings and moving on
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)

# โœ… CORRECT: Fix deprecated usage before it becomes an error

๐Ÿ“š References


Next: V-Order Tuning Deep Dive | Partition Strategy Decision Tree