Home > Docs > Best Practices > Spark Runtime Breaking Changes Matrix

⚙️ Spark Runtime Breaking Changes Matrix¶

Navigate Fabric Spark Runtime Upgrades with Confidence

Last Updated: 2026-04-27 | Version: 1.0.0

📑 Table of Contents¶

🎯 Overview
📊 Runtime Version Matrix
🐍 Python Version Changes
📦 Library Breaking Changes
🔺 Delta Lake Protocol Versions
⚡ Spark Configuration Changes
🔧 Migration Checklist
🧪 Testing Strategy
🔙 Rollback Procedures
🎰 Casino Workload Impact
🏛️ Federal Workload Impact
🚫 Anti-Patterns
📚 References

🎯 Overview¶

Microsoft Fabric bundles Apache Spark, Delta Lake, Python, and dozens of pre-installed libraries into a managed runtime. Runtime upgrades can introduce breaking changes at multiple levels: Spark APIs, Python language features, library versions, Delta Lake protocol requirements, and configuration defaults. This guide provides a comprehensive matrix of every breaking change across Fabric Spark Runtime versions 1.1 through 2.0, with migration checklists and rollback procedures.

Runtime Lifecycle¶

timeline
    title Fabric Spark Runtime Lifecycle
    section Runtime 1.1 (GA)
        2023-11 : GA release
        2024-06 : Maintenance mode
        2025-03 : End of support
    section Runtime 1.2 (GA)
        2024-03 : GA release
        2025-01 : Default for new workspaces
        2025-09 : Maintenance mode
    section Runtime 1.3 (GA)
        2025-06 : GA release
        2026-01 : Default for new workspaces
        2026-06 : Expected maintenance
    section Runtime 2.0 (GA)
        2026-03 : GA release
        2026-06 : Default for new workspaces

📊 Runtime Version Matrix¶

Core Component Versions¶

Component	Runtime 1.1	Runtime 1.2	Runtime 1.3	Runtime 2.0
Apache Spark	3.4.1	3.5.0	3.5.1	4.0.0
Delta Lake	2.4.0	3.1.0	3.2.0	4.0.0
Python	3.10	3.11	3.11	3.12
Java	11	11	17	17
Scala	2.12.17	2.12.18	2.12.18	2.13.12
R	4.2	4.3	4.3	4.4
Pandas	1.5.3	2.1.4	2.2.1	2.2.2
NumPy	1.24.3	1.26.4	1.26.4	2.0.1
mssparkutils	1.0	1.1	1.2	2.0

Breaking Change Severity Legend¶

Symbol	Severity	Action Required
🟢	None	No changes needed
🟡	Low	Minor code updates, deprecation warnings
🟠	Medium	Code changes required, may affect behavior
🔴	High	Significant refactoring, potential data issues
⛔	Critical	Must fix before upgrade, data corruption risk

🐍 Python Version Changes¶

Python 3.10 → 3.11 (Runtime 1.1 → 1.2)¶

Change	Severity	Impact	Migration
`match`/`case` syntax available	🟢	New feature, no breakage	Optional adoption
Exception groups (`ExceptionGroup`)	🟢	New feature	Optional adoption
`tomllib` in standard library	🟢	Can replace `toml` package	Optional
f-string parsing improvements	🟢	Existing f-strings work	None
Faster CPython (10-60%)	🟢	Performance improvement	None

Python 3.11 → 3.12 (Runtime 1.3 → 2.0)¶

Change	Severity	Impact	Migration
`distutils` removed	🔴	`from distutils import ...` fails	Use `setuptools` or `packaging`
`imp` module removed	🟠	`import imp` fails	Use `importlib`
`asynchat`, `asyncore` removed	🟠	Legacy async code breaks	Use `asyncio`
f-string nesting allowed	🟢	New feature	Optional
`typing.TypeVar` defaults	🟢	New feature	Optional
Comprehension inlining	🟢	Performance improvement	None
`unittest.mock` changes	🟡	Some mock behaviors differ	Review test assertions

# ❌ Runtime 1.3 (Python 3.11): Works
from distutils.version import LooseVersion
if LooseVersion("1.2.3") > LooseVersion("1.2.0"):
    pass

# ✅ Runtime 2.0 (Python 3.12): Required replacement
from packaging.version import Version
if Version("1.2.3") > Version("1.2.0"):
    pass

📦 Library Breaking Changes¶

Pandas 1.5 → 2.x¶

Change	Severity	1.5 Behavior	2.x Behavior	Migration
`append()` removed	🔴	`df.append(row)`	Removed	Use `pd.concat([df, row])`
Copy-on-Write default	🟠	Mutable views	CoW by default	Review in-place modifications
`datetime64` resolution	🟠	Nanosecond only	Variable (ns, us, ms, s)	Check timestamp arithmetic
`inplace` deprecation trend	🟡	Widely supported	Some deprecated	Chain operations instead
`GroupBy.apply` behavior	🟠	Includes group keys	`include_groups=False` default	Add explicit parameter
`DataFrame.swaplevel`	🟡	Positional args	Keyword-only	Use `axis=` keyword

# ❌ Pandas 1.5 (Runtime 1.1)
df = df.append({"col": "value"}, ignore_index=True)

# ✅ Pandas 2.x (Runtime 1.2+)
df = pd.concat([df, pd.DataFrame([{"col": "value"}])], ignore_index=True)

# ❌ Pandas 1.5: Mutable view behavior
df2 = df[["col_a", "col_b"]]
df2["col_a"] = 0  # Mutates df in some cases (SettingWithCopyWarning)

# ✅ Pandas 2.x: Copy-on-Write — df is never mutated
df2 = df[["col_a", "col_b"]]
df2["col_a"] = 0  # Only df2 changes, never df

NumPy 1.24 → 2.0¶

Change	Severity	1.24 Behavior	2.0 Behavior	Migration
`np.bool` alias removed	🔴	Alias for `bool`	`AttributeError`	Use `np.bool_` or `bool`
`np.int` alias removed	🔴	Alias for `int`	`AttributeError`	Use `np.int_` or `int`
`np.float` alias removed	🔴	Alias for `float`	`AttributeError`	Use `np.float64` or `float`
`np.complex` alias removed	🔴	Alias for `complex`	`AttributeError`	Use `np.complex128`
`np.object` alias removed	🔴	Alias for `object`	`AttributeError`	Use `np.object_` or `object`
`np.str` alias removed	🔴	Alias for `str`	`AttributeError`	Use `np.str_` or `str`
String representation changes	🟡	`np.array([1])`	Different repr	Update test assertions
`np.in1d` deprecated	🟡	Working	Deprecation warning	Use `np.isin`

# ❌ NumPy 1.24 (Runtime 1.1/1.2)
arr = np.array([1, 2, 3], dtype=np.int)
mask = np.array([True, False, True], dtype=np.bool)

# ✅ NumPy 2.0 (Runtime 2.0)
arr = np.array([1, 2, 3], dtype=np.int_)
mask = np.array([True, False, True], dtype=np.bool_)

Delta Lake / delta-spark¶

Change	Severity	Old Version	New Version	Migration
`DeltaTable.convertToDelta` API change	🟠	Positional args	Named args	Use keyword arguments
Protocol version requirements	⛔	Reader V1/Writer V2	Reader V3/Writer V7	See Delta Protocol section
Deletion vectors default ON	🟠	Off by default	On by default	May increase read latency initially
Liquid clustering GA	🟢	Preview	GA	Can replace ZORDER
`OPTIMIZE` output schema	🟡	Fewer columns	Additional metrics	Update assertions on OPTIMIZE output
Default Parquet writer version	🟡	V1	V2	Forward-compatible
Coordinated commits	🟠	Not available	Available	New concurrency model

mssparkutils Changes¶

Change	Severity	Old API	New API	Migration
`fs.mount` removed	🔴	`mssparkutils.fs.mount()`	Not available	Use OneLake paths directly
Credential API changes	🟠	`credentials.getToken()`	`credentials.getAccessToken()`	Update method name
Notebook reference API	🟡	`notebook.run("name")`	`notebook.run("name", timeout)`	Add timeout parameter
`fs.head` output type	🟡	Returns string	Returns bytes in some cases	Handle both types

# ❌ Runtime 1.1: Mount-based access
mssparkutils.fs.mount("abfss://container@storage.dfs.core.windows.net", "/mnt/data")
df = spark.read.parquet("/mnt/data/file.parquet")

# ✅ Runtime 2.0: Direct OneLake path
df = spark.read.parquet("abfss://workspace@onelake.dfs.fabric.microsoft.com/lakehouse/Files/data/file.parquet")

🔺 Delta Lake Protocol Versions¶

Protocol Compatibility Matrix¶

Feature	Min Reader	Min Writer	Runtime 1.1	Runtime 1.2	Runtime 1.3	Runtime 2.0
Basic Delta	1	1	✅	✅	✅	✅
Column mapping	2	5	✅	✅	✅	✅
Deletion vectors	3	7	❌	✅	✅	✅
Liquid clustering	3	7	❌	❌	✅	✅
Row tracking	3	7	❌	❌	❌	✅
V2 checkpoints	3	7	❌	❌	❌	✅

Critical Warning: Once a table's protocol version is upgraded, older runtimes cannot read or write to it. This is a one-way operation. Never upgrade table protocol without ensuring all consumers run a compatible runtime.

Checking Table Protocol¶

from delta.tables import DeltaTable

dt = DeltaTable.forName(spark, "silver.player_transactions")
detail = dt.detail().select("minReaderVersion", "minWriterVersion").collect()[0]
print(f"Reader: V{detail.minReaderVersion}, Writer: V{detail.minWriterVersion}")

# Check all tables for protocol compatibility
tables = spark.sql("SHOW TABLES IN silver").collect()
for t in tables:
    try:
        dt = DeltaTable.forName(spark, f"silver.{t.tableName}")
        detail = dt.detail().select("minReaderVersion", "minWriterVersion", "name").collect()[0]
        print(f"{detail.name}: Reader V{detail.minReaderVersion}, Writer V{detail.minWriterVersion}")
    except Exception as e:
        print(f"silver.{t.tableName}: Error - {e}")

⚡ Spark Configuration Changes¶

Default Value Changes¶

Configuration	Runtime 1.1	Runtime 1.2	Runtime 1.3	Runtime 2.0	Impact
`spark.sql.adaptive.enabled`	true	true	true	true	🟢 No change
`spark.sql.adaptive.coalescePartitions.enabled`	true	true	true	true	🟢 No change
`spark.sql.shuffle.partitions`	200	200	auto	auto	🟡 AQE auto-tuning
`spark.sql.sources.partitionOverwriteMode`	static	static	dynamic	dynamic	🟠 Partition overwrite behavior
`spark.sql.parquet.datetimeRebaseModeInRead`	EXCEPTION	CORRECTED	CORRECTED	CORRECTED	🟡 Historic dates
`spark.sql.ansi.enabled`	false	false	true	true	🔴 Type errors on overflow
`spark.sql.legacy.timeParserPolicy`	LEGACY	LEGACY	CORRECTED	EXCEPTION	🟠 Date parsing strictness

ANSI Mode Impact (Runtime 1.3+)¶

# ❌ Pre-ANSI (Runtime 1.1/1.2): Silent overflow
spark.sql("SELECT CAST(2147483648 AS INT)")  # Returns -2147483648 (overflow wraps)

# ✅ ANSI mode (Runtime 1.3+): Throws ArithmeticException
# spark.sql("SELECT CAST(2147483648 AS INT)")  # ERROR: overflow

# To handle safely:
spark.sql("SELECT TRY_CAST(2147483648 AS INT)")  # Returns NULL
# Or explicitly:
spark.sql("SELECT CAST(2147483648 AS BIGINT)")  # Returns correct value

Partition Overwrite Mode Change¶

# ❌ Static mode (Runtime 1.1/1.2): Overwrites ALL partitions
df.write.format("delta") \
    .mode("overwrite") \
    .partitionBy("gaming_date") \
    .saveAsTable("bronze.slot_telemetry")
# Deletes data for ALL dates, writes only today's data

# ✅ Dynamic mode (Runtime 1.3+): Overwrites only affected partitions
# Same code now only overwrites partitions present in df
# Explicitly set if you need old behavior:
spark.conf.set("spark.sql.sources.partitionOverwriteMode", "static")

🔧 Migration Checklist¶

Pre-Upgrade (1 Week Before)¶

- [ ] Inventory all notebooks and their library imports
- [ ] Run `grep -r "np\.bool\b\|np\.int\b\|np\.float\b\|np\.object\b" notebooks/`
- [ ] Run `grep -r "\.append(" notebooks/` to find pandas append calls
- [ ] Run `grep -r "distutils" notebooks/` to find distutils usage
- [ ] Check Delta table protocol versions (see script above)
- [ ] Document current Spark config overrides
- [ ] Review custom library versions in environment.yml
- [ ] Run full test suite on current runtime (baseline)
- [ ] Back up critical Delta tables: `CREATE TABLE ... DEEP CLONE`

During Upgrade¶

- [ ] Set new runtime version in Fabric workspace settings
- [ ] Run smoke test notebook (core operations)
- [ ] Run schema validation on all Bronze/Silver/Gold tables
- [ ] Verify mssparkutils API calls work
- [ ] Check Spark UI for new warnings or errors
- [ ] Validate Delta table read/write operations
- [ ] Test Direct Lake connectivity

Post-Upgrade (1 Week After)¶

- [ ] Run full test suite on new runtime
- [ ] Compare query performance metrics (before/after)
- [ ] Review Spark event logs for deprecated API warnings
- [ ] Update environment.yml with new compatible library versions
- [ ] Update CLAUDE.md with new runtime version
- [ ] Document any behavioral changes observed
- [ ] Remove deep clone backups after 30-day stabilization

🧪 Testing Strategy¶

Automated Compatibility Scanner¶

import ast
import os
from pathlib import Path

BREAKING_PATTERNS = {
    "runtime_2.0": {
        "np.bool": "Replace with np.bool_ or bool",
        "np.int": "Replace with np.int_ or int",
        "np.float": "Replace with np.float64 or float",
        "np.object": "Replace with np.object_ or object",
        "np.str": "Replace with np.str_ or str",
        ".append(": "pandas append removed — use pd.concat",
        "from distutils": "distutils removed in Python 3.12",
        "import imp": "imp removed in Python 3.12",
        "mssparkutils.fs.mount": "fs.mount removed in Runtime 2.0",
    }
}

def scan_notebook(file_path: str, target_runtime: str = "runtime_2.0") -> list:
    """Scan a notebook for breaking changes."""
    issues = []
    patterns = BREAKING_PATTERNS.get(target_runtime, {})

    with open(file_path, "r", encoding="utf-8") as f:
        for line_num, line in enumerate(f, 1):
            for pattern, fix in patterns.items():
                if pattern in line:
                    issues.append({
                        "file": file_path,
                        "line": line_num,
                        "pattern": pattern,
                        "fix": fix,
                        "code": line.strip()
                    })
    return issues

# Scan all notebooks
notebook_dir = Path("notebooks")
all_issues = []
for nb in notebook_dir.rglob("*.py"):
    all_issues.extend(scan_notebook(str(nb)))

print(f"Found {len(all_issues)} potential breaking changes:")
for issue in all_issues:
    print(f"  {issue['file']}:{issue['line']} — {issue['pattern']} → {issue['fix']}")

Runtime Comparison Test¶

def runtime_compatibility_test():
    """Smoke test for core operations across runtime versions."""
    results = {}

    # Test 1: Delta read/write
    try:
        df = spark.range(1000).toDF("id")
        df.write.format("delta").mode("overwrite").saveAsTable("test.runtime_check")
        spark.table("test.runtime_check").count()
        results["delta_rw"] = "PASS"
    except Exception as e:
        results["delta_rw"] = f"FAIL: {e}"

    # Test 2: Pandas conversion
    try:
        pdf = spark.range(100).toPandas()
        assert len(pdf) == 100
        results["pandas_convert"] = "PASS"
    except Exception as e:
        results["pandas_convert"] = f"FAIL: {e}"

    # Test 3: mssparkutils
    try:
        files = mssparkutils.fs.ls("Files/")
        results["mssparkutils"] = "PASS"
    except Exception as e:
        results["mssparkutils"] = f"FAIL: {e}"

    # Test 4: ANSI mode behavior
    try:
        result = spark.sql("SELECT TRY_CAST('not_a_number' AS INT)").collect()
        results["ansi_mode"] = "PASS" if result[0][0] is None else "UNEXPECTED"
    except Exception as e:
        results["ansi_mode"] = f"FAIL: {e}"

    # Test 5: NumPy types
    try:
        import numpy as np
        arr = np.array([1, 2, 3], dtype=np.int_)
        mask = np.array([True, False], dtype=np.bool_)
        results["numpy_types"] = "PASS"
    except Exception as e:
        results["numpy_types"] = f"FAIL: {e}"

    return results

🔙 Rollback Procedures¶

Workspace-Level Rollback¶

1. Navigate to Fabric Workspace → Settings → Spark Settings
2. Change Runtime Version back to previous version
3. Wait for runtime pool to restart (~5 minutes)
4. Run smoke test notebook to verify rollback
5. Check Delta table accessibility (protocol version may block!)

Table-Level Rollback¶

# If a Delta table protocol was upgraded and you need to roll back:
# WARNING: Protocol downgrades are NOT supported by Delta Lake

# Option 1: Restore from deep clone backup
spark.sql("DROP TABLE silver.player_transactions")
spark.sql("""
    CREATE TABLE silver.player_transactions
    DEEP CLONE silver.player_transactions_backup_pre_upgrade
""")

# Option 2: Export and re-import
df = spark.read.format("delta").load("path/to/backup/")
df.write.format("delta").saveAsTable("silver.player_transactions")

🎰 Casino Workload Impact¶

Notebook	Runtime 2.0 Risk	Key Changes
`01_bronze_slot_telemetry.py`	🟡 Low	mssparkutils path updates
`01_silver_slot_cleansed.py`	🟡 Low	ANSI mode on type casts
`01_gold_slot_performance.py`	🟡 Low	Pandas 2.x for toPandas()
Streaming notebooks	🟠 Medium	Trigger API changes, checkpoint format
ML notebooks	🔴 High	NumPy 2.0 type aliases removed

🏛️ Federal Workload Impact¶

Notebook	Runtime 2.0 Risk	Key Changes
USDA Bronze/Silver/Gold	🟡 Low	Standard Delta operations
SBA loan analysis	🟡 Low	Pandas concat migration
NOAA weather forecasting	🟠 Medium	NumPy 2.0, scikit-learn compat
EPA sensor streaming	🟠 Medium	Streaming API changes
DOI geospatial	🟠 Medium	GeoPandas + NumPy 2.0

🚫 Anti-Patterns¶

Anti-Pattern 1: Upgrading Without Testing¶

# ❌ WRONG: "Just switch the runtime and see what happens"
# ✅ CORRECT: Run the full compatibility scanner and test suite first

Anti-Pattern 2: Upgrading Table Protocols Blindly¶

# ❌ WRONG: Enabling all new Delta features on production tables
spark.sql("ALTER TABLE silver.transactions SET TBLPROPERTIES ('delta.enableDeletionVectors' = 'true')")
# Now Runtime 1.1 workspaces can't read this table!

# ✅ CORRECT: Only upgrade after confirming all consumers are on compatible runtime

Anti-Pattern 3: Ignoring Deprecation Warnings¶

# ❌ WRONG: Suppressing warnings and moving on
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)

# ✅ CORRECT: Fix deprecated usage before it becomes an error

📚 References¶

Next: V-Order Tuning Deep Dive | Partition Strategy Decision Tree