Home > Tutorials > Synapse Analytics to Fabric Migration
🔷 Tutorial 41: Azure Synapse Analytics → Microsoft Fabric Migration¶
Last Updated: 2026-04-27 | Phase: 14 (Wave 4) | Anchor for Wave 4 Migration Tutorials Status: ✅ Final | Maintainer: Platform Team
Third-party references — publicly sourced, good-faith comparison
This page references non-Microsoft products and services. That information is drawn from each vendor's publicly available documentation and is offered for honest, good-faith comparison only. This is a personal project written from a Microsoft Fabric and Azure perspective; it does not claim expertise in, or authority over, any third-party product, and nothing here is an official statement by, or endorsed by, those vendors. Capabilities, pricing, and features change often — always verify against the vendor's current official documentation. Where a third-party offering is the stronger choice, we say so plainly.
🔷 Tutorial 41: Azure Synapse Analytics → Microsoft Fabric Migration¶
| Difficulty | ⭐⭐⭐ Advanced |
| Time | ⏱️ 180-240 minutes |
| Focus | Most-asked enterprise migration: Synapse Dedicated SQL pool, Spark pool, ADLS Gen2, Synapse Pipelines → Fabric Warehouse, Fabric Spark, OneLake, Fabric Data Pipelines |
📊 Progress Tracker¶
| Tutorial | Status |
|---|---|
| 00 — Environment Setup → 38 — DOJ Justice | ✅ Complete (Phases 1-13) |
| 41 — Synapse → Fabric (this tutorial) | 🔵 YOU ARE HERE |
| 42 — Databricks → Fabric | ⬜ |
| 43 — Redshift → Fabric | ⬜ |
| 44 — BigQuery → Fabric | ⬜ |
| 45 — On-Prem SSAS/SSIS/SSRS → Fabric | ⬜ |
| Navigation | |
|---|---|
| ⬅️ Previous | 40 — RAG Production |
| ➡️ Next | 42 — Databricks → Fabric |
📖 Overview¶
Azure Synapse Analytics → Fabric is the most-asked migration because Synapse and Fabric share Microsoft DNA — but the architectural differences are real. This tutorial walks through migrating each Synapse workload type into its Fabric equivalent, with assessment scripts, schema-conversion tools, and validation patterns.
Why Migrate from Synapse to Fabric?¶
| Concern | Synapse Behavior | Fabric Behavior |
|---|---|---|
| Compute model | DWU per workload, fixed pools | CU pool shared across all workloads |
| Storage | ADLS Gen2 separate from compute | OneLake unified, no copy needed |
| Spark + SQL together | Two separate engines, separate billing | Unified, V-Order Delta works for both |
| BI integration | Power BI Premium connector or DirectQuery | Direct Lake (no copy, no refresh) |
| Real-time | Add-on Stream Analytics, Event Hubs | Eventstream + Eventhouse native |
| Lifecycle | Microsoft is investing in Fabric, not Synapse | Synapse SQL pools enter sustaining mode |
What This Tutorial Covers¶
- Inventory + complexity scoring of your Synapse workspace
- T-SQL schema conversion (Synapse → Fabric Warehouse)
- ADLS Gen2 → OneLake shortcut migration (no data copy)
- Spark pool → Fabric Spark notebook migration
- Synapse Pipeline → Fabric Data Pipeline conversion
- DWU → CU sizing
- Validation: row counts, hashes, query parity, performance benchmarks
- Coexistence pattern (Synapse + Fabric in parallel until cutover)
- Cutover playbook
- Decommissioning checklist
🎯 Learning Objectives¶
By the end of this tutorial, you will be able to:
- Run a comprehensive Synapse workspace inventory and complexity score
- Translate Synapse Dedicated SQL pool DDL to Fabric Warehouse DDL (and identify what won't translate)
- Migrate ADLS Gen2 storage to OneLake using shortcuts (no data movement)
- Convert Synapse Spark notebooks to Fabric notebooks
- Convert Synapse Pipelines to Fabric Data Pipelines
- Size F-SKU capacity from existing DWU + Spark pool consumption
- Validate migrated workloads via row count + hash + query parity
- Establish a coexistence period for risk-free cutover
- Decommission Synapse safely without data loss
🏗️ Reference Architecture¶
flowchart LR
subgraph Synapse["🔷 Source: Azure Synapse Analytics"]
SDP[(Dedicated SQL Pool)]
SSP[Spark Pool]
SPIPE[Synapse Pipelines]
SADLS[(ADLS Gen2)]
SSA[Stream Analytics]
end
subgraph Fabric["⚡ Target: Microsoft Fabric"]
FW[(Fabric Warehouse)]
FS[Fabric Spark]
FPIPE[Fabric Pipelines]
OL[(OneLake)]
EVS[Eventstream + Eventhouse]
end
subgraph Migration["🔄 Migration Layer"]
Assess[Assessment Script]
Convert[Schema Converter]
Shortcut[OneLake Shortcut]
Validate[Validation Suite]
end
SDP --> Assess
SSP --> Assess
SPIPE --> Assess
Assess --> Convert
Convert --> FW
Convert --> FS
Convert --> FPIPE
SADLS -->|shortcut, no copy| Shortcut
Shortcut --> OL
SSA --> EVS
FW --> Validate
FS --> Validate
FPIPE --> Validate
OL --> Validate Component Mapping (the canonical reference)¶
| Synapse Component | Fabric Equivalent | Migration Approach |
|---|---|---|
| Dedicated SQL pool (DWUs) | Fabric Warehouse (CUs) | DDL conversion + CTAS bulk load |
| Serverless SQL pool | Fabric SQL endpoint over Lakehouse | Often direct (T-SQL → Lakehouse SQL) |
| Spark pool | Fabric Spark + notebooks | Notebook code mostly portable |
| Synapse Pipelines | Fabric Data Pipelines | Activities mostly compatible (some renamed) |
| Mapping Data Flows | Fabric Dataflow Gen2 | Manual recreation in Power Query |
| ADLS Gen2 (linked) | OneLake shortcut | No data movement needed |
| Synapse Studio notebooks | Fabric notebooks | Minor API tweaks (mssparkutils shim) |
| Spark Job Definitions | Fabric Spark Job Definitions | Direct port |
| Synapse Link for Cosmos DB | Fabric Mirroring for Cosmos DB | Re-configure mirror |
| Linked services | Connections | Recreate; secrets to Key Vault |
| Stream Analytics | Eventstream + Eventhouse | Re-author logic |
| Synapse Pipeline triggers | Fabric Pipeline schedules | Direct |
📋 Prerequisites¶
- Tutorial 00 (Environment Setup) complete
- Tutorials 01-03 (Bronze/Silver/Gold medallion) understood
- Tutorial 12 (CI/CD) for fabric-cicd
- Tutorial 13 (Migration Planning) for governance overview
- Source: Existing Azure Synapse Analytics workspace (read access at minimum)
- Target: Fabric F64+ capacity provisioned
- Service Principal with
Synapse Administratoron source +Workspace Adminon target - Azure CLI ≥ 2.60 with
synapseextension - PowerShell 7.x with
Az.SynapseandAz.Fabricmodules - Estimated 2-4 hours active time + multi-day coexistence period
🚀 Step-by-Step¶
Step 1 — Assess Your Synapse Workspace¶
Run the assessment script to inventory tables, pipelines, notebooks, and complexity.
cd tutorials/41-synapse-to-fabric/
python 01_assessment.py \
--synapse-workspace "syn-prod-ws01" \
--resource-group "rg-analytics-prod" \
--output-dir "./assessment-output"
✅ Verification: Output directory contains: -
inventory.csv— every table, pipeline, notebook with size + last-used + dependency count -complexity_scores.csv— per-workload "Easy / Medium / Hard / Very Hard" classification -dependency_graph.json— what depends on what (used for migration ordering) -unsupported_features.md— Synapse features that don't have a 1:1 Fabric equivalent💡 Tip: Run assessment monthly during the migration project. New tables, new dependencies, and dropped objects all show up here.
Step 2 — Plan Migration Wave Order¶
Use the dependency graph to plan migration waves. Migrate leaves first, roots last.
python 01_assessment.py --command wave-plan \
--inventory ./assessment-output/inventory.csv \
--max-wave-effort-days 10
The tool produces migration-waves.md with: - Wave 1: Independent reference tables, no consumers - Wave 2: Bronze tables (Synapse staging → Fabric Bronze) - Wave 3: Silver / dimension tables - Wave 4: Gold / fact tables - Wave 5: BI semantic models, Power BI rebinding - Wave 6: Decommission Synapse
⚠️ Gotcha: If your Synapse has cross-database queries (linked Synapse SQL pools), those become cross-Lakehouse / cross-Warehouse queries in Fabric. Plan to migrate the database boundary together to avoid mid-migration latency spikes.
Step 3 — Convert Schemas (Dedicated SQL Pool → Fabric Warehouse)¶
python 02_schema_conversion.py \
--source-conn "Server=tcp:syn-prod-ws01.sql.azuresynapse.net;..." \
--target-warehouse "wh-analytics-prod" \
--target-workspace "ws-analytics-prod" \
--output-ddl ./output-ddl/
The script: 1. Reads source schema via INFORMATION_SCHEMA queries 2. Translates types (most are direct; flags incompatibilities) 3. Removes Synapse-specific clauses (DISTRIBUTION = HASH(...), CLUSTERED COLUMNSTORE INDEX) — Fabric Warehouse handles these automatically 4. Generates Fabric Warehouse DDL with comments noting changes 5. Writes per-table CREATE TABLE to output-ddl/{schema}.{table}.sql
Type Conversion Table¶
| Synapse Type | Fabric Warehouse Type | Notes |
|---|---|---|
bigint | bigint | Direct |
int | int | Direct |
decimal(p,s) | decimal(p,s) | Direct |
nvarchar(n) | varchar(n) | UTF-8 default in Fabric |
nvarchar(max) | varchar(max) | Direct |
datetime | datetime2(6) | Recommended upgrade |
time(n) | time(6) | Direct |
uniqueidentifier | uniqueidentifier | Direct |
binary(n) | varbinary(n) | Recommended |
image / text | varchar(max) / varbinary(max) | Deprecated — replace |
xml | NOT SUPPORTED | Convert to varchar(max) + JSON |
geography, geometry | NOT SUPPORTED in Warehouse | Use Lakehouse + sedona |
hierarchyid | NOT SUPPORTED | Convert to materialized path string |
sql_variant | NOT SUPPORTED | Convert to varchar(max) + type column |
Synapse-Specific Features to Remove¶
| Synapse Feature | Action |
|---|---|
DISTRIBUTION = HASH/REPLICATE/ROUND_ROBIN | Remove — Fabric handles automatically |
CLUSTERED COLUMNSTORE INDEX | Remove — Fabric uses Delta + V-Order |
WITH (HEAP) / CLUSTERED INDEX | Remove |
EXTERNAL TABLE over PolyBase | Convert to OneLake shortcut + Lakehouse table |
MASTER KEY / column encryption | Migrate to OneLake Security + sensitivity labels |
WORKLOAD GROUP / Resource governor | Use Fabric workspace + capacity assignment |
✅ Verification:
-- Run in Fabric Warehouse SQL endpoint SELECT COUNT(*) FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_TYPE = 'BASE TABLE'; -- Should equal source Synapse table count (excluding NOT SUPPORTED rows)💡 Tip: Synapse-specific
OPTIONhints (HASH JOIN,LABEL, etc.) almost never need to be kept. Fabric optimizer is different.
Step 4 — Migrate Storage (ADLS Gen2 → OneLake)¶
This is the biggest win in the migration. Don't copy data. Use OneLake shortcuts.
# Create a shortcut from your existing ADLS Gen2 path
az fabric onelake-shortcut create \
--workspace-id "$WS_ID" \
--lakehouse-id "$LH_ID" \
--name "synapse-data-lake" \
--path "Tables/synapse_data_lake" \
--target-adls-uri "abfss://container@account.dfs.core.windows.net/path/"
Or via Bicep:
resource shortcut 'Microsoft.Fabric/workspaces/items/shortcuts@2024-01-01' = {
name: 'synapse-data-lake'
properties: {
target: {
adlsGen2: {
location: 'https://${storageAccount}.dfs.core.windows.net/'
path: '/${container}/${path}'
subpath: '/'
}
}
}
}
Why this is huge: - Petabyte-scale data is not copied - Source ADLS retains full backup - Fabric reads in place at OneLake performance - Zero downtime - Bidirectional during coexistence (Synapse and Fabric both read same data)
✅ Verification: Query a sample table in Fabric and compare row count to Synapse.
# Fabric notebook df = spark.read.format("delta").load("Tables/synapse-data-lake/customer") print(df.count())⚠️ Gotcha: If Synapse stored data as Parquet (not Delta), you'll need to convert to Delta when reading into Fabric Lakehouse Tables. Convert via
CONVERT TO DELTASQL or rewrite via PySpark.
Step 5 — Migrate Synapse Spark Notebooks¶
Synapse and Fabric Spark are very similar. Most notebooks port with minor changes.
Conversion Checklist¶
| Synapse Pattern | Fabric Pattern |
|---|---|
mssparkutils (Synapse) | notebookutils.mssparkutils (Fabric) — same API mostly |
linkedService = "..." | notebookutils.credentials.getToken(...) |
| Spark config via Synapse Studio | Environment file + Variable Library |
%%spark magic | %%spark (works) |
| Notebook → Pipeline activity | Same — just point to new notebook |
# Synapse:
data = mssparkutils.fs.ls("abfss://...")
# Fabric:
from notebookutils import mssparkutils
data = mssparkutils.fs.ls("abfss://...")
✅ Verification: Run the migrated notebook end-to-end. Output should match Synapse output (within floating-point tolerance for aggregations).
💡 Tip: Use Fabric's notebook environment files (Wave 7 doc landing soon) to pin library versions for reproducibility.
Step 6 — Migrate Synapse Pipelines → Fabric Data Pipelines¶
Most activities have direct Fabric equivalents. The conversion script handles the JSON translation.
python 03_pipeline_migration.py \
--source-pipeline-folder ./synapse-pipelines/ \
--target-workspace "$FABRIC_WS_ID" \
--output-dir ./fabric-pipelines/
Activity Compatibility¶
| Synapse Activity | Fabric Activity | Notes |
|---|---|---|
| Copy data | Copy data | Direct |
| Notebook (Synapse) | Notebook (Fabric) | Re-point reference |
| Stored procedure | Stored procedure | Direct (after schema migration) |
| Lookup | Lookup | Direct |
| ForEach / Until / If | ForEach / Until / If | Direct |
| Web | Web | Direct |
| Wait | Wait | Direct |
| Execute Pipeline | Execute Pipeline | Direct |
| Mapping Data Flow | Dataflow Gen2 | Manual recreation in Power Query |
| Spark job definition | Spark job definition | Direct |
| Set variable | Set variable | Direct |
| Switch | Switch | Direct |
⚠️ Gotcha: Mapping Data Flows are the hardest part. Plan to recreate them as Dataflow Gen2 in Power Query. Budget 4-8 hours per non-trivial Data Flow.
💡 Tip: Pipeline parameters become Variable Library entries in Fabric — a cleaner pattern. Cleanup opportunity during migration.
Step 7 — Size Fabric Capacity from Synapse Consumption¶
python 01_assessment.py --command capacity-recommendation \
--inventory ./assessment-output/inventory.csv \
--dwu-baseline 5000 \
--spark-pool-vcores 200 \
--query-history-csv ./synapse-query-history.csv
Output: F-SKU recommendation with rationale.
DWU → CU Rough Sizing (Starting Point — Validate Empirically)¶
| Synapse Workload | Approximate Fabric F-SKU |
|---|---|
| DW100c | F8 |
| DW500c | F32 |
| DW1000c | F64 |
| DW3000c | F128 |
| DW6000c | F256 |
| DW7500c | F512 |
| DW15000c | F1024 |
| DW30000c | F2048 |
⚠️ Gotcha: This is a starting point only. Spark pool consumption + Power BI usage + real-time workloads all draw from the same CU pool in Fabric. Always run a parallel measurement during coexistence.
Step 8 — Validate Migration¶
Validation runs three categories of checks: row counts, content hashes, and query parity.
python 04_validation_suite.py \
--source-conn "$SYN_CONN" \
--target-warehouse "$FAB_WH_ID" \
--tables-yaml ./tables-to-validate.yaml \
--output-report ./validation-report.html
Three-Tier Validation¶
Tier 1: ROW COUNT ─────▶ count(*) match per table
Tier 2: CONTENT HASH ─────▶ hashbytes('SHA2_256', concat_ws('|', columns)) match per row
Tier 3: QUERY PARITY ─────▶ representative aggregations match within tolerance
| Check | Pass Threshold | Action on Fail |
|---|---|---|
| Row count | exact match | Re-run incremental load, check filter predicates |
| Content hash | 100% match | Investigate type conversion or NULL handling |
| Aggregation parity | within 0.001% (floating point) | Type precision review |
| Query latency | within 2× source p95 | Tune V-Order, partition strategy, query plan |
✅ Verification: All three tiers pass for ≥ 95% of in-scope tables before declaring migration complete.
Step 9 — Coexistence Period (Run Both for 2-4 Weeks)¶
During coexistence: - Both Synapse and Fabric read the same OneLake-shortcut data - Writes go to both systems (idempotent design helps; or use Fabric as truth, replicate to Synapse one-way) - BI reports point to Fabric; Synapse only used for legacy edge cases - Daily validation report confirms parity
💡 Tip: Use Action Groups (Wave 1 Bicep module) to alert on validation failures during coexistence.
Step 10 — Cutover¶
When validation has been green for 7 consecutive days:
- Freeze Synapse — pause pipelines, set DBs to read-only
- Final delta sync — capture last incremental changes
- Switch BI — repoint Power BI workspace to Fabric semantic model
- Switch consumers — apps, APIs, integrations to Fabric endpoints
- Monitor for 48 hours — close on-call bridge after no incidents
- Decommission (Step 11)
Step 11 — Decommission Synapse¶
# After 30-day grace period with all validation green:
# 1. Pause Dedicated SQL Pool (no compute charges, data retained)
az synapse sql pool pause --workspace-name $SYN_WS --name $POOL
# 2. Backup metadata for audit
python 05_backup_synapse_metadata.py --workspace $SYN_WS --output ./synapse-backup/
# 3. After 90 more days, delete pool (data is in OneLake/ADLS, not lost)
az synapse sql pool delete --workspace-name $SYN_WS --name $POOL --yes
# 4. After 180 days total, delete workspace if no other workloads
⚠️ Gotcha: ADLS Gen2 data referenced by OneLake shortcuts is NOT decommissioned — Fabric reads in place. Only the Synapse compute layer is removed.
✅ Verification (Final Checklist)¶
- All in-scope Synapse tables have Fabric Warehouse / Lakehouse equivalents
- Three-tier validation green for 7+ consecutive days
- OneLake shortcuts to ADLS Gen2 working at performance parity
- All Synapse Spark notebooks ported and producing identical output
- All Synapse Pipelines converted; Mapping Data Flows recreated in Dataflow Gen2
- Power BI semantic models repointed to Fabric (Direct Lake)
- Apps, APIs, integrations cut over
- F-SKU sized correctly (CU usage in steady-state band, see SLO/SLI doc)
- Synapse paused (or deleted post-grace-period)
- Cutover postmortem published
- Cost reduction realized vs prior Synapse + ADLS bill
🧹 Cleanup¶
This tutorial creates assessment artifacts in ./assessment-output/ and conversion artifacts in ./output-ddl/. Retain for audit; archive to long-term storage.
If your assessment ran against a real Synapse workspace, the read-only queries leave no trace. No rollback needed.
🚦 Next Steps¶
- Tutorial 42 — Databricks → Fabric — for shops with Databricks alongside Synapse
- Tutorial 45 — On-Prem SSAS/SSIS/SSRS → Fabric — for legacy SQL Server BI stacks
- Migration Patterns Best Practices — cross-source patterns
- fabric-cicd Deployment — automate the deployment of migrated artifacts
- SLO/SLI for Fabric — set service objectives for the new workloads
🛠️ Troubleshooting¶
| Issue | Cause | Resolution |
|---|---|---|
Type 'xml' is not supported during DDL apply | Synapse-specific type | Convert column to varchar(max) + parse to JSON in Silver layer |
| Row counts mismatch by ~1% | Different default NULL handling | Inspect type conversions; CDC catch-up may be in flight |
Spark notebook fails on mssparkutils import | Synapse-specific module path | Replace with from notebookutils import mssparkutils |
| OneLake shortcut shows 0 rows | Permissions on ADLS Gen2 | Grant Storage Blob Data Reader to Workspace Identity |
| Pipeline activity "not supported" | Mapping Data Flow | Recreate as Dataflow Gen2 in Power Query |
| Aggregation off by 0.01% | float vs decimal precision | Cast aggregation column to decimal explicitly |
| F-SKU throttles after migration | Workload concurrency higher than Synapse | Scale F-SKU; use capacity-throttling runbook |
| Power BI Direct Lake fails to refresh | Schema drift in Warehouse | Refresh metadata; see workspace monitoring |
📁 Key Files Referenced¶
| Step | File |
|---|---|
| 1, 2, 7 | tutorials/41-synapse-to-fabric/01_assessment.py |
| 3 | tutorials/41-synapse-to-fabric/02_schema_conversion.py |
| 6 | tutorials/41-synapse-to-fabric/03_pipeline_migration.md |
| 8 | tutorials/41-synapse-to-fabric/04_validation_suite.py (planned in Wave 9) |
| 11 | tutorials/41-synapse-to-fabric/05_backup_synapse_metadata.py (planned) |
| All | data_generation/generators/migration/synapse_workload_inventory.py |
📚 References¶
Microsoft Documentation¶
- Synapse to Fabric migration guide
- Fabric Warehouse T-SQL surface area
- OneLake shortcuts
- Fabric Pipelines vs Synapse Pipelines
Related Tutorials & Docs¶
- Tutorial 24 — Snowflake → Fabric (existing)
- Tutorial 10 — Teradata → Fabric (existing)
- Tutorial 42 — Databricks → Fabric (Wave 4)
- Migration Patterns
- fabric-cicd Deployment
- Capacity Planning