Reference Architecture — Fabric vs Synapse vs Databricks¶
TL;DR (2026): Synapse + Databricks is the production backbone today; Fabric is the strategic forward path for new workloads, especially Real-Time Intelligence and Direct Lake semantic models. The right answer is usually "both, sequenced over 18–36 months." Don't pick one universally; pick per-workload using the decision tree below.
The decision¶
flowchart TD
Start[New analytics workload] --> Q1{Is this workload<br/>net-new?}
Q1 -->|No, migrating| Q2{From what?}
Q2 -->|Synapse SQL pool / Spark| KeepSyn[Keep on Synapse;<br/>plan Fabric eval at next<br/>major schema change]
Q2 -->|Databricks AWS/GCP/onprem| KeepDBX[Migrate to Azure Databricks;<br/>see migration playbook]
Q2 -->|Snowflake / Redshift / BigQuery| Q5{Modern lakehouse OK?}
Q5 -->|Yes| Fabric[Fabric Lakehouse<br/>+ dbt + Direct Lake]
Q5 -->|Need Spark feature parity| DBX[Azure Databricks<br/>+ Unity Catalog]
Q1 -->|Yes, greenfield| Q3{What's the<br/>primary workload?}
Q3 -->|Real-time / streaming / IoT| Q3a{Sub-second latency?}
Q3a -->|Yes| RTI[Fabric RTI / Eventhouse<br/>KQL DB]
Q3a -->|Seconds-to-minutes| ASA[Stream Analytics or<br/>Databricks Structured Streaming]
Q3 -->|Heavy ML / DL / GenAI training| DBX
Q3 -->|Classic SQL warehouse / BI| Q4{Power BI primary<br/>consumer?}
Q4 -->|Yes| FabricLH[Fabric Lakehouse<br/>+ Direct Lake]
Q4 -->|No, mixed BI/SQL| Synapse[Synapse Serverless SQL<br/>over Delta]
Q3 -->|Data engineering / dbt / pipelines| Q6{Existing<br/>investment?}
Q6 -->|Heavy ADF| ADF[ADF + dbt + Synapse Spark<br/>or Databricks compute]
Q6 -->|Greenfield| FabricDF[Fabric Data Pipelines<br/>+ dbt-fabric]
Q3 -->|Lakehouse for AI grounding<br/>RAG/agents| Q7{Need vector search?}
Q7 -->|Yes| AISearch[ADLS Delta + AI Search<br/>+ AOAI]
Q7 -->|Use Fabric semantic search| FabAI[Fabric Lakehouse<br/>+ AI skills]
style Fabric fill:#0078d4,color:#fff
style FabricLH fill:#0078d4,color:#fff
style FabricDF fill:#0078d4,color:#fff
style FabAI fill:#0078d4,color:#fff
style RTI fill:#0078d4,color:#fff
style DBX fill:#ff6b35,color:#fff
style Synapse fill:#13a3b5,color:#fff
style KeepSyn fill:#13a3b5,color:#fff
style ASA fill:#13a3b5,color:#fff
style ADF fill:#13a3b5,color:#fff
style AISearch fill:#13a3b5,color:#fff Side-by-side¶
| Dimension | Fabric | Synapse | Databricks |
|---|---|---|---|
| Deployment model | SaaS (capacity SKU F2-F2048) | PaaS (workspaces + pools) | PaaS (workspaces + clusters) |
| Primary storage | OneLake (single namespace) | ADLS Gen2 (you bring) | ADLS Gen2 (you bring) + Unity Catalog |
| Primary table format | Delta Lake (auto-optimized) | Delta Lake or Parquet | Delta Lake (with Liquid Clustering) |
| SQL engine | Lakehouse SQL endpoint, Warehouse | Serverless SQL, Dedicated SQL Pool | Databricks SQL warehouses |
| Spark engine | Fabric Spark (forked from OSS) | Synapse Spark | Databricks Runtime (forked, optimized) |
| Streaming | Real-Time Intelligence (Eventhouse / KQL) | Structured Streaming, ASA bridge | Structured Streaming, Delta Live Tables |
| BI integration | Power BI Direct Lake (best in class) | Power BI Import/DirectQuery | Power BI Import/DirectQuery, Genie |
| Notebooks | Yes (Fabric notebooks) | Yes (Synapse notebooks) | Yes (Databricks notebooks — original UX) |
| ML platform | Fabric Data Science (preview) | Azure ML integration | MLflow native (best in class for ML) |
| Governance | Built-in (OneLake catalog) + Purview | Purview integration | Unity Catalog + Purview |
| Cost model | Capacity-based (F SKU $/hr, smoothed) | Per-pool (DWU) + per-query (serverless) | Per-cluster (DBU/hr) + storage |
| Auto-pause | Capacity is always on (smoothed) | Yes — pause SQL pool, autoscale Spark | Yes — auto-terminate clusters |
| Multi-cloud | Azure-only (AWS S3 read via shortcut) | Azure-only | AWS, Azure, GCP |
| Azure Government | Pre-GA, no MAG production yet | GA | GA |
| Maturity (2026) | GA but rapidly evolving | Mature, stable | Mature, stable |
| Best for | New BI workloads, RTI/IoT, Direct Lake semantic models, Power BI-first orgs | Existing Synapse investments, mixed SQL/Spark, federal/Gov | Heavy ML/DL/GenAI, multi-cloud, Spark experts |
Cost comparison (rough, 2026)¶
For a typical medium analytics workload (~5 TB Delta, 20 dbt models, daily refresh, BI to 200 users):
| Platform | Monthly cost (USD, dev) | Monthly cost (USD, prod) | Notes |
|---|---|---|---|
| Fabric | $260 (F2 8h/day) | $5,200 (F64 24/7) | Capacity is shared across BI + Lakehouse + RTI; smoothing helps |
| Synapse | $400 (Serverless + small Spark) | $4,800 (DW100c + Spark XS) | Serverless wins for spiky workloads; Dedicated wins for predictable |
| Databricks | $500 (Standard, auto-terminate) | $6,500 (Premium SKU + Photon) | DBU pricing varies a LOT by SKU and Photon usage |
These are order-of-magnitude estimates. Actual costs depend on query patterns, idle time, region, and reserved-capacity discounts. Always model with the real Azure Pricing Calculator before committing.
When to combine (not pick one)¶
This is the most common production answer:
flowchart LR
Sources[Sources] --> ADF
Sources --> EH[Event Hubs]
ADF --> ADLS[(ADLS Delta<br/>shared bronze)]
EH --> RTI[Fabric RTI<br/>streaming gold]
ADLS --> DBX[Databricks<br/>silver+gold dbt<br/>+ ML]
DBX --> ADLS
ADLS -.OneLake shortcut.-> Fabric[Fabric Lakehouse<br/>BI surface]
RTI --> Fabric
Fabric --> PBI[Power BI<br/>Direct Lake]
DBX --> ML[Azure ML<br/>training/inference]
DBX --> AOAI[AOAI + AI Search<br/>RAG/agents] - Databricks does the heavy lifting for transformations and ML
- Fabric is the BI presentation layer (Direct Lake reads the same Delta files via OneLake shortcut, no duplication)
- Fabric RTI handles the streaming gold for real-time dashboards
- Synapse is conspicuously absent from this picture for new workloads; it remains a strong choice for existing workloads and Azure Gov where Fabric isn't GA
Workload-fit matrix¶
| Workload | Best | Acceptable | Avoid |
|---|---|---|---|
| Power BI dashboards (large semantic models) | Fabric Direct Lake | Synapse + Import | Databricks SQL alone |
| Heavy Spark ML / GenAI training | Databricks | Synapse Spark | Fabric Spark (immature) |
| Real-time IoT (sub-second) | Fabric RTI / Eventhouse | Stream Analytics | Synapse Spark Streaming |
| Real-time analytics (seconds) | Fabric RTI, Databricks DLT | Synapse Spark Streaming | Synapse SQL |
| Ad-hoc analyst SQL over Delta | Synapse Serverless, Databricks SQL | Fabric Lakehouse SQL | Fabric Warehouse (preview-feel) |
| Federal / Gov workloads (today) | Synapse + Databricks | Synapse only | Fabric (pre-GA in MAG) |
| Multi-cloud (AWS/GCP source) | Databricks | Fabric (S3 shortcuts) | Synapse |
| Cost-sensitive POC | Synapse Serverless | Databricks Standard | Fabric F-SKU (capacity always on) |
| Net-new BI-first org | Fabric | Synapse | Databricks-only |
Migration sequencing (real-world)¶
If you have an existing Synapse + Databricks investment, the typical 18–36 month path is:
gantt
title Synapse+Databricks → Fabric (typical enterprise)
dateFormat YYYY-MM
section Discovery
Audit current workloads :2026-01, 3M
Pick pilot workloads :2026-02, 1M
section Pilot
Fabric capacity provisioned :2026-04, 1M
Pilot 1 BI workload to Fabric :2026-04, 4M
Pilot 1 RTI workload :2026-06, 4M
section Wave 1
Migrate Tier-2 BI to Fabric :2026-08, 6M
Keep ML on Databricks :2026-08, 18M
Keep Synapse SQL pools :2026-08, 12M
section Wave 2
Migrate Tier-1 BI :2027-02, 6M
Decommission first SQL pool :2027-08, 3M
section Steady state
Fabric for BI + RTI :2028-01, 6M
Databricks for ML :2028-01, 6M
Synapse archived :2028-04, 1M The point is don't try to forklift. Move workloads when they're already in flight (schema change, cost optimization, new feature) — never just because of platform fashion.
Trade-offs summary¶
✅ Why Fabric — Best Power BI integration, OneLake unifies storage, RTI is genuinely good, simpler ops model (one capacity) ⚠️ Why not Fabric (yet) — Pre-GA in Gov, immature ML, capacity model can be expensive for spiky workloads, Spark is forked-OSS not Photon
✅ Why Synapse — Mature, Gov GA, Serverless SQL is brilliant for ad-hoc, Dedicated SQL Pool is a real DW ⚠️ Why not Synapse — Microsoft's investment focus is on Fabric; Synapse is in maintenance mode; new features land in Fabric first
✅ Why Databricks — Best Spark/ML/GenAI runtime, Unity Catalog is excellent, multi-cloud, Photon is fast, MLflow native ⚠️ Why not Databricks — Pricier than Fabric for BI-only workloads, separate identity model adds complexity, Power BI integration is good but not Direct Lake-class
Related¶
- ADR 0010 — Fabric Strategic Target
- ADR 0002 — Databricks over OSS Spark
- ADR 0018 — Fabric RTI Adapter
- Decision — Fabric vs Databricks vs Synapse (the quick-pick version)
- Migration — Databricks to Fabric
- Use Case — Unified Analytics on Fabric
- Patterns — Power BI & Fabric Roadmap
- Supercharge Microsoft Fabric — companion site with tutorials, feature guides, and production best practices for Fabric