Feature Mapping — Databricks to Microsoft Fabric (Complete)¶

Status: Authored 2026-04-30 Audience: Platform engineers, data architects, and migration leads who need a line-by-line mapping of Databricks capabilities to Fabric equivalents. Scope: 60 features across compute, storage, governance, ML, streaming, orchestration, SQL, DevOps, and security.

How to read this document¶

Each feature is mapped with:

Databricks capability -- what it does and how it works on Databricks
Fabric equivalent -- the closest Fabric feature or workaround
Parity level -- Full, Partial, Gap, or Better (Fabric exceeds Databricks)
Migration notes -- what to watch for during migration

Parity levels:

Level	Meaning
Full	Fabric provides equivalent or identical capability
Partial	Fabric covers most use cases but has specific gaps
Gap	No direct Fabric equivalent; workaround or external service required
Better	Fabric provides a materially better experience for this capability

For features marked Partial or Gap, consult the dedicated migration guide linked in each section for workaround details and code examples.

1. Compute¶

Databricks compute is cluster-based: you provision VMs, configure autoscaling, choose a runtime version, and optionally enable Photon. Fabric Spark is fully serverless -- there are no clusters to manage. Sessions start on demand and consume Capacity Units (CU) from a shared pool.

This is the most significant paradigm shift in the migration. Teams accustomed to tuning cluster sizes, spot instance ratios, and init scripts will find Fabric's hands-off model simpler but less configurable.

#	Databricks feature	Fabric equivalent	Parity	Migration notes
1	All-Purpose Clusters (interactive Spark)	Fabric Spark session (notebook-attached)	Partial	No persistent cluster; session starts per notebook. Startup is ~30-60s. No Photon.
2	Jobs Clusters (ephemeral, scheduled)	Fabric Spark job definition	Full	Submit PySpark/Scala jobs via Data Pipeline. CU-based billing replaces DBU + VM.
3	Photon (C++ vectorized engine)	None (OSS Spark only)	Gap	Photon-dependent queries may be 2-5x slower on Fabric Spark. Benchmark before migration. See benchmarks.md for measurements.
4	Serverless Compute (Databricks-managed VMs)	Fabric Spark (always serverless)	Better	All Fabric Spark is serverless -- zero cluster management. Simpler ops model.
5	GPU Clusters (ML training)	None on Fabric Spark	Gap	Use Azure ML compute for GPU workloads. See ml-migration.md.
6	Cluster Policies (governance guardrails)	Fabric capacity admin settings	Partial	Control max CU consumption and auto-pause behavior, but less granular than per-cluster policies (no VM family restrictions, no tag enforcement).
7	Instance Pools (pre-warmed VMs)	Not applicable	N/A	Fabric Spark is serverless; no VM pool concept needed. Sessions start in 30-60s without pre-warming.
8	Init Scripts (cluster startup customization)	Fabric environment + `%pip install`	Partial	No arbitrary bash init scripts. System-level packages (e.g., apt-get, custom JDK) are not installable. Use Fabric environments for Python/R library management.

Key takeaway: For compute, the main gaps are Photon (performance) and GPU (ML training). If your workloads do not depend on either, Fabric's serverless model is an upgrade.

2. Notebooks and development¶

Databricks notebooks and Fabric notebooks are conceptually similar: multi-language, cell-based, Spark-attached. The migration is straightforward for PySpark and SQL cells. The main friction points are Scala (not supported in Fabric), dbutils (replaced by mssparkutils), and Databricks Connect (no direct equivalent).

#	Databricks feature	Fabric equivalent	Parity	Migration notes
9	Databricks Notebooks (multi-language)	Fabric Notebooks (PySpark, Spark SQL, R)	Full	Similar experience. Fabric notebooks support PySpark, SQL, R. No Scala support in Fabric notebooks.
10	%sql magic command	SQL cell type (cell language selector)	Full	Create a SQL cell instead of using the %sql prefix. Syntax is identical.
11	%python, %r, %scala magic commands	Cell language selector (dropdown)	Partial	PySpark and R are supported. Scala is not available in Fabric notebooks. Rewrite Scala cells in PySpark before migration.
12	dbutils.fs (file system utilities)	`mssparkutils.fs`	Full	Direct API equivalent. `mssparkutils.fs.ls()`, `.cp()`, `.rm()`, `.head()`, `.mkdirs()`, `.mv()`, `.put()`.
13	dbutils.secrets (secret management)	`mssparkutils.credentials` + Azure Key Vault	Full	`mssparkutils.credentials.getSecret("vault-name", "secret-name")`. Requires Key Vault linked to workspace.
14	dbutils.widgets (parameterized notebooks)	`mssparkutils.notebook.getParam()` + pipeline parameters	Full	Pass parameters from Data Pipeline notebook activity or `mssparkutils.notebook.run()`.
15	dbutils.notebook.run() (notebook orchestration)	`mssparkutils.notebook.run()`	Full	Same pattern: call child notebooks with parameters and receive exit values. Can also use Data Pipelines for multi-notebook orchestration.
16	Databricks Connect (remote Spark from IDE)	Fabric REST API + Lakehouse JDBC/ODBC + VS Code for Fabric (preview)	Partial	No direct Spark Connect equivalent that lets a local Python process submit Spark jobs to a remote cluster. Use Fabric REST API for job submission, JDBC/ODBC for SQL, or VS Code for Fabric for notebook editing. See notebook-migration.md.
17	Repos (Git integration)	Fabric Git integration (Azure DevOps, GitHub)	Full	Fabric workspaces sync with Git repos. Items are serialized as JSON/definition files in Git.
18	Databricks Assistant (AI code help)	Copilot in Fabric notebooks	Full	Both provide AI-assisted code generation, explanation, and debugging in notebooks.

Key takeaway: Notebooks are the easiest migration surface. 8 of 10 features have Full parity. The two exceptions are Scala (rewrite) and Databricks Connect (use alternatives).

3. SQL analytics¶

Databricks SQL (DBSQL) is a SQL-first query service backed by Photon-optimized warehouses. Fabric's equivalent is the Lakehouse SQL endpoint (always-on, read-only SQL over Delta tables) plus Power BI for dashboarding. For most BI query patterns, Fabric with Direct Lake is faster and cheaper than DBSQL.

#	Databricks feature	Fabric equivalent	Parity	Migration notes
19	Databricks SQL Warehouse (DBSQL)	Fabric Lakehouse SQL endpoint	Better	SQL endpoint is always-on within capacity -- no warehouse to start, no cold-start delay. Read-only; writes go through Spark/Pipelines.
20	DBSQL Dashboard	Power BI (native in Fabric)	Better	Power BI is a full BI platform vs DBSQL's simple SQL dashboard. Richer visuals, sharing, RLS, and embedding.
21	DBSQL Alerts	Data Activator (event-driven triggers)	Full	Data Activator monitors data conditions and triggers Teams notifications, pipelines, or emails. Richer trigger types than DBSQL alerts.
22	DBSQL Query History	Fabric monitoring hub + Capacity Metrics app	Full	Query-level monitoring available in the Fabric admin portal. Historical usage tracked in the Capacity Metrics app.
23	Parameterized Queries (DBSQL)	Power BI slicers + Lakehouse SQL parameters	Full	Different mechanism (visual slicers instead of SQL parameters) but achieves the same interactive filtering outcome.
24	Query Federation (DBSQL to external DBs)	Fabric shortcuts + mirroring	Partial	Shortcuts cover ADLS, S3, GCS, Dataverse. Mirroring covers Azure SQL, Cosmos DB, Snowflake. No arbitrary JDBC/ODBC federation to external databases.

Key takeaway: SQL analytics is a Fabric strength. The always-on SQL endpoint and native Power BI integration make this the highest-ROI migration target.

4. Data engineering and orchestration¶

Databricks data engineering centers on notebooks, DLT pipelines, Workflows, and Auto Loader. Fabric's equivalent is a combination of Data Pipelines (ADF v2), Spark notebooks, and dbt-fabric. The paradigm shifts from "declarative DLT" to "dbt + pipeline orchestration" -- a well-understood pattern in the analytics engineering community.

#	Databricks feature	Fabric equivalent	Parity	Migration notes
25	Delta Live Tables (DLT)	Fabric Data Pipelines + dbt-fabric	Partial	No declarative DLT equivalent in Fabric. Use Data Pipelines for orchestration and dbt for SQL transformations. See dlt-migration.md for detailed conversion patterns.
26	DLT Expectations (data quality rules)	dbt tests + Great Expectations	Partial	dbt tests provide `warn`, `error`, and `store_failures` behaviors matching DLT's `expect`, `expect_or_drop`, and `expect_or_fail`. Setup is manual rather than declarative.
27	DLT Materialized Views	Lakehouse tables refreshed by dbt or notebook	Full	Write results to Lakehouse tables on a schedule. dbt `materialized='table'` provides the same outcome.
28	Databricks Workflows (multi-task job orchestration)	Fabric Data Pipelines	Full	ADF-based orchestration with Fabric-specific activities (notebook, Spark job, copy, dataflow). Richer DAG support than Workflows, with visual designer.
29	Auto Loader (incremental file ingestion with schema evolution)	Fabric Data Pipelines (copy activity + event trigger) or Spark file streaming	Partial	Data Pipelines can trigger on new files via Storage Events. Spark readStream on files also works. Neither provides Auto Loader's automatic schema inference and evolution. See streaming-migration.md.
30	Delta table OPTIMIZE / VACUUM	Fabric auto-optimization (V-Order compaction)	Better	Fabric Lakehouse auto-compacts files and applies V-Order during write. No manual OPTIMIZE needed. VACUUM is available for explicit cleanup.
31	Delta table CLONE (shallow/deep copy)	Fabric shortcut (shallow) or table copy (deep)	Partial	Shortcuts provide zero-copy reference (similar to shallow clone). No direct CLONE SQL command; use `CREATE TABLE AS SELECT` for deep copy.
32	Unity Catalog volumes (managed file storage)	OneLake Files section in Lakehouse	Full	Lakehouse Files section stores unstructured files (CSV, JSON, images, etc.) alongside managed Delta tables. Accessible via mssparkutils.fs.

Key takeaway: DLT migration is the most complex area. Plan 2-10 days per DLT pipeline to convert to dbt + Data Pipeline. Auto Loader replacement requires careful evaluation of your file ingestion patterns.

5. Storage and data lake¶

This is an area where Fabric has a structural advantage. OneLake is a tenant-wide, unified data lake that every Fabric workspace writes to automatically. Databricks storage is more fragmented across DBFS, external locations, Unity Catalog volumes, and cloud storage mounts.

#	Databricks feature	Fabric equivalent	Parity	Migration notes
33	DBFS (Databricks File System)	OneLake	Better	OneLake is tenant-wide (not workspace-scoped), with a single hierarchical namespace. All Fabric items (Lakehouses, Warehouses, etc.) write to OneLake automatically.
34	External Locations (UC-registered cloud storage)	OneLake shortcuts	Better	Shortcuts present external data (ADLS Gen2, S3, GCS, Dataverse) as native Lakehouse tables/files without copying. No `CREATE EXTERNAL LOCATION` ceremony or storage credentials to manage.
35	Delta Lake (open table format)	Delta Lake (same format)	Full	Fabric reads and writes Delta natively. Same Parquet files + `_delta_log` transaction log. Tables are interoperable between Databricks and Fabric.
36	Delta Sharing (cross-organization data sharing)	OneLake shortcuts + Fabric data sharing (preview)	Partial	Internal sharing uses shortcuts (zero-copy). Cross-tenant sharing is evolving in Fabric. Delta Sharing protocol is supported for external consumers but requires manual setup.

Key takeaway: OneLake shortcuts are the foundation of the hybrid architecture. Create shortcuts to existing ADLS paths and both Databricks and Fabric can read the same Delta tables with zero data duplication.

6. Governance and security¶

Unity Catalog is Databricks' centralized governance layer with a mature three-level namespace, fine-grained access control, and integrated lineage. Fabric distributes governance across workspace roles, OneLake permissions, Purview, and Entra ID. The total capability is comparable, but the mapping requires careful planning. See unity-catalog-migration.md for the complete mapping guide.

#	Databricks feature	Fabric equivalent	Parity	Migration notes
37	Unity Catalog (3-level namespace: catalog.schema.table)	OneLake + Workspace + Lakehouse metadata	Partial	No direct 3-level namespace. Workspace = catalog analog, Lakehouse = schema analog. Cross-referencing requires shortcuts. See unity-catalog-migration.md.
38	Column-level security (UC column masks)	Fabric Warehouse column-level DENY	Partial	Available only in Fabric Warehouse, not in Lakehouse SQL endpoint. Route sensitive tables to Warehouse if column-level security is required.
39	Row-level security (UC row filters)	Fabric Warehouse RLS + Power BI RLS	Full	Warehouse supports SQL-based RLS. Power BI adds report-level RLS. Combined, they cover the same use cases as UC row filters.
40	Data lineage (UC table and column lineage)	Microsoft Purview lineage	Full	Purview tracks lineage across Fabric items, Azure SQL, Synapse, and external sources. Requires Purview setup and scanning.
41	Data classification (UC tags and metadata)	Purview sensitivity labels + classifications	Full	Purview provides richer classification with Microsoft Information Protection (MIP) integration. Auto-classification supported.
42	Service principal authentication	Fabric service principal (Entra ID)	Full	Same Entra ID (Azure AD) service principals work for both Databricks and Fabric on Azure. No credential migration needed.
43	IP access lists (network restrictions)	Fabric Private Links + Entra ID Conditional Access	Full	Use Azure Private Link for network isolation. Entra ID Conditional Access provides policy-based access control (device, location, risk).
44	Audit logs (account-level audit)	Azure Monitor + Microsoft 365 Unified Audit Log + Fabric admin monitoring	Full	Multiple audit surfaces: Azure Monitor for infrastructure, M365 audit for user actions, Fabric admin for workspace-level events.

Key takeaway: Governance migration is complex but achievable. The main risk is losing column-level security if tables stay in Lakehouse (route to Warehouse instead). Connect Purview to Fabric early so lineage builds from day one.

7. Machine learning and AI¶

This is Databricks' strongest advantage. MLflow is the industry standard, Model Serving is production-ready, Feature Store is mature, and GPU clusters are available for training. Fabric's ML surface is functional but less mature. For heavy ML workloads, the recommendation is to keep them on Databricks. See ml-migration.md for the detailed migration guide.

#	Databricks feature	Fabric equivalent	Parity	Migration notes
45	MLflow (experiment tracking)	Fabric ML experiments (MLflow API compatible)	Partial	Fabric supports the MLflow API for logging experiments. The experiment viewer is less feature-rich than Databricks. UC model lineage is not replicated.
46	Model Registry (UC-integrated model catalog)	Fabric ML model registry	Partial	Basic model registry with versioning. No Unity Catalog integration or model lineage graph.
47	Model Serving (real-time inference endpoints)	Azure ML managed online endpoints	Gap	No native Fabric model serving. Deploy models to Azure ML managed endpoints or Azure Container Apps. Adds a second service to manage.
48	Feature Store (feature engineering + serving)	Fabric feature engineering (preview, April 2026)	Partial	Preview feature with basic functionality. No online feature serving. Evaluate maturity before migrating.
49	AutoML (automated model selection)	Fabric AutoML (FLAML-based)	Full	Both provide automated model selection and hyperparameter tuning for tabular data. Fabric uses FLAML under the hood.
50	Vector Search (embedding similarity search)	Azure AI Search (vector index)	Gap	No native Fabric vector search. Azure AI Search provides vector, hybrid, and keyword search. Requires separate Azure service.
51	Databricks Apps (hosted ML/data apps)	Azure ML + Azure Container Apps	Gap	No equivalent app hosting in Fabric. Deploy Streamlit/Gradio/Flask apps via Azure Container Apps or Azure App Service.

Key takeaway: For teams with significant ML workloads, keep ML training and serving on Databricks. Migrate experiment tracking and AutoML for simple models only. The hybrid pattern (Databricks for ML, Fabric for BI) is the recommended approach.

8. Streaming and real-time¶

Fabric Real-Time Intelligence (RTI) is genuinely better than Databricks for sub-second streaming analytics. Eventhouse + Eventstream provide purpose-built event ingestion and KQL-based querying that outperforms Structured Streaming + DLT for real-time dashboards. For complex streaming ETL (joins, windows, UDFs), Spark Structured Streaming on Fabric is equivalent to Databricks (without Photon). See streaming-migration.md for the complete guide.

#	Databricks feature	Fabric equivalent	Parity	Migration notes
52	Structured Streaming (Spark micro-batch/continuous)	Fabric Spark Structured Streaming	Full	Same Spark Structured Streaming API runs in Fabric notebooks. Same `readStream` / `writeStream` / `trigger` patterns.
53	Structured Streaming + Auto Loader (file-based streaming)	Fabric Spark file streaming + Data Pipeline event triggers	Partial	Auto Loader's glob-based file detection with automatic schema evolution has no exact equivalent. Use Spark `readStream.format("json/csv")` with `maxFilesPerTrigger` or Data Pipeline storage event triggers.
54	Delta Live Tables (streaming mode)	Fabric Real-Time Intelligence (Eventhouse + Eventstream)	Better	Eventhouse provides sub-second ingestion and KQL querying optimized for time-series data. RTI dashboards refresh in real-time. DLT streaming is micro-batch (seconds-minutes).
55	Kafka / Event Hubs integration	Fabric Eventstream + Eventhouse	Better	Eventstream provides no-code routing from Event Hubs and Kafka to Eventhouse, Lakehouse, or Data Activator. Built-in monitoring. No cluster to manage.

Key takeaway: Streaming is a Fabric strength for analytics use cases (dashboards, alerts, KQL queries). For complex streaming ETL that writes to Delta tables, Fabric Spark Structured Streaming works but consider whether the ETL belongs on Databricks (Photon advantage) with results shortcutted to Fabric for BI.

9. DevOps and CI/CD¶

Databricks Asset Bundles (DABs) and the Databricks CLI provide IaC-style deployment. Fabric uses Git integration (workspace-to-repo sync) and deployment pipelines (dev/test/prod promotion). The Fabric approach is more opinionated (built-in dev/test/prod stages) but less flexible for custom IaC patterns.

#	Databricks feature	Fabric equivalent	Parity	Migration notes
56	Repos (workspace Git sync)	Fabric Git integration (Azure DevOps, GitHub)	Full	Connect a Fabric workspace to a Git repo. Items are serialized as definition files. Commit, pull, branch workflows supported.
57	Databricks Asset Bundles (IaC deployment)	Fabric deployment pipelines	Partial	Deployment pipelines support dev -> test -> prod promotion with built-in UI. Less flexible than DABs for custom CI/CD (no Terraform-style IaC).
58	REST API (workspace and job management)	Fabric REST API	Full	Comprehensive REST API covering workspaces, items, jobs, shortcuts, and admin operations.
59	Terraform provider (IaC)	Fabric Terraform provider (preview)	Partial	The Fabric Terraform provider is newer and covers fewer resources than the Databricks provider. Evaluate coverage for your specific IaC needs.
60	Databricks CLI (command-line tool)	Fabric CLI (preview) + Azure CLI (`az` commands)	Partial	Azure CLI covers some Fabric operations. The Fabric-specific CLI is evolving and not yet feature-complete. Use the REST API for full coverage.

Key takeaway: DevOps migration is straightforward for teams using Git integration. Teams with heavy DABs or Terraform usage should evaluate the Fabric Terraform provider's coverage before migrating CI/CD pipelines.

10. Summary parity scorecard¶

Category	Full	Partial	Gap	Better	Total
Compute (8)	2	3	2	1	8
Notebooks & Dev (10)	8	2	0	0	10
SQL Analytics (6)	4	1	0	2	7
Data Engineering (8)	3	3	0	1	7
Storage (4)	1	1	0	2	4
Governance (8)	6	2	0	0	8
ML & AI (7)	1	3	3	0	7
Streaming (4)	1	1	0	2	4
DevOps (5)	2	3	0	0	5
Total (60)	28	19	5	8	60

47% Full parity. 13% Better than Databricks. 32% Partial (workable with adjustments). 8% Gap (needs external service).

Reading the scorecard¶

60% Full + Better (36 features): These migrate cleanly or improve with Fabric.
32% Partial (19 features): These work but require workarounds, configuration changes, or different tooling. Each is addressed in the dedicated migration guide.
8% Gap (5 features): Photon, GPU clusters, Model Serving, Vector Search, Databricks Apps. All five are in ML/AI and compute. They require external Azure services (Azure ML, Azure AI Search, Azure Container Apps) or staying on Databricks.

Migration priority by parity¶

Priority	Category	Parity	Action
1 -- Migrate first	SQL Analytics, Storage	Mostly Better/Full	Highest ROI, lowest risk
2 -- Migrate next	Notebooks, Governance, DevOps	Mostly Full	Straightforward conversion
3 -- Evaluate carefully	Data Engineering, Streaming	Mixed Full/Partial/Better	DLT and Auto Loader require significant work
4 -- Keep on Databricks	ML & AI, Compute (Photon/GPU)	Mostly Gap/Partial	Hybrid pattern recommended

11. Gap closure roadmap¶

Microsoft is actively closing Fabric gaps. The following items are on the public roadmap or in preview as of April 2026. Monitor Microsoft Fabric release notes for GA announcements.

Gap	Current status	Expected timeline	Interim workaround
Column-level security on Lakehouse	Not available	Roadmap (no date)	Use Fabric Warehouse for sensitive tables
Fabric Terraform provider (full coverage)	Preview (partial)	H2 2026 (estimated)	Use Fabric REST API + Azure CLI
Feature Store (GA)	Preview	H2 2026 (estimated)	Keep on Databricks or use manual feature tables
Fabric CLI (full coverage)	Preview (limited)	2026	Use REST API for full operations
Native vector search	Not planned	Unknown	Use Azure AI Search
Native model serving	Not planned	Unknown	Use Azure ML managed endpoints
Photon-equivalent engine	Not planned	Unknown	Accept perf gap or keep Photon workloads on Databricks

Recommendation: Do not delay migration waiting for gap closure. Use the hybrid pattern for Gap items today, and re-evaluate quarterly as Fabric matures.

Notebook Migration -- detailed notebook conversion guide
Unity Catalog Migration -- governance mapping
DLT Migration -- pipeline migration
ML Migration -- ML/AI workload migration
Streaming Migration -- real-time workload migration
Benchmarks -- performance comparisons for Partial/Gap items
Why Fabric over Databricks -- strategic context
Parent guide: 5-phase migration

Maintainers: csa-inabox core team Source finding: CSA-0083 (HIGH, XL) -- approved via AQ-0010 ballot B6 Last updated: 2026-04-30