Best Practices — Databricks to Fabric Migration¶

Status: Authored 2026-04-30 Audience: Migration leads, platform engineers, and architects executing a Databricks-to-Fabric migration or building a hybrid Databricks + Fabric architecture. Scope: Hybrid strategy patterns, workspace mapping, capacity planning, notebook conversion checklist, common pitfalls, and operational runbook.

1. Hybrid Databricks + Fabric strategy¶

1.1 Why hybrid is the default recommendation¶

Most enterprises should not attempt a full Databricks-to-Fabric migration. Instead, adopt a hybrid architecture where each platform handles what it does best:

Platform	Owns	Reason
Databricks	Heavy ML training, Photon-dependent ETL, multi-cloud reads	GPU clusters, MLflow, Photon performance
Fabric	BI semantic models, Power BI, ad-hoc SQL, real-time analytics	Direct Lake, Eventhouse, single capacity billing
Shared	Delta tables in ADLS Gen2	OneLake shortcuts enable both platforms to read the same data

1.2 Hybrid architecture reference¶

flowchart TB
    subgraph Databricks
        ML[ML Training<br/>Photon + GPU]
        ETL[Heavy ETL<br/>Jobs Compute]
        MLF[MLflow + Feature Store]
    end

    subgraph SharedStorage[Shared ADLS Gen2]
        Delta[(Delta Tables<br/>Bronze / Silver / Gold)]
    end

    subgraph Fabric
        LH[Lakehouse<br/>OneLake shortcuts]
        SQL[SQL Endpoint<br/>ad-hoc queries]
        PBI[Power BI<br/>Direct Lake]
        RTI[Real-Time Intelligence<br/>Eventhouse]
        DP[Data Pipelines<br/>orchestration]
    end

    ML --> Delta
    ETL --> Delta
    Delta --> LH
    LH --> SQL
    LH --> PBI
    LH --> RTI
    DP --> LH

1.3 Integration points¶

Integration	Mechanism	Notes
Databricks writes, Fabric reads	OneLake shortcut to ADLS	Zero-copy; Fabric reads Delta files written by Databricks
Fabric writes, Databricks reads	Databricks external location on ADLS (same path)	Both engines write to shared ADLS; coordinate schema changes
Metadata sync	Document table schemas; no auto-sync	Unity Catalog and OneLake metadata are separate; keep a mapping doc
Lineage	Purview scans both (with connectors)	Purview has connectors for Databricks and Fabric
Authentication	Shared Entra ID (Azure AD)	Same service principals work for both

1.4 What to migrate first¶

Prioritize workloads by ROI:

Priority	Workload	ROI driver
1 (Week 1-2)	Power BI semantic models (Import -> Direct Lake)	Eliminate DBSQL cost + refresh compute
2 (Week 3-4)	Ad-hoc SQL analytics	Eliminate DBSQL warehouse idle cost
3 (Week 5-8)	dbt transformations (if SQL-first)	Simpler ops, lower CU cost
4 (Week 8-12)	Streaming analytics (DLT -> RTI)	Lower latency, lower cost
5 (Ongoing)	PySpark notebooks (case-by-case)	Evaluate per-notebook
Never	ML training (keep on Databricks)	Photon, GPU, MLflow maturity

2. Workspace mapping¶

2.1 Databricks-to-Fabric workspace mapping patterns¶

Pattern A: 1-to-1 (simple)

One Databricks workspace maps to one Fabric workspace:

Databricks: production-workspace  -->  Fabric: Production-Analytics
Databricks: development-workspace -->  Fabric: Development-Analytics

Best for: Small teams, single-project organizations.

Pattern B: Catalog-to-workspace (recommended)

Unity Catalog catalogs map to Fabric workspaces:

UC catalog: production  -->  Fabric workspace: Production
UC catalog: development -->  Fabric workspace: Development
UC catalog: staging     -->  Fabric workspace: Staging

Best for: Organizations using UC for environment isolation.

Pattern C: Domain-to-workspace (enterprise)

Business domains map to separate workspaces:

UC catalog: finance     -->  Fabric workspace: Finance-Analytics
UC catalog: marketing   -->  Fabric workspace: Marketing-Analytics
UC catalog: operations  -->  Fabric workspace: Operations-Analytics

Best for: Large enterprises with domain-driven data ownership.

2.2 Workspace naming conventions¶

Convention	Example	Notes
`{Domain}-{Environment}`	`Finance-Production`	Clear domain + environment separation
`{Team}-{Purpose}`	`DataEng-ETL`	Team-oriented
`{Project}-{Tier}`	`CustomerAnalytics-Gold`	Medallion-tier separation

Recommendation: Use {Domain}-{Environment} for production workspaces and {Team}-{Purpose} for development.

2.3 Lakehouse vs Warehouse decision¶

Use case	Choose Lakehouse	Choose Warehouse
Delta table storage	Yes	No
PySpark notebooks	Yes	No
Direct Lake semantic models	Yes	No (Warehouse uses DirectQuery)
T-SQL stored procedures	No	Yes
Column-level security	No	Yes
Row-level security	No (use PBI RLS)	Yes
Cross-database queries	Via shortcuts	Via cross-database SQL
dbt target	Either (different adapters)	Either

Default recommendation: Use Lakehouse for most workloads. Use Warehouse only when you need T-SQL compatibility or fine-grained SQL security.

3. Capacity planning¶

3.1 Sizing methodology¶

Measure current Databricks usage: Export 3 months of DBU consumption from system.billing.usage
Identify peak and average: Calculate peak hour and 24-hour average
Apply smoothing factor: Fabric smoothing averages CU over 24 hours; this usually means you need a smaller SKU than peak suggests
Start one tier lower: Fabric capacity can be scaled up in minutes; start conservatively
Monitor and adjust: Use the Fabric Capacity Metrics app for the first 2-4 weeks

3.2 Sizing reference¶

Current Databricks monthly spend	Starting Fabric SKU	Notes
< $1,000	F2 - F4	Validate if always-on capacity makes sense
$1,000 - $5,000	F8 - F16	Good starting point for small teams
$5,000 - $20,000	F16 - F64	Most mid-size analytics teams
$20,000 - $50,000	F64 - F128	Includes Power BI Premium (F64+)
$50,000 - $100,000	F128 - F256	Large enterprise, many concurrent users
> $100,000	F256 - F1024	Enterprise scale; use reserved capacity

3.3 Capacity management best practices¶

Practice	Description
Separate dev and prod capacities	Dev capacity can be paused nights/weekends; prod stays on
Reserve base, burst with PAYG	Reserve capacity for steady-state; use PAYG for spikes
Monitor smoothing utilization	If 24-hour average exceeds 80% of capacity, scale up
Schedule batch jobs to spread load	Distribute jobs across the day to flatten peaks
Pause dev capacity on weekends	Automate with Azure Automation or Logic App
Use F-SKU autoscale (if available)	Some F-SKUs support autoscale; enable for spiky workloads

4. Notebook conversion checklist¶

Use this checklist for every notebook being migrated:

Pre-migration assessment¶

Classify notebook: ETL (migrate), ML training (keep on DBR), ad-hoc (migrate), DLT (convert to dbt)
Count lines of code: <200 lines = simple, 200-500 = moderate, >500 = complex (consider rewriting)
Identify languages: PySpark (migrate), SQL (migrate or convert to dbt), Scala (rewrite in PySpark), R (migrate)
List dependencies: External libraries, internal %run references, dbutils calls
Identify data sources: ADLS mounts, Unity Catalog tables, DBFS paths
Check Photon dependency: Run without Photon on Databricks first; note performance difference

Migration execution¶

Create Fabric environment with required libraries
Replace dbutils with mssparkutils (see notebook-migration.md)
Update file paths from /mnt/ to OneLake paths
Update table references from catalog.schema.table to Lakehouse tables
Remove Databricks-specific configs (spark.databricks.*)
Convert Scala cells to PySpark
Replace %sql magic with SQL cell type
Test interactively in Fabric notebook
Validate output against Databricks (row counts, schema, sample data)

Post-migration¶

Schedule via Data Pipeline (replace Databricks Workflow)
Set up monitoring (pipeline alerts, run history)
Update downstream consumers (Power BI, APIs, other notebooks)
Run parallel for 2 weeks
Decommission Databricks notebook (archive to Git, disable job)

5. Common pitfalls and mitigations¶

5.1 Technical pitfalls¶

Pitfall	Impact	Mitigation
Lift-and-shift notebook spaghetti	Moves complexity without improving it	Refactor: convert to dbt models, modular notebooks, or Data Pipelines
Assuming Fabric Spark = Photon	2-3x slower for Photon-dependent queries	Benchmark first; keep Photon workloads on Databricks
Hardcoded paths (/mnt/, dbfs:/)	Notebooks fail on Fabric	Audit and replace all paths before migration
Missing Scala support	Scala cells fail; notebook is broken	Rewrite Scala code in PySpark before migration
Init script dependencies	System-level packages unavailable	Use Fabric environments for library management
Databricks Connect workflows	No direct replacement	Use Fabric REST API, VS Code for Fabric, or JDBC
Unity Catalog column-level security	Not available on Lakehouse	Route sensitive tables to Fabric Warehouse
DLT expectations lost	Quality checks disappear silently	Convert to dbt tests with `store_failures: true`

5.2 Organizational pitfalls¶

Pitfall	Impact	Mitigation
"All or nothing" migration	Delays value; risks failure	Migrate BI first (weeks), then incrementally
No parallel run period	Data discrepancies go undetected	Always run both platforms for 2+ weeks
Skipping capacity trial	Over- or under-provisioned	Run a 60-day Fabric trial before committing
Forgetting Power BI team	Report migration bottleneck	Involve PBI developers from Phase 1
Ignoring training	Teams struggle with new platform	Budget 1-2 weeks of Fabric training per team
Not documenting the mapping	Knowledge loss, inconsistent migration	Maintain a living spreadsheet of UC -> Fabric mappings

5.3 Cost pitfalls¶

Pitfall	Impact	Mitigation
Sizing Fabric by peak DBR usage	Over-provisioned, wasted spend	Use smoothed 24-hour average for sizing
Forgetting to decommission DBR	Paying for both platforms	Set decommission dates per workload; track in the migration plan
Not using reserved capacity	20-40% more expensive	Commit to reserved for base capacity after trial
Running Spark notebooks 24/7	CU consumed continuously	Use Data Pipelines for scheduled runs, not long-running notebooks
Ignoring Power BI Premium savings	Missing a major cost reduction	Verify PBI Premium is included in F64+ before sizing

6. Operational runbook¶

6.1 Day-to-day operations comparison¶

Operation	Databricks	Fabric
Start compute	Start cluster (3-7 min)	Start Spark session (30-60s)
Scale compute	Resize cluster (add nodes)	Scale capacity SKU (minutes)
Monitor jobs	Databricks UI > Workflows	Fabric monitoring hub
Monitor costs	Account console > Usage	Azure Cost Management + Capacity Metrics app
Deploy changes	Databricks Asset Bundles / Repos	Fabric Git integration + deployment pipelines
Manage permissions	Unity Catalog GRANT/REVOKE	Workspace roles + Warehouse SQL
Debug failures	Cluster driver logs, Spark UI	Spark UI (in notebook), monitoring hub
Manage libraries	Cluster libraries, %pip	Fabric environments, %pip

6.2 Monitoring setup¶

After migration, establish these monitoring practices:

Fabric Capacity Metrics app -- install from AppSource; monitors CU consumption
Azure Monitor alerts -- set alerts for capacity utilization > 80%
Data Pipeline alerts -- configure failure notifications for each pipeline
dbt test dashboard -- build Power BI report on store_failures audit tables
OneLake storage monitoring -- track storage growth via Azure portal
Power BI usage metrics -- monitor report views and refresh patterns

6.3 Rollback plan¶

If a migrated workload does not perform as expected:

Immediate: Re-enable the Databricks job/cluster (should not be deleted during parallel run)
Repoint consumers: Switch Power BI / downstream APIs back to DBSQL endpoint
Investigate: Compare benchmarks, identify performance gap
Decide: Optimize Fabric workload, increase capacity, or keep on Databricks
Document: Update the migration plan with lessons learned

7. Migration timeline template¶

Week	Activity	Deliverable
1-2	Assessment: inventory workloads, classify, map	Migration spreadsheet
3	Capacity trial: provision Fabric, run benchmarks	Sizing recommendation
4	Design: workspace mapping, security model	Architecture doc
5-6	Wave 1: OneLake shortcuts, first Direct Lake model	First PBI report on Fabric
7-8	Wave 2: Migrate ad-hoc SQL, simple notebooks	Analysts using Fabric
9-12	Wave 3: dbt transformations, DLT conversion	Pipelines running on Fabric
13-14	Wave 4: Streaming workloads (if applicable)	RTI / Eventhouse live
15-16	Validation: parallel run, reconciliation	Sign-off per workload
17-18	Cutover: decommission Databricks per workload	Cost reduction verified
19-20	Optimization: capacity right-sizing, reserved purchase	Optimized steady state

8. Quick reference: key commands¶

Fabric notebook commands¶

# List files in OneLake
mssparkutils.fs.ls("Files/")

# Get secret from Key Vault
secret = mssparkutils.credentials.getSecret("keyvault-name", "secret-name")

# Get notebook parameter
param = mssparkutils.notebook.getParam("param_name", "default_value")

# Run another notebook
result = mssparkutils.notebook.run("other_notebook", timeout_seconds=300, parameters={"key": "value"})

# Exit with value
mssparkutils.notebook.exit("SUCCESS")

Fabric REST API (common operations)¶

# List workspaces
curl -H "Authorization: Bearer $TOKEN" \
     "https://api.fabric.microsoft.com/v1/workspaces"

# List Lakehouse items
curl -H "Authorization: Bearer $TOKEN" \
     "https://api.fabric.microsoft.com/v1/workspaces/{workspace_id}/items?type=Lakehouse"

# Trigger notebook run
curl -X POST -H "Authorization: Bearer $TOKEN" \
     -H "Content-Type: application/json" \
     "https://api.fabric.microsoft.com/v1/workspaces/{workspace_id}/items/{notebook_id}/jobs/instances?jobType=RunNotebook"

Why Fabric over Databricks -- strategic context
TCO Analysis -- cost modeling
Feature Mapping -- capability comparison
Benchmarks -- performance data for capacity planning
Notebook Migration -- detailed notebook conversion
Unity Catalog Migration -- governance mapping
DLT Migration -- pipeline conversion
ML Migration -- ML workload guidance
Streaming Migration -- real-time workload guidance
Parent guide: 5-phase migration
Fabric documentation: https://learn.microsoft.com/fabric/

Maintainers: csa-inabox core team Source finding: CSA-0083 (HIGH, XL) -- approved via AQ-0010 ballot B6 Last updated: 2026-04-30