Direct Lake parity¶
This is the hardest single workload to parity. Loom is honest about the gap.
What Fabric does¶
Direct Lake is Power BI semantic-model storage mode where the VertiPaq engine (the columnar in-memory engine behind Power BI Import and Analysis Services) reads Delta-Parquet files directly from OneLake, on demand, without an Import-mode copy and without DirectQuery's per-query federation.
Two flavors exist: - Direct Lake on SQL Endpoint (DL/SQL) — falls back to DirectQuery on guardrail breach - Direct Lake on OneLake (DL/OL) — newer; no fallback; errors on guardrail breach
The novel engineering is the transcoder: VertiPaq reads Parquet column chunks and converts them on-the-fly into VertiPaq's compressed columnstore segments. Framing (not refresh) advances the model to the latest Delta version by dropping changed segments + retaining dictionaries — takes seconds, no data movement.
Sub-second query latency on warm columns. Cold-column first-touch is fast (no refresh!). Per-F-SKU guardrails cap rows, row groups, Parquet files, model size.
V-Order is the proprietary Parquet write-time sort + encoding that gives VertiPaq dramatically better paging behavior — files are still 100% Parquet-spec-compliant.
Why this can't be exactly matched¶
Per temp/fiab-research/03-fabric-only-internals.md §1:
No OSS engine implements columnar-on-Parquet caching with a DAX / MDX surface. The closest open-source projects (DuckDB over Parquet, Apache DataFusion, Velox) lack the DAX engine. Power BI / AAS Tabular is the only engine that speaks DAX, and Microsoft's transcoder is the proprietary IP.
A full OSS replica would be a multi-year engineering project.
CSA Loom parity design — apps/fiab-direct-lake-shim¶
Per ADR fiab-0004: Power BI Premium Import semantic model + notification-driven warm-cache materializer.
Architecture¶
Customer lakehouse Loom Direct-Lake-Shim service
──────────────── ───────────────────────────────
ADLS Gen2 Delta tables
_delta_log/*.json Container App / AKS workload
│ C# .NET 10 + TOM client
│ Storage Event Grid
▼ ┌─────────────────────────────┐
"BlobCreated on /_delta_log/*" ──►│ EventGridTrigger handler │
│ 1. Parse Delta commit info │
│ 2. Identify changed parts │
│ 3. Decide refresh strategy │
│ 4. TOM partition refresh │
│ via XMLA endpoint │
└─────────────────────────────┘
│
▼ refresh via XMLA
Power BI Premium semantic model
(Premium F-SKU capacity)
Refresh policy per table¶
Stored in Cosmos DB; editable via Console "Semantic Model" pane (v1.1) or CLI in v1:
| Policy | Behavior | Use case |
|---|---|---|
partition | Refresh only the affected partition on each Delta commit. Requires partition-aware TMDL. | Time-series facts (e.g., partitioned by event_date) |
full | Full table refresh. Slow for large tables. | Small dimension tables |
directquery-fallback | Marks table as DirectQuery against Synapse Serverless or Databricks SQL. Always live, slower DAX. | Tables exceeding F-SKU memory limits |
composite | TMDL composite mixing the above. | Slowly-changing dims + fast facts |
Refresh latency¶
5-30 seconds median for partition refresh on small-to-medium tables. Minutes for full refresh of large tables. Not sub-second.
Console "Semantic Model" pane¶
- TMDL editor (Monaco with TMDL syntax)
- Visual model designer (drag-drop tables, relationships, hierarchies, measures)
- Per-table refresh-policy picker (partition / full / DirectQuery)
- Deploy button (creates / updates the Power BI Premium semantic model via Power BI REST API)
- DAX editor with Copilot NL→DAX side panel
- Test query against the deployed model
- Lineage from underlying lakehouse tables
Per-boundary behavior¶
| Boundary | F-SKU available | Direct-Lake-Shim |
|---|---|---|
| Commercial | ✅ | ✅ Premium Import + warm-cache |
| GCC | ❌ No F-SKU; P-SKU only | Honest structural gap — no Direct Lake parity |
| GCC-High / IL4 | ✅ | ✅ Premium Import + warm-cache |
| IL5 (v1.1) | ✅ | ✅ Premium Import + warm-cache |
Honest gaps¶
This page is the canonical place where Loom documents what it can't match:
- Sub-second freshness on new commits = not achievable. Expected range: 5-30 seconds for partition refresh; minutes for full table refresh of large tables.
- VertiPaq paging characteristics — Import models have warm/cold query latency same as Fabric Direct Lake on warm data; cold-data first-touch is faster than Fabric Direct Lake (because Import is already in memory) but stale until the next refresh.
- Direct Lake on OneLake (no-fallback) parity — not delivered. Documented as v1 OUT.
- GCC has no Direct Lake parity — F-SKU is unavailable in GCC. This is a structural gap, not timing-fixable.
- V-Order — Loom-written Delta tables don't have V-Order; only matters when Fabric reads them (mitigated by OneLake shortcut + Fabric re-compaction post-migration).
When to use what¶
| Scenario | Recommended approach |
|---|---|
| Aggregate analytics (MTD / QTD / YoY) over partition-aware fact tables | Partition refresh + Direct-Lake-Shim — 5-30 s freshness |
| Slowly-changing dimensions | Full refresh on schedule (nightly / hourly) |
| Tables exceeding F-SKU memory | DirectQuery fallback against Databricks SQL Warehouse / Synapse Serverless |
| Mixed (large facts + slow dims) | Composite TMDL — Import for dims + DirectQuery for facts |
| GCC tenant requiring < 1s freshness | Wait for Fabric Gov GA |
| Sub-second freshness on Gov-H/IL5 | Wait for Fabric Gov GA (no Loom parity available) |
Forward migration¶
When Fabric Gov GA arrives: 1. Re-author semantic models in Power BI Desktop with Direct Lake on OneLake storage mode 2. Point at OneLake shortcuts to Loom's existing lakehouse 3. Retire the Direct-Lake-Shim service — Power BI handles freshness natively
Related¶
- ADR: fiab-0004 Direct Lake parity
- Build PRP: PRP-08 —
apps/fiab-direct-lake-shim/ - Service docs: Direct-Lake Shim service
- Tutorial: Tutorial 03 — Direct Lake parity
- Runbook: Direct-Lake-Shim stuck
- Research:
temp/fiab-research/03-fabric-only-internals.md§1