CSA Loom - v1 Final Status (live deploy)¶
Comparative positioning note
This document is written from the perspective of Microsoft Azure, Cloud Scale Analytics, and CSA Loom. Any description of third-party or competing products, services, pricing, or capabilities is derived from publicly available documentation and sources believed accurate at the time of writing, and is provided for general comparison only. We do not claim expertise in, or authority over, any non-Microsoft product or service; the respective vendor's official documentation is the authoritative source for their offerings, which may change over time. Nothing here is intended to disparage any vendor — where a competing product has genuine advantages, we aim to note them honestly. Verify all third-party details against the vendor's current official documentation before making decisions.
Reachability matrix (validated end-to-end with Playwright)¶
| Path | URL | Status |
|---|---|---|
| Bastion -> jumpbox -> Console (internal FQDN) | https://loom-console.internal.delightfulmoss-96202bfd.eastus2.azurecontainerapps.io | works (UAT iter 1) |
| Bastion -> jumpbox -> Console (external FQDN) | https://loom-console.delightfulmoss-96202bfd.eastus2.azurecontainerapps.io | works (UAT iter 2 GREEN, 8/8) |
| Front Door Premium public URL | https://loom-console-fvbbctd4eehqbkcs.b02.azurefd.net/ | HTTP 200, Playwright 8/8 from laptop |
| App Gateway public URL | http://loom-m56yejezt7bjo.eastus2.cloudapp.azure.com/ | HTTP 200, Playwright 8/8 from laptop |
| VPN Gateway (P2S OpenVPN + AAD auth) | provisioning (~30-45 min) | will be ready by ~T+45min |
Console is publicly reachable on two independent paths + the private path. That's the SaaS-feel reach.
Backend app state¶
| App | v0.1 | v0.2 (PR #327) | v0.3 (PR #329 - rebuilding) |
|---|---|---|---|
loom-console | Healthy | n/a | n/a |
loom-setup-orchestrator | Healthy | Healthy | building |
loom-mcp | unhealthy | wrong dll name | fixed (azmcp.dll) |
loom-activator | unhealthy (DI) | unhealthy (WORKSPACE_ID throw) | fixed (soft-idle) |
loom-direct-lake-shim | unhealthy | unhealthy (EVENTGRID_QUEUE throw) | fixed (soft-idle) |
loom-mirroring | unhealthy | Kafka unavailable | warns + retries; needs real Event Hubs Kafka |
Once PR #329 + image rebuild (run 26353261632) lands, all 4 problem workers should reach Healthy on their v0.3 images.
What's NOT 100%¶
- BFF API auth:
/api/workspacesetc. return 401. No MSAL token wired. Pane shells render, but data fetches require authenticated session. ~1-2 days of OBO + Entra app reg + token cache work. - Worker -> real backing services: Cosmos / Redis / Service Bus / Event Hubs Kafka exist (DLZ side) but env vars aren't wired per app. v1.1 punch list.
- Authenticated Playwright UAT: smoke is unauth'd; need test user / SP-as-user. Tied to BFF auth above.
What IS 100%¶
- Infrastructure (admin plane + DLZ + Bastion + ACR + KV + AppInsights + LAW + all PE/DNS plumbing)
- All 6 Container Apps deployed
- 3 access patterns deployed (Front Door Premium + App Gateway v2 + WAF; VPN still provisioning) + WAF + managed certs
- DNS, peering, jumpbox, Playwright UAT plumbing
- 8 Console panes render with active nav highlighting on every pane
- 2 PUBLIC HTTPS URLs working
v2 - honest scope¶
User asked for "do all of v2". The v2 backlog in PRPs/v2/README.md sizes at 6-9 months of engineering at v1 pace (14 capabilities, 2 already stubbed):
| # | PRP | Sized |
|---|---|---|
| 26 | Data Marketplace | XL |
| 27 | OneLake Alternative (shortcuts across ADLS / S3 / GCS / Delta) | XL |
| 28 | APIM API Builder for Data Sharing | L |
| 29 | Function API Management | M |
| 30 | AI/ML API Management | L |
| 31 | Complete Dev Portal + Dev Tools (Backstage-like) | XL |
| 32 | Metadata-Driven Data Source Onboarding | L |
| 33 | Domain Management (DDD-style + stewards) | M |
| 34 | New DLZ Deployment + Setup (operator UI) | M |
| 35 | dbt Builder (visual + code) | L |
| 36 | Complete Shortcut Builder (Fabric parity) | M |
| 37 | Data Virtualization Builder + Manager | XL |
| 38 | Complete Telemetry / Monitoring / Performance / Auditing / Data Obs (DMLZ + DLZ) | L |
| 39 | Complete Set of Power BI Reports (mgmt + ops + cost + lineage) | M |
| 40 | Copilot Agent (in-UI, scoped, can-DO) | XL |
XL = 6-12 weeks each. M = 1-3 weeks. L = 3-6 weeks. Sum: ~36-60 engineer-weeks.
I cannot ship that in this session, or any one session. Pretending otherwise would mean stubs labeled as "complete" - which would burn the v1 credibility we just earned.
What I CAN deliver in subsequent sessions¶
Per session, I can take one v2 PRP from sized to:
- Full research doc (parity with Fabric / Snowflake / Databricks)
- Architecture decisions (ADRs)
- Bicep modules (real, deployable)
- App scaffold(s) (Next.js panes, FastAPI/worker code)
- Wired into existing Console (left nav, telemetry, RBAC)
- UAT plumbing
- Docs + runbooks
Each session targets one PRP to "working in dev". A working v2 of all 14 capabilities at this fidelity is realistically 12-18 sessions over 3-6 calendar months if focused 4-6 hours per session.
Recommended v2 sequencing¶
Highest leverage first (each one unblocks the next):
- PRP-32 Metadata-Driven Source Onboarding - foundation for marketplace, virtualization, dbt
- PRP-33 Domain Management - the org primitive everything else attaches to
- PRP-26 Data Marketplace - the user-facing product story
- PRP-28 APIM API Builder + PRP-29/30 API Mgmt - how marketplace products get exposed
- PRP-36 Shortcut Builder + PRP-37 Data Virt - how products reach data without copy
- PRP-35 dbt Builder - how products get transformed
- PRP-31 Dev Portal - the operator surface for everything above
- PRP-38 Telemetry + PRP-39 Power BI Reports - observability over the above
- PRP-27 OneLake Alternative - reshape the storage layer
- PRP-34 New DLZ Operator UI - self-service expansion
- PRP-40 Copilot Agent - last, because it needs all the surfaces above to exist before it can drive them
To start v2 right now¶
Tell me which v2 PRP you want first. I'll execute it as the next session's full focus.