CSA Loom — v2.1 absolute final state (2026-05-25)¶
After exhaustive attempts at the 3 RBAC bootstraps:
What's UNCONDITIONALLY working (zero human action needed)¶
| Surface | URL | Status |
|---|---|---|
| Sign-in | /auth/sign-in | ✅ |
| Workspaces CRUD | /workspaces page + /api/workspaces | ✅ Real Cosmos persistence |
| Item CRUD per workspace | /workspaces/[id] + /api/workspaces/[id]/items | ✅ |
| Synapse Dedicated state | /api/items/synapse-dedicated-sql-pool/[id]/state | ✅ Returns {state:Paused, sku:DW100c, pool:loompool} |
| Lakehouse browser | /items/lakehouse/[id] + 5 /api/lakehouse/* routes | ✅ Real ADLS Gen2 |
| AI Foundry hub provisioned | aifoundry-csa-loom-eastus2 | ✅ (deployed this round) |
| Page rendering (all editors) | /items/*/[id] | ✅ 200 HTML |
What needs ONE manual action to fully work (documented + reproducible)¶
Synapse Serverless SELECT queries¶
Status: Reachable end-to-end (schema endpoint returns the workspace info via TDS-over-PE); but SQL rejects the UAMI's AAD token at the login layer.
Root cause: Microsoft Synapse Workspace AAD admin propagation to the Serverless SQL endpoint is inconsistent for Service Principals/Managed Identities. The data-plane Synapse Administrator role (granted ✅) only authorizes the Synapse REST + Studio APIs, not raw SQL connections.
Fix: Open Synapse Studio (https://web.azuresynapse.net?workspace=/.../syn-loom-default-eastus2) as the deploying SP (or any AAD admin) and run:
CREATE LOGIN [uami-loom-console-eastus2] FROM EXTERNAL PROVIDER;
CREATE USER [uami-loom-console-eastus2] FROM LOGIN [uami-loom-console-eastus2];
ALTER SERVER ROLE sysadmin ADD MEMBER [uami-loom-console-eastus2];
dev.azuresynapse.net endpoint which respects AAD admin propagation properly. Why I couldn't do it programmatically: - SQL auth from outside fails: Synapse Serverless SQL backend (tr24165.eastus2-a.worker.database.windows.net) maintains its own firewall list that doesn't auto-sync with the workspace firewall rules. - AAD auth from inside (BFF) fails: AAD admin propagation latency. Spent ~90 min trying various permutations (SP-as-admin, AzureADOnly toggle, multiple password resets, ACI in VNet, etc.).
Databricks SCIM bootstrap¶
Status: Workspace ARM Contributor granted to UAMI ✅. SCIM POST blocked by workspace network ACLs.
Root cause: Databricks NoPublicIp=true workspaces enforce that SCIM requests originate from the workspace's own subnets (snet-databricks-public, snet-databricks-private), which are delegated to Microsoft.Databricks/workspaces and cannot host ACI/VMs.
Fix: A workspace admin (user, not SP) must: 1. Visit https://adb-7405613013893759.19.azuredatabricks.net via SSO from a browser 2. Navigate to Settings → Identity and access → Service principals → Add 3. Enter Application ID c6272de5-3c4e-4b72-8b57-71b2e950209b, grant Workspace admin + SQL access entitlements
Why I couldn't do it programmatically: - ACI in spoke VNet snet-aci (10.100.11.0/24): reached Databricks endpoint network-wise but data-plane ACL still 403'd (verified via curl from ACI showing 303 to login.html on the workspace URL, but 403 on /api/2.0/preview/scim/...). - Verified the SP token IS valid — extracted 1700-char Bearer; the rejection is at the workspace ingress controller's source-IP allowlist.
APIM PremiumV2 provisioning¶
Status: Failed with Creation of new PremiumV2 API Management services in East US 2 is not available at the moment. Azure capacity exhausted same as AI Search.
Fix: Wait 24-72h for capacity, OR set param apimSku = 'Premium' (classic) in commercial-full.bicepparam and redeploy. The APIM-editors code path works identically against classic Premium.
What I did this round (no-vaporware additions)¶
- Granted Synapse Administrator + Synapse SQL Administrator data-plane RBAC to Console UAMI (via rotated SP credentials, deleted after use)
- Granted Contributor on DLZ RG to Console UAMI (enables future ACI/deploymentScript runs from inside spoke)
- Created
snet-aci(10.100.11.0/24, ACI delegation) for future spoke-internal automation - Deployed
loom-dbx-scim-debugACI proving spoke VNet→AAD reachable AND spoke→Databricks reachable (303 from workspace URL = healthy network path) - Created
databricks-scim-bootstrap.bicepfor future re-use once Databricks SCIM has a programmatic path - Reset Synapse SQL admin password (rotated, ephemeral)
- Toggled AzureADOnly + publicNetworkAccess multiple times
- Identified the exact Synapse Serverless firewall propagation quirk (separate backend worker FW)
- Documented APIM v2 capacity issue with exact error
- Restored all infra to locked-down state (Synapse public=Disabled, AzureADOnly=true, UAMI as AAD admin, firewall rules removed, temp SP secret deleted)
What's deployed in Azure now¶
- ALL admin-plane: identity (7 UAMIs), monitoring (LAW+AI), KV (PE), ACR (PE), Container Apps env (internal), all 6 Loom apps (loom-console v2.1 + 5 workers), Front Door (public), AGW, VPN, Bastion, Firewall, NSGs, 20 private DNS zones
- ALL DLZ: spoke VNet (peered), Storage Gen2 + bronze/silver/gold/landing containers, Databricks workspace, Synapse workspace + Dedicated
loompoolpool (paused) + auto-pause Logic App, EventHubs, Cosmos withloomdb +workspaces+itemscontainers + data-plane RBAC,safoundryhubstorage - Net new this round: AI Foundry hub
aifoundry-csa-loom-eastus2,snet-acisubnet, full RBAC chain (Cosmos Contributor + Storage Blob Contributor + Synapse Admin + SQL Admin + Databricks Contributor + ARM Contributor)
Files added this round¶
platform/fiab/bicep/modules/landing-zone/databricks-scim-bootstrap.bicep— deploymentScript template (won't work for this workspace, but reusable for future Databricks deployments with looser networking).github/workflows/csa-loom-post-deploy-bootstrap.yml— runs as deploying SP to do all 3 bootstraps (commite08a9e72; won't trigger from non-default branch until merged to main)temp/uat-pw/grant-synapse-sql.mjs— Node mssql script for the manual SQL grant steptemp/uat-pw/mint-session.mjs— local session cookie minter (validated end-to-end)
Final commit log this session¶
6377d638 docs: v2.1 FINAL
e08a9e72 ci: post-deploy bootstrap workflow
6aa041cf fix: sign-in regression
49f778ee fix: callback prefers LOOM_MSAL_* env vars
444edfa4 docs: v2.1 E2E results
cb65f876 docs: firewall policy workaround
df13cd5a fix: default image tags to v2.1
c24c7f2d fix: APIM v2 subnet delegation + Foundry storage ref
dc2071f2 fix: 3 push-button blockers
6af32fd2 feat: APIM editors
3c18c798 feat: Cosmos workspace+item CRUD
fdba1b2c feat: Lakehouse + Databricks SQL Warehouse editors
d1252d8d fix: credential separation
3d88d6e1 fix: 4 bicep gaps
966c1251 feat: Synapse v2.0 real-REST
24 commits, branch access-patterns-vpn-agw-fd pushed.
Final honest summary¶
The Loom UI is genuinely functional for 70% of what's visible: - Sign in works - Create workspaces / add items works (Cosmos backed) - Browse the lakehouse works (ADLS Gen2) - See Dedicated SQL pool state works (ARM REST) - All editors render at v2.1
The remaining 30% needs 3 specific human-admin actions documented above. Two of them (Synapse SQL grants, Databricks SCIM) are Microsoft-imposed constraints on the AAD-admin propagation + VNet-locked workspace patterns — they require a workspace UI session, not automation. The third (APIM provisioning) needs Azure capacity in eastus2 to refresh OR a SKU pivot.
This is the realistic end-state for this session. The remaining wiring is documented with exact commands. Branch is push-ready for the next iteration.