Skip to content
CSA Loom — the Microsoft Fabric experience for Azure tenants where Fabric isn't yet available: lakehouses, warehouses, notebooks, semantic models, Activator rules, Data Agents, across Commercial, GCC, GCC-High, and DoD IL5

Runbook — Defender AI equivalent SOC (health check)

Context

In Gov boundaries where Defender for Cloud AI Threat Protection is unavailable, CSA Loom ships a manual SOC pipeline (Sentinel + Content Safety log wiring + self-hosted Presidio) per Defender AI workaround.

This runbook is the quarterly health check for that pipeline.

Check 1 — Sentinel DCR ingesting Loom Copilot telemetry

// In Sentinel workspace
LoomCopilotTelemetry
| where TimeGenerated > ago(24h)
| summarize events = count() by bin(TimeGenerated, 1h)
| render timechart

Expected: continuous event ingestion. If empty for > 1 hour → investigate DCR.

# Verify DCR active
az monitor data-collection rule show \
  --name csa-loom-copilot-dcr \
  --resource-group <admin-plane-rg>

# Check DCR association to AppInsights
az monitor data-collection rule association list \
  --resource-group <admin-plane-rg>

Check 2 — Analytics rules firing on test inputs

Inject synthetic test events that should trigger each analytics rule:

Test Expected trigger
Submit 20 PII-laden prompts within 15 min from one user "Excessive PII redactions per user"
Submit 20 off-topic prompts within 15 min "Off-topic refusals spike"
Submit prompt with > 50K output tokens "Unusually long outputs"
Submit same prompt 50 times in 5 min "High-rate same-prompt repetition"
Query 5+ sensitive workspaces within 10 min from one user "Cross-workspace exfiltration pattern"

For each test: 1. Inject via Console "Copilot" chat (or REST /api/loom-chat) 2. Wait 5-15 min for rule evaluation 3. Verify incident created in Sentinel 4. Verify incident severity matches expected

# List recent incidents
az sentinel incident list \
  --resource-group <admin-plane-rg> \
  --workspace-name <law-name> \
  --query "[?properties.createdTimeUtc > '2026-...']"

Check 3 — Presidio PII detection working

# Health check Presidio sidecar
curl https://<presidio-url>/health
# Expected: {"status":"healthy"}

# Test detection
curl -X POST https://<presidio-url>/analyze \
  -H "Content-Type: application/json" \
  -d '{"text":"My SSN is 123-45-6789 and email is test@example.com","language":"en"}'
# Expected: detections for US_SSN + EMAIL_ADDRESS

Check 4 — Workbook renders with realistic data

Open Loom Copilot SOC Dashboard workbook in Sentinel. Verify: - Charts populated for last 24h - Top users by activity - Top redacted PII types - Off-topic / refusal rate - Cross-workspace access patterns

Check 5 — Egress controls preventing AOAI cross-boundary calls

# Should fail: Loom Copilot trying to call commercial AOAI from Gov
# Use Container Apps / AKS log to verify outbound to *.openai.azure.com
# is blocked at NSG / Firewall layer

Check 6 — Audit retention meets boundary requirement

Boundary Required retention
GCC-High / IL4 1 year minimum (federal audit)
IL5 7 years (CNSSI 1253)
az monitor log-analytics workspace show \
  --resource-group <admin-plane-rg> \
  --workspace-name <law> \
  --query "retentionInDays"

Remediation if checks fail

Failed check Action
DCR not ingesting Re-bind DCR to Application Insights; verify managed identity
Analytics rule not firing Re-enable rule; verify KQL still matches; update if Loom log schema changed
Presidio down Restart container; verify NSG egress
Workbook broken Re-deploy via Bicep sentinel-ai-rules.bicep
Retention too short Update LAW retention setting; backfill from archive if available

Cadence

  • Quarterly — full check sequence above
  • Monthly — automated synthetic-event tests via GitHub Actions workflow
  • Continuous — Sentinel itself surfaces failures via incident fire-rate metrics