content-safety — parity gap (validator v2, 2026-05-26)¶
Loom URL: /items/content-safety/new Fabric reference: ai.azure.com — Content Safety Studio (Try it: text/image · Blocklists · Jailbreak detection · Custom categories · Prompt Shields · Protected material) Loom screenshot: temp/parity/content-safety-loom.png
Phase 4¶
| Route | Status | Notes |
|---|---|---|
GET /api/items/content-safety | 200 | Returns default thresholds: hate:4, selfHarm:4, sexual:4, violence:4 |
POST /api/items/content-safety text+image | wired but not exercised | — |
Page shows two cards: Text moderation (Textarea + Analyze button), Image moderation (file input + Analyze button). Pre-text caption: "Default category set + severity thresholds. Custom blocklists land in v2.6." which is at least an honest deferral note.
Phase 3 — Fabric vs Loom¶
| Fabric (Content Safety Studio) element | Loom present? | Severity |
|---|---|---|
| Per-category severity sliders (0–6 for Hate, Sexual, Self-harm, Violence) | NO — thresholds are server-defaults only, no UI to change | MAJOR |
| Blocklist editor (terms · regex · wildcard · case-sensitivity) | NO — explicit "v2.6" deferral but no MessageBar in UI surface to set expectations | MAJOR |
| Custom category wizard (train on labelled examples) | NO | MAJOR |
| Prompt Shields (jailbreak / indirect attack detection) | NO | MAJOR |
| Protected Material detection | NO | MAJOR |
| Text moderation try-it | YES — Textarea + Analyze | — |
| Image moderation try-it | YES — file input + Analyze | — |
| Result rendering: severity badges per category | NO — Loom dumps JSON in a <pre> | MINOR (BLOCKER per build-phase contract #4 — output rendering should be structured) |
| Batch / bulk testing | NO | MINOR |
| API key view + rate-limit info | NO | MINOR |
Functional¶
- Analyze Text / Analyze Image buttons fire real POSTs to BFF (BFF returns 200 with
policiesfor GET; POST not exercised live but route is registered) - Result rendering is
JSON.stringifyin a<pre>block — that fails the structured-output rule
Grade — D¶
Read endpoint is real. Try-it text+image work. But the editor doesn't surface ANY of the Content Safety Studio editing surfaces (sliders, blocklists, Prompt Shields, custom categories) — it's a try-it widget over a no-config policy. The "v2.6 deferral" is honest in the caption but the buttons that LOOK like they configure thresholds don't exist. Grade D.