Home > Docs > Best Practices > Security > Audit Trail Immutability
๐ Audit Trail Immutability: Tamper-Evident Workflows for Compliance¶
WORM Storage, Hash-Chained Logs, and Verifiable Retention for Regulated Workloads
Last Updated: 2026-04-27 | Version: 1.0.0 | Wave 5 Feature: 5.9 | Anchor: SOC 2 Type II Readiness
Disclaimer: This document provides architectural and technical guidance for audit-trail immutability on Microsoft Fabric. It is not legal advice, regulatory interpretation, or a guarantee of compliance with any specific statute (SOX, HIPAA, GDPR, PCI-DSS, 21 CFR Part 11, NIGC MICS, FedRAMP). Audit-evidence requirements vary by jurisdiction, regulator, and contract. Engage qualified legal counsel and your audit firm before relying on these patterns to satisfy any specific obligation.
๐ Table of Contents¶
- ๐ฏ Overview: Why Audit Immutability
- ๐งฑ The Three Audit Trail Properties
- โ๏ธ Regulatory Drivers
- ๐ What to Audit โ The Catalog
- ๐๏ธ Implementation Layers
- ๐ WORM Storage Patterns
- ๐งฌ Tamper Evidence
- ๐ Schema Pattern
- โ๏ธ Implementation in Fabric
- โณ Retention Policy Implementation
- ๐ Access Patterns
- โ Verification Procedures
- ๐ต๏ธ Forensic Use
- ๐ฐ Casino Implementation
- ๐๏ธ Federal Implementation
- ๐ซ Anti-Patterns
- ๐ Implementation Checklist
- ๐ References
๐ฏ Overview: Why Audit Immutability¶
An audit trail is only as valuable as its trustworthiness. A log that the operator can silently rewrite is no log at all โ it's a story. Auditors, regulators, courts, and incident responders all share the same fundamental question: can we prove that what we're seeing is what actually happened?
Audit immutability is the discipline of producing logs that:
- Cannot be silently altered after the fact, even by the people who run the system,
- Can be proven to be untouched at any point in time, and
- Are retained for the period the regulator demands.
For Microsoft Fabric workloads โ which sit at the intersection of identity, code execution, data access, and BI โ audit immutability spans Workspace Identity activity, fabric-cicd deployments, OneLake data access, BI report exports, AI Functions inference logs, Data Agent tool calls, and the underlying Bicep/IaC provenance. Without an immutable trail, every other compliance investment (SOC 2, ISO 27001, HIPAA, PCI-DSS) is structurally weak.
Why It Matters for Fabric Workloads¶
| Driver | Detail |
|---|---|
| Regulatory examination | SOX, HIPAA, GDPR, PCI-DSS, 21 CFR Part 11, NIGC MICS all require immutable retention |
| Forensic readiness | When an incident hits, the log is the timeline; if tampered, the investigation collapses |
| Insider-threat deterrence | Knowing the trail is unalterable changes behavior |
| Insurance & contracts | Cyber insurers and enterprise customers ask for evidence of tamper-evident logging |
| Litigation & e-discovery | Spoliation claims attach to loggable events that vanish |
| AI governance | Data Agent + AI Function decisions need verifiable lineage |
๐ Anchor reference: This document satisfies SOC 2 Common Criterion CC6.7 โ Logging & Monitoring and supports the evidence-collection requirements mapped in the SOC 2 Type II Readiness anchor doc. It complements Supply Chain Security (CC5.3) and GDPR Right to Deletion (records-of-processing).
๐งฑ The Three Audit Trail Properties¶
An audit trail must satisfy three independent properties. Missing any one collapses the value of the other two.
1. Completeness โ All Events Captured¶
If the breach happens at 03:04 UTC and the log starts at 03:05 UTC, the log is worse than useless: it conveys false confidence. Completeness means:
- Every in-scope event is emitted, regardless of outcome (success and failure)
- No code path silently skips logging
- No retry / dead-letter / circuit-breaker can drop events
- Buffered logs survive process crashes (durable transport)
2. Immutability โ Cannot Be Altered¶
Once an event is committed to the trail, no human, no script, no privileged role, and no compromised credential can change it. Immutability means:
- Append-only storage at the platform layer (not just the application layer)
- Time-based retention lock (WORM) at the storage tier
- No "edit" or "delete" code path exists for the audit dataset
- Even tenant administrators cannot bypass without a documented break-glass that itself logs
3. Verifiability โ Can Prove Non-Tampering¶
Even an immutable store must be provably immutable. Verifiability means:
- Cryptographic chaining (hash of N depends on hash of N-1)
- Periodic external attestation (e.g., Azure Confidential Ledger, Merkle root publication)
- Reproducible integrity-check procedure run by a separate trust domain
- Auditor can independently confirm "no row in this set has changed since acquisition"
๐ก The three properties are independent. WORM storage gives you immutability but not completeness. Hash-chaining gives you verifiability but not retention. You need all three, layered.
โ๏ธ Regulatory Drivers¶
Different regulators require different retention windows and tamper protections. Design once for the strictest applicable regime.
| Regulation | Scope | Retention | Tamper Requirement |
|---|---|---|---|
| SOX Section 802 | Public-company financial controls | 7 years | Records cannot be altered/destroyed |
| HIPAA Security Rule (45 CFR 164.312(b)) | PHI access & system activity | 6 years from creation/last-effective-date | Audit controls must record and examine activity |
| GDPR Article 30 | Records of processing activities | "As long as needed" + supervisory access | Demonstrably accurate |
| PCI-DSS Req 10 | Cardholder-data environment | 1 year minimum, 3 months immediately accessible | "Promptly back up audit trail files to a centralized log server or media difficult to alter" |
| 21 CFR Part 11 (FDA) | Electronic records & signatures | Per-record-class retention | Computer-generated, time-stamped audit trails; protected from alteration |
| NIGC MICS (Casino) | Gaming transactions, BSA | 5 years for transactions | Tamper-evident, examiner-accessible |
| FedRAMP AU-9 / AU-11 | Federal cloud workloads | 3 years minimum, longer per system | Protect audit information from unauthorized modification |
| NIST 800-53 AU-9 | All federal systems | Per category | "Protects audit information and audit tools from unauthorized access, modification, and deletion" |
| CJIS Security Policy | Criminal-justice information | 1 year online, 5 years offline | Cryptographic integrity required |
โ ๏ธ Compounding rule: When multiple regimes apply to the same dataset, retention is the maximum of all applicable periods, and tamper requirements are the union. A casino slot's W-2G record is simultaneously NIGC MICS (5 yrs) + IRS records (3+ yrs) + state gaming regulator (varies) + SOX (if publicly traded parent) โ store for 7 years, immutable, hash-chained.
๐ What to Audit โ The Catalog¶
A common failure mode is "we log everything" โ which means no one can find anything when it matters, retention costs explode, and PII leaks into the log. Audit by category, with explicit retention class per category.
The Audit Catalog¶
| Category | Examples | Retention Class | Why |
|---|---|---|---|
| Authentication | Sign-in success, sign-in failure, MFA challenge, token issuance | auth (18 mo SOC 2, 7 yr SOX) | Account compromise timeline |
| Privileged access activations | Entra PIM activations, break-glass account use, JIT elevations | priv-access (7 yr) | Highest-trust events; longest review |
| Configuration changes | Bicep deploys, Workspace settings, capacity scaling, tenant admin settings | config (7 yr SOX) | Reproducibility + change-control |
| Code deploys | fabric-cicd runs, GitHub Actions, deployment-pipeline promotions | deploy (7 yr) | Provenance + rollback |
| Data access | OneLake read, Lakehouse query, Direct Lake fetch (when Confidentiality TSC selected) | data-access (12 mo SOC 2, 6 yr HIPAA) | Insider misuse detection |
| Data exports / downloads | Power BI export, notebook download, REST API extract, semantic-link to_pandas() | data-export (7 yr) | DLP + customer DSAR response |
| Schema changes | DDL on Warehouse / SQL DB / Lakehouse, Delta schema evolution, OneLake catalog edits | schema (7 yr) | Trace data-shape drift |
| RBAC changes | Workspace role grants/revokes, OneLake security policy edits, Entra group changes | rbac (7 yr) | Privilege escalation timeline |
| DSAR / deletion events | GDPR Article 17 requests, CCPA delete, evidence of completion | dsar (3 yr post-fulfillment) | Proof of compliance โ see GDPR doc |
| Compliance reports | CTR, SAR, W-2G generation; HIPAA breach reports; FedRAMP POAMs | compliance-report (7 yr) | Regulator may request original |
| Security alerts & incident actions | Sentinel alerts, IR runbook executions, isolation actions, post-mortems | security (7 yr) | Timeline reconstruction |
| AI agent decisions | Data Agent tool calls, AI Function invocations, model output for regulated decision | ai-decision (7 yr) | Algorithmic accountability |
Each row has a retention class; each retention class has a single policy. This keeps retention administration tractable.
๐๏ธ Implementation Layers¶
Treat audit immutability as defense-in-depth across five layers. A break in any one layer compromises the trail.
Layer 1 โ Source: Emit at the System That Performed the Action¶
The system performing the action โ not a downstream observer โ must emit the event. Side-channel logs (e.g., reconstructing intent from network captures) are inferior because they can be silently dropped.
- Workspace Identity for service-principal actions emits to Entra audit
- Bicep deployments emit Activity Log entries
- fabric-cicd emits GitHub Actions run logs
- AI Functions and Data Agents emit invocation logs to Workspace Monitoring
- Notebooks emit
mssparkutilsoperations to the Spark monitoring sink
Layer 2 โ Transport: TLS, Signed, Ordered¶
Logs in flight must not be tamperable. Every transport hop:
- TLS 1.2+ enforced
- Sender authentication (managed identity, not shared secret)
- Sequence number or monotonic timestamp to detect drops/reorders
- Persistent buffer at sender to survive transport outage
Layer 3 โ Storage: Immutable / WORM¶
The persistent store must structurally prevent modification:
- Append-only Delta with VACUUM disabled
- Azure Storage immutability policy (time-based)
- Log Analytics archive tier (read-only after archival)
- Replicated copy to a separate trust domain (different subscription / tenant)
Layer 4 โ Retention: Per-Category Retention Policy¶
Retention is policy-driven, not ad-hoc. Each retention class has:
- A documented duration (in calendar days)
- A documented authority (which regulation drives it)
- A documented owner (who approves changes)
- An enforcement mechanism (lifecycle automation, immutability lock)
Layer 5 โ Access: Audit-Only Role; Auditor Short-Lived Access¶
Reading the audit log is itself a privileged event:
- Read-only role granted to a small Entra group (
grp-audit-readers) - Auditors get time-bounded access via PIM (e.g., 30 days, examination window)
- Every read of the audit dataset is itself logged (meta-audit)
- Production engineers do not have standing read access
๐ WORM Storage Patterns¶
Multiple Azure/Fabric services support Write-Once-Read-Many semantics. Choose by data volume, query latency, and retention duration.
Pattern A โ Azure Storage Immutability Policy (Time-Based Retention Lock)¶
Best for long-term, low-query-rate archive of the audit log (the canonical copy).
// infra/modules/security/audit-storage-worm.bicep
// =============================================================================
// Immutable audit-log storage with time-based retention lock
// =============================================================================
@description('Storage account name (must be globally unique)')
param storageAccountName string
@description('Azure region')
param location string
@description('Retention period in days. Casino/SOX = 2555 (7 yr). HIPAA = 2190 (6 yr).')
@minValue(365)
@maxValue(36500)
param immutabilityRetentionDays int = 2555
@description('Tags to apply')
param tags object = {}
resource sa 'Microsoft.Storage/storageAccounts@2023-05-01' = {
name: storageAccountName
location: location
tags: tags
sku: { name: 'Standard_GZRS' } // geo-zone-redundant
kind: 'StorageV2'
properties: {
accessTier: 'Cool'
allowBlobPublicAccess: false
minimumTlsVersion: 'TLS1_2'
supportsHttpsTrafficOnly: true
publicNetworkAccess: 'Disabled'
allowSharedKeyAccess: false // managed identity only
encryption: {
services: {
blob: { enabled: true, keyType: 'Account' }
}
keySource: 'Microsoft.KeyVault' // CMK
}
}
}
resource auditContainer 'Microsoft.Storage/storageAccounts/blobServices/containers@2023-05-01' = {
name: '${sa.name}/default/audit-immutable'
properties: {
publicAccess: 'None'
immutableStorageWithVersioning: {
enabled: true
}
}
}
// Time-based retention policy โ locked = cannot be reduced after the lock window
resource immutabilityPolicy 'Microsoft.Storage/storageAccounts/blobServices/containers/immutabilityPolicies@2023-05-01' = {
name: '${sa.name}/default/audit-immutable/default'
properties: {
immutabilityPeriodSinceCreationInDays: immutabilityRetentionDays
allowProtectedAppendWrites: true // append-only blobs supported
}
}
โ ๏ธ Once the immutability policy is locked (separate API call after testing), neither subscription owner nor Microsoft Support can shorten or remove it. Test in a sandbox subscription first.
Pattern B โ Log Analytics with Retention Extension¶
Best for active-query audit logs (the operational copy used during incident response).
- Default retention: 30 days. Configure at least 365โ730 days for SOC 2.
- Archive tier: extends to 7 years at lower cost; queryable via search jobs.
- Customer-managed key supported.
- See
infra/modules/monitoring/log-analytics.bicep(retentionInDaysparameter, max 4383 days).
Pattern C โ Azure Data Lake with Sensitivity Label + Retention¶
Best when audit logs are also subject to Purview governance (e.g., contain regulated identifiers).
- Apply Purview sensitivity label
audit-immutable - Container-level immutability policy
- Combined with Defender for Storage for unauthorized-access alerting
Pattern D โ OneLake with Retention Markers¶
Best when the audit log is the analytical product (e.g., feeding a Power BI compliance dashboard).
- Lakehouse table written with append-only semantics
- VACUUM disabled (retain all Delta history)
- OneLake Security policy restricts write to Workspace Identity only
- Replicate to Pattern A storage for the canonical retained copy
Pattern E โ Azure Confidential Ledger (External Attestation)¶
Best as a secondary attestation rather than primary store. Confidential Ledger publishes a Merkle root of submitted entries, providing third-party verifiable receipts at low cost.
| Pattern | Strength | Weakness | Use For |
|---|---|---|---|
| A โ Storage WORM | Strongest legal posture | Slow query | Canonical retained copy |
| B โ Log Analytics | Fastest query | Retention cost at scale | Operational + 90-day forensic |
| C โ ADLS + Purview | Governance integration | Operational complexity | When labels are mandatory |
| D โ OneLake | BI-native | Application-layer immutability only | Compliance dashboards |
| E โ Confidential Ledger | Cryptographic attestation | Not a primary store | Independent attestation receipt |
๐ก Recommended layering: D (OneLake operational) + B (Log Analytics 12โ24 mo) + A (Storage WORM 7+ yr) + E (Confidential Ledger receipts). Each layer compensates for the others' weaknesses.
๐งฌ Tamper Evidence¶
Even immutable storage benefits from cryptographic tamper evidence โ it lets an external party verify integrity without trusting the storage operator.
Hash Chain Pattern¶
Each row contains the hash of the previous row. Altering any row invalidates the chain from that row forward.
To verify, recompute every hash_self from hash_prev + payload. Any divergence pinpoints the tampered row.
Merkle Tree for Batch Attestation¶
Per-row chaining is O(N) to verify. For larger volumes, batch rows into Merkle trees (e.g., 1,024 rows per leaf set) and publish only the root to an external attestor. Verification is O(log N) per row.
External Attestation Service¶
Publish the Merkle root (or every Nth hash_self) to a service outside the trust boundary:
- Azure Confidential Ledger โ append-only, hardware-attested
- Public timestamp service โ RFC 3161 trusted timestamp authority
- Internal cross-tenant โ replicate to a separate Azure tenant managed by audit/legal team
Detect Tampering via Hash Recomputation¶
Run a scheduled integrity job (PySpark notebook below) that recomputes the chain end-to-end and alerts on any mismatch. A mismatch is itself an alertable security event with severity=critical.
๐ Schema Pattern¶
Use a single audit schema across all categories. Variation lives in the context JSON column. Stable column-set is essential for a 7-year retention horizon.
CREATE TABLE audit_immutable (
audit_id STRING NOT NULL, -- UUID v4, generated at source
event_ts TIMESTAMP NOT NULL, -- UTC, set at source, never recomputed
received_ts TIMESTAMP NOT NULL, -- UTC, set at storage, sanity check vs event_ts
subject STRING NOT NULL, -- who (UPN, SP object id, Workspace Identity)
subject_type STRING NOT NULL, -- user | service_principal | managed_identity | system
action STRING NOT NULL, -- canonical verb (READ, WRITE, DEPLOY, DELETE, GRANT)
resource STRING NOT NULL, -- canonical resource path
resource_type STRING NOT NULL, -- workspace | lakehouse | warehouse | semantic_model | bicep_module | ...
outcome STRING NOT NULL, -- success | failure | denied | partial
context STRING, -- JSON: client_ip, user_agent, session_id, request_id, ...
hash_prev STRING NOT NULL, -- hex SHA-256 of previous row's hash_self (or all-zero genesis)
hash_self STRING NOT NULL, -- hex SHA-256(canonical_payload || hash_prev)
signed_by STRING NOT NULL, -- workspace identity client id that wrote this row
retention_class STRING NOT NULL, -- auth | priv-access | config | deploy | data-access | data-export | schema | rbac | dsar | compliance-report | security | ai-decision
schema_version INT NOT NULL -- bump when schema changes; never break old rows
)
USING DELTA
PARTITIONED BY (retention_class, DATE(event_ts))
TBLPROPERTIES (
'delta.appendOnly' = 'true',
'delta.deletedFileRetentionDuration' = 'interval 7300 days',
'delta.logRetentionDuration' = 'interval 7300 days'
);
Field Notes¶
audit_idโ UUID v4, unique per event; index for forensic lookup.event_tsvsreceived_tsโ divergence > N minutes is itself anomalous (clock drift or replay attack).subject+subject_typeโ distinguishes a user from a system actor.actionโ drawn from a small canonical vocabulary; arbitrary verbs forbidden.contextโ JSON for free-form details; keep small.hash_prev/hash_selfโ the cryptographic chain.signed_byโ the writer's identity, lets you prove which writer-identity emitted which row.retention_classโ drives lifecycle policy.schema_versionโ supports forward-compatible evolution; never reuse a value.
โ๏ธ Implementation in Fabric¶
Append-Only Delta Table with VACUUM Disabled¶
# notebooks/audit/00_create_audit_immutable.py
# Run once at workspace bootstrap.
from delta.tables import DeltaTable
from pyspark.sql.types import (
StructType, StructField, StringType, TimestampType, IntegerType,
)
schema = StructType([
StructField("audit_id", StringType(), False),
StructField("event_ts", TimestampType(), False),
StructField("received_ts", TimestampType(), False),
StructField("subject", StringType(), False),
StructField("subject_type", StringType(), False),
StructField("action", StringType(), False),
StructField("resource", StringType(), False),
StructField("resource_type", StringType(), False),
StructField("outcome", StringType(), False),
StructField("context", StringType(), True),
StructField("hash_prev", StringType(), False),
StructField("hash_self", StringType(), False),
StructField("signed_by", StringType(), False),
StructField("retention_class", StringType(), False),
StructField("schema_version", IntegerType(), False),
])
(DeltaTable.createIfNotExists(spark)
.tableName("lh_audit.audit_immutable")
.addColumns(schema)
.partitionedBy("retention_class")
.property("delta.appendOnly", "true")
.property("delta.deletedFileRetentionDuration", "interval 7300 days")
.property("delta.logRetentionDuration", "interval 7300 days")
.property("delta.minReaderVersion", "2")
.property("delta.minWriterVersion", "5")
.execute())
print("audit_immutable table created with appendOnly=true and 20-year file retention.")
Workspace Identity for Audit Writes (No Human-Account Writes)¶
- Only the audit-emitter Workspace Identity has
WRITEonlh_audit.audit_immutable. - All other identities (users, other workspace identities) have
READonly viagrp-audit-readers. - OneLake Security policy enforces this at the Lakehouse level.
- Configuration changes to this policy are themselves audit events (meta-audit).
Hash-Chain Append (PySpark)¶
# notebooks/audit/01_append_with_chain.py
"""
Tamper-evident audit-log writer.
Computes hash_prev / hash_self per row before append.
"""
import hashlib
import json
import uuid
from datetime import datetime, timezone
from pyspark.sql import Row
from delta.tables import DeltaTable
GENESIS = "0" * 64 # used when the table is empty
def _canonical_payload(row: dict) -> str:
"""
Stable JSON encoding for hashing. Sorted keys, no whitespace,
UTF-8. Excludes hash_self (chicken-and-egg) and received_ts
(set at storage, not source).
"""
fields = {k: v for k, v in row.items() if k not in ("hash_self", "received_ts")}
return json.dumps(fields, sort_keys=True, separators=(",", ":"), default=str)
def _sha256_hex(s: str) -> str:
return hashlib.sha256(s.encode("utf-8")).hexdigest()
def _last_hash(table_name: str) -> str:
"""Return the hash_self of the most recent row, or GENESIS if empty."""
df = (spark.table(table_name)
.orderBy("event_ts", ascending=False)
.limit(1)
.select("hash_self"))
rows = df.collect()
return rows[0]["hash_self"] if rows else GENESIS
def append_audit_event(
*,
subject: str,
subject_type: str,
action: str,
resource: str,
resource_type: str,
outcome: str,
retention_class: str,
context: dict | None = None,
signed_by: str,
table_name: str = "lh_audit.audit_immutable",
) -> str:
"""Emit a single audit row. Returns the audit_id."""
now = datetime.now(timezone.utc)
audit_id = str(uuid.uuid4())
hash_prev = _last_hash(table_name)
row = {
"audit_id": audit_id,
"event_ts": now,
"received_ts": now,
"subject": subject,
"subject_type": subject_type,
"action": action,
"resource": resource,
"resource_type": resource_type,
"outcome": outcome,
"context": json.dumps(context or {}, sort_keys=True),
"hash_prev": hash_prev,
"signed_by": signed_by,
"retention_class": retention_class,
"schema_version": 1,
}
row["hash_self"] = _sha256_hex(_canonical_payload(row) + hash_prev)
df = spark.createDataFrame([Row(**row)])
df.write.format("delta").mode("append").saveAsTable(table_name)
return audit_id
# Example: log a fabric-cicd deployment
append_audit_event(
subject="sp-fabric-cicd-prod",
subject_type="service_principal",
action="DEPLOY",
resource="workspaces/ws-fabric-prod/items/lh-gold",
resource_type="lakehouse",
outcome="success",
retention_class="deploy",
context={
"git_sha": "6916337",
"actor": "github-actions",
"run_id": "12345678",
},
signed_by="11111111-2222-3333-4444-555555555555",
)
Hash-Chain Verification (PySpark)¶
# notebooks/audit/02_verify_chain.py
"""
Recomputes the hash chain end-to-end. Alerts on any mismatch.
Designed to run nightly as a scheduled job.
"""
import hashlib
import json
from pyspark.sql.functions import col
GENESIS = "0" * 64
def _canonical(row, exclude=("hash_self", "received_ts")):
d = row.asDict()
fields = {k: v for k, v in d.items() if k not in exclude}
return json.dumps(fields, sort_keys=True, separators=(",", ":"), default=str)
def verify_chain(table_name: str = "lh_audit.audit_immutable") -> dict:
rows = (spark.table(table_name)
.orderBy("event_ts", "audit_id")
.collect())
expected_prev = GENESIS
bad = []
for r in rows:
if r["hash_prev"] != expected_prev:
bad.append({"audit_id": r["audit_id"], "reason": "hash_prev mismatch"})
recomputed = hashlib.sha256(
(_canonical(r) + r["hash_prev"]).encode("utf-8")
).hexdigest()
if recomputed != r["hash_self"]:
bad.append({"audit_id": r["audit_id"], "reason": "hash_self mismatch"})
expected_prev = r["hash_self"]
return {
"rows_checked": len(rows),
"tamper_count": len(bad),
"first_failures": bad[:10],
"status": "OK" if not bad else "TAMPER_DETECTED",
}
result = verify_chain()
print(result)
if result["status"] != "OK":
# Emit a critical security alert via Action Group / Sentinel webhook
raise RuntimeError(f"AUDIT TAMPER DETECTED: {result}")
Replication to Azure Storage with WORM Policy¶
A scheduled pipeline (Copy Job) replicates the OneLake audit table to the WORM storage account nightly. The WORM copy is the canonical retained record for the 7-year horizon; the OneLake copy is the operational queryable for the most recent 12โ24 months.
Log Analytics Archive Tier for Long-Term¶
Diagnostic-settings emit the same events to Log Analytics. After 12 months, archive-tier extends retention to 7 years at ~10ร lower cost.
โณ Retention Policy Implementation¶
Per-Category Retention Configuration¶
# config/audit-retention.yaml
retention_classes:
auth: { days: 2555, authority: "SOX ยง 802", owner: "ciso@" }
priv-access: { days: 2555, authority: "SOX ยง 802 + SOC 2", owner: "ciso@" }
config: { days: 2555, authority: "SOX ยง 404", owner: "vp-eng@" }
deploy: { days: 2555, authority: "SOX ยง 404", owner: "vp-eng@" }
data-access: { days: 2190, authority: "HIPAA ยง 164.312", owner: "privacy@" }
data-export: { days: 2555, authority: "SOC 2 + DLP policy", owner: "ciso@" }
schema: { days: 2555, authority: "SOX ยง 404", owner: "data-platform@" }
rbac: { days: 2555, authority: "SOC 2 CC6.1", owner: "iam@" }
dsar: { days: 1095, authority: "GDPR Art. 17 proof", owner: "privacy@" }
compliance-report: { days: 2555, authority: "BSA + IRS", owner: "compliance@" }
security: { days: 2555, authority: "SOC 2 CC4.1", owner: "soc@" }
ai-decision: { days: 2555, authority: "EU AI Act + SOC 2", owner: "ai-governance@" }
Lifecycle Automation: hot 30d โ cool 1yr โ archive 7yr¶
// Lifecycle policy on the audit storage account
resource lifecycle 'Microsoft.Storage/storageAccounts/managementPolicies@2023-05-01' = {
name: '${sa.name}/default'
properties: {
policy: {
rules: [
{
name: 'audit-tiering'
enabled: true
type: 'Lifecycle'
definition: {
filters: { blobTypes: ['blockBlob'], prefixMatch: ['audit-immutable/'] }
actions: {
baseBlob: {
tierToCool: { daysAfterModificationGreaterThan: 30 }
tierToArchive: { daysAfterModificationGreaterThan: 365 }
// No delete action โ handled separately via legal-hold-aware job
}
}
}
}
]
}
}
}
Auto-Delete Only After Retention Expiry¶
Deletion at end-of-retention is a separate, audited workflow:
- Identify objects past
retention_class.days. - Verify no legal hold is attached.
- Emit an audit event of type
RETENTION_EXPIRY_DELETEto the audit log itself. - Delete via the immutability policy's natural expiry (preferred) or operator action with documented authority.
Legal Hold Override¶
A legal hold tag (legal-hold=case-2026-XYZ) freezes deletion regardless of retention expiry. Tags are added by privileged role only and themselves audit-logged. Holds are released only after written authorization from legal counsel.
๐ Access Patterns¶
Read-Only Auditor Access via Entra Group + RBAC Role¶
Entra group: grp-audit-readers
Members: internal compliance team (3-5)
Permanent access: READ on lh_audit.audit_immutable
READ on Log Analytics audit tables
READ on Storage WORM container (via SAS or Entra)
Excluded: production engineers, developers, BI consumers
Time-Bounded Access for Examinations¶
External auditors (CPA firm, regulator) get time-bounded access via Entra PIM:
- Activation window: examination period (e.g., 30 days)
- MFA + Conditional Access required
- Activation reason recorded
- Auto-revocation at window end
- Every read event captured in meta-audit
Every Audit-Log Read Is Itself Audited (Meta-Audit)¶
A read of audit_immutable emits an event with action=READ_AUDIT_LOG, retention_class=security. This recursion is bounded: the meta-audit table is itself an audit_immutable row, and reads of the meta-audit are themselves logged. Practically, you stop at depth 1 โ but you log that fact too.
๐ก If your auditor objects to the meta-audit ("who watches the watchmen?"), add a separate workspace and Workspace Identity for the meta-audit pipeline, owned by an org outside the operational team (e.g., Internal Audit).
โ Verification Procedures¶
Hash Chain Integrity Check¶
- Cadence: nightly scheduled job (the
verify_chainnotebook above) - Alert: any non-zero
tamper_countraises aseverity=criticalSentinel alert - Evidence: the verification result row is itself written to a
lh_audit.chain_verification_logtable
Sample-Based Attestation¶
For auditor-requested samples:
- Auditor selects N random rows by
audit_id. - Operator extracts the row + the chain back to genesis (or to last attested checkpoint).
- Auditor independently runs the verification function on the extract.
- Operator provides Confidential Ledger receipts for the chain segment.
- Result is documented in the working papers.
Backup of Audit Logs to Separate Trust Domain¶
The single most valuable defense against insider tampering: a second, independently controlled copy.
- Replicate to a separate Azure tenant (e.g., owned by Internal Audit, not Engineering)
- Or to a separate cloud (e.g., Azure โ AWS S3 Object Lock) where the operational team has no privileges
- Compare hash roots weekly; any divergence is investigated as a security incident
๐ต๏ธ Forensic Use¶
During Incident: Query Audit Log for Timeline¶
When Incident Response opens an incident, the IR lead queries the audit log for:
- Authentication events for affected identities (24-hour window pre-incident)
- Privileged-access activations
- Configuration changes
- Data-export events
- Schema or RBAC changes
The output is a timeline with cryptographic provenance โ admissible-quality evidence.
Retain Extra During Investigation (Legal Hold)¶
The moment an incident is declared, the IR lead places a legal hold on:
- The audit log partitions covering the incident window
- All evidence artifacts referenced from the audit log
- Any retention class that would otherwise expire during the investigation
The hold is released only after the incident is closed and counsel authorizes.
Chain of Custody for Export¶
When audit data leaves the immutable store (e.g., to a regulator or court):
- Export is itself an audit event (
action=AUDIT_EXPORT). - The export package contains: rows, hash chain segment, Confidential Ledger receipts, manifest with SHA-256 of the package.
- The package is signed with a workspace identity certificate.
- Recipient signs a chain-of-custody acknowledgement.
- The acknowledgement is stored back in the audit log.
๐ฐ Casino Implementation¶
BSA-Relevant Transactions in the Immutable Trail¶
Every BSA-relevant event flows to audit_immutable:
- CTR generation (
action=COMPLIANCE_REPORT,resource_type=ctr_filing) - SAR drafting & filing (
action=COMPLIANCE_REPORT,resource_type=sar_filing) - W-2G issuance (slot โฅ $1,200; keno โฅ $600; poker โฅ $5,000)
- Patron-record changes (KYC/AML)
- Cage transactions โฅ $3,000 (multiple-transaction-log triggers)
- Self-exclusion list edits
5-Year Retention with Auto-Archive¶
NIGC MICS sets a 5-year minimum; many tribal compacts extend to 7-10 years. Configure retention_class=compliance-report to 2,555 days (7 yr) to satisfy the union.
Examiner Access Pattern¶
State and tribal gaming commission examiners receive:
- PIM-activated reader access to the audit Lakehouse for the examination window
- A pre-built KQL workbook scoped to compliance-report partitions
- Ability to request signed exports for off-site review
๐๏ธ Federal Implementation¶
DOJ โ Extra Restricted Access, Longer Retention¶
- Audit log lives in a dedicated workspace with Workspace Identity scoped to the DOJ tenant's Entra
- Reader group gated by DOJ background-check attestation
- Retention extended to 10 years for case-related records
- Replication to a Confidential Ledger instance in a GovCloud subscription
- See DOJ federal domain doc for case-handling specifics
Tribal Health โ HIPAA 6 Years¶
retention_class=data-accessset to 2,190 days (6 yrs from creation/last-effective-date)- Audit captures every PHI access, including legitimate clinical access
- Annual access-pattern review by Privacy Officer
- See HIPAA section in compliance docs
SBA โ Privacy Act + Records Management¶
- Privacy Act ยง 552a system-of-records notice (SORN) referenced in retention policy
retention_class=dsaraligned to NARA records schedule- DSAR proof-of-completion records retained 3 yrs post-fulfillment
USDA / NOAA / EPA / DOI โ FedRAMP AU-9 / AU-11¶
- Retention 3 years minimum; longer per system-categorization (FIPS 199 Moderate / High)
- WORM via Azure Government Storage with immutability lock
- Cross-region replication for AU-9(2) requirement (separate physical system)
DOT/FAA โ Aviation Safety Records¶
- 7-year retention typical; some certificate records permanent
- Hash-chain attestation supports FOIA defensibility
๐ซ Anti-Patterns¶
| Anti-Pattern | Why It Hurts | What to Do Instead |
|---|---|---|
| Audit log in the same workspace as the audited application | Compromised app credentials can rewrite the log | Separate workspace + Workspace Identity for audit |
DELETE permission ever granted on the audit table | Single rogue actor erases the trail | OneLake Security read-only for everyone except the emitter Workspace Identity |
| VACUUM enabled on audit Delta | Time-travel deletion silently destroys history | delta.appendOnly=true + 20-year file retention |
| No hash chain ("storage is immutable, that's enough") | Storage ops can still rewrite if break-glass is abused | Chain at the application layer; verify nightly |
| Auditing only successes (or only failures) | Half the timeline; attacker can mask intent | Log both; outcome is a column, not a filter |
PII inside context JSON without minimization | Audit log itself becomes a privacy liability subject to DSAR | Minimize, hash, or tokenize; never raw SSN/PAN |
| Single retention class for all events | Either over-retention (cost) or under-retention (compliance gap) | Per-category retention class |
| No legal-hold mechanism | Litigation hold becomes "delete everything else" by default | Tag-based hold + release workflow with counsel sign-off |
| Auditor reads via shared service-account credential | Repudiation risk; can't tell which auditor read what | Per-auditor PIM activation with named identity |
| "We'll set up the audit log later" | You can't backfill; the gap is permanent | Audit emission is a Day-0 deliverable |
๐ Implementation Checklist¶
Before declaring "audit immutability ready":
- Audit catalog documented; every in-scope event maps to a retention class
-
audit_immutableDelta table created withappendOnly=trueand 20-year file retention - OneLake Security policy: WRITE only by audit-emitter Workspace Identity
- OneLake Security policy: READ only by
grp-audit-readers - Hash-chain append helper deployed to
notebooks/audit/ - Hash-chain verification job scheduled nightly
- Verification failure raises
severity=criticalSentinel alert - Storage account with immutability policy provisioned and locked (after pilot)
- Lifecycle policy: hot โ cool (30d) โ archive (365d)
- Replication pipeline OneLake โ WORM Storage scheduled and verified
- Log Analytics audit table retention โฅ 365 days
- Confidential Ledger or equivalent attestation receiver configured
- Cross-trust-domain backup destination configured (separate tenant or cloud)
- Per-category retention configuration externalized in config file
- Legal-hold tagging procedure documented and tested
- PIM workflow for auditor read access documented and tested
- Meta-audit events firing on every read of
audit_immutable - Sample-based attestation procedure documented in compliance runbook
- Chain-of-custody export package format defined and signed
- PII minimization rules enforced in
contextfield (validation in emitter) - Retention-expiry deletion workflow documented (with audit emission)
- Quarterly access review for
grp-audit-readersmembership - Annual independent verification (Internal Audit or external)
- Incident-response runbook updated to include audit-log queries
- DR tested: audit log recovers to RPO/RTO targets
๐ References¶
Standards & Regulatory Sources¶
- NIST SP 800-92 โ Guide to Computer Security Log Management
- NIST SP 800-53 Rev. 5 โ AU (Audit and Accountability) family
- SOX Section 802 โ Records retention
- HIPAA Security Rule โ 45 CFR ยง164.312(b)
- GDPR Article 30 โ Records of processing activities
- PCI-DSS v4.0 โ Requirement 10
- 21 CFR Part 11 โ Electronic Records; Electronic Signatures
- FedRAMP Security Controls Baseline โ AU family
- NIGC MICS โ Tier A/B/C
Microsoft Resources¶
- Azure Storage Immutable Blob Storage
- Azure Confidential Ledger
- Microsoft Fabric Workspace Monitoring
- Microsoft Fabric OneLake Security
- Log Analytics archive tier
- Microsoft Purview Audit
Related Wave 5 Docs (this Wave)¶
- SOC 2 Type II Readiness (Wave 5 anchor)
- ISO 27001 Mapping
- GDPR Right to Deletion
- CCPA Privacy Rights
- STRIDE Threat Model
- Zero-Trust Blueprint
- Data Exfiltration Prevention
- Supply Chain Security
Related Existing Docs¶
- SQL Audit Logs & Compliance
- Customer-Managed Keys
- Outbound Access Protection
- Identity & RBAC Patterns
- Monitoring & Observability
- Disaster Recovery & BCDR