Skip to content

Home > Docs > Best Practices > Security > Data Exfiltration Prevention

๐Ÿšช Data Exfiltration Prevention on Microsoft Fabric

Layered Defenses Against Intentional and Accidental Data Egress

Category Phase Priority Last Updated


Last Updated: 2026-04-27 | Version: 1.0.0 | Anchor: SOC 2 Type II Readiness (Wave 5)

Disclaimer: This document describes architectural and technical controls to reduce the likelihood and impact of data exfiltration. It is not a guarantee. A determined insider with sufficient privilege can defeat any control. Layered defense, behavior monitoring, and process controls are equally important. Engage your security and legal teams before relying on these patterns in regulated environments.


๐Ÿ“‘ Table of Contents


๐ŸŽฏ Overview โ€” The Exfiltration Threat Model

Data exfiltration is the unauthorized movement of data outside the trust boundary of the organization. Unlike unauthorized access (which is about reading), exfiltration is about taking. It is the dominant cause of regulated-data breaches and the highest-impact failure mode for any analytics platform.

In a Fabric workload, the trust boundary is typically: - A specific tenant - A workspace or set of workspaces in a domain - The customer-controlled storage accounts (CMK-encrypted, OAP-fenced, private-endpoint-isolated)

Anything that crosses that boundary unsupervised is an exfiltration event.

The Four Threat Personas

Persona Motivation Detection difficulty Typical vector
Insider โ€” malicious Resignation, revenge, espionage, financial gain Hardest (uses legitimate credentials) Notebook download, Power BI export, COPY INTO to personal storage
Insider โ€” compromised Phished credentials, malware on workstation Hard (legitimate user, abnormal behavior) API token theft, automated scraping via SSMS or REST
External attacker โ€” gained access Any of the above motives, now operating with stolen identity Medium (often noisy if logging is on) Bulk download, mirroring abuse, shortcut creation to attacker-controlled storage
Accidental disclosure Misdirected email, public bucket, screenshot, lost laptop Easy if labels/DLP fire โ€” invisible otherwise Power BI sharing to external email, public OneLake shortcut, unencrypted export

Auditors will ask: "Show me the control that prevents an analyst from copying the player table to their personal OneDrive." You need an answer for every persona, every vector.

What "Prevention" Actually Means

True prevention is rare. Realistic goals:

  1. Eliminate the easy path. No double-clicks should yield a CSV of regulated data.
  2. Add friction. A user determined to exfiltrate must defeat multiple layers (network, identity, data, app, audit) โ€” each leaving evidence.
  3. Detect within hours, not weeks. Behavior monitoring + DLP + SIEM correlation.
  4. Minimize blast radius. Workspace isolation, encryption, sensitivity labels, OneLake row/column controls.
  5. Preserve audit trail integrity. When (not if) an event happens, you can investigate.

๐Ÿ“ Scope: This is a Wave 5 deep-dive. Closely related Wave 5 docs: Zero-Trust Blueprint, STRIDE Threat Model, Audit Trail Immutability, SOC 2 Type II Readiness. Existing dependencies: OAP, Network Security, Data Governance Deep Dive, OneLake Security.


๐Ÿ›ฃ๏ธ Exfiltration Vectors in Fabric

The comprehensive list of exfiltration paths a Fabric tenant must consider. Every vector below is real and observed. A program that addresses only some of them has gaps.

# Vector Description Primary Mitigation Reference
1 COPY INTO to external storage T-SQL COPY INTO from Warehouse / Lakehouse SQL endpoint to attacker-controlled storage account Workspace-level destination allowlist + OAP ยง COPY INTO Restrictions
2 Power BI export to Excel/CSV Right-click โ†’ Export โ†’ Excel from any visual or table Tenant + sensitivity-label disable ยง Power BI Export
3 Notebook .ipynb download Download notebook with embedded result data; query a Lakehouse, then save and download Workspace policy: disable export; cell-output redaction ยง Notebook Download
4 Lakehouse Files download Drag-drop or "Download" from Files area in Lakehouse explorer OneLake security RBAC + workspace policy OneLake Security
5 Cross-tenant sharing misuse "Share" button on a report or item to external Entra tenant Tenant B2B settings + label-based external block ยง Cross-Tenant Sharing
6 OneLake shortcut to external (S3, GCS) Create shortcut pointing OUTBOUND to attacker storage; mirroring effect OAP + connector allowlist OAP
7 Mirroring egress Configure mirroring from Fabric to external Snowflake/etc. (egress mirror) Disable outbound mirroring; allowlist destinations ยง OAP Deep Dive
8 SQL endpoint client tools Connect SSMS, Power BI Desktop, Azure Data Studio to Warehouse SQL endpoint and SELECT * Conditional Access + IP firewall + audit Network Security
9 GraphQL API API for GraphQL exposes structured queries; bulk extraction over HTTPS Throttling + RBAC + query depth limits + audit GraphQL feature doc
10 Eventstream output to external Eventstream destination set to external Event Hub or Kafka Destination allowlist + workspace policy ยง OAP Deep Dive
11 SHIR pulling on-prem Self-hosted Integration Runtime pulls from on-prem source โ€” same SHIR can write to attacker on-prem Restrict SHIR sinks; pin to managed VNet Network Security
12 Pipeline copy to external sink Data pipeline Copy activity with external sink (Blob, S3, REST) Connection allowlist + OAP egress control OAP
13 Email subscriptions Power BI subscription to external email with attachment Tenant-level external email block ยง Cross-Tenant Sharing
14 Screenshot / photo of screen Out-of-band; cannot be technically prevented end-to-end Watermarking + workforce policy + DLP camera detection on managed devices ยง Sensitivity Labels
15 Print to PDF Browser print โ†’ save as PDF โ†’ exfiltrate via email Sensitivity label "no print" protection action ยง Sensitivity Labels
16 Personal device sync (BYOD) Power BI mobile / OneDrive personal Conditional Access device compliance Zero-Trust Blueprint

โš ๏ธ Be honest about scope. Vectors 14 (camera) and parts of BYOD are organizational/process controls, not technical. Document them in your security awareness training; do not pretend they're solved by a Bicep parameter.


๐Ÿ›ก๏ธ Layered Defense Model

No single control prevents exfiltration. Defense-in-depth places obstacles at the network, identity, data, application, and audit layers. An attacker must defeat all five to succeed silently.

flowchart TB
    subgraph Threat["๐ŸŽฏ Threat Actor"]
        Insider[Insider<br/>malicious / compromised]
        External[External<br/>credential theft]
        Accidental[Accidental<br/>misdirection]
    end

    subgraph Network["๐ŸŒ Network Layer"]
        OAP[OAP โ€” Outbound<br/>Access Protection]
        PE[Private Endpoints]
        IPF[IP Firewall]
        VNet[Managed VNet]
    end

    subgraph Identity["๐Ÿ†” Identity Layer"]
        CA[Conditional Access<br/>+ MFA]
        DC[Device Compliance]
        PIM[Entra PIM<br/>just-in-time]
        WI[Workspace Identity]
    end

    subgraph Data["๐Ÿ“ฆ Data Layer"]
        Labels[Sensitivity<br/>Labels]
        DLP[Purview DLP<br/>Policies]
        OLS[OneLake Security<br/>row/column]
        CMK[CMK Encryption]
    end

    subgraph App["โš™๏ธ Application Layer"]
        ExportOff[Export-to-Excel<br/>Disabled]
        DownloadOff[Notebook<br/>Download Off]
        ShareBlock[External Share<br/>Block]
        CopyRestrict[COPY INTO<br/>Allowlist]
    end

    subgraph Audit["๐Ÿ“œ Audit & Detection"]
        Logs[Workspace Monitoring<br/>+ Log Analytics]
        Sentinel[Microsoft Sentinel<br/>UEBA]
        Reflex[Data Activator<br/>Reflex]
        SOC[SOC<br/>Investigation]
    end

    subgraph Asset["๐Ÿ”’ Protected Asset"]
        OneLake[(OneLake<br/>regulated data)]
    end

    Threat --> Network
    Network -->|allowed| Identity
    Identity -->|authenticated| Data
    Data -->|labeled| App
    App -->|permitted action| OneLake
    Network -.->|every event| Audit
    Identity -.->|every event| Audit
    Data -.->|policy match| Audit
    App -.->|user action| Audit
    Audit --> SOC

    style Threat fill:#fee,stroke:#c00
    style Asset fill:#efe,stroke:#0a0
    style Audit fill:#eef,stroke:#00c

Layer Responsibilities

Layer Stops Cannot stop
Network Egress to non-allowlisted destinations; lateral movement to attacker storage Egress to allowed destinations being abused
Identity Unauthenticated access, weak-MFA bypass, non-compliant devices Legitimate user with legitimate credentials acting maliciously
Data Reading data above clearance; un-labeled exfiltration User with clearance copying data to allowed channel
Application UI-level export/share buttons API-level access via approved client tools
Audit Nothing in real time โ€” detects after the fact Sub-second exfiltration of small datasets

The unique value of layered defense: no layer is asked to be perfect, but the combination makes silent exfiltration impractical.


๐Ÿšง OAP โ€” Outbound Access Protection (Deep Dive)

OAP is the single highest-leverage technical control for exfiltration prevention. Reference: existing OAP doc.

What OAP Blocks

Outbound flow Blocked by OAP?
Notebook โ†’ unauthorized ADLS Gen2 โœ… Yes
Pipeline copy โ†’ unauthorized Blob โœ… Yes
Notebook โ†’ personal OneDrive โœ… Yes
Eventstream โ†’ external Event Hub (not allowlisted) โœ… Yes
Mirror destination โ†’ external Snowflake (not allowlisted) โœ… Yes
User download via UI (Power BI export, notebook download) โŒ No โ€” that's a UI action, not an outbound network call from the workspace
SQL client (SSMS) reading from Warehouse to local disk โŒ No โ€” endpoint is allowed; client is downstream
User screenshot โŒ No

OAP is a network-egress control. It prevents the workspace from sending data outbound. It does not prevent users with legitimate read access from pulling data through approved client tools.

Configuration Patterns

The recommended pattern is default-deny + per-domain allowlist:

Workspace OAP allowlist (storage) OAP allowlist (cross-workspace) OAP allowlist (connectors)
ws_casino_prod stcasinoprod (RW), stcasinoarchive (W) ws_shared_gold (RO) ADLS Gen2, Eventhouse, Azure SQL
ws_federal_doj stfederaldoj (RW) none ADLS Gen2 only
ws_tribal_health sthealthcareprod (RW) none Fabric-native only
ws_dev_sandbox stdevsynthetic only ws_shared_gold (RO) broad (synthetic data only)

Bicep โ€” OAP Module Reference

@description('Workspace ID for OAP target')
param workspaceId string

@description('Approved ADLS Gen2 storage rules')
param storageRules array = [
  {
    storageAccountName: 'stcasinoprod'
    containers: [ 'bronze', 'silver', 'gold' ]
    accessLevel: 'ReadWrite'
  }
]

@description('Approved cross-workspace targets')
param crossWorkspaceRules array = []

@description('Approved external connector types')
param allowedConnectors array = [
  'AzureDataLakeStorageGen2'
  'FabricLakehouse'
  'FabricWarehouse'
  'Eventhouse'
]

resource oap 'Microsoft.Fabric/workspaces/outboundAccessProtection@2026-01-01' = {
  name: '${workspaceId}/default'
  properties: {
    enabled: true
    defaultAction: 'Deny'
    storageRules: storageRules
    crossWorkspaceRules: crossWorkspaceRules
    connectorRules: {
      mode: 'AllowList'
      allowedConnectors: allowedConnectors
    }
  }
}

output oapEnabled bool = oap.properties.enabled

Validation Tests

Every workspace deployment should run an OAP smoke test:

# tests/security/test_oap_egress_block.py
import pytest
from pyspark.sql.utils import AnalysisException

UNAUTHORIZED_PATH = (
    "abfss://exfil@unauthorizedstorage.dfs.core.windows.net/test"
)

def test_oap_blocks_unauthorized_egress(spark):
    """OAP should refuse a write to a non-allowlisted storage account."""
    df = spark.createDataFrame([(1, "synthetic")], ["id", "value"])
    with pytest.raises((AnalysisException, PermissionError)) as exc:
        df.write.format("delta").mode("overwrite").save(UNAUTHORIZED_PATH)
    assert "outbound" in str(exc.value).lower() or "denied" in str(exc.value).lower()

def test_oap_allows_approved_egress(spark, approved_path):
    """OAP should permit writes to allowlisted destinations."""
    df = spark.createDataFrame([(1, "synthetic")], ["id", "value"])
    df.write.format("delta").mode("overwrite").save(approved_path)
    assert spark.read.format("delta").load(approved_path).count() == 1

When OAP Doesn't Help

โš ๏ธ Critical limitation. OAP secures the destination set. It does not stop exfiltration to allowed destinations.

A malicious user with write access to stcasinoprod/bronze (an OAP-allowed account) can still: - Copy regulated data into a less-protected container within stcasinoprod - Write to stcasinoprod then access it from outside Fabric via the storage's own data plane (if storage RBAC permits) - Stage data for external retrieval by a separate process they control

OAP must be paired with: - Storage-account RBAC and private endpoints (data plane locked down) - OneLake security row/column rules (cannot read what they want to copy) - Audit log analysis on writes to monitored containers - DLP policies on outbound files


๐Ÿ“ฅ COPY INTO Restrictions

Default Behavior

COPY INTO is a Warehouse / SQL endpoint T-SQL statement that bulk-loads from Azure storage. By default, the destination of a COPY INTO is a Warehouse table โ€” not external storage โ€” so traditional COPY INTO is an ingress tool, not egress.

The exfiltration risk emerges when: - Users have permission to COPY INTO-to-external-Warehouse-or-Storage via the inverse pattern (CREATE EXTERNAL TABLE AS SELECT, OPENROWSET writes, BCP). - Users use SELECT INTO to a remote linked database. - Users run PolyBase-style writes if/when supported.

Workspace Policy to Restrict Destinations

For Fabric Warehouses, use: - OAP to block writes from Warehouse endpoints to non-allowlisted storage - Workspace IP firewall so the SQL endpoint is reachable only from corporate IPs and Bastion subnets - Object-level GRANT/DENY so only specific service accounts can use COPY INTO-style operations

-- Restrict COPY INTO and bulk-load privileges to a service principal only
DENY ADMINISTER BULK OPERATIONS TO [analyst_role];
DENY ALTER ANY EXTERNAL DATA SOURCE TO [analyst_role];
DENY ALTER ANY EXTERNAL FILE FORMAT TO [analyst_role];

-- Allow read-only on regulated tables; no bulk writes
GRANT SELECT ON SCHEMA::gold TO [analyst_role];
DENY INSERT, UPDATE, DELETE ON SCHEMA::gold TO [analyst_role];

Audit Log Analysis

Every COPY INTO and external-table operation appears in the Fabric SQL audit. Monitor for unusual patterns.

// Detect bulk-load operations to or from external sources
FabricSQLAuditLogs
| where TimeGenerated > ago(24h)
| where StatementType in ("COPY", "BULK INSERT", "EXTERNAL TABLE", "OPENROWSET")
| where StatementText !contains "stcasinoprod"  // exclude approved sources
    and StatementText !contains "stfederalprod"
| project TimeGenerated, UserPrincipalName, WorkspaceName, DatabaseName,
          StatementType, StatementText, RowsAffected, ClientIP
| order by TimeGenerated desc

๐Ÿ“ค Power BI Export Restrictions

Power BI's "Export to Excel" / "Export to CSV" is the most common accidental and intentional exfiltration vector for analytics data. Lock it down by default; allow it only for non-regulated workspaces.

Tenant-Wide Settings

In the Fabric Admin Portal โ†’ Tenant settings:

Setting Recommended Rationale
Export to Excel Disabled for Confidential and Highly Confidential security groups Default-allow exposes everything
Export underlying data Disabled by default; allow per workspace Underlying = the raw query result, often more than the visual
Export reports as PowerPoint / PDF Allow with watermark Lower-risk than raw data
Live connect to dataset from Excel Restrict to corporate-network only Excel "Analyze in Excel" is essentially unbounded export
Print Block for Highly Confidential Print-to-PDF is exfiltration

Sensitivity Labels with Protection

Apply Microsoft Information Protection labels with encryption + content-marking + access restrictions to regulated reports.

Label Encryption Watermark Export Print Forwarding
Public none none yes yes yes
Internal yes (org) "Internal" footer yes yes inside org only
Confidential yes (org) watermark view-only, no Excel no no external
Highly Confidential โ€” Casino CTR/SAR yes (named group) watermark + user identity view-only, no Excel no no
Highly Confidential โ€” PHI yes (named group) watermark + user identity view-only, no Excel no no

When a user opens an Excel file that was exported (before label tightening), MIP encryption keeps the file readable only to authorized identities โ€” even if it leaves the tenant.

Persona โ€” BI Consumer View-Only

The standard pattern for casino floor managers, federal field staff, healthcare clinical reviewers:

Permission Setting
Workspace role Viewer only โ€” never Member
Sensitivity label Confidential or higher applied at semantic-model level
Export Disabled by tenant + label
Subscriptions Disabled
Share with external Blocked by tenant
App ownership Workspace admin only

The user can interact with reports and dashboards but cannot move data anywhere.


๐Ÿ““ Notebook Download Restrictions

A .ipynb file with cell outputs can contain thousands of rows of regulated data โ€” a single download can be a major breach.

Workspace Policy: Disable Download

Configure at workspace level (Settings โ†’ Security โ†’ Item Export):

Item type Recommended
Notebook download (.ipynb) Disabled for prod workspaces
Notebook download (.py) Allowed (no embedded data)
Lakehouse Files download Disabled for regulated containers
Workspace export Admin-only, audit-logged

Cell-Output Hygiene

For workspaces where download must be allowed, train and code-review for:

# โŒ Anti-pattern: large display() calls leave regulated data in cell output
display(spark.table("lh_gold.player_master"))

# โœ… Pattern: redacted display
df = spark.table("lh_gold.player_master")
display(df.limit(10).select("player_id", "join_date"))  # exclude PII columns
print(f"row_count={df.count()}")  # aggregate only

A pre-commit linter or CI check can flag display(...) and .show() of tables tagged Confidential.

Workspace Identity for Notebook Execution

Notebooks should authenticate to OneLake and external connectors via Workspace Identity (managed identity), not via embedded secrets. This eliminates the "hard-coded SAS token in notebook โ†’ leaked notebook โ†’ durable credential exposure" path.

# โœ… Workspace Identity โ€” no credential in notebook
from notebookutils import mssparkutils
df = spark.read.format("delta").load(
    "abfss://gold@stcasinoprod.dfs.core.windows.net/player_master"
)

# โŒ Anti-pattern โ€” embedded SAS token, leaks if notebook is downloaded
sas = "?sv=2023-01-01&ss=b&srt=co&sp=rwdlac&se=2026-12-31..."
df = spark.read.format("delta").load(
    f"https://stcasinoprod.blob.core.windows.net/gold/player_master{sas}"
)

๐ŸŒ Cross-Tenant Sharing Controls

Cross-tenant B2B sharing is a major exfiltration path because once data crosses tenants, your DLP/labels travel only if MIP encryption is enforced and the receiving tenant honors it.

Tenant-Level B2B Settings

In Entra ID โ†’ External Identities โ†’ Cross-tenant access settings:

Setting Recommended
Default outbound sharing Block all, allowlist by partner tenant
Default inbound sharing Block all, allowlist by partner tenant
Per-partner outbound Allow specific Entra tenants of approved partners only
Cross-tenant access for Fabric items Disabled by default; opt-in per workspace by request
Per-user external invitation Restricted to approved roles

External User Policy

In Fabric Admin Portal โ†’ Tenant settings:

Setting Recommended
External users in workspaces Disabled, allowlist by Entra group
External user content access Read-only, no export
External user share-back Disabled

Data Residency

Casino, federal, and healthcare workloads frequently have data residency requirements. Cross-tenant sharing can move data to tenants in other regions.

  • Pin storage to in-region regions (US Gov for federal)
  • Tag workspaces with dataResidency: us-gov
  • Conditional Access: block sign-in to in-scope workspaces from non-approved geographies
  • Sensitivity label: Region-Locked: US-Only with named-group encryption

๐Ÿ” DLP Integration (Microsoft Purview)

Microsoft Purview Data Loss Prevention extends content-aware policies to Fabric. Purview DLP can scan content and trigger actions when sensitive patterns are detected.

Trigger Conditions

Common DLP rules for Fabric:

Rule Trigger Action
Bulk PII Document or query result with 5+ SSN matches or 10+ credit-card matches Block export; notify user; alert SOC
HIPAA PHI Patient-record patterns (MRN + DOB + diagnosis) Block + alert
CTR/SAR Currency Transaction Report identifiers Block + alert + auto-classify Highly Confidential
Federal CUI Controlled Unclassified Information markers Block + alert
Source code with secrets API key, JWT, connection-string patterns Warn + alert

Block / Warn / Audit Modes

DLP policies progress through enforcement modes:

Mode Use when Effect
Audit-only Initial rollout; calibrating false-positive rate Logs match, takes no other action
Warn Steady state for low-severity rules Shows policy tip; user can override with justification (logged)
Block High-severity rules in production Action prevented; user notified; SOC alert

The recommended path: deploy in audit-only for 30 days, tune rules, promote to warn for 30 days, then block.

DLP Policy Example (Purview)

policy:
  name: Casino-Financial-Bulk-PII-Block
  scope:
    fabric_workspaces:
      - ws_casino_prod
      - ws_casino_compliance
  conditions:
    - any_of:
        - sensitive_info_type: U.S. Social Security Number
          min_count: 5
        - sensitive_info_type: Credit Card Number
          min_count: 10
        - keyword_dictionary: ctr_sar_terms
          min_count: 3
  actions:
    - block_export: true
    - block_share_external: true
    - notify_user:
        message: "This dataset contains regulated financial PII and cannot be exported."
    - notify_admin:
        recipients: ["security-ops@contoso.com"]
        severity: high
    - log_event: true
  exceptions:
    - role: "compliance-officer"
      requires_justification: true

๐Ÿท๏ธ Sensitivity Label Enforcement

Sensitivity labels are the substrate the entire exfiltration program rides on. Without labels, DLP cannot decide what to protect, and audit cannot determine severity of an event.

Auto-Labeling

Auto-label policies inspect content and apply a label when patterns match:

Trigger Label
Casino: contains player_id and aggregate amount > $9,999 Highly Confidential โ€” Casino-Financial
Casino: contains player_id alone Confidential โ€” Casino-PII
Federal-DOJ: contains case_id Highly Confidential โ€” DOJ-Case
Tribal Health: contains MRN or ICD-10 Highly Confidential โ€” PHI
SBA: contains borrower_ein and loan_amount Confidential โ€” SBA-Loan

Inheritance Through Medallion

Labels should propagate from raw โ†’ curated layers. Configure Purview to enforce inheritance.

# Pseudocode โ€” verify label propagation in CI
def test_label_inheritance():
    bronze_label = purview.get_label("lh_bronze.player_transactions")
    silver_label = purview.get_label("lh_silver.player_transactions_clean")
    gold_label   = purview.get_label("lh_gold.player_kpi")
    # Silver and Gold must be at least as restrictive as Bronze
    assert label_rank(silver_label) >= label_rank(bronze_label)
    assert label_rank(gold_label)   >= label_rank(bronze_label)

Protection Actions

Each label has content-marking (visible) and protection (cryptographic) settings:

Label Watermark Header/Footer Encryption Restrict copy/print Expiry
Public none none none no none
Internal none "Contoso Internal" org-wide no none
Confidential "CONFIDENTIAL โ€” {user}" yes named groups yes none
Highly Confidential "HIGHLY CONFIDENTIAL โ€” {user} โ€” {date}" yes named groups yes 30 days

The user-identity watermark is critical: any screenshot taken of a regulated report can be traced back to the viewer.


๐Ÿ“ˆ Egress Monitoring

Detection assumes prevention will fail. Monitor for the patterns prevention couldn't stop.

KQL โ€” Unusual Download Patterns

// Single user downloading large volume from a Lakehouse in a short window
FabricActivityLogs
| where TimeGenerated > ago(1h)
| where Activity in ("ExportReport", "DownloadFile", "ExportToExcel", "DownloadNotebook")
| extend RowsExported = tolong(coalesce(ActivityDetail.rowCount, "0"))
| summarize TotalRows = sum(RowsExported), Events = count(),
            Items = make_set(ItemName)
    by UserId, WorkspaceName, bin(TimeGenerated, 5m)
| where TotalRows > 10000 or Events > 20
| order by TotalRows desc

KQL โ€” Off-Hours Activity

// Privileged user activity outside business hours
let business_hours = range(7, 19); // 7am-7pm
FabricActivityLogs
| where TimeGenerated > ago(7d)
| extend HourOfDay = datetime_part("hour", TimeGenerated)
| where HourOfDay !in (business_hours)
| where Activity in ("ExportReport", "ExportToExcel", "DownloadFile",
                     "ShareReport", "CreateShortcut")
| summarize Events = count() by UserId, Activity, bin(TimeGenerated, 1d)
| where Events > 5
| order by Events desc

KQL โ€” First-Time-Download

// Detect when a user downloads a report for the first time ever
let baseline = FabricActivityLogs
| where TimeGenerated between (ago(180d) .. ago(1d))
| where Activity == "ExportReport"
| distinct UserId, ReportId;
FabricActivityLogs
| where TimeGenerated > ago(1d)
| where Activity == "ExportReport"
| join kind=leftanti baseline on UserId, ReportId
| project TimeGenerated, UserId, ReportId, WorkspaceName

Sentinel Detection Rules

Promote the highest-fidelity KQL queries into Microsoft Sentinel analytic rules:

Rule Severity Threshold Response
Bulk-export High > 10,000 rows by single user in 1h Auto-disable session; page SOC
Off-hours-export Medium privileged user export between 8pm and 6am Slack to SOC; manual review
First-time-export Low first export of a report by a user Audit trail entry; weekly review
Cross-tenant-share High any share to external tenant on regulated label Auto-revoke share; page SOC
OAP-block-burst High > 5 OAP blocks by single user in 1h Auto-disable session; page SOC

Alert Thresholds

โš ๏ธ Tune to your environment. A 10,000-row threshold may be normal for a finance analyst building a forecast. The threshold matters less than the delta from that user's baseline. UEBA does this automatically.


๐Ÿ•ต๏ธ Detective Controls

User Behavior Analytics (UEBA)

Microsoft Defender for Cloud Apps (MCAS) and Microsoft Sentinel UEBA produce per-user behavioral baselines. Anomalies that warrant alerts:

Anomaly Why it matters
Activity from new geography Compromised credential or VPN abuse
New device sign-in for privileged user Initial access of compromise
Volumetric anomaly Bulk download or copy-out
Sequence anomaly (e.g., sign-in โ†’ broad search โ†’ export) Reconnaissance + exfil pattern
Impossible travel Concurrent compromise

Volume-Based Anomaly Detection

Compare each user's current hour to their median hour over 30 days. Alert when the current hour is > 5x median or > 95th-percentile.

# Concept: per-user export-volume z-score
SELECT user_id, current_hour_rows,
       (current_hour_rows - median_30d_hour) /
       NULLIF(stddev_30d_hour, 0) AS z_score
FROM exfil_baseline
WHERE z_score > 3
   OR current_hour_rows > p95_30d_hour;

First-Time-Download Alerts

When a user opens a report or table they've never accessed before AND immediately exports it, that is a high-fidelity signal even at low volume.


๐Ÿ‘๏ธ Insider Threat Patterns

Insider exfiltration follows known behavioral patterns. Couple HR signals (resignation date, HR-flagged employees) with technical signals where lawful.

Pre-Resignation Behavior

Common signals in the 30 days before an insider resigns or is terminated:

Signal Detection
Increased export volume Volume anomaly vs. own baseline
Off-hours activity Off-hours KQL
Access to data outside role Access-pattern KQL
Bulk creation of subscriptions to personal email Subscription-creation log filter
Shortcut creation to external storage OneLake shortcut audit
First-time access to historically-untouched workspaces Workspace-access KQL

โš ๏ธ Legal scope. Pre-resignation monitoring requires HR partnership, written workforce policy, and (in many jurisdictions) prior notice in employment agreements. Engage legal before deploying.

Privileged User Oversight

Role Oversight
Workspace Admin Two-person rule for sensitive operations; weekly access review
Tenant Admin Entra PIM with approval workflow; session recording
Service Principal owner Quarterly attestation; secret rotation; allowlist of permitted operations
External vendor admin Daily activity report to internal sponsor

Audit Trail Integrity

If the audit trail can be tampered with by the same insider you're trying to detect, it's not an audit trail. See audit trail immutability for:

  • Immutable Blob storage (WORM)
  • Log forwarding to a separate tenant
  • Tamper-evident hashing (cryptographic chain)
  • Privileged-access PIM gating around log deletion

๐Ÿค Vendor / Third-Party Controls

Sub-processors (vendors who process your data) are exfiltration vectors at the contractual layer. Microsoft is one such sub-processor; so are any ISVs you integrate.

Data Processing Agreement (DPA) Requirements

Every vendor processing regulated data should have:

Clause What it provides
Purpose limitation Vendor may use data only for the contracted purpose
Sub-processor list Disclosure of all downstream processors
Sub-processor approval Right to object to new sub-processors
Encryption requirements Specifies algorithms and key management
Breach notification SLA for notification (typically 24-72h)
Right to audit Reserved right to audit vendor controls
Data return / destruction At contract end, data returned and destroyed; written attestation
Region constraints Where data may be processed and stored

Sub-Processor Management

Maintain a register of every vendor with access to regulated data:

Vendor Service Data accessed DPA on file SOC 2 Type II Last review
Microsoft Fabric platform All Yes Yes 2026-01
Snowflake (mirror dest) Read-only mirror Gold aggregates Yes Yes 2026-02
ISV-X Reporting connector Read-only Confidential Yes Type I 2026-02

Right to Audit

For high-criticality vendors, exercise the audit clause periodically โ€” even just a documentation review counts. Auditors will ask for evidence that this was done.


๐ŸŽฐ Casino Implementation

The casino domain handles NIGC MICS financial-reporting data, player PII, and CTR/SAR records. Exfiltration concerns:

Data Sensitivity Exfil concern
Slot floor telemetry Internal Operational; low regulatory risk
Player loyalty / PII Confidential Identity-theft risk; reputational
Player gambling pattern Confidential Regulatory + reputational
CTR/SAR financial reports Highly Confidential Federal Title 31 (BSA/AML); legal exposure
W-2G filings Confidential IRS reporting accuracy

Casino Configuration

Control Setting
OAP Enabled. Allow stcasinoprod, stcasinoarchive. Cross-workspace ws_shared_gold (RO).
Sensitivity labels All player PII auto-labeled Confidential; CTR/SAR auto-labeled Highly Confidential
Power BI export Disabled for ws_casino_compliance; enabled with audit for ws_casino_analytics (non-PII aggregates)
Notebook download Disabled for ws_casino_compliance and ws_casino_prod
Cross-tenant share Blocked at tenant level; CTR/SAR labeled to require named-group encryption
DLP Bulk-PII rule: 5+ SSN match โ†’ block. CTR-keyword: 3+ โ†’ block.
Floor staff role Viewer-only; no export; no print; mobile-app device-compliance required

Floor Staff: Zero Download Capability

The casino floor manager and surveillance team need real-time visibility but zero download capability. The pattern:

Layer Setting
Workspace role Viewer
Sensitivity label Internal (operational only) โ€” no PII in their reports
Export to Excel Disabled (label)
Print Disabled (label)
Subscriptions Disabled (tenant)
Device Conditional Access requires compliant managed device
Network Conditional Access requires casino-floor IP range

๐Ÿ›๏ธ Federal Implementation

DOJ โ€” Restricted Access, No Export

Control Setting
Workspace ws_federal_doj only
OAP Allow stfederaldoj only; no cross-workspace
Sensitivity label All case data Highly Confidential โ€” DOJ-Case (named-group encryption)
Power BI export Disabled
Notebook download Disabled
Connector allowlist ADLS Gen2 only
Conditional Access Federal-managed device + Gov network

Tribal Health โ€” HIPAA, Encryption + Watermark on Every Export

Control Setting
Workspace ws_tribal_healthcare
OAP Allow sthealthcareprod only; cross-workspace = none
Sensitivity label All PHI auto-labeled Highly Confidential โ€” PHI
Power BI export Disabled by default. When exception is approved (research use), MIP encryption mandatory + watermark with user identity + 7-day expiry
Notebook download Disabled
DLP PHI-pattern rule: block + alert
Audit retention 6 years (HIPAA)
Business Associate Agreement On file with Microsoft and any sub-processors

SBA โ€” Borrower PII, Opt-In for Sharing

Control Setting
Workspace ws_federal_sba
OAP Allow stfederalprod/sba only
Sensitivity label Borrower data Confidential โ€” SBA-Loan
Power BI export Aggregates allowed; row-level borrower data blocked by DLP
Cross-agency share Opt-in per-borrower consent flag in source data; default no-share
Audit All access to borrower-PII rows logged with purpose-of-access

USDA, NOAA, EPA, DOI โ€” Tiered

Agency Notable nuance
USDA Producer survey responses are confidential by statute; aggregate publication only
NOAA Most data is public; protect business email and internal personnel data
EPA Enforcement-related data Highly Confidential; publication data Internal
DOI Tribal-trust resource data Highly Confidential; cultural-resource data restricted

๐Ÿšซ Anti-Patterns

Anti-Pattern Why It Hurts What to Do Instead
Relying solely on RBAC RBAC controls access, not egress. A user with read access can still exfiltrate. Layer OAP + DLP + sensitivity labels on top of RBAC
Allowlisting "all internal storage" Insider can stage to an under-monitored internal account, then retrieve from outside Fabric OAP allowlist must be specific accounts + containers, not wildcards
Audit-only DLP forever Audit tells you what happened โ€” doesn't stop it Promote to warn within 30 days, block within 60
No sensitivity labels DLP and audit cannot prioritize; everything is "data" Roll out labels before DLP; auto-labeling for known patterns
Power BI export enabled tenant-wide The default-allow option for the most common exfil vector Default-deny; allow per-workspace by request
Notebook download enabled in regulated workspaces A single download = thousands of rows of data leakage Disable at workspace policy; enforce in CI
Storage SAS tokens in notebooks Notebook leak = durable credential leak Workspace Identity only
Cross-tenant sharing default-allow One click can move data to a tenant outside your control Default-deny; allowlist by partner with DPA
No off-hours / volume monitoring Bulk-exfil events look like normal sessions in real time UEBA + KQL alerts on behavior anomalies
Audit logs stored in same workspace as data Insider with workspace admin can erase their tracks Forward to immutable storage in a separate trust boundary
Treating OAP as a checkbox "OAP is on" is meaningless if everything is allowlisted OAP rule review quarterly; default-deny posture verified

๐Ÿ“‹ Implementation Checklist

Before declaring "Data Exfiltration Prevention ready":

Network Layer

  • OAP enabled on every regulated workspace with default-deny
  • OAP allowlists reviewed quarterly with named owner
  • OAP smoke test passes in CI for every workspace deployment
  • Private Endpoints configured for all storage accounts
  • Workspace IP firewall restricts to corporate ranges
  • Managed VNet integrated where applicable

Identity Layer

  • Conditional Access enforces MFA for all users
  • Conditional Access enforces compliant managed device for regulated workspaces
  • Entra PIM gates privileged access (just-in-time)
  • Workspace Identity used for service-to-service authentication
  • Quarterly access review process running
  • B2B cross-tenant access default-deny with allowlist

Data Layer

  • Sensitivity labels deployed (Public, Internal, Confidential, Highly Confidential)
  • Auto-labeling rules in place for casino PII, CTR/SAR, PHI, federal CUI
  • Label inheritance verified through medallion (CI test)
  • CMK enabled for storage with rotation policy
  • OneLake security row/column controls in place for confidential tables
  • DLP policies promoted past audit-only mode

Application Layer

  • Power BI export disabled tenant-wide for Confidential and above
  • Notebook download disabled in regulated workspaces
  • Lakehouse Files download disabled in regulated containers
  • Print and forwarding restrictions on Highly Confidential labels
  • Power BI subscription to external email blocked
  • Cross-tenant sharing blocked by tenant policy

Audit & Detection Layer

  • Workspace Monitoring enabled with โ‰ฅ 12-month retention
  • Log Analytics retention โ‰ฅ 12 months for regulated workspaces (โ‰ฅ 6 years for HIPAA)
  • Sentinel UEBA enabled
  • Sentinel analytic rules: bulk-export, off-hours, first-time-download, cross-tenant-share, OAP-block-burst
  • Data Activator Reflex on OAP block bursts
  • Logs forwarded to immutable storage in a separate trust boundary
  • Audit trail immutability controls implemented

Process & Governance

  • Pre-resignation monitoring policy reviewed by legal
  • Sub-processor register current with DPAs on file
  • DLP false-positive review monthly
  • Insider-threat tabletop exercise run annually
  • Workforce security awareness training completed
  • Exfiltration-incident runbook current and tested

๐Ÿ“š References

Microsoft Resources

Industry & Standards


โฌ†๏ธ Back to Top | ๐Ÿ“š Security Index | ๐Ÿ  Home