Home > Docs > Best Practices > Data Sharing & Federation
🔗 Data Sharing & Federation for Microsoft Fabric¶
Cross-Workspace, Cross-Organization, and Cross-Cloud Data Access Patterns
Last Updated: 2026-04-13 | Version: 1.0.0
📑 Table of Contents¶
- 🎯 Overview
- 🏗️ Architecture
- 🔗 Shortcut Patterns
- 📤 Fabric Data Sharing
- 🌐 External Federation
- 🔒 Security & Governance
- 🎰 Casino Industry Sharing
- 🏛️ Federal Agency Sharing
- 📊 Monitoring & Auditing
- ⚠️ Limitations
- 📚 References
🎯 Overview¶
Data sharing and federation in Microsoft Fabric enables organizations to provide governed access to data across workspace, organizational, and cloud boundaries without copying data. This guide covers OneLake shortcuts, Fabric data sharing, Iceberg endpoint federation, and external catalog integration patterns for casino gaming and federal agency workloads.
Sharing vs. Federation vs. Copying¶
| Approach | Data Movement | Governance | Latency | Cost |
|---|---|---|---|---|
| OneLake Shortcuts | None (zero-copy) | Source workspace controls access | Real-time | Lowest |
| Fabric Data Sharing | None (in-place) | Shared via permissions | Real-time | Low |
| Iceberg Endpoint | None (external read) | External catalog governs | Near-real-time | Low |
| Copy Activity | Full copy | Consumer owns copy | Batch latency | Highest |
| Mirroring | CDC replication | Source changes propagate | Near-real-time | Medium |
Decision Framework¶
flowchart TB
Q1{"Is the consumer<br/>inside your Fabric tenant?"}
Q1 -->|Yes| Q2{"Same workspace?"}
Q1 -->|No| Q3{"Can they read<br/>Iceberg/Delta?"}
Q2 -->|Yes| A1["Direct table reference<br/>No sharing needed"]
Q2 -->|No| A2["OneLake Shortcut<br/>or Fabric Data Sharing"]
Q3 -->|Yes| A3["Iceberg Endpoint<br/>or Delta Sharing"]
Q3 -->|No| A4["Copy Activity<br/>or REST API export"]
style Q1 fill:#6C3483,stroke:#333,color:#fff
style Q2 fill:#2471A3,stroke:#333,color:#fff
style Q3 fill:#E67E22,stroke:#333,color:#fff
style A1 fill:#27AE60,stroke:#333,color:#fff
style A2 fill:#27AE60,stroke:#333,color:#fff
style A3 fill:#27AE60,stroke:#333,color:#fff
style A4 fill:#27AE60,stroke:#333,color:#fff 🏗️ Architecture¶
End-to-End Data Sharing Architecture¶
flowchart LR
subgraph Publishers["Data Publishers"]
P1["Casino Property<br/>Workspace"]
P2["Federal Agency<br/>Workspace"]
P3["External Partner<br/>Snowflake / Databricks"]
end
subgraph Sharing["Sharing Mechanisms"]
S1["OneLake<br/>Shortcuts"]
S2["Fabric Data<br/>Sharing"]
S3["Iceberg<br/>Endpoint"]
end
subgraph Consumers["Data Consumers"]
C1["Corporate Analytics<br/>Workspace"]
C2["Gaming Commission<br/>External"]
C3["Cross-Agency<br/>Dashboard"]
C4["External Catalog<br/>Unity / Glue"]
end
P1 -->|Internal| S1
P1 -->|Regulated| S2
P2 -->|Internal| S1
P2 -->|External| S3
P3 -->|Inbound| S1
S1 --> C1
S1 --> C3
S2 --> C2
S3 --> C4
style Publishers fill:#6C3483,stroke:#333,color:#fff
style Sharing fill:#2471A3,stroke:#333,color:#fff
style Consumers fill:#27AE60,stroke:#333,color:#fff Sharing Layer Components¶
| Component | Role | Scope |
|---|---|---|
| OneLake | Unified storage layer for all Fabric items | Tenant-wide |
| Shortcuts | Virtual pointers to data in OneLake or external storage | Workspace-to-workspace, external clouds |
| Fabric Data Sharing | In-place sharing with granular permissions | Cross-workspace within tenant |
| Iceberg Endpoint | Open table format endpoint for external consumers | Cross-platform federation |
| SQL Analytics Endpoint | T-SQL read access to Lakehouse tables | BI tools, reporting |
🔗 Shortcut Patterns¶
Shortcut Types¶
| Shortcut Type | Source | Target | Data Movement | Authentication |
|---|---|---|---|---|
| Internal (OneLake) | Another Lakehouse in the same tenant | Current Lakehouse | None | Workspace RBAC |
| ADLS Gen2 | Azure Data Lake Storage Gen2 | Current Lakehouse | None | Service principal or Workspace identity |
| Amazon S3 | AWS S3 bucket | Current Lakehouse | None | IAM access key |
| Google Cloud Storage | GCS bucket | Current Lakehouse | None | Service account key |
| Dataverse | Dataverse tables | Current Lakehouse | None | Entra ID |
Pattern 1: Internal OneLake Shortcut¶
Share Gold-layer tables from a tenant workspace to a corporate analytics workspace without copying data.
flowchart LR
subgraph Source["Source: ws-casino-vegas"]
G1["lh_gold_vegas<br/>📊 slot_performance_daily<br/>📊 revenue_by_denom<br/>📊 player_ltv"]
end
subgraph Target["Target: ws-casino-corporate"]
SC["lh_gold_corporate<br/>🔗 vegas/slot_performance_daily<br/>🔗 vegas/revenue_by_denom<br/>🔗 vegas/player_ltv"]
end
G1 -.->|OneLake Shortcut| SC
style Source fill:#6C3483,stroke:#333,color:#fff
style Target fill:#27AE60,stroke:#333,color:#fff Creating an internal shortcut (REST API):
import requests
def create_internal_shortcut(
target_workspace_id: str,
target_lakehouse_id: str,
shortcut_name: str,
source_workspace_id: str,
source_lakehouse_id: str,
source_table_path: str,
headers: dict
) -> dict:
"""Create an internal OneLake shortcut."""
url = (
f"https://api.fabric.microsoft.com/v1"
f"/workspaces/{target_workspace_id}"
f"/items/{target_lakehouse_id}"
f"/shortcuts"
)
payload = {
"name": shortcut_name,
"path": "Tables",
"target": {
"oneLake": {
"workspaceId": source_workspace_id,
"itemId": source_lakehouse_id,
"path": f"Tables/{source_table_path}"
}
}
}
response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
return response.json()
# Example: Share Vegas Gold tables to Corporate workspace
create_internal_shortcut(
target_workspace_id="ws-casino-corporate-id",
target_lakehouse_id="lh-gold-corporate-id",
shortcut_name="vegas_slot_performance_daily",
source_workspace_id="ws-casino-vegas-id",
source_lakehouse_id="lh-gold-vegas-id",
source_table_path="slot_performance_daily",
headers=get_auth_headers()
)
Pattern 2: ADLS Gen2 Shortcut¶
Read data from an external Azure Data Lake Storage account without copying.
def create_adls_shortcut(
workspace_id: str,
lakehouse_id: str,
shortcut_name: str,
adls_account: str,
container: str,
path: str,
connection_id: str,
headers: dict
) -> dict:
"""Create an ADLS Gen2 shortcut."""
url = (
f"https://api.fabric.microsoft.com/v1"
f"/workspaces/{workspace_id}"
f"/items/{lakehouse_id}"
f"/shortcuts"
)
payload = {
"name": shortcut_name,
"path": "Tables",
"target": {
"adlsGen2": {
"location": f"https://{adls_account}.dfs.core.windows.net",
"subpath": f"/{container}/{path}",
"connectionId": connection_id
}
}
}
response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
return response.json()
# Example: Shortcut to USDA open data in ADLS
create_adls_shortcut(
workspace_id="ws-federal-usda-id",
lakehouse_id="lh-bronze-usda-id",
shortcut_name="usda_crop_production_raw",
adls_account="stfederalopendata",
container="usda",
path="crop_production/delta",
connection_id="conn-usda-adls-id",
headers=get_auth_headers()
)
Pattern 3: Cross-Cloud Shortcuts (S3 / GCS)¶
# Amazon S3 shortcut for external partner data
def create_s3_shortcut(
workspace_id: str,
lakehouse_id: str,
shortcut_name: str,
s3_bucket: str,
s3_path: str,
connection_id: str,
headers: dict
) -> dict:
"""Create an S3 shortcut."""
payload = {
"name": shortcut_name,
"path": "Tables",
"target": {
"amazonS3": {
"location": f"https://{s3_bucket}.s3.amazonaws.com",
"subpath": f"/{s3_path}",
"connectionId": connection_id
}
}
}
url = (
f"https://api.fabric.microsoft.com/v1"
f"/workspaces/{workspace_id}"
f"/items/{lakehouse_id}"
f"/shortcuts"
)
response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
return response.json()
# Example: Shortcut to partner gaming data in S3
create_s3_shortcut(
workspace_id="ws-casino-vegas-id",
lakehouse_id="lh-bronze-vegas-id",
shortcut_name="partner_loyalty_data",
s3_bucket="partner-gaming-data-prod",
s3_path="loyalty/delta",
connection_id="conn-partner-s3-id",
headers=get_auth_headers()
)
💡 Tip: Cross-cloud shortcuts (S3, GCS) incur cloud egress charges from the source cloud provider. Factor this into your cost model for high-volume data access patterns.
Shortcut Transformation Pattern¶
Shortcuts point to raw data, but consumers need transformed views. Combine shortcuts with PySpark notebooks for "shortcut transformations."
# Databricks notebook source
# MAGIC %md
# MAGIC # Shortcut Transformation: USDA Crop Production
# MAGIC Read from ADLS shortcut, transform, write to Silver Lakehouse
# Read from shortcut (Bronze)
raw_df = spark.read.format("delta").table("bronze.usda_crop_production_raw")
# Transform
silver_df = (
raw_df
.filter(F.col("year") >= 2020)
.withColumn("yield_per_acre", F.col("production") / F.col("area_harvested"))
.withColumn("_ingested_at", F.current_timestamp())
.dropDuplicates(["state_fips", "commodity_code", "year"])
)
# Write to Silver (not a shortcut — actual Delta table)
silver_df.write.format("delta").mode("overwrite").saveAsTable("silver.usda_crop_production")
📤 Fabric Data Sharing¶
What is Fabric Data Sharing?¶
Fabric Data Sharing enables in-place data sharing between workspaces within the same Fabric tenant. Unlike shortcuts, sharing creates a governed sharing contract with explicit permissions.
Sharing vs. Shortcuts¶
| Feature | Shortcuts | Data Sharing |
|---|---|---|
| Mechanism | Virtual pointer in OneLake | Sharing contract with permissions |
| Governance | Source workspace RBAC | Explicit share grants + audit trail |
| Discoverability | Must know source path | Shared items appear in consumer workspace |
| Permissions | Inherited from source workspace membership | Granular read/write grants |
| Audit | Standard OneLake access logs | Dedicated sharing audit events |
| Cross-tenant | Not supported | Future support planned |
Creating a Data Share¶
def create_data_share(
source_workspace_id: str,
source_item_id: str,
target_workspace_id: str,
share_name: str,
permissions: list[str],
headers: dict
) -> dict:
"""Create a Fabric Data Share."""
url = (
f"https://api.fabric.microsoft.com/v1"
f"/workspaces/{source_workspace_id}"
f"/items/{source_item_id}"
f"/shares"
)
payload = {
"displayName": share_name,
"targetWorkspace": {
"workspaceId": target_workspace_id
},
"permissions": permissions # e.g., ["Read", "ReadAll"]
}
response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
return response.json()
# Share compliance data with gaming commission workspace
create_data_share(
source_workspace_id="ws-casino-vegas-id",
source_item_id="lh-gold-vegas-compliance-id",
target_workspace_id="ws-gaming-commission-id",
share_name="vegas-compliance-data",
permissions=["Read"],
headers=get_auth_headers()
)
Share Management Lifecycle¶
flowchart LR
C["Create Share<br/>Publisher defines<br/>scope + permissions"] --> A["Accept Share<br/>Consumer accepts<br/>into workspace"]
A --> U["Use Data<br/>Consumer reads<br/>via Lakehouse/SQL"]
U --> M["Monitor<br/>Audit access<br/>and usage"]
M --> R["Revoke<br/>Publisher removes<br/>access"]
style C fill:#6C3483,stroke:#333,color:#fff
style A fill:#2471A3,stroke:#333,color:#fff
style U fill:#27AE60,stroke:#333,color:#fff
style M fill:#E67E22,stroke:#333,color:#fff
style R fill:#6C3483,stroke:#333,color:#fff Permission Scoping¶
| Permission | Access Level | Use Case |
|---|---|---|
| Read | Read specific tables in the share | General analytics consumers |
| ReadAll | Read all tables in the shared item | Data engineers needing full access |
| Write | Write back to shared tables | Bi-directional data exchange |
⚠️ Important: Start with the minimum permission level. Use Read with explicit table scoping for most consumers. Only grant ReadAll when the consumer needs access to all current and future tables.
🌐 External Federation¶
Iceberg Endpoint¶
Fabric's OneLake Iceberg endpoint exposes Lakehouse Delta tables as Iceberg-compatible tables, enabling external engines (Spark, Trino, Snowflake, Databricks) to read Fabric data without copying.
flowchart LR
subgraph Fabric["Microsoft Fabric"]
LH["Lakehouse<br/>Delta Tables"]
ICE["Iceberg Endpoint<br/>OneLake"]
end
subgraph External["External Consumers"]
SNO["Snowflake<br/>External Catalog"]
DBR["Databricks<br/>Unity Catalog"]
TRI["Trino / Starburst<br/>Federated Query"]
CUS["Custom Spark<br/>Application"]
end
LH --> ICE
ICE --> SNO
ICE --> DBR
ICE --> TRI
ICE --> CUS
style Fabric fill:#6C3483,stroke:#333,color:#fff
style External fill:#27AE60,stroke:#333,color:#fff Iceberg endpoint URL format:
Reading from Databricks Unity Catalog:
# In Databricks: read Fabric Lakehouse table via Iceberg endpoint
spark.read.format("iceberg").load(
"https://onelake.dfs.fabric.microsoft.com/"
"ws-federal-usda-id/lh-gold-usda-id/"
"Tables/crop_production_summary"
)
Configuring Snowflake External Catalog:
-- In Snowflake: create external catalog pointing to Fabric Iceberg endpoint
CREATE OR REPLACE CATALOG INTEGRATION fabric_catalog
CATALOG_SOURCE = ICEBERG_REST
TABLE_FORMAT = ICEBERG
CATALOG_URI = 'https://onelake.dfs.fabric.microsoft.com/<workspace_id>/<lakehouse_id>'
WAREHOUSE = 'COMPUTE_WH'
ENABLED = TRUE;
-- Query Fabric data from Snowflake
SELECT * FROM fabric_catalog.gold.slot_performance_daily
WHERE gaming_date >= '2026-01-01';
External Catalog Integration¶
| External Catalog | Integration Method | Authentication |
|---|---|---|
| Databricks Unity Catalog | Iceberg REST Catalog | OAuth / Service principal |
| AWS Glue | Iceberg endpoint + IAM | Cross-cloud IAM federation |
| Snowflake | External catalog integration | OAuth / Service principal |
| Trino / Starburst | Iceberg connector | Bearer token |
| Apache Hive Metastore | Iceberg endpoint | Kerberos / OAuth |
Delta Sharing (Outbound)¶
For sharing Fabric data with external partners who support the Delta Sharing protocol.
# Configure Delta Sharing provider in Fabric
delta_sharing_config = {
"provider": {
"name": "casino-partner-share",
"sharing_type": "OPEN", # or "MANAGED" for Databricks-to-Databricks
"authentication": "bearer_token",
"tables": [
{
"schema": "gold",
"name": "slot_performance_daily",
"partitions": ["gaming_date"],
"history_sharing": False # Don't share historical versions
},
{
"schema": "gold",
"name": "revenue_summary_monthly",
"partitions": ["report_month"],
"history_sharing": False
}
]
},
"recipients": [
{
"name": "gaming-commission",
"token_lifetime_seconds": 86400,
"allowed_tables": ["slot_performance_daily"]
},
{
"name": "partner-analytics",
"token_lifetime_seconds": 3600,
"allowed_tables": ["revenue_summary_monthly"]
}
]
}
🔒 Security & Governance¶
Data Sharing Security Model¶
flowchart TB
subgraph Governance["Governance Layer"]
SL["Sensitivity Labels<br/>Confidential | Internal | Public"]
PV["Purview Policies<br/>Auto-classification"]
AU["Audit Log<br/>Access tracking"]
end
subgraph Sharing["Sharing Controls"]
RBAC["Workspace RBAC<br/>Admin | Member | Viewer"]
RLS["Row-Level Security<br/>Tenant filtering"]
OLS["OneLake Security<br/>Folder-level ACLs"]
CLS["Column-Level Security<br/>Mask sensitive fields"]
end
subgraph Data["Shared Data"]
PUB["Public Tables<br/>🌐 Open data"]
INT["Internal Tables<br/>🔒 Cross-workspace"]
CONF["Confidential Tables<br/>🔐 Restricted access"]
end
Governance --> Sharing
Sharing --> Data
style Governance fill:#6C3483,stroke:#333,color:#fff
style Sharing fill:#2471A3,stroke:#333,color:#fff
style Data fill:#27AE60,stroke:#333,color:#fff Sensitivity Label Enforcement¶
| Label | Sharing Allowed | External Sharing | Iceberg Endpoint |
|---|---|---|---|
| Public | ✅ Any workspace | ✅ Allowed | ✅ Enabled |
| Internal | ✅ Same domain only | ❌ Blocked | ⚠️ Requires approval |
| Confidential | ⚠️ Named workspaces only | ❌ Blocked | ❌ Disabled |
| Highly Confidential | ❌ No sharing | ❌ Blocked | ❌ Disabled |
Data Masking for Shared Data¶
-- Dynamic data masking on shared tables
-- Consumer sees masked PII, publisher sees full data
CREATE TABLE dbo.shared_player_profiles (
player_id INT NOT NULL,
first_name VARCHAR(50) MASKED WITH (FUNCTION = 'partial(1, "XXX", 0)'),
last_name VARCHAR(50) MASKED WITH (FUNCTION = 'partial(1, "XXX", 0)'),
email VARCHAR(100) MASKED WITH (FUNCTION = 'email()'),
ssn_hash VARCHAR(64) MASKED WITH (FUNCTION = 'default()'),
tier_status VARCHAR(20), -- Not masked: safe to share
lifetime_value DECIMAL(12,2) -- Not masked: aggregate metric
);
-- Grant unmask to specific principals
GRANT UNMASK ON dbo.shared_player_profiles TO [sg-casino-compliance-officers];
🎰 Casino Industry Sharing¶
Scenario: Multi-Property Revenue Sharing with Gaming Commission¶
A multi-property casino operator must share regulated compliance data with the state gaming commission while sharing operational KPIs with corporate headquarters.
flowchart TB
subgraph Properties["Casino Properties"]
V["ws-casino-vegas<br/>🎰 Full operational data"]
A["ws-casino-atlantic<br/>🎰 Full operational data"]
end
subgraph Corporate["Corporate HQ"]
C["ws-casino-corporate<br/>📊 Aggregated KPIs only"]
end
subgraph Commission["Gaming Commission"]
GC["ws-gaming-commission<br/>📋 CTR + SAR + W-2G only"]
end
V -->|"Gold shortcuts<br/>(revenue, player KPIs)"| C
A -->|"Gold shortcuts<br/>(revenue, player KPIs)"| C
V -->|"Data Share<br/>(compliance tables only)"| GC
A -->|"Data Share<br/>(compliance tables only)"| GC
style Properties fill:#E67E22,stroke:#333,color:#fff
style Corporate fill:#6C3483,stroke:#333,color:#fff
style Commission fill:#2471A3,stroke:#333,color:#fff Compliance data sharing configuration:
commission_share_config = {
"share_name": "vegas-gaming-commission-compliance",
"source_workspace": "ws-casino-vegas-prod",
"source_lakehouse": "lh_gold_vegas",
"shared_tables": [
{
"table": "ctr_filings",
"description": "Currency Transaction Reports (>$10,000)",
"columns_included": [
"filing_id", "filing_date", "transaction_amount",
"patron_name_masked", "property_code", "status"
],
"columns_excluded": [
"patron_ssn", "patron_address", "internal_notes"
],
"refresh_frequency": "daily"
},
{
"table": "sar_alerts",
"description": "Suspicious Activity Report alerts",
"columns_included": [
"alert_id", "alert_date", "alert_type",
"transaction_pattern", "amount_range", "status"
],
"columns_excluded": [
"patron_id", "investigation_notes", "officer_id"
],
"refresh_frequency": "daily"
},
{
"table": "w2g_records",
"description": "W-2G jackpot records (≥$1,200 slots)",
"columns_included": [
"record_id", "jackpot_date", "jackpot_amount",
"game_type", "machine_id", "property_code"
],
"columns_excluded": [
"patron_ssn", "patron_tin", "patron_address"
],
"refresh_frequency": "daily"
}
],
"access_controls": {
"allowed_principals": ["sg-gaming-commission-auditors"],
"sensitivity_label": "Confidential",
"audit_all_access": True,
"expiration_days": 365 # Annual renewal required
}
}
🏛️ Federal Agency Sharing¶
Scenario 1: EPA → NOAA Cross-Agency Data Sharing via Shortcuts¶
EPA shares air quality monitoring data with NOAA for climate correlation analysis using OneLake shortcuts.
flowchart LR
subgraph EPA["ws-federal-epa"]
EPA_G["lh_gold_epa<br/>📊 air_quality_daily<br/>📊 monitoring_stations<br/>📊 pollutant_trends"]
end
subgraph NOAA["ws-federal-noaa"]
NOAA_B["lh_bronze_noaa<br/>🔗 epa_air_quality_daily<br/>(shortcut)"]
NOAA_S["lh_silver_noaa<br/>Joined: Weather + Air Quality"]
NOAA_G["lh_gold_noaa<br/>Climate-AQ Correlation"]
end
EPA_G -.->|OneLake Shortcut| NOAA_B
NOAA_B --> NOAA_S
NOAA_S --> NOAA_G
style EPA fill:#6C3483,stroke:#333,color:#fff
style NOAA fill:#2471A3,stroke:#333,color:#fff Implementation:
# Create shortcut from EPA Gold to NOAA Bronze
create_internal_shortcut(
target_workspace_id="ws-federal-noaa-id",
target_lakehouse_id="lh-bronze-noaa-id",
shortcut_name="epa_air_quality_daily",
source_workspace_id="ws-federal-epa-id",
source_lakehouse_id="lh-gold-epa-id",
source_table_path="air_quality_daily",
headers=get_auth_headers()
)
# NOAA Silver notebook: join weather + EPA air quality
weather_df = spark.read.format("delta").table("bronze.weather_station_observations")
air_quality_df = spark.read.format("delta").table("bronze.epa_air_quality_daily") # Via shortcut
# Correlate weather patterns with air quality
correlated_df = (
weather_df.alias("w")
.join(
air_quality_df.alias("aq"),
on=[
F.col("w.station_state") == F.col("aq.state_code"),
F.col("w.observation_date") == F.col("aq.measurement_date")
],
how="inner"
)
.select(
"w.station_id", "w.observation_date", "w.temperature_avg",
"w.humidity_avg", "w.wind_speed_avg",
"aq.pm25_concentration", "aq.ozone_level", "aq.aqi_value"
)
)
correlated_df.write.format("delta").mode("overwrite").saveAsTable(
"silver.weather_air_quality_correlation"
)
Scenario 2: DOI Inter-Bureau Data Sharing¶
The Department of Interior shares data between bureaus (BLM, NPS, USFWS, USGS) using per-bureau workspaces with shared shortcuts.
# DOI inter-bureau sharing matrix
doi_sharing_matrix = {
"blm": {
"shares_with": ["nps", "usfws"],
"shared_tables": [
"land_management_areas",
"grazing_allotments",
"mineral_leases"
]
},
"nps": {
"shares_with": ["blm", "usfws", "usgs"],
"shared_tables": [
"park_boundaries",
"visitor_statistics",
"wildlife_observations"
]
},
"usfws": {
"shares_with": ["blm", "nps"],
"shared_tables": [
"endangered_species_locations",
"habitat_assessments",
"wildlife_refuge_boundaries"
]
},
"usgs": {
"shares_with": ["blm", "nps", "usfws"],
"shared_tables": [
"geological_surveys",
"water_monitoring_stations",
"seismic_observations"
]
}
}
# Automate shortcut creation for inter-bureau sharing
def provision_doi_shortcuts(sharing_matrix: dict, headers: dict):
"""Create all inter-bureau shortcuts based on sharing matrix."""
for source_bureau, config in sharing_matrix.items():
for target_bureau in config["shares_with"]:
for table in config["shared_tables"]:
create_internal_shortcut(
target_workspace_id=f"ws-doi-{target_bureau}-id",
target_lakehouse_id=f"lh-bronze-{target_bureau}-id",
shortcut_name=f"{source_bureau}_{table}",
source_workspace_id=f"ws-doi-{source_bureau}-id",
source_lakehouse_id=f"lh-gold-{source_bureau}-id",
source_table_path=table,
headers=headers
)
print(f" ✅ {source_bureau}.{table} → {target_bureau}")
Scenario 3: Federal Open Data Publishing¶
Agencies publish public datasets via a shared open data workspace with Iceberg endpoints for external consumers.
# Federal open data publishing configuration
open_data_config = {
"workspace": "ws-federal-opendata",
"lakehouse": "lh_opendata_public",
"published_datasets": [
{
"agency": "USDA",
"table": "crop_production_annual",
"description": "Annual crop production statistics by state and commodity",
"update_frequency": "annual",
"license": "Public Domain (US Government Work)",
"iceberg_enabled": True
},
{
"agency": "NOAA",
"table": "weather_normals_30yr",
"description": "30-year climate normals by weather station",
"update_frequency": "decennial",
"license": "Public Domain",
"iceberg_enabled": True
},
{
"agency": "EPA",
"table": "air_quality_index_daily",
"description": "Daily AQI readings for all monitoring stations",
"update_frequency": "daily",
"license": "Public Domain",
"iceberg_enabled": True
}
],
"access": {
"authentication": "anonymous_read", # Public data
"rate_limit": "1000 requests/hour",
"format": "Iceberg (Parquet + metadata)"
}
}
📊 Monitoring & Auditing¶
Share Access Audit¶
// KQL query: Monitor data share access patterns
FabricSharingLogs
| where TimeGenerated > ago(7d)
| summarize
AccessCount = count(),
UniqueUsers = dcount(UserId),
DataTransferredMB = sum(BytesTransferred) / 1048576
by ShareName, SourceWorkspace, TargetWorkspace
| order by AccessCount desc
Shortcut Health Monitoring¶
def check_shortcut_health(workspace_id: str, lakehouse_id: str, headers: dict) -> list:
"""Check the health of all shortcuts in a Lakehouse."""
url = (
f"https://api.fabric.microsoft.com/v1"
f"/workspaces/{workspace_id}"
f"/items/{lakehouse_id}"
f"/shortcuts"
)
response = requests.get(url, headers=headers)
shortcuts = response.json().get("value", [])
health_report = []
for sc in shortcuts:
status = "HEALTHY"
if sc.get("status") == "Broken":
status = "BROKEN"
elif sc.get("status") == "Unauthorized":
status = "AUTH_FAILED"
health_report.append({
"name": sc["name"],
"type": sc["target"].get("type", "unknown"),
"status": status,
"last_accessed": sc.get("lastAccessedTime")
})
return health_report
⚠️ Limitations¶
| Limitation | Impact | Mitigation |
|---|---|---|
| Cross-tenant shortcuts | Cannot create shortcuts across Entra ID tenants | Use Iceberg endpoint or Delta Sharing for cross-org |
| Shortcut write access | Shortcuts are read-only | Write to the source directly or use Copy Activity |
| S3/GCS egress costs | Cross-cloud reads incur source cloud egress fees | Cache frequently accessed data or use scheduled copy |
| Iceberg metadata refresh | Iceberg endpoint metadata may lag by minutes | Use for analytics workloads, not real-time |
| Sensitivity label inheritance | Shared data inherits source sensitivity label | May prevent downstream sharing if label is too restrictive |
| Shortcut depth | Cannot create shortcuts of shortcuts (one level only) | Design direct shortcut paths from source to consumer |
| Row-level security | RLS on source is not automatically enforced on shortcuts | Implement RLS in consumer workspace or use Data Sharing |
📚 References¶
Microsoft Documentation¶
- OneLake shortcuts overview
- Create ADLS Gen2 shortcuts
- Create S3 shortcuts
- OneLake Iceberg support
- Fabric data sharing
- OneLake security
- Sensitivity labels in Fabric
- Delta Sharing protocol
Architecture Patterns¶
Related Documents¶
- Multi-Tenant Workspace Architecture — Workspace topology patterns
- OneLake Security — Folder-level access controls
- Data Governance Deep Dive — Classification, RLS, compliance
- Iceberg Interoperability — Iceberg endpoint details
- Migration Patterns — Source-to-Fabric migration playbook
- Outbound Access Protection — Network-level sharing controls