🔗 Tutorial 18: Data Sharing & OneLake Shortcuts¶

Home > Tutorials > Data Sharing & OneLake Shortcuts

Last Updated: 2026-04-15 | Version: 2.0 Status: ✅ Final | Maintainer: Documentation Team

Third-party references — publicly sourced, good-faith comparison

This page references non-Microsoft products and services. That information is drawn from each vendor's publicly available documentation and is offered for honest, good-faith comparison only. This is a personal project written from a Microsoft Fabric and Azure perspective; it does not claim expertise in, or authority over, any third-party product, and nothing here is an official statement by, or endorsed by, those vendors. Capabilities, pricing, and features change often — always verify against the vendor's current official documentation. Where a third-party offering is the stronger choice, we say so plainly.


Difficulty	Intermediate
Time	1.5 hours
Focus	OneLake, Shortcuts, Cross-Workspace Sharing, Multi-Tenant Architecture

Progress Tracker¶

+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+
|   00    |   01    |   02    |   03    |   04    |   05    |   06    |   07    |   08    |   09    |
|  SETUP  | BRONZE  | SILVER  |  GOLD   |   RT    |   PBI   |  PIPES  |   GOV   | MIRROR  |  AI/ML  |
+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+
|   [x]   |   [x]   |   [x]   |   [x]   |   [x]   |   [x]   |   [x]   |   [x]   |   [x]   |   [x]   |
+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+

+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+
|   10    |   11    |   12    |   13    |   14    |   15    |   16    |   17    |   18    |   19    |
|TERADATA |   SAS   |  CI/CD  |PLANNING | SECURITY|  COST   |  PERF   | MONITOR | SHARING | COPILOT |
+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+
|   [x]   |   [x]   |   [x]   |   [x]   |   [x]   |   [x]   |   [x]   |   [x]   |  [**]   |   [ ]   |
+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+
                                                                         ^
                                                                    YOU ARE HERE

Navigation
Previous	17-Monitoring & Alerting
Next	19-Fabric Copilot & AI Features

🎯 Overview¶

OneLake Architecture

Source: What is OneLake?

This tutorial explores OneLake shortcuts and data sharing patterns in Microsoft Fabric. You will learn how to share data across workspaces, integrate with external data sources (ADLS Gen2, Amazon S3, Google Cloud Storage), and implement multi-tenant data architectures for enterprise scenarios.

OneLake serves as Microsoft Fabric's unified data lake, providing a single logical storage layer across all Fabric workloads. Shortcuts enable data virtualization, allowing you to reference data from internal Fabric locations or external cloud storage without data duplication.

Key Concepts¶

Concept	Description
OneLake	Unified data lake for all Fabric workloads with hierarchical namespace
Shortcuts	Virtual references to data in other locations without copying
Internal Shortcuts	References to data within Fabric (cross-workspace)
External Shortcuts	References to external cloud storage (ADLS, S3, GCS)
Data Virtualization	Access data in place without movement or duplication

📚 Learning Objectives¶

By the end of this tutorial, you will be able to:

🏗️ Architecture Overview¶

OneLake Unified Data Lake Architecture¶

flowchart TB
    subgraph OneLake["OneLake - Unified Data Lake"]
        subgraph Tenant["Fabric Tenant"]
            subgraph WS1["Workspace: Vegas Casino"]
                LH1[(Lakehouse<br/>Player Data)]
                LH2[(Lakehouse<br/>Slot Telemetry)]
            end

            subgraph WS2["Workspace: Atlantic City Casino"]
                LH3[(Lakehouse<br/>Player Data)]
                LH4[(Lakehouse<br/>Table Games)]
            end

            subgraph WS3["Workspace: Enterprise Analytics"]
                LH5[(Lakehouse<br/>Consolidated View)]
            end
        end
    end

    subgraph External["External Data Sources"]
        ADLS[(Azure Data Lake<br/>Storage Gen2)]
        S3[(Amazon S3)]
        GCS[(Google Cloud<br/>Storage)]
        DV[(Dataverse)]
    end

    %% Internal Shortcuts
    LH1 -.->|Internal Shortcut| LH5
    LH2 -.->|Internal Shortcut| LH5
    LH3 -.->|Internal Shortcut| LH5
    LH4 -.->|Internal Shortcut| LH5

    %% External Shortcuts
    ADLS -.->|External Shortcut| LH1
    S3 -.->|External Shortcut| LH2
    GCS -.->|External Shortcut| LH3
    DV -.->|Dataverse Shortcut| LH4

    style OneLake fill:#1a1a2e,color:#fff
    style WS1 fill:#4a90d9,color:#fff
    style WS2 fill:#4a90d9,color:#fff
    style WS3 fill:#16a085,color:#fff
    style External fill:#8e44ad,color:#fff

Shortcut Types and Use Cases¶

flowchart LR
    subgraph ShortcutTypes["Shortcut Types"]
        direction TB
        INT[Internal Shortcuts]
        EXT[External Shortcuts]
        DV[Dataverse Shortcuts]
    end

    subgraph InternalUses["Internal Use Cases"]
        CROSS[Cross-Workspace<br/>Sharing]
        CENT[Centralized<br/>Analytics]
        DOMAIN[Domain<br/>Federation]
    end

    subgraph ExternalUses["External Use Cases"]
        ADLS_USE[ADLS Gen2<br/>Legacy Data]
        S3_USE[S3 Buckets<br/>Multi-Cloud]
        GCS_USE[GCS<br/>Partner Data]
    end

    subgraph DataverseUses["Dataverse Use Cases"]
        D365[Dynamics 365<br/>Integration]
        POWER[Power Platform<br/>Data]
        CRM[CRM/ERP<br/>Analytics]
    end

    INT --> CROSS
    INT --> CENT
    INT --> DOMAIN

    EXT --> ADLS_USE
    EXT --> S3_USE
    EXT --> GCS_USE

    DV --> D365
    DV --> POWER
    DV --> CRM

    style ShortcutTypes fill:#2c3e50,color:#fff
    style InternalUses fill:#27ae60,color:#fff
    style ExternalUses fill:#e67e22,color:#fff
    style DataverseUses fill:#9b59b6,color:#fff

📋 Prerequisites¶

Before starting this tutorial, ensure you have:

Note: External cloud storage is optional. You can complete the internal shortcuts sections without AWS, GCP, or Azure Storage accounts.

🛠️ Step 1: Understanding OneLake Architecture¶

1.1 OneLake Hierarchy¶

OneLake organizes data in a hierarchical structure:

OneLake (Tenant)
  |
  +-- Workspace (Vegas Casino)
  |     |
  |     +-- Lakehouse (player_analytics)
  |     |     |
  |     |     +-- Tables/
  |     |     |     +-- dim_players
  |     |     |     +-- fact_sessions
  |     |     |
  |     |     +-- Files/
  |     |           +-- raw/
  |     |           +-- processed/
  |     |
  |     +-- Lakehouse (slot_telemetry)
  |           |
  |           +-- Tables/
  |           +-- Files/
  |
  +-- Workspace (Atlantic City Casino)
        |
        +-- Lakehouse (...)

1.2 OneLake Path Format¶

OneLake uses a standardized path format for accessing data:

Component	Path Format	Example
Workspace	`/{workspace_name}`	`/Vegas_Casino`
Lakehouse	`/{workspace}/{lakehouse}.Lakehouse`	`/Vegas_Casino/player_analytics.Lakehouse`
Tables	`/{...}/Tables/{table_name}`	`/.../Tables/dim_players`
Files	`/{...}/Files/{path}`	`/.../Files/raw/data.parquet`

1.3 OneLake Endpoints¶

OneLake exposes multiple access endpoints:

Endpoint	URL Pattern	Use Case
DFS	`https://onelake.dfs.fabric.microsoft.com/{path}`	ADLS Gen2 compatible
Blob	`https://onelake.blob.fabric.microsoft.com/{path}`	Azure Blob compatible
Direct	`abfss://{workspace}@onelake.dfs.fabric.microsoft.com/{item}`	Spark/Notebook access

Example - Accessing OneLake from Spark:

# Access a lakehouse table directly
df = spark.read.format("delta").load(
    "abfss://Vegas_Casino@onelake.dfs.fabric.microsoft.com/player_analytics.Lakehouse/Tables/dim_players"
)

# Access files in OneLake
df_raw = spark.read.parquet(
    "abfss://Vegas_Casino@onelake.dfs.fabric.microsoft.com/player_analytics.Lakehouse/Files/raw/"
)

🛠️ Step 2: Creating Internal Shortcuts (Cross-Workspace)¶

2.1 Scenario: Casino Property Data Federation¶

In this scenario, we have multiple casino properties with their own lakehouses, and we want to create a consolidated view in an enterprise analytics workspace.

flowchart TB
    subgraph Properties["Casino Properties"]
        direction LR
        VE[(Vegas<br/>Lakehouse)]
        AC[(Atlantic City<br/>Lakehouse)]
        NO[(New Orleans<br/>Lakehouse)]
    end

    subgraph Enterprise["Enterprise Analytics"]
        ENT[(Enterprise<br/>Lakehouse)]
        subgraph Shortcuts["Shortcuts"]
            S1[vegas_players]
            S2[ac_players]
            S3[no_players]
        end
    end

    VE -->|Shortcut| S1
    AC -->|Shortcut| S2
    NO -->|Shortcut| S3

    S1 --> ENT
    S2 --> ENT
    S3 --> ENT

    style Properties fill:#3498db,color:#fff
    style Enterprise fill:#27ae60,color:#fff
    style Shortcuts fill:#f39c12,color:#000

2.2 Creating an Internal Shortcut via UI¶

Navigate to your Enterprise Analytics workspace
Open the target Lakehouse where you want to create the shortcut
In the Lakehouse explorer, right-click on Tables or Files
Select New shortcut
Choose Microsoft OneLake as the source

Configuration:

Setting	Value
Source Type	Microsoft OneLake
Workspace	Vegas Casino (source workspace)
Lakehouse	player_analytics
Source Path	Tables/dim_players
Shortcut Name	vegas_dim_players

2.3 Creating Internal Shortcuts via Notebook¶

# Fabric Notebook: Create Internal Shortcuts Programmatically

from notebookutils import mssparkutils

# Define shortcut configurations
shortcuts = [
    {
        "name": "vegas_players",
        "source_workspace": "Vegas_Casino",
        "source_lakehouse": "player_analytics",
        "source_path": "Tables/dim_players"
    },
    {
        "name": "atlantic_city_players",
        "source_workspace": "Atlantic_City_Casino",
        "source_lakehouse": "player_analytics",
        "source_path": "Tables/dim_players"
    },
    {
        "name": "new_orleans_players",
        "source_workspace": "New_Orleans_Casino",
        "source_lakehouse": "player_analytics",
        "source_path": "Tables/dim_players"
    }
]

# Get current workspace and lakehouse
current_workspace = spark.conf.get("spark.fabric.workspace.id")
target_lakehouse = "enterprise_analytics"

# Create shortcuts using Fabric REST API
import requests
import json

def create_internal_shortcut(
    target_workspace_id: str,
    target_lakehouse_name: str,
    shortcut_name: str,
    source_workspace: str,
    source_lakehouse: str,
    source_path: str
):
    """Create an internal OneLake shortcut."""

    # Get access token
    access_token = mssparkutils.credentials.getToken("pbi")

    headers = {
        "Authorization": f"Bearer {access_token}",
        "Content-Type": "application/json"
    }

    # Build shortcut payload
    payload = {
        "path": f"Tables/{shortcut_name}",
        "target": {
            "oneLake": {
                "workspaceName": source_workspace,
                "itemName": source_lakehouse,
                "path": source_path
            }
        }
    }

    # API endpoint for creating shortcuts
    url = f"https://api.fabric.microsoft.com/v1/workspaces/{target_workspace_id}/items/{target_lakehouse_name}/shortcuts"

    response = requests.post(url, headers=headers, json=payload)

    if response.status_code == 201:
        print(f"Shortcut '{shortcut_name}' created successfully")
        return True
    else:
        print(f"Failed to create shortcut: {response.text}")
        return False

# Create all shortcuts
for shortcut in shortcuts:
    create_internal_shortcut(
        target_workspace_id=current_workspace,
        target_lakehouse_name=target_lakehouse,
        shortcut_name=shortcut["name"],
        source_workspace=shortcut["source_workspace"],
        source_lakehouse=shortcut["source_lakehouse"],
        source_path=shortcut["source_path"]
    )

2.4 Querying Data Through Shortcuts¶

Once shortcuts are created, query them like regular tables:

# Query consolidated player data from all properties
from pyspark.sql.functions import lit, col

# Read from each shortcut
df_vegas = spark.table("enterprise_analytics.vegas_players") \
    .withColumn("property", lit("Vegas"))

df_atlantic = spark.table("enterprise_analytics.atlantic_city_players") \
    .withColumn("property", lit("Atlantic City"))

df_neworleans = spark.table("enterprise_analytics.new_orleans_players") \
    .withColumn("property", lit("New Orleans"))

# Union all property data
df_all_players = df_vegas.unionByName(df_atlantic).unionByName(df_neworleans)

# Analyze across all properties
player_summary = df_all_players.groupBy("property", "loyalty_tier") \
    .count() \
    .orderBy("property", "loyalty_tier")

display(player_summary)

☁️ Step 3: External Shortcuts to Cloud Storage¶

3.1 ADLS Gen2 Shortcuts¶

Azure Data Lake Storage Gen2 shortcuts are ideal for: - Migrating existing Azure data lakes to Fabric - Accessing legacy data without movement - Hybrid cloud scenarios

flowchart LR
    subgraph Azure["Azure"]
        ADLS[(ADLS Gen2<br/>Storage Account)]
        subgraph Containers["Containers"]
            RAW[raw-data]
            HIST[historical]
            VENDOR[vendor-data]
        end
    end

    subgraph Fabric["Microsoft Fabric"]
        LH[(Lakehouse)]
        subgraph Shortcuts["External Shortcuts"]
            S1[adls_raw]
            S2[adls_historical]
            S3[vendor_slots]
        end
    end

    RAW --> S1
    HIST --> S2
    VENDOR --> S3

    S1 --> LH
    S2 --> LH
    S3 --> LH

    style Azure fill:#0078d4,color:#fff
    style Fabric fill:#742774,color:#fff

Creating ADLS Gen2 Shortcut:

In Lakehouse, click New shortcut
Select Azure Data Lake Storage Gen2
Configure connection:

Setting	Value
URL	`https://youraccount.dfs.core.windows.net/`
Authentication	Organizational account OR Service Principal
Container	`raw-data`
Sub-path	`/casino/slot-telemetry/`
Shortcut Name	`adls_slot_data`

3.2 Amazon S3 Shortcuts¶

S3 shortcuts enable multi-cloud data strategies:

# Example: Partner slot machine data in S3

# S3 Shortcut Configuration (via UI)
# URL: s3://vendor-slot-data/
# Authentication: Access Key + Secret Key
# Path: /igt-machines/telemetry/
# Shortcut Name: igt_telemetry

Creating S3 Shortcut via UI:

In Lakehouse, click New shortcut
Select Amazon S3
Configure connection:

Setting	Value
URL	`s3://vendor-slot-data/`
Authentication	Access Key and Secret Key
Access Key ID	`AKIA...`
Secret Access Key	`(stored securely)`
Path	`/igt-machines/telemetry/`
Shortcut Name	`igt_vendor_telemetry`

3.3 Google Cloud Storage Shortcuts¶

GCS shortcuts for Google Cloud integration:

Setting	Value
URL	`gs://partner-analytics-bucket/`
Authentication	Service Account Key
Path	`/shared/player-insights/`
Shortcut Name	`gcs_partner_insights`

3.4 External Shortcut Best Practices¶

Practice	Description
Use Service Principals	Avoid personal accounts for production shortcuts
Implement Least Privilege	Grant only necessary permissions on source storage
Monitor Access	Enable logging on source storage accounts
Consider Latency	Cross-region shortcuts may have higher latency
Plan for Egress	External shortcuts incur egress charges from source clouds

🔌 Step 4: Dataverse Shortcuts¶

4.1 Dynamics 365 Integration¶

Dataverse shortcuts connect Fabric directly to Dynamics 365 and Power Platform data:

flowchart LR
    subgraph PowerPlatform["Power Platform"]
        DV[(Dataverse)]
        subgraph Tables["Tables"]
            ACCT[Account]
            CONT[Contact]
            OPPOR[Opportunity]
        end
    end

    subgraph Fabric["Microsoft Fabric"]
        LH[(Lakehouse)]
        subgraph DVShortcuts["Dataverse Shortcuts"]
            S1[dv_accounts]
            S2[dv_contacts]
            S3[dv_opportunities]
        end
    end

    ACCT --> S1
    CONT --> S2
    OPPOR --> S3

    S1 --> LH
    S2 --> LH
    S3 --> LH

    style PowerPlatform fill:#742774,color:#fff
    style Fabric fill:#0078d4,color:#fff

4.2 Creating Dataverse Shortcuts¶

In Lakehouse, click New shortcut
Select Dataverse
Authenticate with your Power Platform credentials
Select the environment containing your Dataverse tables
Choose the tables to shortcut

Example Configuration:

Setting	Value
Environment	Production - Contoso Casino
Table	account
Shortcut Name	dv_crm_accounts

4.3 Querying Dataverse Data¶

# Join Dataverse CRM data with Fabric analytics

# Read player data from Fabric
df_players = spark.table("gold.gold_player_360")

# Read CRM accounts from Dataverse shortcut
df_crm = spark.table("enterprise.dv_crm_accounts")

# Join to enrich player profiles with CRM data
df_enriched = df_players.join(
    df_crm,
    df_players.crm_account_id == df_crm.accountid,
    "left"
).select(
    df_players["*"],
    df_crm.name.alias("crm_company_name"),
    df_crm.revenue.alias("crm_annual_revenue"),
    df_crm.industrycode.alias("crm_industry")
)

display(df_enriched)

🔒 Step 5: Security and Access Control¶

5.1 Security Inheritance Model¶

Shortcuts inherit security from their source with additional controls:

flowchart TB
    subgraph Source["Source Security"]
        SRC_PERM[Source Permissions<br/>Read/Write/Execute]
        SRC_RLS[Source Row-Level<br/>Security]
    end

    subgraph Shortcut["Shortcut Security"]
        SC_PERM[Shortcut Permissions<br/>Inherit + Override]
        SC_ACCESS[Shortcut Access<br/>Control]
    end

    subgraph Consumer["Consumer Access"]
        USER_A[User A: Full Access]
        USER_B[User B: Read Only]
        USER_C[User C: No Access]
    end

    SRC_PERM --> SC_PERM
    SRC_RLS --> SC_ACCESS

    SC_PERM --> USER_A
    SC_ACCESS --> USER_B
    SC_ACCESS -.->|Denied| USER_C

    style Source fill:#27ae60,color:#fff
    style Shortcut fill:#f39c12,color:#000
    style Consumer fill:#3498db,color:#fff

5.2 Permission Requirements¶

Shortcut Type	Source Permission Required	Target Permission Required
Internal OneLake	Read on source item	Write on target lakehouse
ADLS Gen2	Storage Blob Data Reader	Write on target lakehouse
Amazon S3	s3:GetObject, s3:ListBucket	Write on target lakehouse
Google Cloud Storage	storage.objects.get	Write on target lakehouse
Dataverse	Read on Dataverse tables	Write on target lakehouse

5.3 Implementing Row-Level Security with Shortcuts¶

# Example: Implement RLS for multi-property access

# Create a security mapping table
security_mapping = [
    ("user1@casino.com", "Vegas"),
    ("user1@casino.com", "Atlantic City"),
    ("user2@casino.com", "Vegas"),
    ("user3@casino.com", "New Orleans"),
]

df_security = spark.createDataFrame(
    security_mapping,
    ["user_email", "allowed_property"]
)

# Save security mapping
df_security.write.mode("overwrite").saveAsTable("security.property_access")

# In Semantic Model, apply RLS using DAX
# Property Access =
# FILTER(
#     'All Players',
#     'All Players'[property] IN
#     SELECTCOLUMNS(
#         FILTER(
#             'Property Access',
#             'Property Access'[user_email] = USERPRINCIPALNAME()
#         ),
#         "Property", 'Property Access'[allowed_property]
#     )
# )

5.4 Auditing Shortcut Access¶

# Query Fabric audit logs for shortcut access

# Via Power BI REST API or Fabric Admin API
# GET /admin/activityevents
# Filter: ActivityType eq 'CreateShortcut' or 'AccessShortcut'

# Example audit query
audit_events = """
SELECT
    activity_timestamp,
    user_name,
    activity_type,
    item_name,
    shortcut_source,
    result_status
FROM fabric_audit_logs
WHERE activity_type IN ('CreateShortcut', 'AccessShortcut', 'DeleteShortcut')
  AND activity_timestamp >= current_date - 30
ORDER BY activity_timestamp DESC
"""

6.1 Hub-and-Spoke Architecture¶

flowchart TB
    subgraph Hub["Central Data Hub"]
        MASTER[(Master Data<br/>Lakehouse)]
        REF[(Reference Data<br/>Lakehouse)]
        GOV[(Governance<br/>Catalog)]
    end

    subgraph Spoke1["Vegas Property"]
        VE_LH[(Vegas<br/>Lakehouse)]
        VE_SC1[master_shortcut]
        VE_SC2[ref_shortcut]
    end

    subgraph Spoke2["Atlantic City Property"]
        AC_LH[(Atlantic City<br/>Lakehouse)]
        AC_SC1[master_shortcut]
        AC_SC2[ref_shortcut]
    end

    subgraph Spoke3["New Orleans Property"]
        NO_LH[(New Orleans<br/>Lakehouse)]
        NO_SC1[master_shortcut]
        NO_SC2[ref_shortcut]
    end

    MASTER --> VE_SC1
    MASTER --> AC_SC1
    MASTER --> NO_SC1

    REF --> VE_SC2
    REF --> AC_SC2
    REF --> NO_SC2

    VE_SC1 --> VE_LH
    VE_SC2 --> VE_LH
    AC_SC1 --> AC_LH
    AC_SC2 --> AC_LH
    NO_SC1 --> NO_LH
    NO_SC2 --> NO_LH

    style Hub fill:#9b59b6,color:#fff
    style Spoke1 fill:#3498db,color:#fff
    style Spoke2 fill:#3498db,color:#fff
    style Spoke3 fill:#3498db,color:#fff

6.2 Data Mesh Pattern¶

flowchart TB
    subgraph Domain1["Player Domain"]
        P_PROD[(Player Data<br/>Product)]
        P_PORT[Player Data<br/>Port]
    end

    subgraph Domain2["Gaming Domain"]
        G_PROD[(Gaming Data<br/>Product)]
        G_PORT[Gaming Data<br/>Port]
    end

    subgraph Domain3["Financial Domain"]
        F_PROD[(Financial Data<br/>Product)]
        F_PORT[Financial Data<br/>Port]
    end

    subgraph Central["Self-Service Platform"]
        CATALOG[Data<br/>Catalog]
        DISCOVER[Discovery<br/>Portal]
    end

    subgraph Consumer["Consumer Workspace"]
        CONS_LH[(Analytics<br/>Lakehouse)]
        SC1[player_shortcut]
        SC2[gaming_shortcut]
        SC3[finance_shortcut]
    end

    P_PORT --> CATALOG
    G_PORT --> CATALOG
    F_PORT --> CATALOG

    CATALOG --> DISCOVER

    P_PROD --> SC1
    G_PROD --> SC2
    F_PROD --> SC3

    SC1 --> CONS_LH
    SC2 --> CONS_LH
    SC3 --> CONS_LH

    style Domain1 fill:#e74c3c,color:#fff
    style Domain2 fill:#27ae60,color:#fff
    style Domain3 fill:#3498db,color:#fff
    style Central fill:#f39c12,color:#000
    style Consumer fill:#9b59b6,color:#fff

6.3 Implementing Hub-and-Spoke¶

# Notebook: Configure Hub-and-Spoke Shortcuts

# Central hub publishes master data
# Spoke workspaces create shortcuts to access

# Hub workspace: enterprise-data-hub
# Hub lakehouse: master_data

# Spoke configuration
spokes = [
    {
        "workspace": "Vegas_Casino",
        "lakehouse": "property_analytics",
        "shortcuts": [
            {"name": "master_players", "source_path": "Tables/dim_player_master"},
            {"name": "master_games", "source_path": "Tables/dim_game_master"},
            {"name": "ref_calendar", "source_path": "Tables/dim_calendar"},
            {"name": "ref_geography", "source_path": "Tables/dim_geography"}
        ]
    },
    {
        "workspace": "Atlantic_City_Casino",
        "lakehouse": "property_analytics",
        "shortcuts": [
            {"name": "master_players", "source_path": "Tables/dim_player_master"},
            {"name": "master_games", "source_path": "Tables/dim_game_master"},
            {"name": "ref_calendar", "source_path": "Tables/dim_calendar"},
            {"name": "ref_geography", "source_path": "Tables/dim_geography"}
        ]
    }
]

# Deploy shortcuts to all spokes
hub_workspace = "Enterprise_Data_Hub"
hub_lakehouse = "master_data"

for spoke in spokes:
    print(f"Configuring shortcuts for {spoke['workspace']}...")
    for shortcut in spoke["shortcuts"]:
        # Create shortcut via API
        print(f"  Creating shortcut: {shortcut['name']}")
        # create_internal_shortcut(...)

Scenario	Pattern	Example
Vendor Data Ingestion	External shortcut from vendor storage	IGT slot machine telemetry from S3
Regulatory Reporting	Data copy to external destination	CTR reports to FinCEN ADLS
Partner Analytics	Cross-tenant sharing	Joint marketing analytics with hotel partner
Franchise Data	Multi-tenant with isolation	Franchise property data federation

7.2 Architecture for Vendor Data Integration¶

flowchart TB
    subgraph Vendors["Vendor Data Sources"]
        IGT[(IGT<br/>S3 Bucket)]
        AGS[(AGS<br/>ADLS Account)]
        SCI[(SciGames<br/>GCS Bucket)]
    end

    subgraph Landing["Fabric Landing Zone"]
        LZ_LH[(Landing Zone<br/>Lakehouse)]
        subgraph VendorShortcuts["Vendor Shortcuts"]
            SC_IGT[igt_telemetry]
            SC_AGS[ags_telemetry]
            SC_SCI[sci_telemetry]
        end
    end

    subgraph Processing["Data Processing"]
        BRONZE[(Bronze<br/>Lakehouse)]
        SILVER[(Silver<br/>Lakehouse)]
        GOLD[(Gold<br/>Lakehouse)]
    end

    IGT --> SC_IGT
    AGS --> SC_AGS
    SCI --> SC_SCI

    SC_IGT --> LZ_LH
    SC_AGS --> LZ_LH
    SC_SCI --> LZ_LH

    LZ_LH --> BRONZE
    BRONZE --> SILVER
    SILVER --> GOLD

    style Vendors fill:#e67e22,color:#fff
    style Landing fill:#f39c12,color:#000
    style Processing fill:#27ae60,color:#fff

7.3 Configuring Vendor Shortcuts¶

# Notebook: Configure Vendor Data Shortcuts

# Vendor 1: IGT (Amazon S3)
igt_config = {
    "name": "igt_slot_telemetry",
    "type": "s3",
    "url": "s3://igt-casino-data/",
    "path": "telemetry/2024/",
    "auth": {
        "type": "access_key",
        "key_vault_secret": "igt-s3-access-key"
    }
}

# Vendor 2: AGS (Azure Storage)
ags_config = {
    "name": "ags_game_data",
    "type": "adls_gen2",
    "url": "https://agspartnerdata.dfs.core.windows.net/",
    "container": "casino-export",
    "path": "game-analytics/",
    "auth": {
        "type": "service_principal",
        "key_vault_secret": "ags-sp-credentials"
    }
}

# Vendor 3: Scientific Games (GCS)
sci_config = {
    "name": "sci_lottery_data",
    "type": "gcs",
    "url": "gs://scigames-partner-bucket/",
    "path": "lottery/daily/",
    "auth": {
        "type": "service_account",
        "key_vault_secret": "sci-gcs-sa-key"
    }
}

# Unified vendor data processing
def process_vendor_data():
    """Process data from all vendor shortcuts."""

    # Read from each vendor shortcut
    df_igt = spark.read.parquet("Tables/igt_slot_telemetry")
    df_ags = spark.read.parquet("Tables/ags_game_data")
    df_sci = spark.read.parquet("Tables/sci_lottery_data")

    # Standardize schemas
    df_igt_std = standardize_telemetry(df_igt, "IGT")
    df_ags_std = standardize_telemetry(df_ags, "AGS")
    df_sci_std = standardize_telemetry(df_sci, "SciGames")

    # Union all vendor data
    df_unified = df_igt_std.unionByName(df_ags_std).unionByName(df_sci_std)

    # Write to bronze layer
    df_unified.write \
        .mode("append") \
        .partitionBy("vendor", "date") \
        .saveAsTable("bronze.vendor_telemetry")

    return df_unified.count()

For sharing data with external partners:

# Export data to partner storage (outbound)
# Note: Use Data Pipelines for scheduled exports

# Example: Export regulatory reports to FinCEN
from pyspark.sql.functions import current_date

# Prepare CTR report
df_ctr = spark.sql("""
    SELECT
        transaction_id,
        player_id,
        transaction_date,
        transaction_amount,
        transaction_type,
        property_id
    FROM gold.regulatory_transactions
    WHERE transaction_amount >= 10000
      AND transaction_date >= current_date() - 1
""")

# Write to external partner storage
df_ctr.write \
    .mode("overwrite") \
    .parquet("abfss://fincen-reports@partnerdata.dfs.core.windows.net/ctr/daily/")

🏢 Step 8: Multi-Tenant Data Architecture¶

8.1 Multi-Tenant Patterns¶

flowchart TB
    subgraph SharedInfra["Shared Infrastructure"]
        CAP[Fabric Capacity<br/>F64]
        SEC[Shared Security<br/>Policies]
    end

    subgraph Tenant1["Tenant: Vegas Corp"]
        T1_WS1[Vegas Casino<br/>Workspace]
        T1_WS2[Henderson Casino<br/>Workspace]
        T1_LH[(Shared Data<br/>Lakehouse)]
    end

    subgraph Tenant2["Tenant: Atlantic Gaming"]
        T2_WS1[AC Boardwalk<br/>Workspace]
        T2_WS2[AC Marina<br/>Workspace]
        T2_LH[(Shared Data<br/>Lakehouse)]
    end

    subgraph Tenant3["Tenant: Gulf Coast Casinos"]
        T3_WS1[NOLA Casino<br/>Workspace]
        T3_WS2[Biloxi Casino<br/>Workspace]
        T3_LH[(Shared Data<br/>Lakehouse)]
    end

    CAP --> Tenant1
    CAP --> Tenant2
    CAP --> Tenant3

    SEC --> Tenant1
    SEC --> Tenant2
    SEC --> Tenant3

    T1_WS1 -.-> T1_LH
    T1_WS2 -.-> T1_LH
    T2_WS1 -.-> T2_LH
    T2_WS2 -.-> T2_LH
    T3_WS1 -.-> T3_LH
    T3_WS2 -.-> T3_LH

    style SharedInfra fill:#9b59b6,color:#fff
    style Tenant1 fill:#e74c3c,color:#fff
    style Tenant2 fill:#3498db,color:#fff
    style Tenant3 fill:#27ae60,color:#fff

8.2 Tenant Isolation Strategies¶

Strategy	Description	Use Case
Workspace Isolation	Separate workspaces per tenant	Strong isolation, separate billing
Lakehouse Isolation	Separate lakehouses within shared workspace	Medium isolation, shared governance
Schema Isolation	Separate schemas/folders within lakehouse	Basic isolation, shared compute
RLS Isolation	Row-level security filtering	Same data, different views

8.3 Implementing Multi-Tenant Shortcuts¶

# Notebook: Multi-Tenant Data Federation

# Tenant configuration
tenants = [
    {
        "tenant_id": "T001",
        "name": "Vegas Corp",
        "workspaces": ["Vegas_Casino", "Henderson_Casino"],
        "shared_lakehouse": "vegas_corp_shared"
    },
    {
        "tenant_id": "T002",
        "name": "Atlantic Gaming",
        "workspaces": ["AC_Boardwalk", "AC_Marina"],
        "shared_lakehouse": "atlantic_gaming_shared"
    },
    {
        "tenant_id": "T003",
        "name": "Gulf Coast Casinos",
        "workspaces": ["NOLA_Casino", "Biloxi_Casino"],
        "shared_lakehouse": "gulf_coast_shared"
    }
]

def create_tenant_shortcuts(tenant: dict):
    """Create shortcuts for tenant data federation."""

    shared_lakehouse = tenant["shared_lakehouse"]

    for workspace in tenant["workspaces"]:
        # Create shortcut from property workspace to shared lakehouse
        shortcuts = [
            {
                "name": f"{workspace.lower()}_players",
                "source_workspace": workspace,
                "source_lakehouse": "property_analytics",
                "source_path": "Tables/dim_players"
            },
            {
                "name": f"{workspace.lower()}_transactions",
                "source_workspace": workspace,
                "source_lakehouse": "property_analytics",
                "source_path": "Tables/fact_transactions"
            }
        ]

        for sc in shortcuts:
            print(f"Creating shortcut: {sc['name']} in {shared_lakehouse}")
            # create_internal_shortcut(...)

# Create shortcuts for all tenants
for tenant in tenants:
    create_tenant_shortcuts(tenant)
    print(f"Configured shortcuts for tenant: {tenant['name']}")

8.4 Cross-Tenant Analytics (Enterprise View)¶

# Create enterprise-wide view across all tenants

# Enterprise workspace: "Enterprise_Analytics"
# Enterprise lakehouse: "all_properties"

# Create shortcuts to each tenant's shared lakehouse
enterprise_shortcuts = [
    {"name": "vegas_corp", "source": "vegas_corp_shared"},
    {"name": "atlantic_gaming", "source": "atlantic_gaming_shared"},
    {"name": "gulf_coast", "source": "gulf_coast_shared"}
]

# Query across all tenants
enterprise_summary = spark.sql("""
    SELECT
        'Vegas Corp' as tenant,
        count(*) as player_count,
        sum(lifetime_value) as total_ltv
    FROM enterprise_analytics.vegas_corp.players

    UNION ALL

    SELECT
        'Atlantic Gaming' as tenant,
        count(*) as player_count,
        sum(lifetime_value) as total_ltv
    FROM enterprise_analytics.atlantic_gaming.players

    UNION ALL

    SELECT
        'Gulf Coast' as tenant,
        count(*) as player_count,
        sum(lifetime_value) as total_ltv
    FROM enterprise_analytics.gulf_coast.players
""")

display(enterprise_summary)

⚡ Step 9: Performance Optimization¶

9.1 Shortcut vs Copy Decision Matrix¶

Factor	Use Shortcut	Use Copy
Latency Sensitivity	Low latency tolerance	High latency tolerance
Data Freshness	Need real-time data	Can tolerate stale data
Query Frequency	Occasional queries	Heavy query workload
Data Volume	Large datasets (>1TB)	Small to medium datasets
Cross-Region	Same region	Cross-region
Transform Needs	None or light	Heavy transformation
Cost Optimization	Minimize storage	Minimize egress

9.2 Performance Considerations¶

flowchart TB
    subgraph Performance["Performance Factors"]
        LAT[Network Latency]
        BW[Bandwidth]
        CACHE[Caching]
        PART[Partitioning]
    end

    subgraph ShortcutPerf["Shortcut Performance"]
        SP1[Same-Region: Low Latency]
        SP2[Cross-Region: Higher Latency]
        SP3[External: Variable]
    end

    subgraph Optimization["Optimizations"]
        O1[Use Partition Pruning]
        O2[Filter Early in Queries]
        O3[Cache Frequently Accessed]
        O4[Monitor Egress Costs]
    end

    LAT --> SP1
    LAT --> SP2
    BW --> SP3

    SP1 --> O1
    SP2 --> O2
    SP3 --> O3
    SP3 --> O4

    style Performance fill:#3498db,color:#fff
    style ShortcutPerf fill:#27ae60,color:#fff
    style Optimization fill:#f39c12,color:#000

9.3 Query Optimization for Shortcuts¶

# Optimize queries against shortcuts

# BAD: Full table scan across shortcut
df_bad = spark.table("shortcuts.vendor_telemetry")
result = df_bad.filter(col("date") == "2024-01-15").count()

# GOOD: Push filter to shortcut source
df_good = spark.read \
    .format("delta") \
    .load("Tables/vendor_telemetry") \
    .filter(col("date") == "2024-01-15")
result = df_good.count()

# BETTER: Use predicate pushdown with partition column
df_best = spark.sql("""
    SELECT count(*)
    FROM shortcuts.vendor_telemetry
    WHERE date = '2024-01-15'
      AND vendor = 'IGT'
""")

# Monitor query performance
spark.sql("EXPLAIN EXTENDED SELECT * FROM shortcuts.vendor_telemetry WHERE date = '2024-01-15'").show(truncate=False)

9.4 Caching Strategy¶

# Cache frequently accessed shortcut data

# Identify hot data patterns
hot_tables = [
    "shortcuts.master_players",
    "shortcuts.ref_calendar",
    "shortcuts.ref_geography"
]

# Create cached copies for hot data
for table in hot_tables:
    df = spark.table(table)

    # Write to local lakehouse as cached copy
    df.write \
        .mode("overwrite") \
        .format("delta") \
        .saveAsTable(f"cached.{table.split('.')[1]}")

    print(f"Cached {table} locally")

# Schedule cache refresh
# Use Data Pipeline to refresh cached tables daily

🏛️ Step 10: Governance for Shared Data¶

10.1 Governance Framework¶

flowchart TB
    subgraph Governance["Data Governance"]
        CATALOG[Data Catalog<br/>Purview]
        LINEAGE[Data Lineage]
        CLASSIFY[Data Classification]
        POLICY[Access Policies]
    end

    subgraph Shortcuts["Shortcut Governance"]
        SC_REG[Shortcut Registry]
        SC_AUDIT[Access Auditing]
        SC_LIFE[Lifecycle Mgmt]
    end

    subgraph Compliance["Compliance"]
        NIGC[Gaming Compliance<br/>NIGC MICS]
        PII[PII Protection]
        GDPR[GDPR / CCPA]
    end

    CATALOG --> SC_REG
    LINEAGE --> SC_REG
    CLASSIFY --> SC_AUDIT

    SC_REG --> NIGC
    SC_AUDIT --> PII
    SC_LIFE --> GDPR

    style Governance fill:#9b59b6,color:#fff
    style Shortcuts fill:#3498db,color:#fff
    style Compliance fill:#e74c3c,color:#fff

10.2 Shortcut Registry¶

Maintain a registry of all shortcuts for governance:

# Notebook: Shortcut Registry Management

from pyspark.sql.functions import current_timestamp, lit
from pyspark.sql.types import StructType, StructField, StringType, TimestampType

# Shortcut registry schema
registry_schema = StructType([
    StructField("shortcut_id", StringType(), False),
    StructField("shortcut_name", StringType(), False),
    StructField("shortcut_type", StringType(), False),  # internal, adls, s3, gcs, dataverse
    StructField("target_workspace", StringType(), False),
    StructField("target_lakehouse", StringType(), False),
    StructField("source_location", StringType(), False),
    StructField("source_path", StringType(), True),
    StructField("owner", StringType(), False),
    StructField("data_classification", StringType(), False),
    StructField("created_date", TimestampType(), False),
    StructField("last_reviewed_date", TimestampType(), True),
    StructField("status", StringType(), False)  # active, deprecated, pending_review
])

# Register a new shortcut
def register_shortcut(
    shortcut_name: str,
    shortcut_type: str,
    target_workspace: str,
    target_lakehouse: str,
    source_location: str,
    source_path: str,
    owner: str,
    data_classification: str
):
    """Register a shortcut in the governance registry."""

    import uuid

    new_record = [(
        str(uuid.uuid4()),
        shortcut_name,
        shortcut_type,
        target_workspace,
        target_lakehouse,
        source_location,
        source_path,
        owner,
        data_classification,
        current_timestamp(),
        None,
        "active"
    )]

    df_new = spark.createDataFrame(new_record, registry_schema)

    # Append to registry
    df_new.write.mode("append").saveAsTable("governance.shortcut_registry")

    print(f"Registered shortcut: {shortcut_name}")

# Query registry for compliance review
def get_shortcuts_for_review(days_since_review: int = 90):
    """Get shortcuts that need compliance review."""

    return spark.sql(f"""
        SELECT *
        FROM governance.shortcut_registry
        WHERE status = 'active'
          AND (last_reviewed_date IS NULL
               OR last_reviewed_date < current_date() - {days_since_review})
        ORDER BY created_date
    """)

10.3 Lineage Tracking¶

# Track data lineage through shortcuts

# Lineage metadata table
lineage_entries = [
    {
        "source": "s3://igt-casino-data/telemetry/",
        "shortcut": "igt_slot_telemetry",
        "target": "bronze.vendor_telemetry",
        "transformation": "raw_ingestion",
        "pipeline": "pl_vendor_ingestion"
    },
    {
        "source": "bronze.vendor_telemetry",
        "shortcut": None,
        "target": "silver.slot_transactions",
        "transformation": "cleanse_standardize",
        "pipeline": "pl_silver_transform"
    },
    {
        "source": "silver.slot_transactions",
        "shortcut": None,
        "target": "gold.slot_analytics",
        "transformation": "aggregate_metrics",
        "pipeline": "pl_gold_aggregation"
    }
]

# Visualize lineage
# Integration with Microsoft Purview provides automatic lineage tracking

✅ Validation Checklist¶

Before completing this tutorial, verify:

OneLake Understanding - Can explain OneLake hierarchy and path formats
Internal Shortcuts - Successfully created cross-workspace shortcuts
External Shortcuts - Configured at least one external shortcut (ADLS/S3/GCS)
Dataverse Integration - Understand Dataverse shortcut creation (if applicable)
Security Inheritance - Verified permissions flow through shortcuts
Cross-Workspace Sharing - Data accessible from target workspace
Query Performance - Tested query performance through shortcuts
Governance Registry - Documented shortcuts in governance registry

🔧 Troubleshooting¶

Issue	Cause	Solution
Shortcut creation fails	Insufficient permissions	Verify Write access on target lakehouse
Data not visible	Source permissions	Ensure Read access on source data
Slow queries	Cross-region latency	Consider caching or data copy for hot data
Authentication error (ADLS)	Expired credentials	Refresh service principal or managed identity
S3 access denied	IAM policy mismatch	Verify S3 bucket policy allows Fabric access
Shortcut shows stale data	Caching	Shortcuts reflect real-time data; check source
Cannot delete shortcut	Items depend on it	Remove dependent items first

📚 Best Practices¶

Shortcut Design¶

Use Descriptive Names - Include source indication (e.g., adls_vendor_igt_telemetry)
Document Purpose - Maintain registry with business context
Plan for Failure - External shortcuts depend on source availability
Consider Latency - Same-region shortcuts perform better

Security¶

Least Privilege - Grant minimum permissions on source and target
Use Service Principals - Avoid personal accounts for production shortcuts
Audit Access - Enable logging on all shortcut sources
Classify Data - Apply sensitivity labels through Purview

Performance¶

Partition Alignment - Align partitions with query patterns
Cache Hot Data - Copy frequently accessed data locally
Monitor Egress - Track cross-region and external egress costs
Optimize Queries - Use predicate pushdown and filter early

Governance¶

Lifecycle Management - Review and deprecate unused shortcuts
Version Control - Track shortcut configurations in source control
Compliance Review - Regular review for regulatory compliance
Lineage Documentation - Track data flow through shortcuts

🎉 Summary¶

In this tutorial, you learned to:

Understand OneLake as Fabric's unified data lake with hierarchical organization
Create internal shortcuts for cross-workspace data sharing without duplication
Configure external shortcuts to ADLS Gen2, Amazon S3, and Google Cloud Storage
Integrate Dataverse data for Dynamics 365 and Power Platform connectivity
Implement security inheritance and access control patterns
Design hub-and-spoke and data mesh architectures using shortcuts
Configure B2B data sharing for vendor and partner integration
Build multi-tenant data architectures with proper isolation
Optimize shortcut performance and understand virtualization trade-offs
Establish governance frameworks for shared data management

➡️ Next Steps¶

Continue to Tutorial 19: Fabric Copilot & AI Features to learn how to leverage AI-powered assistance for data engineering and analytics tasks in Microsoft Fabric.

Related tutorials: - Tutorial 07: Governance & Purview - Data governance integration - Tutorial 08: Database Mirroring - Alternative data synchronization - Tutorial 10: Teradata Migration - Migration patterns using shortcuts

📖 Additional Resources¶

Microsoft Documentation¶

Architecture Patterns¶

APIs and Automation¶

Previous	Up	Next
⬅️ 17-Monitoring & Alerting	📖 Tutorials Index	19-Copilot & AI ➡️

💬 Questions or issues? Open an issue in the GitHub repository.

⬆️ Back to Top | 📚 Tutorials | 🏠 Home

← Previous17 — Monitoring & AlertingRead more →Next →19 — Copilot & AI FeaturesRead more →

🔗 Tutorial 18: Data Sharing & OneLake Shortcuts¶