⚙️ Tutorial 00: Environment Setup¶
Last Updated: 2026-04-15 | Version: 2.0 Status: ✅ Final | Maintainer: Documentation Team
For the 5-minute version, see docs/QUICK_START.md.
📖 Overview¶
This tutorial walks you through setting up the complete Microsoft Fabric environment for the Casino/Gaming POC. By the end you will have a workspace, three Lakehouses following the medallion architecture, and a verified Spark environment ready for data ingestion.
graph LR
subgraph "What We're Building"
A[Azure Subscription] --> B[Fabric Workspace]
B --> C["lh_bronze — Raw Data"]
B --> D["lh_silver — Cleansed Data"]
B --> E["lh_gold — Business KPIs"]
end 🎯 Learning Objectives¶
By the end of this tutorial you will be able to:
- Verify your Fabric capacity is running and ready
- Create and configure a Fabric workspace
- Create three Lakehouses for the medallion architecture
- Understand why Bronze / Silver / Gold layers exist
- Run a test notebook and confirm Spark connectivity
- Set up local development tools for data generation
📋 Prerequisites¶
Before starting, confirm you have:
| Requirement | How to Get It |
|---|---|
| Azure subscription with Fabric enabled | Sign up for Azure |
| Fabric capacity (F64 recommended, F2 minimum) | Start a free 60-day trial |
| Access to app.fabric.microsoft.com | Sign in with your Microsoft Entra ID account |
| Python 3.10+ (optional, for local data generation) | Download from python.org |
| Azure CLI (optional, for Bicep deployment) | Install Azure CLI |
| VS Code (optional, for editing notebooks locally) | Download VS Code |
🛠️ Step 1: Verify Fabric Capacity¶
Your Fabric capacity must be Running before any notebooks or data processing will work. A paused or deallocated capacity will silently prevent everything downstream.
Check capacity state¶
- Open the Azure Portal
- Search for Microsoft Fabric and open your capacity resource
- Check the Status field on the overview blade:
| Status | Meaning | Action |
|---|---|---|
| Running | Ready to use | Proceed to Step 2 |
| Paused | Manually paused to save cost | Click Resume — takes 1-2 minutes |
| Deallocated | Stopped by Azure | Click Resume — may take 2-3 minutes |
| Not found | No capacity provisioned | Deploy via Bicep (see QUICK_START.md Step 2) or start a trial |
Verify from the Fabric portal¶
- Open app.fabric.microsoft.com
- Click Settings (gear icon, top right) > Admin portal
- Navigate to Capacity settings
- Confirm your capacity appears and shows Active
If you don't have Admin portal access, ask your tenant admin to confirm capacity status, or check the Azure Portal method above which only requires Contributor access on the capacity resource.
Checkpoint: Capacity status is Running / Active.
🛠️ Step 2: Create Workspace¶
- In the Fabric portal left navigation, click Workspaces
- Click + New workspace
- Configure:
| Setting | Value |
|---|---|
| Name | casino-fabric-poc |
| Description | Casino/Gaming Fabric POC — Medallion Architecture |
| License mode | Fabric capacity |
| Capacity | Select your Fabric capacity from the dropdown |
- Click Apply
Configure workspace settings¶
After the workspace is created:
- Open workspace settings (click the three dots ... next to the workspace name > Workspace settings)
- Under General, set the contact list to include your team
- Under License info, confirm Fabric capacity is selected
Pin the workspace to your favorites (star icon) for quick access from the home page.
Checkpoint: casino-fabric-poc appears in your Workspaces list with a diamond/capacity badge.
🛠️ Step 3: Create Lakehouses¶
The medallion architecture uses three layers to progressively refine data. Each layer gets its own Lakehouse:
graph TB
subgraph "Medallion Architecture"
direction LR
B["lh_bronze<br/>Raw, append-only<br/>Minimal transformation"] --> S["lh_silver<br/>Cleansed, validated<br/>Schema-enforced, deduped"]
S --> G["lh_gold<br/>Business aggregations<br/>Star schema, KPIs"]
end | Layer | Lakehouse Name | What Goes Here | Example Tables |
|---|---|---|---|
| Bronze | lh_bronze | Raw data exactly as received. Append-only, no business logic. | bronze_slot_telemetry, bronze_player_profile |
| Silver | lh_silver | Cleansed and validated data. Nulls handled, duplicates removed, schemas enforced. | silver_slot_cleansed, silver_player_master |
| Gold | lh_gold | Business-ready aggregations. Star schema tables, KPIs, compliance reports. | gold_slot_performance, gold_revenue_daily |
Create each Lakehouse¶
For each of the three Lakehouses:
- In the
casino-fabric-pocworkspace, click + New > Lakehouse - Enter the name (
lh_bronze, thenlh_silver, thenlh_gold) - Click Create
Verify¶
Your workspace item list should now show:
Each Lakehouse will have empty Tables and Files folders — that's expected at this point.
Checkpoint: All three Lakehouses appear in the workspace.
🛠️ Step 4: Connect External Storage (Path A Only)¶
🔀 Which path are you on?
Path A — Production-Aligned Path B — Quickstart Prereq Deploy infra/main.bicepfirstFabric capacity + workspace only Data location ADLS Gen2 → OneLake shortcut Upload directly to OneLake Source path in notebooks Files/landing_zone/...Files/raw/...Cost ~$1-3/day (Purview + Storage + KV + LAW) Fabric capacity only Best for Production patterns, governance, security tutorials Learning medallion flow fast Path B users: Skip this step entirely — proceed to Step 5. Upload your generated data directly to
lh_bronze/Files/raw/<source>/via the Fabric UI.Path A users: Complete this step to connect your Bicep-provisioned ADLS storage.
If you deployed the ADLS Gen2 storage account via Bicep, connect it as a shortcut to avoid copying data.
- Open
lh_bronzein the workspace - In the Explorer pane, right-click on Files
- Select New shortcut > Azure Data Lake Storage Gen2
- Enter your ADLS DFS endpoint URL and authenticate with your organizational account
- Browse to the
landingcontainer - Name the shortcut
landing_zoneand click Create
Shortcuts let you query external data in place without copying it into OneLake, saving storage cost and keeping data in sync.
Checkpoint: The landing_zone shortcut appears under Files in lh_bronze, and you can browse its contents.
🛠️ Step 5: Configure Workspace Access¶
- Open workspace settings > Access
- Add team members with appropriate roles:
| Role | Who | Permissions |
|---|---|---|
| Admin | Workspace owners | Full control including delete |
| Member | Data engineers | Edit all items |
| Contributor | Developers | Create and edit own items |
| Viewer | Business users, analysts | Read only |
Be careful with Admin — Admins can delete the entire workspace and all its contents.
🛠️ Step 6: Install Local Tools (Optional)¶
For local data generation and development, set up Python:
Python environment setup¶
Run these commands in a terminal from the repo root (Suppercharge_Microsoft_Fabric/):
# Create virtual environment
python -m venv .venv
# Activate — Windows
.venv\Scripts\activate
# Activate — macOS / Linux
source .venv/bin/activate
# Install project dependencies
pip install -r requirements.txt
Verify installation¶
All
pythonandpipcommands in this tutorial series assume you have activated the virtual environment first.
✅ Step 7: Verify the Environment¶
Run a test notebook in Fabric¶
This test writes and reads a Delta table to confirm Spark and Lakehouse connectivity.
- In the
casino-fabric-pocworkspace, click + New > Notebook - In the notebook's Lakehouse explorer (left panel), click Add and attach
lh_bronze - Paste this code into the first cell and click the Run button:
# ---- Run this in a Fabric Notebook (attached to lh_bronze) ----
# Create test data
data = [("environment", "ready"), ("setup", "complete")]
df = spark.createDataFrame(data, ["key", "value"])
# Write to Bronze Lakehouse as a Delta table
df.write.format("delta").mode("overwrite").save("Tables/test_connection")
# Read it back to confirm round-trip works
df_check = spark.read.format("delta").load("Tables/test_connection")
print(f"Row count: {df_check.count()}") # Should print: Row count: 2
display(df_check)
print("Environment verified successfully.")
First-time execution takes 2-3 minutes while Fabric provisions a Spark cluster (cold start). Subsequent cell runs in the same session are fast.
- After the cell succeeds, verify
test_connectionappears under Tables in the Lakehouse explorer (you may need to click the refresh icon) - Clean up by running this in a second cell:
# ---- Run this in a Fabric Notebook ----
spark.sql("DROP TABLE IF EXISTS test_connection")
print("Test table cleaned up.")
Checkpoint: The test cell printed Row count: 2 and the table appeared in the explorer.
✅ Validation Checklist¶
Before proceeding to Tutorial 01, confirm every item:
- Fabric capacity is Running / Active
- Workspace
casino-fabric-pocexists with Fabric capacity assigned - Lakehouse
lh_bronzecreated - Lakehouse
lh_silvercreated - Lakehouse
lh_goldcreated - Test notebook executed successfully (Row count: 2)
- Test table cleaned up
- (Optional) ADLS shortcut
landing_zoneworking - (Optional) Local Python environment with dependencies installed
🔧 Troubleshooting¶
Workspace and Capacity¶
| Error / Symptom | Cause | Fix |
|---|---|---|
| "Fabric is not enabled for your tenant" | Tenant admin hasn't enabled Fabric | Contact your Microsoft Entra tenant admin to enable Fabric in the admin portal |
| "No capacity available" when creating workspace | No Fabric capacity provisioned or trial not started | Start a Fabric trial or deploy capacity via Bicep |
| Workspace shows no capacity badge | Workspace not assigned to a Fabric capacity | Open workspace settings > License info > select your Fabric capacity |
Notebooks and Spark¶
| Error / Symptom | Cause | Fix |
|---|---|---|
| Cell hangs for 5+ minutes on first run | Spark cold start taking longer than usual | Wait up to 3 minutes. If it exceeds 5, cancel the cell and click Run again — the cluster may have failed to provision |
AnalysisException: Table not found | Notebook is attached to the wrong Lakehouse or no Lakehouse attached | Click the Lakehouse icon in the notebook sidebar, verify lh_bronze is attached. Remove and re-add if needed |
TokenExpired or authentication errors | Fabric session token expired (after ~1 hour of inactivity) | Refresh the browser tab and re-run the cell. Your work is saved automatically |
OutOfMemoryError or SparkException: Job aborted | Data too large for current capacity | Reduce data volume or upgrade from F2 to a larger SKU |
Capacity is suspended or notebook won't start | Fabric capacity is paused in Azure Portal | Go to Azure Portal > your Fabric capacity > click Resume (1-2 minutes) |
Local Development¶
| Error / Symptom | Cause | Fix |
|---|---|---|
ModuleNotFoundError: No module named 'pandas' | Virtual environment not activated | Run .venv\Scripts\activate (Windows) or source .venv/bin/activate (macOS/Linux) before pip install |
python: command not found | Python not on PATH | Install Python from python.org and ensure "Add to PATH" is checked |
docker compose not recognized | Using deprecated docker-compose binary | Install Docker Desktop 4.x+ which includes the docker compose v2 plugin |
Pro Tip: If Spark startup fails repeatedly, try detaching and reattaching the Lakehouse to the notebook — this forces a fresh cluster provisioning.
🎉 Summary¶
You have successfully:
- Verified your Fabric capacity is running
- Created the
casino-fabric-pocworkspace - Set up the medallion architecture with
lh_bronze,lh_silver, andlh_gold - Confirmed Spark connectivity with a test notebook
- (Optional) Connected external ADLS storage and set up local dev tools
Your environment is ready for data ingestion.
📚 Resources¶
- Microsoft Fabric Documentation
- Lakehouse Overview
- Workspace Management
- OneLake Shortcuts
- How to Use Notebooks
🧭 Navigation¶
| Previous | Up | Next |
|---|---|---|
| N/A — this is the first tutorial | Tutorials Index | Tutorial 01: Bronze Layer |