Data Engineer Quickstart¶

Last Updated: 2026-05-05 | Role: Data Engineer Goal: Ingest, transform, and serve data through a production-ready medallion architecture in Microsoft Fabric.

Persona & Typical Day¶

You build and maintain data pipelines that move data from source systems into a governed, queryable lakehouse. A typical day involves monitoring pipeline runs, debugging schema drift in bronze tables, optimizing Spark jobs, writing silver-layer transformations, and validating that gold-layer aggregations feed accurate numbers to downstream BI reports.

You care about data quality, pipeline reliability, idempotency, and keeping compute costs under control.

Your First 30 Minutes¶

Follow these steps in order to get a working medallion pipeline running:

Set up your environment - Create a workspace, provision Lakehouses for bronze/silver/gold, and configure access. Tutorial 00: Environment Setup
Ingest your first Bronze table - Run a PySpark notebook that lands raw data into the bronze Lakehouse with append-only semantics. Tutorial 01: Bronze Layer
Transform to Silver - Cleanse, deduplicate, and enforce schemas to produce curated silver tables. Tutorial 02: Silver Layer
Build Gold aggregations - Create star-schema KPI tables that power Direct Lake reports. Tutorial 03: Gold Layer
Create a Data Factory pipeline - Orchestrate the bronze-to-gold flow with scheduling and error handling. Tutorial 06: Data Pipelines

Your First Week¶

Day	Focus	Resource
1	Complete 30-minute path above	Tutorials 00-03, 06
2	Add real-time streaming ingestion	Tutorial 04: Real-Time Analytics
3	Set up Lakehouse schemas and shortcuts	Lakehouse Setup Best Practices
4	Implement data quality checks	Testing Strategies
5	Configure CI/CD for notebook deployment	fabric-cicd Deployment

Key Features for Data Engineers¶

Feature	Doc Link	Why It Matters
Medallion Architecture	Deep Dive	The foundational pattern for all data transformation layers
Spark Notebooks	Best Practices	Your primary development tool for PySpark transformations
Data Factory Pipelines	Pipelines & Data Movement	Orchestration, scheduling, and dependency management
Lakehouse Setup	Setup Guide	Delta Lake storage, schema enforcement, and table management
Mirroring	Mirroring Guide	Near-real-time replication from operational databases
Incremental Refresh & CDC	CDC Patterns	Efficient data loading without full reprocessing
Dataflow Gen2	Dataflow Gen2	Low-code/no-code ETL for lighter transformations
Shortcut Transformations	OneLake Shortcuts	Access external data without copying it into OneLake
Copy Job CDC	Copy Job Guide	Simplified change data capture for common sources

Common Pitfalls¶

Skipping schema enforcement in Bronze - Without explicit schemas, downstream Silver notebooks break silently when source columns change. Always define schemas even on raw ingestion.
Over-partitioning Delta tables - Partitioning by high-cardinality columns (e.g., user ID) creates millions of small files. Partition by date or a low-cardinality dimension instead.
Ignoring V-Order - Fabric's V-Order optimization dramatically improves Direct Lake read performance. Make sure gold tables are written with V-Order enabled. See the V-Order Tuning Guide.
Not using Lakehouse schemas - Schemas (GA 2026) let you organize tables into namespaces inside a single Lakehouse. Use them instead of creating multiple Lakehouses for logical separation.
Running full refreshes when incremental is possible - Full table rewrites waste compute. Use merge/upsert patterns and watermark-based incremental loads.

Medallion Architecture

Deep dive into Bronze, Silver, and Gold layer patterns with partitioning, schema evolution, and optimization guidance.

Medallion Deep Dive
Pipeline Orchestration

Metadata-driven pipelines, error handling, retry patterns, and scheduling strategies.

Metadata-Driven Pipelines
Performance Tuning

Spark parallelism, query optimization, and V-Order tuning for production workloads.

Performance & Parallelism
Error Handling & Monitoring

Structured error handling, alerting, and pipeline monitoring patterns.

Error Handling Guide