Home > Docs > Best Practices > Git Workflow for Fabric

Git Workflow for Microsoft Fabric Development¶

Last Updated: 2026-04-27 | Version: 1.0.0

Table of Contents¶

Overview
Fabric Git Integration Architecture
Branch Strategy
Repository Structure for Fabric Items
PR Workflow
Conflict Resolution for Fabric Files
fabric-cicd Deployment Integration
Handling Merge Conflicts in Notebooks
.gitignore Patterns
Protected Branches and Policies
Code Review Checklist
Commit Message Convention
Multi-Developer Workflow
Troubleshooting
References

Overview¶

Microsoft Fabric provides built-in Git integration that syncs workspace items (notebooks, pipelines, semantic models) to a Git repository. This guide establishes the branching strategy, PR workflow, and conflict resolution patterns used in this POC.

Key Principles¶

Git is the source of truth. Fabric workspaces are deployment targets, not the authoritative version.
All changes go through PRs. No direct workspace edits in staging or production.
Automated deployment. The fabric-cicd library deploys items from Git to workspaces.
Test before merge. Unit tests run in CI; integration testing happens in the dev workspace.

Fabric Git Integration Architecture¶

Developer Machine                    GitHub                        Fabric Service
+-----------------+     push     +----------------+   fabric-cicd  +------------------+
| VS Code         | ----------> | main branch    | ------------> | Casino-POC-Dev   |
| - Edit .py      |             |                |               |   Notebooks      |
| - Run pytest    |    PR       | feature/*      |               |   Lakehouses     |
| - Commit        | ----------> | phase14/*      |               |   Pipelines      |
+-----------------+             +-------+--------+               +------------------+
                                        |
                                        | merge to release/staging
                                        v
                                +----------------+   fabric-cicd  +------------------+
                                | release/staging| ------------> | Casino-POC-Stg   |
                                +-------+--------+               +------------------+
                                        |
                                        | merge to release/prod
                                        v
                                +----------------+   fabric-cicd  +------------------+
                                | release/prod   | ------------> | Casino-POC-Prod  |
                                +----------------+               +------------------+

Sync Direction¶

Direction	When	How
Fabric → Git	Initial export, one-time migration	Fabric UI: "Connect to Git"
Git → Fabric	Every deployment	`fabric-cicd` via GitHub Actions
Local → Git	Every commit	`git push`
Git → Local	Every pull	`git pull`

Important: Once Git integration is established, avoid making changes directly in the Fabric workspace. All changes should flow through Git to maintain a single source of truth.

Branch Strategy¶

Recommended: Feature Branch + Environment Branches¶

main                          (protected, auto-deploy to Dev)
  |
  +-- feature/slot-schema     (developer feature work)
  +-- feature/sar-detection   (developer feature work)
  +-- phase14/wave8-*         (phase-based feature branches)
  |
  +-- release/staging         (protected, deploy to Staging)
  +-- release/prod            (protected, deploy to Production)

Branch Naming Convention¶

Pattern	Purpose	Example
`feature/<desc>`	New functionality	`feature/add-player-dedup`
`fix/<desc>`	Bug fixes	`fix/ctr-threshold-boundary`
`phase<N>/wave<N>-<desc>`	Phase-based work	`phase14/wave8-developer-experience`
`release/staging`	Staging environment	Long-lived
`release/prod`	Production environment	Long-lived
`hotfix/<desc>`	Emergency production fix	`hotfix/sar-false-positive`

Branch Lifecycle¶

# 1. Create feature branch from main
git checkout main
git pull origin main
git checkout -b feature/add-player-dedup

# 2. Develop, test, commit
# ... edit files ...
pytest validation/unit_tests/ -v
git add notebooks/silver/02_silver_player_cleansed.py
git commit -m "feat(silver): add deduplication logic for player records"

# 3. Push and create PR
git push -u origin feature/add-player-dedup
gh pr create --title "feat(silver): player deduplication" --body "..."

# 4. After merge, delete feature branch
git checkout main
git pull origin main
git branch -d feature/add-player-dedup

Environment Promotion¶

# Promote main --> staging
git checkout release/staging
git merge main
git push origin release/staging
# GitHub Actions deploys to Casino-POC-Stg via fabric-cicd

# Promote staging --> production (after validation)
git checkout release/prod
git merge release/staging
git push origin release/prod
# GitHub Actions deploys to Casino-POC-Prod via fabric-cicd (requires approval)

Repository Structure for Fabric Items¶

How Fabric Items Map to Files¶

Fabric Item	File Format	Repository Path
Notebook	`.py` (Databricks source)	`notebooks/<layer>/<name>.py`
Semantic Model	`.bim` (JSON)	`semantic-models/<name>/model.bim`
Pipeline	`.json` (definition)	`pipelines/<name>/pipeline-content.json`
Lakehouse	`.platform` (metadata)	`lakehouses/<name>/.platform`
Data Generator	`.py` (Python module)	`data_generation/generators/<name>.py`
Bicep Module	`.bicep`	`infra/modules/<category>/<name>.bicep`

This POC's Structure¶

Suppercharge_Microsoft_Fabric/
  notebooks/
    bronze/               # 18 Bronze ingestion notebooks
    silver/               # 17 Silver transformation notebooks
    gold/                 # 19 Gold KPI/analytics notebooks
    utils/                # Shared utility modules
    docs/use-cases/       # Applied analytics use cases
  data_generation/
    generators/           # 17 Python data generators
    config/               # YAML configuration
    open_data/            # Federal dataset download scripts
  infra/
    main.bicep            # Root IaC orchestration
    modules/              # Bicep modules
    environments/         # Parameter files per environment
  validation/
    unit_tests/           # 612 unit tests
    great_expectations/   # 9 data quality suites
  scripts/
    fabric-cicd-deploy.py # Deployment script
  .github/workflows/
    deploy-fabric.yml     # CI/CD pipeline

Platform Files¶

Fabric Git integration creates .platform files alongside items. These contain workspace-specific metadata:

{
  "$schema": "https://developer.microsoft.com/json-schemas/fabric/gitIntegration/platformProperties/2.0.0/schema.json",
  "metadata": {
    "type": "Notebook",
    "displayName": "01_bronze_slot_telemetry"
  },
  "config": {
    "version": "2.0",
    "logicalId": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
  }
}

Note: .platform files are workspace-specific. Do not manually edit them. The fabric-cicd library handles item ID mapping during deployment.

PR Workflow¶

The Complete Flow¶

1. Create Branch    2. Develop Locally   3. Push + PR        4. CI Checks
   git checkout -b     edit .py files       git push             pytest
   feature/xyz         run pytest           gh pr create         bicep build
                       commit                                    lint

5. Code Review      6. Merge             7. Auto-Deploy      8. Validate
   Team reviews        Squash merge         fabric-cicd          Run notebook
   Approve PR          to main              deploys to Dev       in workspace

Creating a Pull Request¶

# After committing and pushing your feature branch
gh pr create \
  --title "feat(bronze): add USDA crop ingestion notebook" \
  --body "$(cat <<'EOF'
## Summary
- Added Bronze notebook for USDA crop production data ingestion
- Schema enforcement with 12 validated fields
- Null rejection for required fields (state_code, year, commodity)

## Test Plan
- [x] Unit tests pass locally (pytest)
- [x] Schema validation tests added
- [ ] Manual validation in Dev workspace after merge

## Compliance
- No PII data involved
- Uses public USDA NASS API

## Related
- Phase 14, Wave 8: Developer Experience
EOF
)"

PR Template¶

Create .github/PULL_REQUEST_TEMPLATE.md:

## Summary
<!-- 1-3 bullet points describing what changed and why -->

## Type of Change
- [ ] New notebook (Bronze/Silver/Gold)
- [ ] Notebook modification
- [ ] Data generator update
- [ ] Infrastructure (Bicep)
- [ ] Documentation
- [ ] Bug fix
- [ ] Test update

## Test Plan
- [ ] Unit tests pass (`pytest validation/unit_tests/ -v`)
- [ ] New tests added for new logic
- [ ] Manual validation in Dev workspace
- [ ] No regressions in existing tests

## Compliance Checklist
- [ ] No PII in committed code
- [ ] No hardcoded secrets or connection strings
- [ ] FABRIC_POC_HASH_SALT used for PII hashing (not hardcoded salt)
- [ ] Compliance thresholds unchanged (or documented if changed)

## Breaking Changes
<!-- List any breaking changes and migration steps -->
None

Conflict Resolution for Fabric Files¶

Notebook Conflicts (.py files)¶

Notebook .py files are standard Python with # COMMAND ---------- cell separators. Git handles these like any Python file:

# Check for conflicts after merge
git merge main
# If conflicts exist:

# Option 1: Open in VS Code and resolve visually
code notebooks/bronze/01_bronze_slot_telemetry.py

# Option 2: Use VS Code merge editor
# Click "Resolve in Merge Editor" when prompted

Best practices for avoiding notebook conflicts:

Keep notebook cells small and focused.
Put shared logic in notebooks/utils/ modules (conflict-free).
Avoid editing the same notebook on multiple branches simultaneously.
Use # COMMAND ---------- separators as natural merge boundaries.

Pipeline Conflicts (.json files)¶

Pipeline definitions are JSON files. Conflicts in JSON are harder to resolve:

# Use jq to format JSON before resolving
cat pipelines/main-pipeline/pipeline-content.json | python -m json.tool > temp/formatted.json

Strategy: If pipeline conflicts are complex, take the newer version and re-apply your changes manually in the Fabric UI, then re-export.

Semantic Model Conflicts (.bim files)¶

Semantic model definitions are large JSON files (.bim). Manual conflict resolution is error-prone.

Strategy: 1. Take the version from main (their changes). 2. Re-apply your measure or table changes using Tabular Editor. 3. Re-export the .bim file.

.platform File Conflicts¶

.platform files should never conflict because they are workspace-specific and listed in .gitignore. If they do appear in a conflict:

# Accept theirs (workspace metadata is regenerated by fabric-cicd)
git checkout --theirs '*.platform'
git add '*.platform'

fabric-cicd Deployment Integration¶

How Deployment Works¶

The deploy-fabric.yml workflow uses fabric-cicd to deploy items:

# Simplified from .github/workflows/deploy-fabric.yml
- name: Deploy to Fabric
  run: |
    python scripts/fabric-cicd-deploy.py \
      --workspace-id ${{ secrets.FABRIC_WORKSPACE_ID }} \
      --environment ${{ inputs.target_environment }} \
      --item-type-in-scope Notebook Lakehouse SemanticModel \
      ${{ inputs.dry_run && '--dry-run' || '' }}

Deployment Flow¶

git push to main
  |
  v
GitHub Actions: validate
  - bicep build
  - pytest
  |
  v
GitHub Actions: deploy-dev
  - fabric-cicd (dry-run by default)
  |
  v
Manual dispatch: deploy-staging
  - Requires environment approval
  - fabric-cicd deploys to staging workspace
  |
  v
Manual dispatch: deploy-prod
  - Requires environment approval + wait timer
  - fabric-cicd deploys to production workspace

Scoping Deployments¶

Not every push needs to deploy every item type. The workflow uses path filters:

on:
  push:
    branches: [main]
    paths:
      - 'notebooks/**'
      - 'reports/semantic-model/**'
      - 'reports/report-definitions/**'

Only changes to these paths trigger deployment. Infrastructure changes (Bicep), documentation, and tests do not trigger Fabric deployments.

Handling Merge Conflicts in Notebooks¶

Common Conflict Scenarios¶

Scenario 1: Both branches modify the same cell

<<<<<<< HEAD
SOURCE_PATH = "Files/output/bronze_slot_telemetry_v2.parquet"
=======
SOURCE_PATH = "Files/output/bronze_slot_telemetry.parquet"
>>>>>>> feature/update-source

Resolution: Choose the correct path based on the current data format.

Scenario 2: One branch adds a cell, another modifies adjacent cells

# COMMAND ----------

# Cell A (unchanged)

# COMMAND ----------

<<<<<<< HEAD
# Cell B (modified in main)
df = df.withColumn("new_col", lit("v2"))
=======
# Cell B-prime (new cell added in feature branch)
df = df.withColumn("validation_flag", when(col("amount") > 0, True))

# COMMAND ----------

# Cell B (original, unchanged in feature)
>>>>>>> feature/add-validation

Resolution: Keep both changes. The new cell from the feature branch and the modification from main are independent.

Scenario 3: Import block conflicts

<<<<<<< HEAD
from pyspark.sql.functions import col, current_timestamp, lit, when, hash
=======
from pyspark.sql.functions import col, current_timestamp, input_file_name, lit
>>>>>>> feature/add-source-tracking

Resolution: Merge both import lists:

from pyspark.sql.functions import (
    col,
    current_timestamp,
    hash,
    input_file_name,
    lit,
    when,
)

Prevention: Reducing Notebook Conflicts¶

Extract shared logic. Functions in notebooks/utils/ are less likely to conflict than inline notebook code.
One concern per notebook. Do not mix ingestion and transformation in the same notebook.
Use thin notebooks. The notebook should be orchestration only. Business logic lives in imported modules.
Coordinate through PRs. Before starting work on a notebook someone else is editing, communicate in the PR.

.gitignore Patterns¶

Fabric-Specific Entries¶

This POC's .gitignore includes these Fabric-relevant patterns:

# Fabric workspace metadata (generated by extension, workspace-specific)
.fabric/

# Fabric binary exports (use .py source format instead)
*.lakehouse
*.warehouse
*.notebook

# Power BI binary files (use .bim or TMDL for source control)
*.pbix
*.pbit

# Generated data (too large for Git)
data_generation/output/
output/
*.parquet
*.csv

# Exception: tracked schema and config CSVs
!**/schemas/*.csv
!**/config/*.csv
!sample-data/**/*.csv
!sample-data/**/*.parquet

# Platform metadata (regenerated by fabric-cicd)
# Note: .platform files ARE tracked if using Fabric Git integration directly.
# They are NOT tracked if using fabric-cicd exclusively.

What to Track vs. Ignore¶

Track (commit)	Ignore (gitignore)
`.py` notebook source	`.ipynb` checkpoint files
`.bicep` infrastructure	`.json` ARM compiled output
`.bim` semantic models	`.pbix` Power BI binaries
`requirements.txt`	`.venv/` virtual environment
`.github/workflows/`	`.fabric/` workspace metadata
`data_generation/generators/`	`data_generation/output/`
`validation/unit_tests/`	`.pytest_cache/`, `htmlcov/`

Protected Branches and Policies¶

Branch Protection Rules¶

Configure these on GitHub for each protected branch:

main branch:

protection_rules:
  require_pull_request:
    required_reviewers: 1
    dismiss_stale_reviews: true
    require_code_owner_review: false
  require_status_checks:
    strict: true
    contexts:
      - "Run Notebook Unit Tests"
      - "Validate Fabric Items"
  require_linear_history: false
  allow_force_pushes: false
  allow_deletions: false

release/staging and release/prod:

protection_rules:
  require_pull_request:
    required_reviewers: 2
  require_status_checks:
    strict: true
  restrict_pushes:
    - team: platform-leads
  deployment_branch_policy:
    protected_branches: true

CODEOWNERS¶

Create .github/CODEOWNERS:

# Default: platform team reviews everything
*                           @platform-team

# Notebooks: data engineering team
notebooks/                  @data-engineering-team
data_generation/            @data-engineering-team

# Infrastructure: platform team
infra/                      @platform-team
.github/workflows/          @platform-team

# Compliance-sensitive files: require lead review
notebooks/utils/compliance_rules.py  @data-engineering-lead
validation/unit_tests/test_compliance_pipeline.py  @data-engineering-lead

Code Review Checklist¶

For Notebook PRs¶

Schema: Does the notebook enforce a schema (StructType), or does it rely on inference?
Nulls: Are required columns validated for null values?
Idempotency: Can the notebook be re-run without creating duplicates?
Paths: Are file paths using the standard pattern (Files/output/...)?
Secrets: No hardcoded connection strings, passwords, or API keys?
PII: SSN hashing uses FABRIC_POC_HASH_SALT env var, not a hardcoded salt?
Tests: Unit tests added or updated for new business logic?
Compliance: CTR/SAR/W-2G thresholds match regulatory requirements?
Documentation: Markdown cells explain the notebook's purpose and data flow?

For Bicep PRs¶

Naming: Resources follow Azure naming conventions with project prefix?
Parameters: All configurable values are parameterized (not hardcoded)?
What-if: az deployment sub what-if output reviewed?
Tags: Resources include required tags (environment, project, owner)?
Security: No secrets in parameter defaults?

For Data Generator PRs¶

Base class: Generator inherits from BaseGenerator?
Type hints: All functions have type annotations?
PII: Uses 900-series synthetic SSNs, not Faker?
Determinism: Random seed can be set for reproducibility?
Tests: Generator tests added in validation/unit_tests/?

Commit Message Convention¶

This POC follows the Conventional Commits format:

<type>(<scope>): <subject>

[optional body]

[optional footer]

Types¶

Type	When
`feat`	New feature (notebook, generator, module)
`fix`	Bug fix
`docs`	Documentation only
`refactor`	Code restructuring without behavior change
`test`	Adding or updating tests
`chore`	Build, CI, or tooling changes

Scopes¶

Scope	Covers
`bronze`	Bronze layer notebooks
`silver`	Silver layer notebooks
`gold`	Gold layer notebooks
`infra`	Bicep modules, deployment
`generators`	Data generation code
`ci`	GitHub Actions workflows
`phase<N>/wave<N>`	Phase-specific work

Examples¶

feat(bronze): add USDA crop production ingestion notebook
fix(silver): correct SSN hashing to use env var salt
docs(best-practices): add dev experience documentation
refactor(generators): extract base validation into BaseGenerator
test(federal): add 54 unit tests for federal agency generators
chore(ci): update actions/checkout to v4
feat(phase14/wave8): add developer experience best practices

Multi-Developer Workflow¶

Scenario: Two Developers Editing Different Notebooks¶

This is the simple case. Both developers work on separate feature branches and merge independently:

# Developer A: working on Bronze notebook
git checkout -b feature/bronze-usda
# edit notebooks/bronze/14_bronze_usda_crop.py

# Developer B: working on Gold notebook
git checkout -b feature/gold-usda-kpi
# edit notebooks/gold/14_gold_usda_performance.py

# No conflicts: both PRs merge cleanly to main

Scenario: Two Developers Editing the Same Notebook¶

Coordinate through communication and small, focused PRs:

# Developer A: adds schema enforcement (merge first)
git checkout -b feature/slot-schema
# edit notebooks/bronze/01_bronze_slot_telemetry.py -- cell 3 only
# PR title: "feat(bronze): add schema enforcement to slot telemetry"

# Developer B: adds compliance flags (merge after A)
git checkout -b feature/slot-compliance
# Wait for A's PR to merge
git pull origin main  # Get A's changes
git rebase main       # Rebase onto A's changes
# edit notebooks/bronze/01_bronze_slot_telemetry.py -- cell 5 only
# PR title: "feat(bronze): add CTR/SAR flags to slot telemetry"

Scenario: Hotfix While Feature Work is in Progress¶

# Developer has feature work in progress
git stash  # Save current work

# Create hotfix from main
git checkout main
git pull origin main
git checkout -b hotfix/sar-false-positive
# fix, test, commit
git push -u origin hotfix/sar-false-positive
gh pr create --title "fix(compliance): correct SAR detection boundary"

# After hotfix is merged, return to feature work
git checkout feature/my-feature
git stash pop
git rebase main  # Incorporate the hotfix

Troubleshooting¶

`error: failed to push some refs`¶

Cause:  Remote has changes you don't have locally
Fix:    git pull --rebase origin <branch>

`CONFLICT (content): Merge conflict in notebooks/...`¶

Cause:  Two branches modified the same notebook cell
Fix:    Open in VS Code, use the merge editor, resolve, then:
        git add <file>
        git commit

`fabric-cicd deployment fails after merge`¶

Cause:  Item definition is invalid (broken JSON, missing required fields)
Fix:    Run validation locally:
        python scripts/fabric-cicd-deploy.py --dry-run

PR checks fail but tests pass locally¶

Cause:  Python version or dependency mismatch between local and CI
Fix:    Ensure you're using Python 3.11 and matching pip versions
        Check: python --version && pip freeze | grep pyspark

Large file rejected by GitHub¶

Cause:  Parquet or CSV data files accidentally staged
Fix:    git rm --cached <file>
        Verify .gitignore covers the pattern