Tutorial 10: Data Marketplace — Publish, Discover, and Consume Data Products¶
Estimated Time: 60 minutes | Path: F (Platform Operations) | Prerequisites: Tutorial 01
graph LR
subgraph "Data Producers"
DT[Domain Teams] -->|register| MP[Marketplace API]
end
subgraph "Marketplace"
MP --> CO[Cosmos DB]
MP --> PV[Purview Catalog]
MP --> QE[Quality Engine]
end
subgraph "Consumers"
MP -->|discover| U[Users]
U -->|request access| AR[Approval Workflow]
AR -->|grant| RBAC[Azure RBAC]
end What You'll Build¶
A self-service data marketplace where domain teams publish data products and consumers discover, request access to, and monitor quality scores.
Prerequisites¶
- Tutorial 01 completed (foundation platform deployed)
- Azure Cosmos DB account (deployed with foundation)
- Microsoft Purview instance configured
Environment Variables¶
export COSMOS_ENDPOINT="https://<your-cosmos>.documents.azure.com:443/"
export PURVIEW_ENDPOINT="https://<your-purview>.purview.azure.com"
export AZURE_OPENAI_ENDPOINT="https://<your-openai>.openai.azure.com/"
Step 1: Understand the Data Product Schema¶
The marketplace uses a structured schema for data products. Each product includes metadata, quality scores, access policies, and lineage.
# Data Product structure (from portal.shared.api.models.marketplace)
{
"id": "dp-finance-revenue",
"name": "Revenue Summary",
"domain": "finance",
"layer": "gold",
"description": "Aggregated revenue by region and quarter",
"owner": {
"team": "Finance Data Team",
"email": "finance-data@contoso.com"
},
"schema": {
"format": "delta",
"location": "abfss://gold@storage.dfs.core.windows.net/finance/revenue_summary",
"columns": [
{"name": "revenue", "type": "decimal(18,2)", "description": "Total revenue"},
{"name": "region", "type": "string", "description": "Geographic region"},
{"name": "quarter", "type": "string", "description": "Fiscal quarter"}
]
},
"quality": {
"overall_score": 87.3,
"completeness": 92.1,
"accuracy": 88.5,
"timeliness": 85.2,
"gate_status": "PASS"
},
"sla": {
"freshness_hours": 24,
"uptime_percent": 99.5
},
"classification": ["Financial Data", "Confidential"],
"status": "certified"
}
Expected Output: Understanding of the data product model.
Step 2: Register a Data Product¶
Use the marketplace API to register a new data product.
# The marketplace API runs as part of the portal
# Start the portal locally for development
cd portal
pip install -e ".[dev]"
uvicorn portal.shared.api.main:app --reload --port 8000
# Register a data product
curl -X POST http://localhost:8000/api/v1/marketplace/products \
-H "Content-Type: application/json" \
-d '{
"name": "Air Quality Index",
"domain": "environmental",
"layer": "gold",
"description": "Daily air quality measurements by county from EPA AQS",
"owner": {"team": "Environmental Team", "email": "env@contoso.com"},
"schema": {
"format": "delta",
"location": "abfss://gold@storage.dfs.core.windows.net/environmental/air_quality_index",
"columns": [
{"name": "aqi", "type": "integer", "description": "Air Quality Index value"},
{"name": "county_fips", "type": "string", "description": "County FIPS code"},
{"name": "pollutant", "type": "string", "description": "Primary pollutant"},
{"name": "date", "type": "date", "description": "Measurement date"}
]
},
"classification": ["Public Data"],
"status": "endorsed"
}'
Expected Output:
{
"id": "dp-environmental-air-quality",
"status": "endorsed",
"created_at": "2026-04-22T...",
"quality": { "overall_score": null, "gate_status": "PENDING" }
}
Step 3: Run Quality Assessment¶
Trigger a quality assessment for the registered product.
curl -X POST http://localhost:8000/api/v1/marketplace/products/dp-environmental-air-quality/quality \
-H "Content-Type: application/json" \
-d '{"suite": "gold_standard"}'
The quality engine runs Great Expectations checks:
- Completeness: % of non-null values across required columns
- Accuracy: Business rule validation (AQI between 0-500, valid FIPS codes)
- Timeliness: Time since last data refresh vs SLA
- Consistency: Cross-table referential integrity checks
Expected Output:
{
"overall_score": 91.2,
"completeness": 95.0,
"accuracy": 92.1,
"timeliness": 88.0,
"consistency": 89.5,
"gate_status": "PASS",
"details": {
"checks_passed": 14,
"checks_failed": 1,
"checks_warning": 2
}
}
Step 4: Search and Discover Products¶
# Search by keyword
curl "http://localhost:8000/api/v1/marketplace/products?q=environmental"
# Filter by domain
curl "http://localhost:8000/api/v1/marketplace/products?domain=finance&status=certified"
# Get product details
curl "http://localhost:8000/api/v1/marketplace/products/dp-finance-revenue"
Expected Output:
{
"results": [
{
"id": "dp-environmental-air-quality",
"name": "Air Quality Index",
"domain": "environmental",
"quality": { "overall_score": 91.2, "gate_status": "PASS" },
"status": "endorsed"
}
],
"total": 1
}
Step 5: Request Access¶
Consumers request access through the approval workflow.
curl -X POST http://localhost:8000/api/v1/marketplace/access-requests \
-H "Content-Type: application/json" \
-d '{
"product_id": "dp-environmental-air-quality",
"requestor": "analyst@contoso.com",
"justification": "Need AQI data for county health impact analysis",
"access_level": "read",
"duration_days": 90
}'
Expected Output:
{
"request_id": "ar-001",
"status": "pending_approval",
"approver": "env@contoso.com",
"created_at": "2026-04-22T..."
}
The owner receives a notification and can approve/deny via the portal or API.
Step 6: Approve and Grant Access¶
# Owner approves the request
curl -X PUT http://localhost:8000/api/v1/marketplace/access-requests/ar-001 \
-H "Content-Type: application/json" \
-d '{"status": "approved", "notes": "Approved for Q2 analysis"}'
On approval, the system:
- Creates Azure RBAC role assignment for the requestor
- Adds reader access to the storage container
- Registers the access grant in Purview
- Sends notification to the requestor
Step 7: Purview Integration¶
The marketplace syncs with Microsoft Purview for unified governance.
"""Sync marketplace products to Purview catalog."""
from azure.identity import DefaultAzureCredential
import requests
credential = DefaultAzureCredential()
token = credential.get_token("https://purview.azure.net/.default").token
# Create Purview entity for the data product
entity = {
"typeName": "azure_datalake_gen2_path",
"attributes": {
"qualifiedName": "abfss://gold@storage.dfs.core.windows.net/environmental/air_quality_index",
"name": "Air Quality Index",
"description": "Daily air quality measurements by county from EPA AQS",
},
"classifications": [{"typeName": "Microsoft.Public"}],
}
resp = requests.post(
f"{PURVIEW_ENDPOINT}/datamap/api/atlas/v2/entity",
headers={"Authorization": f"Bearer {token}"},
json={"entity": entity},
)
print(f"Purview entity created: {resp.json()['guidAssignments']}")
Step 8: Monitor Data Product Health¶
Set up ongoing monitoring for your registered products.
# Get product health dashboard
curl "http://localhost:8000/api/v1/marketplace/products/dp-environmental-air-quality/health"
Expected Output:
{
"product_id": "dp-environmental-air-quality",
"health": {
"quality_trend": [91.2, 90.8, 92.1, 91.5],
"freshness_status": "on_time",
"last_refresh": "2026-04-22T06:00:00Z",
"consumers": 3,
"active_access_grants": 2,
"sla_compliance": 99.8
}
}
Troubleshooting¶
| Issue | Cause | Solution |
|---|---|---|
401 Unauthorized | Missing or expired token | Run az login and retry |
Quality score null | Assessment not yet run | Trigger quality check (Step 3) |
| Purview sync failed | Network/permission issue | Check Purview RBAC roles |
| Access request stuck | No approver configured | Set product owner email |
What's Next¶
- Tutorial 01: Foundation Platform — deploy the base infrastructure
- Tutorial 02: Data Governance — configure Purview policies
- Tutorial 06: AI Analytics — add AI-powered insights