Skip to content

Cost Optimization Best Practices

Home | Best Practices | Cost Optimization

Status

Best practices for optimizing costs in Cloud Scale Analytics.


Overview

Cost optimization is achieved through:

  • Right-sizing resources
  • Leveraging auto-scaling and auto-pause
  • Using reserved capacity for predictable workloads
  • Implementing efficient data architectures

For detailed guidance, see: Full Cost Optimization Guide


Quick Wins

1. Auto-Pause Dedicated SQL Pools

# Configure auto-pause (15 minutes inactivity)
az synapse sql pool update \
    --name dedicated-pool \
    --workspace-name synapse-ws \
    --resource-group rg-analytics \
    --enable-auto-pause true \
    --auto-pause-delay 15

2. Use Serverless SQL for Ad-Hoc Queries

-- Query external data without dedicated resources
SELECT
    CustomerSegment,
    COUNT(*) AS CustomerCount,
    AVG(TotalPurchases) AS AvgPurchases
FROM OPENROWSET(
    BULK 'https://storage.dfs.core.windows.net/data/customers/*.parquet',
    FORMAT = 'PARQUET'
) AS customers
GROUP BY CustomerSegment;

3. Implement Tiered Storage

Tier Use Case Cost
Hot Active data (< 30 days) $$$$
Cool Infrequent access (30-90 days) $$$
Archive Compliance/historical (> 90 days) $

4. Schedule Spark Clusters

# Auto-terminate idle clusters
cluster_config = {
    "autotermination_minutes": 30,
    "enable_autoscaling": True,
    "min_workers": 1,
    "max_workers": 10
}

Cost Monitoring

Budget Alerts

az consumption budget create \
    --budget-name monthly-analytics \
    --amount 10000 \
    --time-grain Monthly \
    --start-date 2025-01-01 \
    --end-date 2025-12-31 \
    --notifications '{
        "actual_vs_budget_80_percent": {
            "enabled": true,
            "operator": "GreaterThan",
            "threshold": 80,
            "contactEmails": ["team@company.com"]
        }
    }'

Cost Analysis Query

// Monthly cost breakdown by service
AzureDiagnostics
| where TimeGenerated > ago(30d)
| summarize
    EstimatedCost = sum(toreal(Quantity) * toreal(UnitPrice))
    by ResourceType, bin(TimeGenerated, 1d)
| render timechart


Last Updated: January 2025