📊 Resource Planner - Interactive Capacity Calculator¶

📋 Overview¶

The Resource Planner is an interactive calculator that helps you optimize Azure Synapse Analytics resource allocation based on your workload characteristics. This demo provides hands-on experience with capacity planning, performance tuning, and cost optimization for compute and storage resources.

🎯 Learning Objectives¶

By completing this interactive demo, you will be able to:

Calculate optimal Spark pool sizes for your workloads
Determine appropriate SQL pool DWU settings
Estimate storage requirements for Data Lake layers
Balance performance requirements with cost constraints
Plan for scalability and growth
Optimize resource allocation across multiple workloads
Understand cost implications of different configuration choices

🎓 Prerequisites¶

Knowledge Requirements¶

Basic understanding of cloud computing concepts
Familiarity with Azure Synapse Analytics components
Understanding of compute vs. storage trade-offs
Basic knowledge of workload characteristics (batch, streaming, interactive)

Technical Requirements¶

Modern web browser (Chrome, Edge, Firefox, Safari)
JavaScript enabled
Stable internet connection
Calculator or spreadsheet (optional, for comparisons)

Recommended Experience¶

Experience with Azure resource management
Understanding of performance metrics (latency, throughput)
Familiarity with cost optimization strategies
Basic knowledge of capacity planning concepts

🚀 Demo Description¶

What This Demo Covers¶

The Resource Planner demo provides interactive calculators for:

Spark Pool Sizing: Determine node counts, sizes, and autoscale settings
SQL Pool Configuration: Calculate DWU requirements and performance tiers
Storage Planning: Estimate Data Lake storage needs and costs
Cost Optimization: Compare configuration options and identify savings
Workload Profiling: Match resources to workload characteristics
Growth Planning: Project future resource needs based on growth rates

Demo Features¶

Interactive Input Forms¶

<!-- Workload Characteristics Input -->
<div class="workload-input-form">
  <h3>Define Your Workload</h3>

  <label>Workload Type:</label>
  <select id="workload-type">
    <option value="batch">Batch Processing</option>
    <option value="streaming">Streaming Analytics</option>
    <option value="interactive">Interactive Queries</option>
    <option value="ml">Machine Learning</option>
    <option value="mixed">Mixed Workload</option>
  </select>

  <label>Data Volume (TB/day):</label>
  <input type="number" id="data-volume" min="0.1" step="0.1" value="1.0">

  <label>Query Complexity:</label>
  <input type="range" id="query-complexity" min="1" max="10" value="5">
  <span id="complexity-label">Medium</span>

  <label>Concurrent Users:</label>
  <input type="number" id="concurrent-users" min="1" value="10">

  <label>SLA Requirements (ms):</label>
  <input type="number" id="sla-requirement" min="100" step="100" value="1000">
</div>

Real-Time Cost Calculator¶

// Cost calculation engine
class ResourceCostCalculator {
  constructor() {
    this.pricingData = {
      sparkNodeHourly: {
        'Small': 0.30,
        'Medium': 0.60,
        'Large': 1.20,
        'XLarge': 2.40
      },
      sqlDWU: {
        'DW100c': 1.20,
        'DW500c': 6.00,
        'DW1000c': 12.00,
        'DW2000c': 24.00
      },
      storageTB: 0.020  // per GB/month
    };
  }

  calculateMonthlyCost(config) {
    const sparkCost = this.calculateSparkCost(config.spark);
    const sqlCost = this.calculateSQLCost(config.sql);
    const storageCost = this.calculateStorageCost(config.storage);

    return {
      spark: sparkCost,
      sql: sqlCost,
      storage: storageCost,
      total: sparkCost + sqlCost + storageCost
    };
  }

  calculateSparkCost(sparkConfig) {
    const nodePrice = this.pricingData.sparkNodeHourly[sparkConfig.nodeSize];
    const hoursPerMonth = sparkConfig.hoursPerDay * 30;
    return sparkConfig.nodeCount * nodePrice * hoursPerMonth;
  }
}

📝 Step-by-Step Guide¶

Phase 1: Workload Profiling (8 minutes)¶

Step 1: Define Workload Characteristics¶

Open the Resource Planner interface
Navigate to "Workload Profile" tab
Fill in your workload details:
Workload Type: Select "Batch Processing"
Data Volume: Enter "5.0 TB/day"
Query Complexity: Set slider to "7 (High)"
Concurrent Users: Enter "25"
SLA Requirement: Enter "2000 ms"

Expected Outcome: System displays workload classification and resource recommendations

Step 2: Review Workload Analysis¶

Workload Analysis Results:
==========================
Classification: High-Volume Batch Processing
Recommended Architecture: Spark-based with Delta Lake
Peak Resource Needs: 8-10 hours daily
Optimization Strategy: Auto-scaling with scheduled start/stop

Key Metrics:
- Daily Processing Window: 8 hours
- Peak Throughput Required: 640 MB/s
- Average Query Time: 45 seconds
- Data Retention: 90 days

Action: Note the recommended architecture pattern

Step 3: Set Business Constraints¶

Click "Business Constraints" section
Configure constraints:
Monthly Budget: $5,000
Performance Priority: High (slider at 80%)
Cost Priority: Medium (slider at 50%)
Uptime Requirements: 99.5%
Click "Apply Constraints"

Expected Outcome: Calculator adjusts recommendations to fit constraints

Phase 2: Spark Pool Optimization (10 minutes)¶

Step 4: Calculate Spark Pool Requirements¶

Navigate to "Spark Pool Calculator" tab
Input your processing requirements:
Data to Process: 5 TB/day
Processing Window: 8 hours
Parallelism Needed: High
Click "Calculate Optimal Configuration"

Calculation Results:

Recommended Spark Pool Configuration:
=====================================
Node Size: Large (16 vCPU, 128 GB RAM)
Min Nodes: 2
Max Nodes: 10
Autoscale: Enabled
Auto-pause: 15 minutes

Performance Metrics:
- Estimated Processing Time: 6.5 hours
- Throughput: 800 MB/s
- Cores Available: 160 (at max scale)
- Memory Available: 1.28 TB (at max scale)

Cost Estimate:
- Average Nodes Used: 6.5
- Active Hours/Day: 8
- Monthly Cost: $3,744

Step 5: Compare Configuration Options¶

Click "Compare Configurations" button
Review side-by-side comparison of 3 options:

Metric	Budget	Balanced	Performance
Node Size	Medium	Large	XLarge
Max Nodes	6	10	12
Processing Time	9.5 hrs	6.5 hrs	4.2 hrs
Monthly Cost	$2,160	$3,744	$6,912
Reliability	95%	98%	99.5%

Select "Balanced" configuration
Click "Save Configuration"

Step 6: Configure Auto-Scaling Rules¶

Open "Auto-Scaling Settings" panel

Configure scaling triggers:

scaling_rules:
  scale_up:
    - metric: executor_cpu_utilization
      threshold: 80
      action: add_2_nodes
      cooldown: 5_minutes

  scale_down:
    - metric: executor_cpu_utilization
      threshold: 30
      action: remove_1_node
      cooldown: 10_minutes

  scheduled_scaling:
    - schedule: "0 6 * * *"  # 6 AM daily
      action: set_min_nodes_8
    - schedule: "0 15 * * *"  # 3 PM daily
      action: set_min_nodes_2

Click "Apply Scaling Rules"

Expected Outcome: Estimated cost reduces by 15-20% with smart auto-scaling

Phase 3: SQL Pool Configuration (7 minutes)¶

Step 7: Calculate SQL Pool DWU¶

Navigate to "SQL Pool Calculator" tab
Define query workload:
Query Type: Mixed (70% simple, 30% complex)
Concurrent Queries: 15
Average Query Time Target: 3 seconds
Data Size: 2 TB
Click "Recommend DWU Level"

Recommendation Output:

SQL Pool Configuration Recommendation:
======================================
Service Level: DW1000c
DWU: 1000
Max Concurrent Queries: 32
Memory per Query: 1 GB

Performance Projections:
- Simple Query Avg Time: 0.8 seconds
- Complex Query Avg Time: 12 seconds
- Mixed Workload Avg Time: 3.4 seconds
- 95th Percentile: 18 seconds

Monthly Cost: $8,640

Step 8: Optimize for Cost¶

Click "Cost Optimization" button
Review optimization suggestions:
Pause during off-hours: Save 45% ($3,888/month)
Use lower DWU for dev/test: Save 15% ($1,296/month)
Implement result caching: Improve performance 20%
Select optimization options
View updated cost: $5,752/month (33% savings)

Phase 4: Storage Planning (5 minutes)¶

Step 9: Calculate Storage Requirements¶

Navigate to "Storage Calculator" tab
Input data characteristics:
Daily Ingestion: 5 TB
Retention Period: 90 days
Compression Ratio: 3:1 (Delta Lake)
Data Lake Zones: Bronze, Silver, Gold
Redundancy: LRS (Locally Redundant)

Storage Calculation:

Data Lake Storage Estimation:
==============================

Bronze Layer (Raw Data):
- Daily: 5 TB
- 90-day retention: 450 TB
- Format: Parquet (compressed)
- Actual Storage: 150 TB

Silver Layer (Cleansed):
- Daily: 4 TB (20% reduction)
- 180-day retention: 720 TB
- Format: Delta Lake
- Actual Storage: 240 TB

Gold Layer (Curated):
- Daily: 1.5 TB (70% aggregation)
- 365-day retention: 547.5 TB
- Format: Delta Lake
- Actual Storage: 182.5 TB

Total Storage Required: 572.5 TB

Monthly Cost Breakdown:
- Bronze: $3,072
- Silver: $4,915
- Gold: $3,736
- Total: $11,723/month

Step 10: Optimize Storage Costs¶

Click "Storage Optimization Wizard"
Review optimization recommendations:
Tier bronze data to Cool after 30 days: Save 25%
Archive silver data after 120 days: Save 15%
Enable data lifecycle policies: Save 10%
Apply optimizations
New monthly cost: $7,969 (32% savings)

Phase 5: Complete Resource Plan (10 minutes)¶

Step 11: Generate Full Resource Plan¶

Navigate to "Summary" tab
Review consolidated resource plan:

Azure Synapse Analytics Resource Plan
======================================

Compute Resources:
------------------
Spark Pools:
  - Pool Name: primary-pool
  - Node Size: Large
  - Nodes: 2-10 (autoscale)
  - Monthly Cost: $3,744

SQL Pools:
  - Pool Name: dwh-pool
  - DWU Level: DW1000c
  - Pause Schedule: Off-hours
  - Monthly Cost: $5,752

Storage Resources:
------------------
Data Lake Storage:
  - Total Capacity: 572.5 TB
  - Tiering: Enabled
  - Lifecycle Policies: Active
  - Monthly Cost: $7,969

Total Monthly Cost: $17,465
Annual Cost: $209,580

Cost vs. Budget:
  Budget: $5,000/month ⚠️ OVER BUDGET
  Actual: $17,465/month
  Variance: +249%

Recommendations:
1. Consider phased implementation
2. Review data retention policies
3. Optimize query patterns to reduce DWU needs
4. Implement more aggressive auto-scaling

Step 12: Scenario Planning¶

Click "Scenario Analysis" button
Compare multiple scenarios:

Scenario 1: Aggressive Cost Optimization - Reduce Spark pool max nodes to 6 - Lower SQL DWU to DW500c - Reduce retention to 60 days - Monthly Cost: $8,950 (within budget) - Trade-off: 30% longer processing times

Scenario 2: Performance Optimized - Increase Spark pool max nodes to 15 - Raise SQL DWU to DW2000c - Full retention policies - Monthly Cost: $24,600 - Trade-off: 40% faster, higher reliability

Scenario 3: Balanced Approach - Current configuration with optimizations - Scheduled scaling - Smart tiering - Monthly Cost: $14,200 - Trade-off: Good performance, manageable cost

Select "Scenario 3: Balanced"
Click "Export Resource Plan"

Step 13: Generate Cost Forecast¶

Navigate to "Cost Forecast" section
Input growth parameters:
Data Growth Rate: 15% per quarter
User Growth: 10% per quarter
New Workloads: 2 per year
Click "Generate 12-Month Forecast"

Forecast Output:

12-Month Cost Forecast:
=======================

Q1 2025: $14,200/month
Q2 2025: $16,330/month (+15%)
Q3 2025: $18,780/month (+15%)
Q4 2025: $21,597/month (+15%)

Year-End Projection:
- Total Year 1 Cost: $212,620
- Average Monthly: $17,718

Growth Drivers:
1. Data volume increase (60% of growth)
2. Additional workloads (25% of growth)
3. User expansion (15% of growth)

Optimization Opportunities:
- Committed use discounts: Save 15%
- Reserved capacity: Save 20%
- Enterprise agreement: Save 10%

Potential Savings: $42,524/year

🛠️ Technical Implementation Notes¶

Calculation Engine¶

# Python-based Resource Calculator
from dataclasses import dataclass
from typing import List, Dict
import math

@dataclass
class WorkloadProfile:
    """Workload characteristics for resource planning"""
    data_volume_tb: float
    processing_window_hours: int
    query_complexity: int  # 1-10 scale
    concurrent_users: int
    sla_ms: int
    workload_type: str

class SparkPoolCalculator:
    """Calculate optimal Spark pool configuration"""

    def __init__(self):
        self.node_specs = {
            'Small': {'vcpu': 4, 'memory_gb': 32, 'cost_per_hour': 0.30},
            'Medium': {'vcpu': 8, 'memory_gb': 64, 'cost_per_hour': 0.60},
            'Large': {'vcpu': 16, 'memory_gb': 128, 'cost_per_hour': 1.20},
            'XLarge': {'vcpu': 32, 'memory_gb': 256, 'cost_per_hour': 2.40}
        }

    def calculate_node_requirements(self, workload: WorkloadProfile) -> Dict:
        """Calculate required nodes based on workload"""

        # Calculate data processing requirements
        throughput_required_mb_s = (workload.data_volume_tb * 1024) / (workload.processing_window_hours * 3600)

        # Determine node size based on query complexity
        if workload.query_complexity <= 3:
            node_size = 'Small'
        elif workload.query_complexity <= 6:
            node_size = 'Medium'
        elif workload.query_complexity <= 8:
            node_size = 'Large'
        else:
            node_size = 'XLarge'

        node_spec = self.node_specs[node_size]

        # Calculate number of nodes
        # Assume each node can process ~100 MB/s for medium complexity
        processing_capacity_per_node = 100 * (10 / workload.query_complexity)
        min_nodes = math.ceil(throughput_required_mb_s / processing_capacity_per_node)
        max_nodes = min_nodes * 3  # Allow 3x scaling

        # Ensure minimum of 2 nodes for reliability
        min_nodes = max(2, min_nodes)

        return {
            'node_size': node_size,
            'min_nodes': min_nodes,
            'max_nodes': max_nodes,
            'vcpu_per_node': node_spec['vcpu'],
            'memory_per_node_gb': node_spec['memory_gb'],
            'cost_per_hour': node_spec['cost_per_hour'],
            'estimated_processing_time_hours': self._estimate_processing_time(
                workload, min_nodes, node_size
            )
        }

    def _estimate_processing_time(self, workload: WorkloadProfile,
                                   nodes: int, node_size: str) -> float:
        """Estimate job processing time"""
        node_throughput = self.node_specs[node_size]['vcpu'] * 25  # MB/s per node
        total_throughput = node_throughput * nodes

        data_to_process_mb = workload.data_volume_tb * 1024 * 1024
        processing_time_hours = data_to_process_mb / (total_throughput * 3600)

        # Add overhead for complexity
        complexity_factor = 1 + (workload.query_complexity / 10)

        return processing_time_hours * complexity_factor

class SQLPoolCalculator:
    """Calculate optimal SQL Pool DWU"""

    def __init__(self):
        self.dwu_levels = {
            'DW100c': {'dw': 100, 'memory_gb': 60, 'cost_per_hour': 1.20},
            'DW500c': {'dw': 500, 'memory_gb': 300, 'cost_per_hour': 6.00},
            'DW1000c': {'dw': 1000, 'memory_gb': 600, 'cost_per_hour': 12.00},
            'DW2000c': {'dw': 2000, 'memory_gb': 1200, 'cost_per_hour': 24.00},
        }

    def recommend_dwu(self, workload: WorkloadProfile) -> Dict:
        """Recommend DWU level based on workload"""

        # Calculate required concurrency slots
        required_slots = workload.concurrent_users * 1.2  # 20% buffer

        # Calculate memory requirements (rough estimate)
        data_size_gb = workload.data_volume_tb * 1024
        memory_needed = (data_size_gb * 0.1) + (required_slots * 1)  # GB

        # Find appropriate DWU level
        for level, specs in sorted(self.dwu_levels.items(),
                                   key=lambda x: x[1]['dw']):
            if specs['memory_gb'] >= memory_needed:
                return {
                    'level': level,
                    'dwu': specs['dw'],
                    'memory_gb': specs['memory_gb'],
                    'max_concurrent_queries': specs['dw'] // 100 * 4,
                    'cost_per_hour': specs['cost_per_hour'],
                    'monthly_cost_24x7': specs['cost_per_hour'] * 730
                }

        # Default to highest if nothing fits
        return self._format_recommendation('DW2000c')

Cost Optimization Algorithms¶

class CostOptimizer:
    """Optimize resource configuration for cost"""

    def optimize_spark_pool(self, config: Dict, constraints: Dict) -> Dict:
        """Optimize Spark pool configuration"""

        optimizations = []

        # Auto-pause optimization
        if config['active_hours_per_day'] < 24:
            savings = self._calculate_pause_savings(config)
            optimizations.append({
                'strategy': 'Auto-pause during idle time',
                'savings_percent': savings['percent'],
                'savings_monthly': savings['amount']
            })

        # Right-sizing optimization
        utilization = config.get('average_utilization', 0.5)
        if utilization < 0.5:
            savings = self._calculate_rightsizing_savings(config, utilization)
            optimizations.append({
                'strategy': 'Right-size nodes based on utilization',
                'savings_percent': savings['percent'],
                'savings_monthly': savings['amount']
            })

        # Scheduled scaling optimization
        if self._has_predictable_pattern(config):
            savings = self._calculate_scheduled_scaling_savings(config)
            optimizations.append({
                'strategy': 'Implement scheduled auto-scaling',
                'savings_percent': savings['percent'],
                'savings_monthly': savings['amount']
            })

        return {
            'original_cost': config['monthly_cost'],
            'optimizations': optimizations,
            'optimized_cost': self._apply_optimizations(config, optimizations),
            'total_savings': sum(o['savings_monthly'] for o in optimizations)
        }

Forecasting Model¶

class ResourceForecaster:
    """Forecast future resource needs and costs"""

    def forecast_growth(self, current_config: Dict,
                       growth_params: Dict,
                       months: int = 12) -> List[Dict]:
        """Generate resource and cost forecast"""

        forecasts = []
        config = current_config.copy()

        for month in range(1, months + 1):
            # Apply growth factors
            config['data_volume_tb'] *= (1 + growth_params['data_growth_rate'])
            config['concurrent_users'] *= (1 + growth_params['user_growth_rate'])

            # Recalculate resources
            spark_config = self.spark_calculator.calculate_node_requirements(config)
            sql_config = self.sql_calculator.recommend_dwu(config)
            storage_config = self.storage_calculator.estimate_capacity(config)

            # Calculate costs
            total_cost = (
                spark_config['monthly_cost'] +
                sql_config['monthly_cost_24x7'] +
                storage_config['monthly_cost']
            )

            forecasts.append({
                'month': month,
                'data_volume_tb': config['data_volume_tb'],
                'concurrent_users': config['concurrent_users'],
                'spark_cost': spark_config['monthly_cost'],
                'sql_cost': sql_config['monthly_cost_24x7'],
                'storage_cost': storage_config['monthly_cost'],
                'total_cost': total_cost,
                'config_changes': self._detect_config_changes(config, spark_config, sql_config)
            })

        return forecasts

Interactive UI Components¶

// React component for Resource Planner
import React, { useState, useEffect } from 'react';
import { LineChart, Line, XAxis, YAxis, Tooltip, Legend } from 'recharts';

const ResourcePlanner = () => {
  const [workload, setWorkload] = useState({
    dataVolumeTB: 5.0,
    processingWindowHours: 8,
    queryComplexity: 5,
    concurrentUsers: 25,
    slaMs: 2000
  });

  const [recommendation, setRecommendation] = useState(null);
  const [forecast, setForecast] = useState([]);

  useEffect(() => {
    // Fetch recommendations when workload changes
    fetchRecommendations();
  }, [workload]);

  const fetchRecommendations = async () => {
    const response = await fetch('/api/resource-planner/recommend', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify(workload)
    });
    const data = await response.json();
    setRecommendation(data);
  };

  const generateForecast = async (growthParams) => {
    const response = await fetch('/api/resource-planner/forecast', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ workload, growthParams, months: 12 })
    });
    const data = await response.json();
    setForecast(data);
  };

  return (
    <div className="resource-planner">
      <WorkloadForm workload={workload} onChange={setWorkload} />
      <RecommendationPanel recommendation={recommendation} />
      <CostForecastChart data={forecast} />
      <OptimizationSuggestions config={recommendation} />
    </div>
  );
};

🎯 Key Takeaways¶

After completing this demo, you should understand:

Capacity Planning: How to calculate resource requirements for different workloads
Cost Optimization: Techniques to reduce costs while maintaining performance
Trade-offs: Balance between performance, reliability, and cost
Scaling Strategies: When to use auto-scaling vs. manual scaling
Storage Management: Optimize storage costs with tiering and lifecycle policies
Forecasting: Project future needs and budget accordingly

Documentation¶

Tutorials¶

Right-Sizing Resources Tutorial
Auto-Scaling Configuration
Cost Monitoring Setup

Tools¶

❓ FAQ¶

Q: How accurate are the cost estimates? A: Cost estimates are based on current Azure pricing and typical usage patterns. Actual costs may vary based on your specific usage, region, and enterprise agreements.

Q: Can I import my current resource configuration? A: Yes, the tool supports importing configurations from Azure Resource Manager (ARM) templates or directly from your Azure subscription.

Q: Does this account for reserved capacity discounts? A: The calculator can factor in reserved capacity discounts when you enable that option in the pricing settings.

Q: How often should I review my resource plan? A: Review your resource plan quarterly or whenever there are significant changes in workload characteristics or business requirements.

💬 Feedback¶

Help us improve this resource planner!

Next Steps: - Try the Cost Calculator - Explore the Migration Assessment Wizard - Review Performance Optimization Guide

Last Updated: December 2025 | Version: 1.0.0