Skip to content

⏰ Tutorial 5: Windowing Functions

Tutorial Duration Level

Master time-based windowing functions for temporal aggregations. Learn tumbling, hopping, sliding, and session windows to analyze streaming data over time intervals.

🎯 Learning Objectives

After completing this tutorial, you will be able to:

  • Use tumbling windows for non-overlapping fixed intervals
  • Implement hopping windows for overlapping time analysis
  • Create sliding windows for continuous event-driven aggregations
  • Apply session windows for activity-based grouping
  • Choose appropriate window types for different scenarios
  • Handle late-arriving events in windowed queries

⏱️ Time Estimate: 40 minutes

  • Tumbling Windows: 10 minutes
  • Hopping Windows: 10 minutes
  • Sliding & Session Windows: 15 minutes
  • Advanced Patterns: 5 minutes

📋 Prerequisites

⏰ Window Types Overview

Window Comparison

Window Type Overlapping Trigger Use Case
Tumbling No Fixed interval Hourly reports, batch processing
Hopping Yes Fixed interval Moving averages, trend analysis
Sliding Yes Event arrival Real-time alerts, continuous metrics
Session No Inactivity gap User sessions, activity tracking

📊 Step 1: Tumbling Windows

1.1 Basic Tumbling Window

Non-overlapping, fixed-size time windows:

-- Calculate metrics every 5 minutes (no overlap)
SELECT
    deviceId,
    location,
    COUNT(*) AS eventCount,
    AVG(temperature) AS avgTemperature,
    MIN(temperature) AS minTemperature,
    MAX(temperature) AS maxTemperature,
    System.Timestamp() AS windowEnd
INTO
    SqlOutput
FROM
    EventHubInput TIMESTAMP BY timestamp
GROUP BY
    deviceId,
    location,
    TumblingWindow(minute, 5);

Window Behavior: - Events from 10:00:00 - 10:04:59 → Window 1 - Events from 10:05:00 - 10:09:59 → Window 2 - No events belong to multiple windows

1.2 Multiple Tumbling Windows

Different intervals for different analyses:

-- 1-minute tumbling window for real-time monitoring
SELECT
    'RealTime' AS reportType,
    COUNT(DISTINCT deviceId) AS activeDevices,
    AVG(temperature) AS avgTemp,
    System.Timestamp() AS windowEnd
INTO
    BlobOutput
FROM
    EventHubInput TIMESTAMP BY timestamp
GROUP BY
    TumblingWindow(minute, 1);

-- 1-hour tumbling window for historical trends
SELECT
    'Hourly' AS reportType,
    DATEPART(hour, System.Timestamp()) AS hourOfDay,
    COUNT(*) AS totalEvents,
    AVG(temperature) AS avgTemp,
    STDEV(temperature) AS tempStdDev,
    System.Timestamp() AS windowEnd
INTO
    SqlOutput
FROM
    EventHubInput TIMESTAMP BY timestamp
GROUP BY
    TumblingWindow(hour, 1);

1.3 Tumbling Window with Complex Aggregations

-- Comprehensive 10-minute device health summary
SELECT
    deviceId,
    location,
    System.Timestamp() AS windowEnd,
    -- Event metrics
    COUNT(*) AS totalEvents,
    COUNT(DISTINCT eventType) AS distinctEventTypes,
    -- Temperature statistics
    AVG(temperature) AS avgTemp,
    MIN(temperature) AS minTemp,
    MAX(temperature) AS maxTemp,
    STDEV(temperature) AS tempVariance,
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY temperature) OVER() AS medianTemp,
    -- Status tracking
    SUM(CASE WHEN status = 'normal' THEN 1 ELSE 0 END) AS normalEvents,
    SUM(CASE WHEN status = 'warning' THEN 1 ELSE 0 END) AS warningEvents,
    SUM(CASE WHEN status = 'critical' THEN 1 ELSE 0 END) AS criticalEvents,
    -- Health score
    CASE
        WHEN AVG(vibration) < 0.5 AND AVG(temperature) BETWEEN 68 AND 75 THEN 100
        WHEN AVG(vibration) < 1.0 THEN 75
        WHEN AVG(vibration) < 1.5 THEN 50
        ELSE 25
    END AS healthScore
INTO
    SqlOutput
FROM
    EventHubInput TIMESTAMP BY timestamp
GROUP BY
    deviceId,
    location,
    TumblingWindow(minute, 10);

🔄 Step 2: Hopping Windows

2.1 Basic Hopping Window

Overlapping windows for trend analysis:

-- 10-minute window, hopping every 5 minutes (50% overlap)
SELECT
    deviceId,
    COUNT(*) AS eventCount,
    AVG(temperature) AS avgTemperature,
    System.Timestamp() AS windowEnd
INTO
    BlobOutput
FROM
    EventHubInput TIMESTAMP BY timestamp
GROUP BY
    deviceId,
    HoppingWindow(minute, 10, 5);  -- (window size, hop size)

Window Behavior: - Window 1: 10:00:00 - 10:09:59 - Window 2: 10:05:00 - 10:14:59 (overlaps with Window 1) - Events in overlap period belong to both windows

2.2 Moving Average with Hopping Window

Calculate rolling averages:

-- 1-hour moving average, updated every 15 minutes
SELECT
    deviceId,
    location,
    AVG(temperature) AS movingAvgTemp,
    AVG(humidity) AS movingAvgHumidity,
    AVG(vibration) AS movingAvgVibration,
    MIN(temperature) AS minTemp,
    MAX(temperature) AS maxTemp,
    COUNT(*) AS sampleCount,
    System.Timestamp() AS windowEnd,
    DATEDIFF(minute, MIN(timestamp), System.Timestamp()) AS windowDurationMinutes
INTO
    SqlOutput
FROM
    EventHubInput TIMESTAMP BY timestamp
GROUP BY
    deviceId,
    location,
    HoppingWindow(hour, 1, 15);  -- 1 hour window, hop every 15 min

2.3 Trend Detection with Hopping Windows

Identify increasing/decreasing trends:

-- Detect temperature trends using overlapping windows
WITH WindowedData AS (
    SELECT
        deviceId,
        AVG(temperature) AS avgTemp,
        System.Timestamp() AS windowEnd
    FROM
        EventHubInput TIMESTAMP BY timestamp
    GROUP BY
        deviceId,
        HoppingWindow(minute, 10, 5)
)
SELECT
    deviceId,
    avgTemp AS currentAvgTemp,
    LAG(avgTemp, 1) OVER (PARTITION BY deviceId LIMIT DURATION(minute, 10)) AS previousAvgTemp,
    CASE
        WHEN avgTemp > LAG(avgTemp, 1) OVER (PARTITION BY deviceId LIMIT DURATION(minute, 10)) + 2 THEN 'Increasing'
        WHEN avgTemp < LAG(avgTemp, 1) OVER (PARTITION BY deviceId LIMIT DURATION(minute, 10)) - 2 THEN 'Decreasing'
        ELSE 'Stable'
    END AS tempTrend,
    windowEnd
INTO
    BlobOutput
FROM
    WindowedData;

📈 Step 3: Sliding Windows

3.1 Basic Sliding Window

Event-driven windows (re-compute on every event):

-- Alert when average temperature exceeds threshold in last 5 minutes
SELECT
    deviceId,
    location,
    AVG(temperature) AS avgTempLast5Min,
    COUNT(*) AS eventCount,
    System.Timestamp() AS alertTime
INTO
    BlobOutput
FROM
    EventHubInput TIMESTAMP BY timestamp
GROUP BY
    deviceId,
    location,
    SlidingWindow(minute, 5)
HAVING
    AVG(temperature) > 80;

Window Behavior: - Recomputed on EVERY event arrival - Looks back 5 minutes from each event - Useful for immediate alerting

3.2 Continuous Anomaly Detection

Detect anomalies using sliding window statistics:

-- Detect temperature spikes using sliding window
SELECT
    deviceId,
    location,
    temperature AS currentTemp,
    AVG(temperature) OVER (PARTITION BY deviceId LIMIT DURATION(minute, 10)) AS avgTempLast10Min,
    STDEV(temperature) OVER (PARTITION BY deviceId LIMIT DURATION(minute, 10)) AS stdDevLast10Min,
    CASE
        WHEN temperature > AVG(temperature) OVER (PARTITION BY deviceId LIMIT DURATION(minute, 10)) +
                           (2 * STDEV(temperature) OVER (PARTITION BY deviceId LIMIT DURATION(minute, 10)))
        THEN 'Spike Detected'
        WHEN temperature < AVG(temperature) OVER (PARTITION BY deviceId LIMIT DURATION(minute, 10)) -
                           (2 * STDEV(temperature) OVER (PARTITION BY deviceId LIMIT DURATION(minute, 10)))
        THEN 'Drop Detected'
        ELSE 'Normal'
    END AS anomalyStatus,
    System.Timestamp() AS detectionTime
INTO
    BlobOutput
FROM
    EventHubInput TIMESTAMP BY timestamp
WHERE
    temperature IS NOT NULL;

3.3 Real-Time Threshold Alerting

Continuous monitoring with sliding windows:

-- Alert when conditions persist over sliding window
SELECT
    deviceId,
    location,
    AVG(vibration) AS avgVibration,
    MAX(vibration) AS maxVibration,
    COUNT(*) AS highVibrationCount,
    'High Vibration Alert' AS alertType,
    System.Timestamp() AS alertTime
INTO
    BlobOutput
FROM
    EventHubInput TIMESTAMP BY timestamp
GROUP BY
    deviceId,
    location,
    SlidingWindow(minute, 3)
HAVING
    AVG(vibration) > 1.0
    AND COUNT(*) >= 5;  -- At least 5 high readings

🎯 Step 4: Session Windows

4.1 Basic Session Window

Group events separated by inactivity timeout:

-- Identify device activity sessions (10-second inactivity timeout)
SELECT
    deviceId,
    location,
    COUNT(*) AS eventsInSession,
    AVG(temperature) AS avgTemp,
    MIN(timestamp) AS sessionStart,
    MAX(timestamp) AS sessionEnd,
    DATEDIFF(second, MIN(timestamp), MAX(timestamp)) AS sessionDurationSeconds,
    System.Timestamp() AS sessionWindowEnd
INTO
    SqlOutput
FROM
    EventHubInput TIMESTAMP BY timestamp
GROUP BY
    deviceId,
    location,
    SessionWindow(second, 10);  -- 10-second timeout

Window Behavior: - Window closes after 10 seconds of no events from device - Session duration varies based on event frequency - Ideal for user activity or device usage patterns

4.2 User Activity Sessions

Track device usage patterns:

-- Analyze device operational sessions
SELECT
    deviceId,
    location,
    COUNT(*) AS operationCount,
    MIN(timestamp) AS sessionStart,
    MAX(timestamp) AS sessionEnd,
    DATEDIFF(minute, MIN(timestamp), MAX(timestamp)) AS sessionDurationMinutes,
    AVG(temperature) AS avgSessionTemp,
    MAX(temperature) AS peakSessionTemp,
    -- Categorize session length
    CASE
        WHEN DATEDIFF(minute, MIN(timestamp), MAX(timestamp)) < 5 THEN 'Short'
        WHEN DATEDIFF(minute, MIN(timestamp), MAX(timestamp)) < 30 THEN 'Medium'
        ELSE 'Long'
    END AS sessionLength,
    System.Timestamp() AS sessionEnd
INTO
    BlobOutput
FROM
    EventHubInput TIMESTAMP BY timestamp
WHERE
    status != 'offline'
GROUP BY
    deviceId,
    location,
    SessionWindow(minute, 5, 60);  -- 5-min timeout, 60-min max duration

4.3 Equipment Usage Analysis

Understand operational patterns:

-- Equipment usage patterns with session windows
SELECT
    SUBSTRING(location, 1, CHARINDEX('/', location) - 1) AS building,
    COUNT(DISTINCT deviceId) AS activeDevicesInSession,
    AVG(DATEDIFF(minute, MIN(timestamp), MAX(timestamp))) AS avgSessionDuration,
    COUNT(*) AS totalOperations,
    MIN(timestamp) AS periodStart,
    MAX(timestamp) AS periodEnd,
    System.Timestamp() AS analysisTime
INTO
    SqlOutput
FROM
    EventHubInput TIMESTAMP BY timestamp
WHERE
    eventType = 'telemetry'
    AND status = 'normal'
GROUP BY
    SUBSTRING(location, 1, CHARINDEX('/', location) - 1),
    SessionWindow(minute, 10, 120);  -- 10-min timeout, 2-hour max

🔍 Step 5: Advanced Windowing Patterns

5.1 Combining Multiple Window Types

Use different windows for different analyses:

-- Real-time alerts (Sliding) + Trend analysis (Hopping)
-- Alert output
SELECT
    'RealTimeAlert' AS outputType,
    deviceId,
    AVG(temperature) AS avgTemp,
    System.Timestamp() AS alertTime
INTO
    BlobOutput
FROM
    EventHubInput TIMESTAMP BY timestamp
GROUP BY
    deviceId,
    SlidingWindow(minute, 5)
HAVING
    AVG(temperature) > 85;

-- Trend output
SELECT
    'TrendAnalysis' AS outputType,
    deviceId,
    AVG(temperature) AS movingAvgTemp,
    STDEV(temperature) AS tempVariability,
    System.Timestamp() AS windowEnd
INTO
    SqlOutput
FROM
    EventHubInput TIMESTAMP BY timestamp
GROUP BY
    deviceId,
    HoppingWindow(minute, 30, 10);

5.2 Nested Windowing

Windows within windows for complex analysis:

-- Calculate hourly max of 5-minute averages
WITH FiveMinuteAverages AS (
    SELECT
        deviceId,
        AVG(temperature) AS avgTemp,
        System.Timestamp() AS windowEnd
    FROM
        EventHubInput TIMESTAMP BY timestamp
    GROUP BY
        deviceId,
        TumblingWindow(minute, 5)
)
SELECT
    deviceId,
    MAX(avgTemp) AS maxOf5MinAverages,
    AVG(avgTemp) AS avgOf5MinAverages,
    COUNT(*) AS numberOfFiveMinWindows,
    System.Timestamp() AS hourlyWindowEnd
INTO
    SqlOutput
FROM
    FiveMinuteAverages TIMESTAMP BY windowEnd
GROUP BY
    deviceId,
    TumblingWindow(hour, 1);

5.3 Window Size Optimization

Choose optimal window sizes based on data characteristics:

-- Multi-resolution analysis
-- High frequency (1-minute windows)
SELECT
    '1min' AS resolution,
    deviceId,
    AVG(temperature) AS avgTemp,
    System.Timestamp() AS windowEnd
INTO
    BlobOutput
FROM
    EventHubInput TIMESTAMP BY timestamp
GROUP BY
    deviceId,
    TumblingWindow(minute, 1);

-- Medium frequency (15-minute windows)
SELECT
    '15min' AS resolution,
    deviceId,
    AVG(temperature) AS avgTemp,
    STDEV(temperature) AS tempStdDev,
    System.Timestamp() AS windowEnd
INTO
    SqlOutput
FROM
    EventHubInput TIMESTAMP BY timestamp
GROUP BY
    deviceId,
    TumblingWindow(minute, 15);

-- Low frequency (1-hour windows)
SELECT
    '1hour' AS resolution,
    deviceId,
    location,
    AVG(temperature) AS avgTemp,
    MIN(temperature) AS minTemp,
    MAX(temperature) AS maxTemp,
    COUNT(*) AS eventCount,
    System.Timestamp() AS windowEnd
INTO
    SqlOutput
FROM
    EventHubInput TIMESTAMP BY timestamp
GROUP BY
    deviceId,
    location,
    TumblingWindow(hour, 1);

🧪 Step 6: Testing and Deployment

6.1 Deploy Windowing Query

# Example: Deploy hopping window moving average query
$query = @"
-- Moving average temperature monitoring
SELECT
    deviceId,
    location,
    AVG(temperature) AS movingAvgTemp,
    MIN(temperature) AS minTemp,
    MAX(temperature) AS maxTemp,
    COUNT(*) AS sampleCount,
    System.Timestamp() AS windowEnd
INTO
    SqlOutput
FROM
    EventHubInput TIMESTAMP BY timestamp
GROUP BY
    deviceId,
    location,
    HoppingWindow(minute, 15, 5);
"@

$query | Out-File -FilePath "windowing-query.sql" -Encoding UTF8

# Stop job
az stream-analytics job stop --name $env:STREAM_JOB --resource-group $env:STREAM_RG

# Update transformation
az stream-analytics transformation update `
    --job-name $env:STREAM_JOB `
    --resource-group $env:STREAM_RG `
    --name "Transformation" `
    --saql @windowing-query.sql

# Start job
az stream-analytics job start `
    --job-name $env:STREAM_JOB `
    --resource-group $env:STREAM_RG `
    --output-start-mode JobStartTime

6.2 Validate Window Output

# Query to check windowed results
$validationQuery = @"
SELECT TOP 20
    deviceId,
    location,
    windowEnd,
    movingAvgTemp,
    sampleCount,
    DATEDIFF(minute, LAG(windowEnd) OVER (PARTITION BY deviceId ORDER BY windowEnd), windowEnd) AS minutesBetweenWindows
FROM SensorReadings
ORDER BY deviceId, windowEnd DESC;
"@

Invoke-Sqlcmd -ConnectionString $connectionString -Query $validationQuery | Format-Table

🎓 Key Concepts Learned

Window Selection Guide

Requirement Recommended Window
Fixed-interval reports Tumbling
Moving averages Hopping
Real-time alerts Sliding
User/device sessions Session
Trend detection Hopping
Anomaly detection Sliding

Best Practices

  • Tumbling: Use for non-overlapping batch processing
  • Hopping: Choose hop size < window size for overlap
  • Sliding: Use sparingly (compute-intensive)
  • Session: Set timeout based on expected inactivity
  • Performance: Larger windows = fewer computations = better performance

Common Patterns

  • Dashboard Metrics: Tumbling windows (1, 5, 15 min)
  • Trend Analysis: Hopping windows (15-60 min window, 5-15 min hop)
  • Alerting: Sliding windows (1-5 min)
  • Usage Analytics: Session windows (5-30 min timeout)

🚀 Next Steps

You've mastered temporal windowing! Continue to:

Tutorial 06: Joins and Temporal Operations →

In the next tutorial, you'll learn:

  • Stream-to-stream joins
  • Stream-to-reference data joins
  • Temporal joins with time constraints
  • Self-joins for pattern detection

📚 Additional Resources


Tutorial Progress: 5 of 11 complete | Next: Joins and Temporal Operations

Last Updated: January 2025