⏰ Tutorial 5: Windowing Functions¶
Master time-based windowing functions for temporal aggregations. Learn tumbling, hopping, sliding, and session windows to analyze streaming data over time intervals.
🎯 Learning Objectives¶
After completing this tutorial, you will be able to:
- ✅ Use tumbling windows for non-overlapping fixed intervals
- ✅ Implement hopping windows for overlapping time analysis
- ✅ Create sliding windows for continuous event-driven aggregations
- ✅ Apply session windows for activity-based grouping
- ✅ Choose appropriate window types for different scenarios
- ✅ Handle late-arriving events in windowed queries
⏱️ Time Estimate: 40 minutes¶
- Tumbling Windows: 10 minutes
- Hopping Windows: 10 minutes
- Sliding & Session Windows: 15 minutes
- Advanced Patterns: 5 minutes
📋 Prerequisites¶
- Completed Tutorial 04: Basic Queries
- Understanding of aggregation functions
- Stream Analytics job running with data flow
⏰ Window Types Overview¶
Window Comparison¶
| Window Type | Overlapping | Trigger | Use Case |
|---|---|---|---|
| Tumbling | No | Fixed interval | Hourly reports, batch processing |
| Hopping | Yes | Fixed interval | Moving averages, trend analysis |
| Sliding | Yes | Event arrival | Real-time alerts, continuous metrics |
| Session | No | Inactivity gap | User sessions, activity tracking |
📊 Step 1: Tumbling Windows¶
1.1 Basic Tumbling Window¶
Non-overlapping, fixed-size time windows:
-- Calculate metrics every 5 minutes (no overlap)
SELECT
deviceId,
location,
COUNT(*) AS eventCount,
AVG(temperature) AS avgTemperature,
MIN(temperature) AS minTemperature,
MAX(temperature) AS maxTemperature,
System.Timestamp() AS windowEnd
INTO
SqlOutput
FROM
EventHubInput TIMESTAMP BY timestamp
GROUP BY
deviceId,
location,
TumblingWindow(minute, 5);
Window Behavior: - Events from 10:00:00 - 10:04:59 → Window 1 - Events from 10:05:00 - 10:09:59 → Window 2 - No events belong to multiple windows
1.2 Multiple Tumbling Windows¶
Different intervals for different analyses:
-- 1-minute tumbling window for real-time monitoring
SELECT
'RealTime' AS reportType,
COUNT(DISTINCT deviceId) AS activeDevices,
AVG(temperature) AS avgTemp,
System.Timestamp() AS windowEnd
INTO
BlobOutput
FROM
EventHubInput TIMESTAMP BY timestamp
GROUP BY
TumblingWindow(minute, 1);
-- 1-hour tumbling window for historical trends
SELECT
'Hourly' AS reportType,
DATEPART(hour, System.Timestamp()) AS hourOfDay,
COUNT(*) AS totalEvents,
AVG(temperature) AS avgTemp,
STDEV(temperature) AS tempStdDev,
System.Timestamp() AS windowEnd
INTO
SqlOutput
FROM
EventHubInput TIMESTAMP BY timestamp
GROUP BY
TumblingWindow(hour, 1);
1.3 Tumbling Window with Complex Aggregations¶
-- Comprehensive 10-minute device health summary
SELECT
deviceId,
location,
System.Timestamp() AS windowEnd,
-- Event metrics
COUNT(*) AS totalEvents,
COUNT(DISTINCT eventType) AS distinctEventTypes,
-- Temperature statistics
AVG(temperature) AS avgTemp,
MIN(temperature) AS minTemp,
MAX(temperature) AS maxTemp,
STDEV(temperature) AS tempVariance,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY temperature) OVER() AS medianTemp,
-- Status tracking
SUM(CASE WHEN status = 'normal' THEN 1 ELSE 0 END) AS normalEvents,
SUM(CASE WHEN status = 'warning' THEN 1 ELSE 0 END) AS warningEvents,
SUM(CASE WHEN status = 'critical' THEN 1 ELSE 0 END) AS criticalEvents,
-- Health score
CASE
WHEN AVG(vibration) < 0.5 AND AVG(temperature) BETWEEN 68 AND 75 THEN 100
WHEN AVG(vibration) < 1.0 THEN 75
WHEN AVG(vibration) < 1.5 THEN 50
ELSE 25
END AS healthScore
INTO
SqlOutput
FROM
EventHubInput TIMESTAMP BY timestamp
GROUP BY
deviceId,
location,
TumblingWindow(minute, 10);
🔄 Step 2: Hopping Windows¶
2.1 Basic Hopping Window¶
Overlapping windows for trend analysis:
-- 10-minute window, hopping every 5 minutes (50% overlap)
SELECT
deviceId,
COUNT(*) AS eventCount,
AVG(temperature) AS avgTemperature,
System.Timestamp() AS windowEnd
INTO
BlobOutput
FROM
EventHubInput TIMESTAMP BY timestamp
GROUP BY
deviceId,
HoppingWindow(minute, 10, 5); -- (window size, hop size)
Window Behavior: - Window 1: 10:00:00 - 10:09:59 - Window 2: 10:05:00 - 10:14:59 (overlaps with Window 1) - Events in overlap period belong to both windows
2.2 Moving Average with Hopping Window¶
Calculate rolling averages:
-- 1-hour moving average, updated every 15 minutes
SELECT
deviceId,
location,
AVG(temperature) AS movingAvgTemp,
AVG(humidity) AS movingAvgHumidity,
AVG(vibration) AS movingAvgVibration,
MIN(temperature) AS minTemp,
MAX(temperature) AS maxTemp,
COUNT(*) AS sampleCount,
System.Timestamp() AS windowEnd,
DATEDIFF(minute, MIN(timestamp), System.Timestamp()) AS windowDurationMinutes
INTO
SqlOutput
FROM
EventHubInput TIMESTAMP BY timestamp
GROUP BY
deviceId,
location,
HoppingWindow(hour, 1, 15); -- 1 hour window, hop every 15 min
2.3 Trend Detection with Hopping Windows¶
Identify increasing/decreasing trends:
-- Detect temperature trends using overlapping windows
WITH WindowedData AS (
SELECT
deviceId,
AVG(temperature) AS avgTemp,
System.Timestamp() AS windowEnd
FROM
EventHubInput TIMESTAMP BY timestamp
GROUP BY
deviceId,
HoppingWindow(minute, 10, 5)
)
SELECT
deviceId,
avgTemp AS currentAvgTemp,
LAG(avgTemp, 1) OVER (PARTITION BY deviceId LIMIT DURATION(minute, 10)) AS previousAvgTemp,
CASE
WHEN avgTemp > LAG(avgTemp, 1) OVER (PARTITION BY deviceId LIMIT DURATION(minute, 10)) + 2 THEN 'Increasing'
WHEN avgTemp < LAG(avgTemp, 1) OVER (PARTITION BY deviceId LIMIT DURATION(minute, 10)) - 2 THEN 'Decreasing'
ELSE 'Stable'
END AS tempTrend,
windowEnd
INTO
BlobOutput
FROM
WindowedData;
📈 Step 3: Sliding Windows¶
3.1 Basic Sliding Window¶
Event-driven windows (re-compute on every event):
-- Alert when average temperature exceeds threshold in last 5 minutes
SELECT
deviceId,
location,
AVG(temperature) AS avgTempLast5Min,
COUNT(*) AS eventCount,
System.Timestamp() AS alertTime
INTO
BlobOutput
FROM
EventHubInput TIMESTAMP BY timestamp
GROUP BY
deviceId,
location,
SlidingWindow(minute, 5)
HAVING
AVG(temperature) > 80;
Window Behavior: - Recomputed on EVERY event arrival - Looks back 5 minutes from each event - Useful for immediate alerting
3.2 Continuous Anomaly Detection¶
Detect anomalies using sliding window statistics:
-- Detect temperature spikes using sliding window
SELECT
deviceId,
location,
temperature AS currentTemp,
AVG(temperature) OVER (PARTITION BY deviceId LIMIT DURATION(minute, 10)) AS avgTempLast10Min,
STDEV(temperature) OVER (PARTITION BY deviceId LIMIT DURATION(minute, 10)) AS stdDevLast10Min,
CASE
WHEN temperature > AVG(temperature) OVER (PARTITION BY deviceId LIMIT DURATION(minute, 10)) +
(2 * STDEV(temperature) OVER (PARTITION BY deviceId LIMIT DURATION(minute, 10)))
THEN 'Spike Detected'
WHEN temperature < AVG(temperature) OVER (PARTITION BY deviceId LIMIT DURATION(minute, 10)) -
(2 * STDEV(temperature) OVER (PARTITION BY deviceId LIMIT DURATION(minute, 10)))
THEN 'Drop Detected'
ELSE 'Normal'
END AS anomalyStatus,
System.Timestamp() AS detectionTime
INTO
BlobOutput
FROM
EventHubInput TIMESTAMP BY timestamp
WHERE
temperature IS NOT NULL;
3.3 Real-Time Threshold Alerting¶
Continuous monitoring with sliding windows:
-- Alert when conditions persist over sliding window
SELECT
deviceId,
location,
AVG(vibration) AS avgVibration,
MAX(vibration) AS maxVibration,
COUNT(*) AS highVibrationCount,
'High Vibration Alert' AS alertType,
System.Timestamp() AS alertTime
INTO
BlobOutput
FROM
EventHubInput TIMESTAMP BY timestamp
GROUP BY
deviceId,
location,
SlidingWindow(minute, 3)
HAVING
AVG(vibration) > 1.0
AND COUNT(*) >= 5; -- At least 5 high readings
🎯 Step 4: Session Windows¶
4.1 Basic Session Window¶
Group events separated by inactivity timeout:
-- Identify device activity sessions (10-second inactivity timeout)
SELECT
deviceId,
location,
COUNT(*) AS eventsInSession,
AVG(temperature) AS avgTemp,
MIN(timestamp) AS sessionStart,
MAX(timestamp) AS sessionEnd,
DATEDIFF(second, MIN(timestamp), MAX(timestamp)) AS sessionDurationSeconds,
System.Timestamp() AS sessionWindowEnd
INTO
SqlOutput
FROM
EventHubInput TIMESTAMP BY timestamp
GROUP BY
deviceId,
location,
SessionWindow(second, 10); -- 10-second timeout
Window Behavior: - Window closes after 10 seconds of no events from device - Session duration varies based on event frequency - Ideal for user activity or device usage patterns
4.2 User Activity Sessions¶
Track device usage patterns:
-- Analyze device operational sessions
SELECT
deviceId,
location,
COUNT(*) AS operationCount,
MIN(timestamp) AS sessionStart,
MAX(timestamp) AS sessionEnd,
DATEDIFF(minute, MIN(timestamp), MAX(timestamp)) AS sessionDurationMinutes,
AVG(temperature) AS avgSessionTemp,
MAX(temperature) AS peakSessionTemp,
-- Categorize session length
CASE
WHEN DATEDIFF(minute, MIN(timestamp), MAX(timestamp)) < 5 THEN 'Short'
WHEN DATEDIFF(minute, MIN(timestamp), MAX(timestamp)) < 30 THEN 'Medium'
ELSE 'Long'
END AS sessionLength,
System.Timestamp() AS sessionEnd
INTO
BlobOutput
FROM
EventHubInput TIMESTAMP BY timestamp
WHERE
status != 'offline'
GROUP BY
deviceId,
location,
SessionWindow(minute, 5, 60); -- 5-min timeout, 60-min max duration
4.3 Equipment Usage Analysis¶
Understand operational patterns:
-- Equipment usage patterns with session windows
SELECT
SUBSTRING(location, 1, CHARINDEX('/', location) - 1) AS building,
COUNT(DISTINCT deviceId) AS activeDevicesInSession,
AVG(DATEDIFF(minute, MIN(timestamp), MAX(timestamp))) AS avgSessionDuration,
COUNT(*) AS totalOperations,
MIN(timestamp) AS periodStart,
MAX(timestamp) AS periodEnd,
System.Timestamp() AS analysisTime
INTO
SqlOutput
FROM
EventHubInput TIMESTAMP BY timestamp
WHERE
eventType = 'telemetry'
AND status = 'normal'
GROUP BY
SUBSTRING(location, 1, CHARINDEX('/', location) - 1),
SessionWindow(minute, 10, 120); -- 10-min timeout, 2-hour max
🔍 Step 5: Advanced Windowing Patterns¶
5.1 Combining Multiple Window Types¶
Use different windows for different analyses:
-- Real-time alerts (Sliding) + Trend analysis (Hopping)
-- Alert output
SELECT
'RealTimeAlert' AS outputType,
deviceId,
AVG(temperature) AS avgTemp,
System.Timestamp() AS alertTime
INTO
BlobOutput
FROM
EventHubInput TIMESTAMP BY timestamp
GROUP BY
deviceId,
SlidingWindow(minute, 5)
HAVING
AVG(temperature) > 85;
-- Trend output
SELECT
'TrendAnalysis' AS outputType,
deviceId,
AVG(temperature) AS movingAvgTemp,
STDEV(temperature) AS tempVariability,
System.Timestamp() AS windowEnd
INTO
SqlOutput
FROM
EventHubInput TIMESTAMP BY timestamp
GROUP BY
deviceId,
HoppingWindow(minute, 30, 10);
5.2 Nested Windowing¶
Windows within windows for complex analysis:
-- Calculate hourly max of 5-minute averages
WITH FiveMinuteAverages AS (
SELECT
deviceId,
AVG(temperature) AS avgTemp,
System.Timestamp() AS windowEnd
FROM
EventHubInput TIMESTAMP BY timestamp
GROUP BY
deviceId,
TumblingWindow(minute, 5)
)
SELECT
deviceId,
MAX(avgTemp) AS maxOf5MinAverages,
AVG(avgTemp) AS avgOf5MinAverages,
COUNT(*) AS numberOfFiveMinWindows,
System.Timestamp() AS hourlyWindowEnd
INTO
SqlOutput
FROM
FiveMinuteAverages TIMESTAMP BY windowEnd
GROUP BY
deviceId,
TumblingWindow(hour, 1);
5.3 Window Size Optimization¶
Choose optimal window sizes based on data characteristics:
-- Multi-resolution analysis
-- High frequency (1-minute windows)
SELECT
'1min' AS resolution,
deviceId,
AVG(temperature) AS avgTemp,
System.Timestamp() AS windowEnd
INTO
BlobOutput
FROM
EventHubInput TIMESTAMP BY timestamp
GROUP BY
deviceId,
TumblingWindow(minute, 1);
-- Medium frequency (15-minute windows)
SELECT
'15min' AS resolution,
deviceId,
AVG(temperature) AS avgTemp,
STDEV(temperature) AS tempStdDev,
System.Timestamp() AS windowEnd
INTO
SqlOutput
FROM
EventHubInput TIMESTAMP BY timestamp
GROUP BY
deviceId,
TumblingWindow(minute, 15);
-- Low frequency (1-hour windows)
SELECT
'1hour' AS resolution,
deviceId,
location,
AVG(temperature) AS avgTemp,
MIN(temperature) AS minTemp,
MAX(temperature) AS maxTemp,
COUNT(*) AS eventCount,
System.Timestamp() AS windowEnd
INTO
SqlOutput
FROM
EventHubInput TIMESTAMP BY timestamp
GROUP BY
deviceId,
location,
TumblingWindow(hour, 1);
🧪 Step 6: Testing and Deployment¶
6.1 Deploy Windowing Query¶
# Example: Deploy hopping window moving average query
$query = @"
-- Moving average temperature monitoring
SELECT
deviceId,
location,
AVG(temperature) AS movingAvgTemp,
MIN(temperature) AS minTemp,
MAX(temperature) AS maxTemp,
COUNT(*) AS sampleCount,
System.Timestamp() AS windowEnd
INTO
SqlOutput
FROM
EventHubInput TIMESTAMP BY timestamp
GROUP BY
deviceId,
location,
HoppingWindow(minute, 15, 5);
"@
$query | Out-File -FilePath "windowing-query.sql" -Encoding UTF8
# Stop job
az stream-analytics job stop --name $env:STREAM_JOB --resource-group $env:STREAM_RG
# Update transformation
az stream-analytics transformation update `
--job-name $env:STREAM_JOB `
--resource-group $env:STREAM_RG `
--name "Transformation" `
--saql @windowing-query.sql
# Start job
az stream-analytics job start `
--job-name $env:STREAM_JOB `
--resource-group $env:STREAM_RG `
--output-start-mode JobStartTime
6.2 Validate Window Output¶
# Query to check windowed results
$validationQuery = @"
SELECT TOP 20
deviceId,
location,
windowEnd,
movingAvgTemp,
sampleCount,
DATEDIFF(minute, LAG(windowEnd) OVER (PARTITION BY deviceId ORDER BY windowEnd), windowEnd) AS minutesBetweenWindows
FROM SensorReadings
ORDER BY deviceId, windowEnd DESC;
"@
Invoke-Sqlcmd -ConnectionString $connectionString -Query $validationQuery | Format-Table
🎓 Key Concepts Learned¶
Window Selection Guide¶
| Requirement | Recommended Window |
|---|---|
| Fixed-interval reports | Tumbling |
| Moving averages | Hopping |
| Real-time alerts | Sliding |
| User/device sessions | Session |
| Trend detection | Hopping |
| Anomaly detection | Sliding |
Best Practices¶
- Tumbling: Use for non-overlapping batch processing
- Hopping: Choose hop size < window size for overlap
- Sliding: Use sparingly (compute-intensive)
- Session: Set timeout based on expected inactivity
- Performance: Larger windows = fewer computations = better performance
Common Patterns¶
- Dashboard Metrics: Tumbling windows (1, 5, 15 min)
- Trend Analysis: Hopping windows (15-60 min window, 5-15 min hop)
- Alerting: Sliding windows (1-5 min)
- Usage Analytics: Session windows (5-30 min timeout)
🚀 Next Steps¶
You've mastered temporal windowing! Continue to:
Tutorial 06: Joins and Temporal Operations →
In the next tutorial, you'll learn:
- Stream-to-stream joins
- Stream-to-reference data joins
- Temporal joins with time constraints
- Self-joins for pattern detection
📚 Additional Resources¶
Tutorial Progress: 5 of 11 complete | Next: Joins and Temporal Operations
Last Updated: January 2025