Azure Data Explorer (ADX) / KQL Guide¶

Azure Data Explorer delivers sub-second queries over streaming and time-series data — the hot-path analytics engine in CSA-in-a-Box for real-time dashboards, anomaly detection, and operational intelligence.

Why ADX¶

Azure Data Explorer (Kusto) is purpose-built for exploring large volumes of semi-structured telemetry, logs, and time-series data. Where the Delta Lakehouse excels at batch analytics across the medallion layers, ADX excels at interactive, ad-hoc queries over streaming data with sub-second response times. CSA-in-a-Box uses ADX alongside the Lakehouse — not instead of it — for scenarios where detection latency is measured in seconds, not hours.

Architecture Overview¶

graph LR
    subgraph Ingestion
        EH[Event Hubs<br/>Streaming Data]
        IOT[IoT Hub<br/>Device Telemetry]
        ADLS[(ADLS Gen2<br/>Historical Batch)]
    end

    subgraph "ADX Cluster"
        DB1[Database: telemetry]
        DB2[Database: security_logs]
        MV[Materialized Views<br/>Pre-aggregated]
        HOT[Hot Cache<br/>SSD — fast queries]
        COLD[Cold Storage<br/>Managed disk — low cost]
    end

    subgraph Consumption
        DASH[ADX Dashboards]
        GRAF[Grafana Plugin]
        PBI[Power BI<br/>DirectQuery]
        API[KQL REST API]
        ALERT[Azure Monitor Alerts]
    end

    subgraph "Long-term Archival"
        EXPORT[Continuous Export]
        ADLS2[(ADLS Gen2<br/>Delta / Parquet)]
    end

    EH -->|Data connection| DB1
    IOT -->|Data connection| DB1
    ADLS -->|One-time ingest| DB2
    EH -->|Data connection| DB2

    DB1 --> MV
    DB1 --> HOT
    DB1 --> COLD
    DB2 --> HOT
    DB2 --> COLD

    HOT --> DASH
    HOT --> GRAF
    MV --> PBI
    HOT --> API
    HOT --> ALERT

    DB1 --> EXPORT
    DB2 --> EXPORT
    EXPORT --> ADLS2

Setup¶

Cluster Creation (Bicep)¶

resource adxCluster 'Microsoft.Kusto/clusters@2023-08-15' = {
  name: 'adxcsa${environment}${location}'
  location: location
  sku: {
    name: 'Standard_E8ads_v5'    // Dev: Dev(No SLA)_Standard_E2a_v4
    tier: 'Standard'
    capacity: 2                  // Node count (min 2 for production)
  }
  identity: {
    type: 'SystemAssigned'
  }
  properties: {
    enableStreamingIngest: true
    enableAutoStop: true          // Auto-stop when idle (dev/test)
    enableDiskEncryption: true
    enableDoubleEncryption: false // Set true for IL5/CMMC workloads
    publicNetworkAccess: 'Disabled'
    trustedExternalTenants: []
  }
}

Database Creation¶

resource adxDatabase 'Microsoft.Kusto/clusters/databases@2023-08-15' = {
  parent: adxCluster
  name: 'telemetry'
  location: location
  kind: 'ReadWrite'
  properties: {
    softDeletePeriod: 'P365D'   // Keep data for 1 year
    hotCachePeriod: 'P30D'      // Keep 30 days in hot cache (SSD)
  }
}

Table Schema Definition (KQL)¶

// Create the raw telemetry table
.create table TelemetryRaw (
    DeviceId: string,
    Timestamp: datetime,
    Temperature: real,
    Humidity: real,
    Pressure: real,
    Location: dynamic,        // JSON: {"lat": 38.9, "lon": -77.0}
    Tags: dynamic             // JSON: {"site": "dc-east", "floor": 3}
) with (folder = "raw")

// Create ingestion mapping for Event Hub JSON payload
.create table TelemetryRaw ingestion json mapping 'TelemetryMapping'
    '[{"column":"DeviceId","path":"$.device_id","datatype":"string"},'
    ' {"column":"Timestamp","path":"$.timestamp","datatype":"datetime"},'
    ' {"column":"Temperature","path":"$.temperature","datatype":"real"},'
    ' {"column":"Humidity","path":"$.humidity","datatype":"real"},'
    ' {"column":"Pressure","path":"$.pressure","datatype":"real"},'
    ' {"column":"Location","path":"$.location","datatype":"dynamic"},'
    ' {"column":"Tags","path":"$.tags","datatype":"dynamic"}]'

// Set retention and caching policy
.alter table TelemetryRaw policy retention softdelete = 365d
.alter table TelemetryRaw policy caching hot = 30d

Ingestion¶

Ingestion Methods¶

Method	Latency	Best for	Configuration
Streaming ingestion	< 10 seconds	Low-latency, low-volume	Enable on cluster + table
Queued (batched) ingestion	1-5 minutes	High throughput, cost-efficient	Default; batching policy
Event Hub data connection	1-5 minutes (queued) or < 10s (streaming)	CSA streaming pipeline	Managed data connection
IoT Hub data connection	1-5 minutes	Device telemetry	Managed data connection
One-time ADLS ingest	Minutes to hours	Historical backfill	`.ingest into` command

Event Hub Data Connection (Bicep)¶

resource ehDataConnection 'Microsoft.Kusto/clusters/databases/dataConnections@2023-08-15' = {
  parent: adxDatabase
  name: 'dc-telemetry-eh'
  location: location
  kind: 'EventHub'
  properties: {
    eventHubResourceId: eventHub.id
    consumerGroup: 'cg-adx'
    tableName: 'TelemetryRaw'
    mappingRuleName: 'TelemetryMapping'
    dataFormat: 'MULTIJSON'
    compression: 'None'
    managedIdentityResourceId: adxCluster.id
  }
}

Batching Policy¶

Control the trade-off between ingestion latency and efficiency.

// Default batching: 5 minutes or 1 GB or 1000 files
// For lower latency (higher cost):
.alter table TelemetryRaw policy ingestionbatching
    @'{"MaximumBatchingTimeSpan": "00:00:30", "MaximumNumberOfItems": 500, "MaximumRawDataSizeMB": 256}'

KQL Essentials¶

Core Operators¶

// Filter, project, and sort
TelemetryRaw
| where Timestamp > ago(1h)
| where Temperature > 35.0
| project DeviceId, Timestamp, Temperature, Humidity
| order by Temperature desc
| take 100

// Aggregation with time bins
TelemetryRaw
| where Timestamp > ago(24h)
| summarize
    AvgTemp = avg(Temperature),
    MaxTemp = max(Temperature),
    EventCount = count()
    by bin(Timestamp, 1h), DeviceId
| order by Timestamp asc

Time Series Analysis¶

// Create a regular time series and detect anomalies
let min_t = ago(7d);
let max_t = now();
let dt = 1h;
TelemetryRaw
| where Timestamp between (min_t .. max_t)
| make-series AvgTemp = avg(Temperature) on Timestamp from min_t to max_t step dt by DeviceId
| extend (anomalies, score, baseline) = series_decompose_anomalies(AvgTemp, 1.5)
| mv-expand Timestamp to typeof(datetime),
            AvgTemp to typeof(real),
            anomalies to typeof(int),
            score to typeof(real),
            baseline to typeof(real)
| where anomalies != 0
| project Timestamp, DeviceId, AvgTemp, baseline, score, AnomalyType = iff(anomalies > 0, "Spike", "Dip")

Anomaly Detection¶

// Decompose time series into trend, seasonality, and residual
TelemetryRaw
| where Timestamp > ago(30d)
| make-series AvgTemp = avg(Temperature) on Timestamp step 1h by DeviceId
| extend (flag, score, baseline) = series_decompose_anomalies(AvgTemp)
| render anomalychart with (anomalycolumns=flag)

Geospatial Queries¶

// Find devices within a geographic polygon (e.g., a facility boundary)
let facility = dynamic({
    "type": "Polygon",
    "coordinates": [[[-77.05, 38.88], [-77.03, 38.88], [-77.03, 38.90], [-77.05, 38.90], [-77.05, 38.88]]]
});
TelemetryRaw
| where Timestamp > ago(1h)
| extend lat = toreal(Location.lat), lon = toreal(Location.lon)
| where geo_point_in_polygon(lon, lat, facility)
| summarize DeviceCount = dcount(DeviceId), AvgTemp = avg(Temperature)

Render Visualizations¶

// Time chart
TelemetryRaw
| where Timestamp > ago(24h)
| summarize AvgTemp = avg(Temperature) by bin(Timestamp, 15m), DeviceId
| render timechart

// Pie chart of events by device
TelemetryRaw
| where Timestamp > ago(1h)
| summarize EventCount = count() by DeviceId
| top 10 by EventCount
| render piechart

Materialized Views¶

Materialized views pre-aggregate data for dashboard queries, reducing compute at query time.

// Hourly aggregation — dashboards query this instead of raw table
.create materialized-view with (backfill=true) TelemetryHourly on table TelemetryRaw {
    TelemetryRaw
    | summarize
        AvgTemp = avg(Temperature),
        MaxTemp = max(Temperature),
        MinTemp = min(Temperature),
        AvgHumidity = avg(Humidity),
        EventCount = count()
        by bin(Timestamp, 1h), DeviceId
}

// Daily rollup for trend dashboards
.create materialized-view with (backfill=true) TelemetryDaily on table TelemetryRaw {
    TelemetryRaw
    | summarize
        AvgTemp = avg(Temperature),
        P95Temp = percentile(Temperature, 95),
        EventCount = count()
        by bin(Timestamp, 1d), DeviceId
}

Dashboard Performance

Point Power BI DirectQuery and Grafana dashboards at materialized views, not raw tables. A dashboard hitting TelemetryHourly runs 10-100x faster than one scanning TelemetryRaw with a summarize.

Data Retention¶

Hot Cache vs Cold Storage¶

Tier	Storage	Query speed	Cost	Use case
Hot cache	SSD on cluster nodes	Sub-second	Higher (compute-bound)	Recent data, active dashboards
Cold storage	Managed disk (remote)	Seconds	Lower	Historical queries, compliance

// Keep 30 days hot, 365 days total
.alter table TelemetryRaw policy caching hot = 30d
.alter table TelemetryRaw policy retention softdelete = 365d recoverability = enabled

Continuous Export to ADLS¶

For data older than the ADX retention period, export to ADLS for long-term archival in the Delta Lakehouse.

// Create external table for export target
.create external table TelemetryArchive (
    DeviceId: string,
    Timestamp: datetime,
    Temperature: real,
    Humidity: real,
    Pressure: real
) kind=adls
    partition by (Year: int = getYear(Timestamp), Month: int = getMonth(Timestamp))
    pathformat = ("year=" Year "/month=" Month)
    dataformat = parquet
    (
        h@'https://csaadls.blob.core.windows.net/archive/telemetry;managed_identity=system'
    )

// Set up continuous export
.create-or-alter continuous-export TelemetryExport
    to table TelemetryArchive
    with (intervalBetweenRuns=1h, forcedLatency=10m)
    <| TelemetryRaw
       | project DeviceId, Timestamp, Temperature, Humidity, Pressure

Fabric Eventhouse¶

Fabric Eventhouse is ADX running inside Fabric — same KQL engine, same query language, integrated with OneLake and Real-Time Dashboards.

Feature	Standalone ADX	Fabric Eventhouse
KQL engine	Same	Same
Deployment	Azure resource (Bicep)	Fabric workspace item
Storage	Managed by ADX	OneLake (Delta-compatible)
Dashboards	ADX Dashboards	Real-Time Dashboards
Alerting	Azure Monitor	Fabric Activator (Reflex)
Gov availability	GA	Not yet Gov GA (ADR-0018)

graph LR
    ES[Fabric Eventstream] --> EH2[Eventhouse<br/>KQL Database]
    EH2 --> RTD[Real-Time Dashboard]
    EH2 --> ACT[Activator<br/>Automated Alerts]
    EH2 --> OL[(OneLake<br/>Delta shortcut)]
    OL --> PBI[Power BI<br/>Direct Lake]

Migration Path

When Fabric reaches Gov GA, ADX workloads migrate to Eventhouse with no KQL changes. The query language, table schemas, and materialized views are identical. See ADR-0018 for the env-gated adapter pattern.

Dashboards and Visualization¶

ADX Dashboards¶

Native dashboards built into the ADX web UI. Best for ops teams who write KQL directly.

KQL-native — each tile is a KQL query
Parameters — dropdown filters backed by KQL queries
Auto-refresh — configurable refresh interval (30s minimum)
Sharing — share via URL or embed in portal

Grafana Plugin¶

For teams already using Grafana, the Azure Data Explorer plugin provides native KQL support.

# Grafana data source configuration
[plugin.azure-data-explorer]
  clusterUrl = https://adxcsadeveastus.eastus.kusto.windows.net
  database = telemetry
  authentication = managed_identity

Power BI Connector¶

Use DirectQuery from Power BI to ADX for real-time dashboards that combine streaming data with batch Gold layer data.

// Optimized view for Power BI (materialized view recommended)
TelemetryHourly
| where Timestamp > ago(30d)
| project Timestamp, DeviceId, AvgTemp, MaxTemp, EventCount

Power BI + ADX Performance

Always point Power BI DirectQuery at materialized views or functions — never at raw tables. Power BI generates KQL queries for every visual interaction; raw table scans will be slow and expensive.

Security¶

Authentication¶

Method	Use case	Configuration
Managed Identity	Azure services (Functions, ADF, Databricks)	Assign `Database Viewer` or `Database User` role
Entra ID user	Analysts, dashboard users	Add to database principals
Service principal	External apps, CI/CD	App registration + database principal

Network Security¶

Private Endpoints — connect via Private Link; disable public access
VNet injection — deploy ADX cluster into a VNet (legacy; prefer Private Endpoints)
Managed Private Endpoints — ADX-initiated connections to sources (Event Hub, ADLS)

Row-Level Security¶

// Create a function that filters by user's region
.create function with (
    folder="security",
    docstring="RLS filter for regional access"
) RegionalFilter() {
    TelemetryRaw
    | where Tags.region == current_principal_details().Region
       or current_principal_details().Role == "Admin"
}

// Apply as restricted view policy
.alter table TelemetryRaw policy restricted_view_access true

Monitoring¶

Key Metrics¶

Metric	Alert threshold	What it means
Ingestion latency	> 5 minutes (queued)	Batching delay or source issue
Ingestion volume	Drop > 50%	Source failure or data connection issue
Cache utilization	> 80%	Consider increasing hot cache or cluster size
CPU utilization	> 70% sustained	Scale out or optimize queries
Query duration (P95)	> 30 seconds	Missing materialized views or unoptimized KQL
Failed ingestions	Any	Schema mismatch or mapping error

Diagnostic Settings¶

resource adxDiagnostics 'Microsoft.Insights/diagnosticSettings@2021-05-01-preview' = {
  scope: adxCluster
  name: 'adx-diagnostics'
  properties: {
    workspaceId: logAnalyticsWorkspace.id
    logs: [
      { category: 'SucceededIngestion', enabled: true }
      { category: 'FailedIngestion', enabled: true }
      { category: 'Command', enabled: true }
      { category: 'Query', enabled: true }
    ]
    metrics: [
      { category: 'AllMetrics', enabled: true }
    ]
  }
}

Cost Optimization¶

Cluster SKU Sizing¶

Workload	Recommended SKU	Nodes	Use case
Dev/Test	`Dev(No SLA)_Standard_E2a_v4`	1	Single-node, no SLA, auto-stop
Small production	`Standard_E8ads_v5`	2	< 1 TB hot data, moderate queries
Medium production	`Standard_E16ads_v5`	3-5	1-10 TB hot data, concurrent dashboards
Large production	`Standard_E16ads_v5`	8+	> 10 TB hot data, heavy anomaly detection

Auto-Stop for Dev/Test

Enable enableAutoStop: true on dev/test clusters. The cluster stops after 24 hours of inactivity and restarts on the next query — saving significant compute cost.

Cost Levers¶

Lever	Action	Impact
Hot cache period	Reduce from 30d to 7d	Less SSD storage per node
Retention period	Export to ADLS, reduce ADX retention	Less managed storage
Materialized views	Pre-aggregate; reduce query compute	Lower per-query cost
Auto-stop	Enable for non-production	Zero cost when idle
Reserved instances	1-year or 3-year commitment	30-60% discount

Anti-Patterns¶

Don't: Use ADX as a data warehouse

ADX is optimized for time-series and log analytics, not relational star-schema queries. Use the Delta Lakehouse (Databricks/Synapse) for warehouse workloads and ADX for streaming analytics.

Don't: Query raw tables from dashboards

Every dashboard refresh scans the raw table without materialized views. Create materialized views for any query pattern used by dashboards.

Don't: Set hot cache to maximum retention

Keeping all data in hot cache is expensive and unnecessary. Only recent data (7-30 days) needs sub-second query speed; older data queries can tolerate seconds of latency from cold storage.

Don't: Ignore ingestion failures

Failed ingestions usually indicate schema mismatches between the source payload and the table mapping. Monitor the FailedIngestion log and fix mappings immediately — failed events are dropped, not retried.

Do: Use streaming ingestion for low-latency use cases

When detection latency must be under 10 seconds (fraud, security), enable streaming ingestion on the table. For everything else, queued ingestion is more cost-efficient.

Do: Export to ADLS for long-term retention

Continuous export moves aged data to Parquet in ADLS, where it joins the Delta Lakehouse and is queryable by dbt, Databricks, and Synapse at a fraction of the ADX storage cost.

Azure Data Explorer (ADX) / KQL Guide¶

Why ADX¶

Architecture Overview¶

Setup¶

Cluster Creation (Bicep)¶

Database Creation¶

Table Schema Definition (KQL)¶

Ingestion¶

Ingestion Methods¶

Event Hub Data Connection (Bicep)¶

Batching Policy¶

KQL Essentials¶

Core Operators¶

Time Series Analysis¶

Anomaly Detection¶

Geospatial Queries¶

Render Visualizations¶

Materialized Views¶

Data Retention¶

Hot Cache vs Cold Storage¶

Continuous Export to ADLS¶

Fabric Eventhouse¶

Dashboards and Visualization¶

ADX Dashboards¶

Grafana Plugin¶

Power BI Connector¶

Security¶

Authentication¶

Network Security¶

Row-Level Security¶

Monitoring¶

Key Metrics¶

Diagnostic Settings¶

Cost Optimization¶

Cluster SKU Sizing¶

Cost Levers¶

Anti-Patterns¶

Checklist¶

Related Documentation¶