Azure Event Hubs Guide¶
See also: generic Azure reference
For service-agnostic deep-dive content on Azure Event Hubs — architecture, feature reference, code samples, and patterns independent of CSA-in-a-Box — see Azure Event Hubs in the reference library.
Event Hubs is the streaming backbone of CSA-in-a-Box — a managed, Kafka-compatible ingestion layer that buffers high-velocity data into the Bronze layer with zero broker operations. See ADR-0005 for the decision rationale.
Why Event Hubs¶
Event Hubs was chosen over self-hosted Kafka and Confluent Cloud for three reasons: it is Gov-GA with FedRAMP High inheritance, it requires zero ZooKeeper/KRaft management, and its Kafka-protocol endpoint lets existing Kafka producers connect without code changes. Event Hubs Capture writes directly to ADLS Gen2 as Avro or Parquet, giving every vertical example zero-code Bronze ingestion.
Architecture Overview¶
graph LR
subgraph Producers
A[IoT Devices]
B[Application Logs]
C[CDC Streams]
D[Kafka Clients]
end
subgraph "Event Hub Namespace"
NS[Namespace<br/>TLS + Managed Identity]
EH1[Event Hub: telemetry<br/>8 partitions]
EH2[Event Hub: logs<br/>4 partitions]
EH3[Event Hub: cdc<br/>16 partitions]
end
subgraph "Consumer Groups"
CG1[cg-databricks]
CG2[cg-functions]
CG3[cg-adx]
CG4[cg-stream-analytics]
end
subgraph Consumers
DBR[Databricks<br/>Structured Streaming]
FUNC[Azure Functions]
ADX[Azure Data Explorer]
ASA[Stream Analytics]
end
subgraph Archival
CAP[Event Hub Capture]
ADLS[(ADLS Gen2<br/>Bronze Layer)]
end
A --> NS
B --> NS
C --> NS
D -->|Kafka protocol| NS
NS --> EH1
NS --> EH2
NS --> EH3
EH1 --> CG1
EH1 --> CG2
EH2 --> CG3
EH3 --> CG4
CG1 --> DBR
CG2 --> FUNC
CG3 --> ADX
CG4 --> ASA
EH1 --> CAP
EH2 --> CAP
EH3 --> CAP
CAP --> ADLS Setup¶
Namespace Creation (Bicep)¶
resource ehNamespace 'Microsoft.EventHub/namespaces@2024-01-01' = {
name: 'eh-csa-${environment}-${location}'
location: location
sku: {
name: 'Standard' // Standard | Premium | Dedicated
tier: 'Standard'
capacity: 2 // Throughput units (Standard) or PUs (Premium)
}
properties: {
isAutoInflateEnabled: true
maximumThroughputUnits: 10
kafkaEnabled: true // Enables Kafka-protocol endpoint
minimumTlsVersion: '1.2'
publicNetworkAccess: 'Disabled'
disableLocalAuth: true // Force Entra ID auth only
}
}
Event Hub Configuration¶
resource eventHub 'Microsoft.EventHub/namespaces/eventhubs@2024-01-01' = {
parent: ehNamespace
name: 'telemetry'
properties: {
partitionCount: 8 // Fixed at creation on Standard tier
messageRetentionInDays: 7 // 1-7 (Standard) or 1-90 (Premium)
status: 'Active'
}
}
Consumer Groups¶
Always create dedicated consumer groups per downstream consumer. The default $Default group should never be shared across consumers.
resource cgDatabricks 'Microsoft.EventHub/namespaces/eventhubs/consumergroups@2024-01-01' = {
parent: eventHub
name: 'cg-databricks'
properties: {
userMetadata: 'Databricks Structured Streaming — Bronze ingest'
}
}
resource cgAdx 'Microsoft.EventHub/namespaces/eventhubs/consumergroups@2024-01-01' = {
parent: eventHub
name: 'cg-adx'
properties: {
userMetadata: 'ADX streaming ingestion — hot path analytics'
}
}
Event Hub Capture¶
Capture provides automatic archival of every event to ADLS Gen2 — no consumer code required. This is the primary mechanism for populating the Bronze layer in CSA-in-a-Box.
Configuration¶
resource captureDescription 'Microsoft.EventHub/namespaces/eventhubs@2024-01-01' = {
parent: ehNamespace
name: 'telemetry'
properties: {
partitionCount: 8
messageRetentionInDays: 7
captureDescription: {
enabled: true
encoding: 'Avro' // Avro (default) or AvroDeflate
intervalInSeconds: 300 // Flush every 5 minutes
sizeLimitInBytes: 314572800 // 300 MB max file size
destination: {
name: 'EventHubArchive.AzureBlockBlob'
properties: {
storageAccountResourceId: adlsAccount.id
blobContainer: 'bronze'
archiveNameFormat: '{Namespace}/{EventHub}/{PartitionId}/{Year}/{Month}/{Day}/{Hour}/{Minute}/{Second}'
}
}
}
}
}
Path Patterns¶
| Pattern element | Example | Notes |
|---|---|---|
{Namespace} | eh-csa-dev-eastus | Identifies the source namespace |
{EventHub} | telemetry | Topic/event hub name |
{PartitionId} | 0 through 7 | Partition for ordering |
{Year}/{Month}/{Day} | 2026/04/29 | Date-based partitioning |
{Hour}/{Minute}/{Second} | 14/30/00 | Time granularity |
Parquet Capture
Premium and Dedicated tiers support Parquet encoding directly. For Standard tier, Capture writes Avro — convert to Delta in the Bronze-to-Silver dbt step.
Kafka Compatibility¶
Event Hubs exposes a Kafka-protocol endpoint on port 9093 (SASL_SSL). Existing Kafka producers and consumers connect without code changes.
Kafka Client Configuration¶
# bootstrap.servers — use the Event Hubs namespace FQDN
bootstrap.servers=eh-csa-dev-eastus.servicebus.windows.net:9093
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
username="$ConnectionString" \
password="Endpoint=sb://eh-csa-dev-eastus.servicebus.windows.net/;SharedAccessKeyName=...;SharedAccessKey=...";
# Or use OAuth (recommended for production)
sasl.mechanism=OAUTHBEARER
sasl.login.callback.handler.class=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginCallbackHandler
sasl.oauthbearer.token.endpoint.url=https://login.microsoftonline.com/{tenant-id}/oauth2/v2.0/token
sasl.oauthbearer.scope=https://eh-csa-dev-eastus.servicebus.windows.net/.default
Kafka Feature Subset
Event Hubs supports the Kafka producer and consumer protocols but does not support Kafka Connect, Kafka Streams, KSQL, or broker-side transactions. If you need those, evaluate Confluent Cloud on Azure as an alternate path (see ADR-0005).
Schema Registry¶
Event Hubs includes a built-in Schema Registry for Avro schema management.
Registering a Schema¶
from azure.schemaregistry import SchemaRegistryClient
from azure.identity import DefaultAzureCredential
client = SchemaRegistryClient(
fully_qualified_namespace="eh-csa-dev-eastus.servicebus.windows.net",
credential=DefaultAzureCredential(),
)
schema_definition = """{
"type": "record",
"name": "TelemetryEvent",
"namespace": "com.csa.telemetry",
"fields": [
{"name": "device_id", "type": "string"},
{"name": "timestamp", "type": "long", "logicalType": "timestamp-millis"},
{"name": "temperature", "type": "double"},
{"name": "humidity", "type": "double"}
]
}"""
schema = client.register_schema(
group_name="csa-telemetry",
name="TelemetryEvent",
definition=schema_definition,
format="Avro",
)
print(f"Schema ID: {schema.id}")
Schema Evolution Rules¶
| Change | Backward compatible? | Action required |
|---|---|---|
| Add field with default | Yes | No consumer changes |
| Remove field | No | Update all consumers first |
| Rename field | No | Add new field, deprecate old |
| Change field type | No | Version the schema (v2) |
Streaming Patterns¶
At-Least-Once vs Exactly-Once¶
| Guarantee | How | Trade-off |
|---|---|---|
| At-least-once | Consumer commits offset after processing | Duplicates possible; use idempotent writes |
| Exactly-once | Checkpoint + transactional sink (Databricks Delta) | Higher latency; requires Delta MERGE or dedup |
CSA-in-a-Box Default
The platform defaults to at-least-once with idempotent Delta MERGE in dbt Silver models. This is simpler to operate and handles 99% of use cases.
Checkpointing¶
Consumers must persist their partition offsets (checkpoints) to survive restarts. Databricks Structured Streaming uses ADLS-based checkpoints automatically. Azure Functions uses Azure Storage blobs.
Partition Assignment¶
- Round-robin (default) — events spread evenly across partitions
- Partition key — events with the same key go to the same partition (ordering guarantee)
- Use partition keys for per-device or per-tenant ordering; avoid hot partitions by choosing high-cardinality keys
Databricks Integration¶
Structured Streaming from Event Hubs (PySpark)¶
from pyspark.sql import SparkSession
from pyspark.sql.functions import col, from_json, current_timestamp
from pyspark.sql.types import StructType, StructField, StringType, DoubleType, LongType
spark = SparkSession.builder.getOrCreate()
# Connection configuration
eh_conf = {
"eventhubs.connectionString": sc._jvm.org.apache.spark.eventhubs
.EventHubsUtils
.encrypt(dbutils.secrets.get("kv-csa", "eh-connection-string")),
"eventhubs.consumerGroup": "cg-databricks",
"eventhubs.startingPosition": '{"offset": "-1", "enqueuedTime": null, "isInclusive": true}',
"maxEventsPerTrigger": 10000,
}
# Define the payload schema
schema = StructType([
StructField("device_id", StringType()),
StructField("timestamp", LongType()),
StructField("temperature", DoubleType()),
StructField("humidity", DoubleType()),
])
# Read stream
raw_stream = (
spark.readStream
.format("eventhubs")
.options(**eh_conf)
.load()
)
# Parse the body (bytes → JSON → struct)
parsed = (
raw_stream
.withColumn("body", col("body").cast("string"))
.withColumn("data", from_json(col("body"), schema))
.select(
col("data.device_id").alias("device_id"),
col("data.timestamp").alias("event_timestamp"),
col("data.temperature"),
col("data.humidity"),
col("enqueuedTime").alias("enqueued_at"),
current_timestamp().alias("ingested_at"),
)
)
# Write to Bronze Delta table with checkpointing
(
parsed.writeStream
.format("delta")
.outputMode("append")
.option("checkpointLocation", "/mnt/bronze/telemetry/_checkpoint")
.table("bronze.telemetry_raw")
)
Security¶
Authentication & Authorization¶
| Method | Use case | Recommendation |
|---|---|---|
| Managed Identity | Azure-hosted consumers (Databricks, Functions, ADX) | Preferred for all Azure workloads |
| SAS Policy | External producers, legacy clients | Scope to Send-only or Listen-only; rotate regularly |
| Entra ID OAuth | Kafka clients, service principals | Use Azure Event Hubs Data Sender/Receiver roles |
Network Security¶
- Private Endpoints — deploy Event Hubs with
publicNetworkAccess: Disabledand connect via Private Link - IP firewall rules — allowlist specific CIDRs when private endpoints are not feasible
- Service endpoints — legacy option; prefer Private Endpoints for new deployments
Disable Local Auth
Set disableLocalAuth: true on the namespace to force Entra ID authentication and prevent SAS key usage in production.
Monitoring¶
Key Metrics¶
| Metric | Alert threshold | What it means |
|---|---|---|
| Incoming Messages | Baseline + 50% | Burst ingestion; check auto-inflate |
| Outgoing Messages | Drop > 30% | Consumer lag or failure |
| Throttled Requests | Any | Throughput units exhausted |
| Server Errors | Any | Service-side issue; open support ticket |
| Capture Backlog | > 0 for 15 min | Capture falling behind; increase flush interval |
Diagnostic Settings¶
resource diagnostics 'Microsoft.Insights/diagnosticSettings@2021-05-01-preview' = {
scope: ehNamespace
name: 'eh-diagnostics'
properties: {
workspaceId: logAnalyticsWorkspace.id
logs: [
{ categoryGroup: 'allLogs', enabled: true }
]
metrics: [
{ category: 'AllMetrics', enabled: true }
]
}
}
Cost Optimization¶
| Tier | Best for | Throughput | Price model |
|---|---|---|---|
| Standard | Dev/test, moderate workloads | 1 TU = 1 MB/s in, 2 MB/s out | Per TU-hour + ingress |
| Premium | Production, large partitions | 1 PU = ~5-10 TU equivalent | Per PU-hour |
| Dedicated | Multi-tenant, >1M msg/s | Entire cluster reserved | Per CU-hour |
Auto-Inflate
Enable auto-inflate on Standard tier to automatically scale throughput units from your baseline up to a configured maximum. This prevents throttling during burst ingestion without pre-provisioning peak capacity.
Capture Costs¶
Capture is free to enable — you pay only for the ADLS storage written. At Standard tier, Capture writes Avro; at Premium/Dedicated, Parquet is available and reduces downstream conversion costs.
Anti-Patterns¶
Don't: Share consumer groups across consumers
Each downstream service needs its own consumer group. Sharing causes partition theft and message loss.
Don't: Use Event Hubs for request-reply
Event Hubs is an append-only log. For request-reply messaging, use Azure Service Bus queues or topics.
Don't: Ignore partition count planning
Partition counts are fixed at creation on Standard tier. Start with at least as many partitions as your expected peak consumer parallelism. Over-partitioning is cheap; under-partitioning requires recreating the hub.
Do: Use partition keys for ordering
If per-device or per-tenant ordering matters, set a partition key. Without a key, events are round-robin distributed and order is not guaranteed.
Do: Enable Capture from day one
Capture costs nothing beyond storage. Enable it immediately so the Bronze layer is populated from the start — you cannot retroactively capture events that have already expired.
Checklist¶
- Namespace created with
kafkaEnabled: trueanddisableLocalAuth: true - Private Endpoint configured; public network access disabled
- Event hubs created with appropriate partition counts
- Dedicated consumer groups created per downstream consumer
- Capture enabled to ADLS Gen2 Bronze container
- Schema Registry schemas registered for each event type
- Databricks Structured Streaming job tested with checkpoint
- Diagnostic settings forwarding to Log Analytics
- Auto-inflate enabled (Standard) or PU scaling configured (Premium)
- Alerting configured for throttled requests and consumer lag