Skip to content

📨 Azure Event Hubs

See also: CSA-in-a-Box platform guide

This is the generic Azure reference for Azure Event Hubs. For how CSA-in-a-Box specifically deploys, configures, and integrates this service, see the platform guide: Azure Event Hubs guide.

Status Tier Complexity

Big data streaming platform and event ingestion service for millions of events per second.


🌟 Service Overview

Azure Event Hubs is a fully managed, real-time data ingestion service that can stream millions of events per second from any source. It provides a distributed streaming platform with low latency and seamless integration with Azure and third-party services.

🔥 Key Value Propositions

  • Massive Scale: Ingest millions of events per second with elastic throughput
  • Kafka Compatibility: Drop-in replacement for Apache Kafka with native protocol support
  • Auto-Capture: Automatically capture streaming data to Azure Data Lake or Blob Storage
  • Global Distribution: Multi-region replication with geo-disaster recovery
  • Enterprise Security: Advanced authentication, encryption, and network isolation

🏗️ Architecture Overview

graph TB
    subgraph "Event Producers"
        IoT[IoT Devices]
        Apps[Applications]
        Logs[Log Collectors]
        APIs[API Services]
    end

    subgraph "Azure Event Hubs"
        subgraph "Event Hub Instance"
            P1[Partition 1]
            P2[Partition 2]
            P3[Partition 3]
            P4[Partition N]
        end

        Schema[Schema Registry]
        Capture[Event Hub Capture]
    end

    subgraph "Event Consumers"
        SA[Stream Analytics]
        Spark[Databricks/Synapse]
        Functions[Azure Functions]
        Apps2[Custom Applications]
    end

    subgraph "Storage"
        ADLS[Data Lake Gen2]
        Blob[Blob Storage]
    end

    IoT --> P1
    Apps --> P2
    Logs --> P3
    APIs --> P4

    P1 --> SA
    P2 --> Spark
    P3 --> Functions
    P4 --> Apps2

    Capture --> ADLS
    Capture --> Blob

    Schema -.-> P1
    Schema -.-> P2

💰 Pricing Tiers

🥉 Standard Tier

Pricing

Best For: Development, testing, and variable production workloads

Features:

  • Throughput Units (TUs): 1-20 auto-inflate capable
  • Retention: 1-7 days configurable
  • Consumer Groups: Up to 20 per Event Hub
  • Partitions: Up to 32 per Event Hub
  • Kafka Support: ✅ Native protocol support
  • Capture: ✅ To Data Lake or Blob Storage
  • Schema Registry: ✅ Included

Pricing Model:

  • Base charge per Throughput Unit
  • Ingress events (per million)
  • Capture charge (per GB stored)

🥇 Premium Tier

Pricing

Best For: Production workloads with predictable performance requirements

Features:

  • Processing Units (PUs): 1-16 dedicated capacity
  • Retention: Up to 90 days
  • Consumer Groups: Unlimited
  • Partitions: Up to 100 per Event Hub
  • Performance Isolation: Dedicated resources
  • Enhanced Security: Private Link, customer-managed keys
  • Larger Messages: Up to 1 MB message size

Additional Benefits:

  • Guaranteed capacity and latency
  • Network isolation with Private Link
  • Customer-managed encryption keys
  • Multi-region disaster recovery

🏆 Dedicated Tier

Pricing

Best For: Mission-critical enterprise workloads with extreme scale requirements

Features:

  • Capacity Units (CUs): Single-tenant deployments
  • Retention: Up to 90 days
  • Throughput: Multiple GB/sec per CU
  • Event Hubs: Unlimited namespaces and Event Hubs
  • Complete Isolation: Physical hardware isolation
  • Bring Your Own Key (BYOK): Full encryption control

Ideal For:

  • Multi-tenant SaaS platforms
  • Extremely high-volume scenarios (>100 MB/sec)
  • Compliance requirements needing physical isolation
  • Predictable monthly costs for large-scale operations

🎯 Core Concepts

Throughput Units (Standard Tier)

A throughput unit controls capacity for Event Hubs:

  • Ingress: Up to 1 MB/sec or 1,000 events/sec per TU
  • Egress: Up to 2 MB/sec or 4,096 events/sec per TU
  • Auto-inflate: Automatically scale TUs based on demand
# Enable auto-inflate for an Event Hub namespace
az eventhubs namespace update \
  --resource-group myResourceGroup \
  --name myNamespace \
  --enable-auto-inflate true \
  --maximum-throughput-units 20

Partitions

Partitions are ordered sequences of events within an Event Hub:

  • Purpose: Enable parallel processing and scaling
  • Count: 1-32 (Standard), up to 100 (Premium)
  • Partition Keys: Route related events to same partition
  • Ordering: Guaranteed within a partition, not across partitions
# Send event with partition key for ordering
from azure.eventhub import EventHubProducerClient, EventData

producer = EventHubProducerClient.from_connection_string(
    conn_str="your_connection_string",
    eventhub_name="your_eventhub"
)

# Events with same partition key go to same partition
event_data = EventData("Sensor reading: 23.5°C")
producer.send_event(event_data, partition_key="sensor-123")

Consumer Groups

Consumer groups enable multiple applications to read from the same Event Hub independently:

  • Default: $Default consumer group always available
  • Isolation: Each consumer group maintains its own offset
  • Limit: Up to 20 (Standard), Unlimited (Premium)
# Read from specific consumer group
from azure.eventhub import EventHubConsumerClient

consumer = EventHubConsumerClient.from_connection_string(
    conn_str="your_connection_string",
    consumer_group="analytics-team",
    eventhub_name="your_eventhub"
)

📊 Use Cases

📱 IoT Telemetry Ingestion

Scenario: Ingest millions of sensor readings per second

graph LR
    Devices[IoT Devices] -->|HTTPS/AMQP| EventHub[Event Hubs]
    EventHub --> Stream[Stream Analytics]
    EventHub --> Capture[Capture to ADLS]
    Stream --> Alerts[Real-time Alerts]
    Capture --> Analytics[Batch Analytics]

📊 Application Logging & Monitoring

Scenario: Centralized logging for distributed applications

# Send application logs to Event Hubs
import logging
from azure.eventhub import EventHubProducerClient, EventData
import json

def send_log_event(level, message, metadata):
    producer = EventHubProducerClient.from_connection_string(
        conn_str=os.getenv("EVENTHUB_CONNECTION_STRING"),
        eventhub_name="application-logs"
    )

    log_event = {
        "timestamp": datetime.utcnow().isoformat(),
        "level": level,
        "message": message,
        "metadata": metadata
    }

    event_data = EventData(json.dumps(log_event))
    producer.send_event(event_data)
    producer.close()

🔄 Change Data Capture (CDC)

Scenario: Stream database changes to Event Hubs for downstream processing

📈 Real-time Analytics Pipeline

Scenario: Process streaming data with Stream Analytics and visualize in Power BI


🚀 Quick Start

Create Event Hub Namespace and Hub

# Create resource group
az group create --name rg-eventhub-demo --location eastus

# Create Event Hubs namespace (Standard tier)
az eventhubs namespace create \
  --name eventhub-demo-ns \
  --resource-group rg-eventhub-demo \
  --location eastus \
  --sku Standard \
  --enable-auto-inflate true \
  --maximum-throughput-units 10

# Create Event Hub with 4 partitions
az eventhubs eventhub create \
  --name telemetry-events \
  --namespace-name eventhub-demo-ns \
  --resource-group rg-eventhub-demo \
  --partition-count 4 \
  --message-retention 3

# Create consumer group
az eventhubs eventhub consumer-group create \
  --eventhub-name telemetry-events \
  --namespace-name eventhub-demo-ns \
  --resource-group rg-eventhub-demo \
  --name analytics-consumers

Send Events (Python)

from azure.eventhub import EventHubProducerClient, EventData
import json

# Initialize producer
producer = EventHubProducerClient.from_connection_string(
    conn_str="Endpoint=sb://eventhub-demo-ns.servicebus.windows.net/;...",
    eventhub_name="telemetry-events"
)

# Create batch and send events
try:
    event_batch = producer.create_batch()

    for i in range(100):
        event_data = {
            "sensor_id": f"sensor-{i % 10}",
            "temperature": 20 + (i % 15),
            "humidity": 50 + (i % 30),
            "timestamp": datetime.utcnow().isoformat()
        }
        event_batch.add(EventData(json.dumps(event_data)))

    producer.send_batch(event_batch)
    print(f"Sent batch of {len(event_batch)} events")
finally:
    producer.close()

Receive Events (Python)

from azure.eventhub import EventHubConsumerClient

def on_event_batch(partition_context, events):
    for event in events:
        print(f"Received event from partition {partition_context.partition_id}")
        print(f"Event data: {event.body_as_str()}")

    # Update checkpoint for this partition
    partition_context.update_checkpoint()

# Initialize consumer
consumer = EventHubConsumerClient.from_connection_string(
    conn_str="Endpoint=sb://eventhub-demo-ns.servicebus.windows.net/;...",
    consumer_group="$Default",
    eventhub_name="telemetry-events"
)

# Start receiving
try:
    with consumer:
        consumer.receive_batch(
            on_event_batch=on_event_batch,
            starting_position="-1"  # Start from beginning
        )
except KeyboardInterrupt:
    print("Stopped receiving")

📚 Deep Dive Guides

🛠️ Integration Scenarios

🎯 Best Practices

  • Performance Optimization
  • Security Configuration
  • Cost Optimization

Last Updated: 2025-01-28 Service Version: General Availability Documentation Status: Complete