Skip to content
Learn — Azure analytics reference library covering services, architecture patterns, tutorials, solutions, monitoring, DevOps

Kappa Architecture

Overview

The Kappa Architecture is a simplification of the Lambda Architecture that removes the batch layer, treating all data as streaming data. It uses a single unified processing layer built on an immutable, replayable event log.

Core Principles

  • Single Processing Layer: All data is processed as streams, eliminating the batch/speed layer distinction
  • Immutable Log: A durable, replayable event log (e.g., Azure Event Hubs, Apache Kafka) stores all data
  • Reprocessing: Historical data can be reprocessed by replaying the log with updated processing logic
  • Simplicity: Fewer moving parts compared to Lambda Architecture

Architecture Components

Event Log

  • Stores all events durably and in order
  • Enables replay for reprocessing
  • Typically uses Azure Event Hubs or Apache Kafka

Stream Processing Layer

  • Processes all incoming data as streams
  • Handles both real-time and historical data
  • Uses Azure Stream Analytics, Apache Spark Structured Streaming, or Azure Databricks

Serving Layer

  • Stores processed results for query serving
  • Supports both real-time and historical queries
  • Uses Azure Synapse Analytics, Azure Cosmos DB, or Azure Data Lake Storage

Implementation on Azure

Component Azure Services
Event Log Azure Event Hubs (with retention), Apache Kafka on HDInsight
Stream Processing Azure Databricks Structured Streaming, Azure Stream Analytics
Serving Azure Synapse Serverless SQL, Azure Cosmos DB, Azure Data Lake Storage Gen2