Skip to content

📈 Analytical Store (HTAP) in Azure Cosmos DB

HTAP

Run analytics on operational data without impacting transactional workloads using Azure Cosmos DB Analytical Store with Synapse Link.


🎯 Overview

Analytical Store is a column-oriented store that auto-syncs with the transactional store, enabling HTAP (Hybrid Transactional/Analytical Processing) scenarios.

Benefits

  • No ETL Required: Automatic sync from transactional to analytical store
  • No RU Impact: Analytics don't consume RUs
  • Near Real-time: Auto-sync within 2-5 minutes
  • Column Store: Optimized for analytical queries

🏗️ Architecture

graph LR
    App[Application] --> Trans[Transactional Store<br/>Row-oriented]
    Trans -.Auto Sync.-> Analytical[Analytical Store<br/>Column-oriented]
    Analytical --> Synapse[Synapse Analytics]
    Synapse --> BI[Power BI]
    Synapse --> ML[Azure ML]

🚀 Enable Analytical Store

# Enable Synapse Link on account
az cosmosdb update \
  --name mycosmosaccount \
  --resource-group myresourcegroup \
  --enable-analytical-storage true

# Create container with analytical store
az cosmosdb sql container create \
  --account-name mycosmosaccount \
  --resource-group myresourcegroup \
  --database-name ecommerce \
  --name orders \
  --partition-key-path "/customerId" \
  --analytical-storage-ttl -1

Query with Synapse Spark

# Synapse Spark notebook
df = spark.read\
    .format("cosmos.olap")\
    .option("spark.synapse.linkedService", "CosmosDBLinkedService")\
    .option("spark.cosmos.container", "orders")\
    .load()

# Run analytics
df.createOrReplaceTempView("orders")
result = spark.sql("""
    SELECT customerId, COUNT(*) as order_count, SUM(total) as total_spent
    FROM orders
    GROUP BY customerId
    ORDER BY total_spent DESC
    LIMIT 100
""")

result.show()

Query with Synapse SQL

-- Synapse serverless SQL pool
SELECT
    customerId,
    COUNT(*) as order_count,
    SUM(total) as total_revenue
FROM OPENROWSET(
    PROVIDER = 'CosmosDB',
    CONNECTION = '<connection-string>',
    OBJECT = 'orders',
    SERVER_CREDENTIAL = 'CosmosDBCredential'
) AS orders
GROUP BY customerId
ORDER BY total_revenue DESC;


Last Updated: 2025-01-28