Azure Cosmos DB Guide¶

Overview¶

Azure Cosmos DB is a globally distributed, multi-model database service that provides single-digit millisecond latency at any scale. In CSA-in-a-Box it serves three primary roles:

Operational data store — metadata catalog, session state, marketplace product registry
Graph database — knowledge graphs for GraphRAG entity-relationship traversal
Change-feed source — CDC pipeline feeding the Bronze layer of the lakehouse

Cosmos DB exposes multiple wire-protocol APIs over the same underlying storage engine, so you choose the API that matches your access pattern rather than migrating your data model.

API	Wire Protocol	Primary Use in CSA-in-a-Box
NoSQL	REST / SQL-like query	Metadata store, session state, marketplace catalog
MongoDB	MongoDB wire protocol	Lift-and-shift of existing MongoDB workloads
Gremlin	Apache TinkerPop	Knowledge graphs for GraphRAG (Tutorial 09)
PostgreSQL	PostgreSQL (Citus)	Portal persistence, distributed HTAP
Table	Azure Table Storage	Legacy Table Storage migrations

CSA-in-a-Box Defaults

New workloads should use the NoSQL API unless you have a specific reason to choose another. GraphRAG workloads use Gremlin. Portal persistence uses PostgreSQL (see ADR-0015).

Architecture¶

The following diagram shows how Cosmos DB APIs map to CSA-in-a-Box use cases and downstream consumers.

flowchart LR
    subgraph Apps["Applications"]
        Portal[Portal SPA]
        AI[AI Services]
        Ingest[Ingestion Pipelines]
    end

    subgraph Cosmos["Azure Cosmos DB"]
        NoSQL[NoSQL API<br/>Metadata & Sessions]
        Gremlin[Gremlin API<br/>Knowledge Graph]
        PG[PostgreSQL API<br/>Portal Persistence]
    end

    subgraph Downstream["Downstream Analytics"]
        CF[Change Feed]
        SL[Synapse Link<br/>Analytical Store]
        Bronze[(ADLS Bronze Layer)]
        Synapse[Synapse Serverless SQL]
    end

    Portal --> PG
    AI --> Gremlin
    Ingest --> NoSQL

    NoSQL --> CF --> Bronze
    NoSQL --> SL --> Synapse
    Gremlin --> AI

    style Cosmos fill:#e3f2fd,stroke:#1565c0
    style Downstream fill:#fff3e0,stroke:#e65100
    style Apps fill:#e8f5e9,stroke:#2e7d32

API Selection Guide¶

Comparison Table¶

Criterion	NoSQL	MongoDB	Gremlin	PostgreSQL (Citus)
Query language	SQL-like	MQL	Gremlin traversal	SQL
Schema	Flexible JSON	Flexible BSON	Vertices + Edges	Relational
Transactions	Transactional batch (same partition)	Multi-document ACID	Per-traversal	Full ACID
Indexing	Automatic (tunable)	Automatic + custom	Automatic	Manual (B-tree, GIN)
Max document size	2 MB	2 MB	N/A (vertex/edge)	Row-level
Synapse Link	Yes	Yes	No	No
Change feed	Yes	Yes (change streams)	No	Logical replication
Best for	Greenfield, high-scale ops	MongoDB migrations	Graph traversal, GraphRAG	HTAP, relational workloads
CSA-in-a-Box role	Primary operational store	Migration path only	Knowledge graphs	Portal DB

When to Use Each API¶

NoSQL API — Choose for all new operational workloads. Best SDK support, broadest feature set (Synapse Link, change feed, hierarchical partitions, priority-based execution), lowest latency in direct mode.

MongoDB API — Choose only when migrating an existing MongoDB application. The wire-protocol compatibility lets you point existing drivers at Cosmos DB with minimal code changes.

Gremlin API — Choose for graph-native workloads where you need traversal queries (shortest path, neighbors-of-neighbors, impact analysis). In CSA-in-a-Box this powers GraphRAG knowledge graphs (see Tutorial 09).

PostgreSQL API (Citus) — Choose when you need relational semantics with horizontal scale-out. CSA-in-a-Box uses this for portal persistence where relational integrity and JOIN support outweigh document flexibility.

Partitioning Strategy¶

The partition key is the single most consequential design decision. You cannot change it after container creation without migrating data.

Partition Key Selection Criteria¶

Goal	Why It Matters
High cardinality	Millions of distinct values prevent hot partitions
Even distribution	Uniform write/read spread across physical partitions
Query alignment	Most queries should target a single partition
Stability	The value must never change for a given document

Examples by Workload¶

Workload	Recommended Key	Avoid	Reason
User sessions	`/userId`	`/sessionStatus`	Status has 3-4 values (hot)
Marketplace catalog	`/domainId`	`/category`	Low cardinality
IoT telemetry	`/deviceId`	`/timestamp`	Recent timestamps create hot partition
Multi-tenant SaaS	`/tenantId`	`/region`	Region has <10 values
Audit log	Synthetic: `/tenantId-yearMonth`	`/action`	Action types are few

Hierarchical Partition Keys¶

For workloads where a single key produces uneven partitions (e.g., a mega-tenant), use hierarchical partition keys (up to 3 levels):

{
    "partitionKey": {
        "paths": ["/tenantId", "/userId", "/sessionId"],
        "kind": "MultiHash",
        "version": 2
    }
}

This distributes a large tenant's data across multiple physical partitions while keeping per-user queries efficient.

Cross-Partition Query Costs¶

Cross-Partition Queries

A query that omits the partition key in its filter fans out to every physical partition. This multiplies RU cost by the number of physical partitions and adds latency. Always include the partition key in user-facing query paths.

-- Single-partition (fast, cheap)
SELECT * FROM c WHERE c.tenantId = 'acme' AND c.status = 'active'

-- Cross-partition (expensive, fan-out)
SELECT * FROM c WHERE c.status = 'active'

Hot Partition Detection¶

Monitor the Normalized RU Consumption metric per partition key range. If any range consistently exceeds 70%, you have a hot partition. Remediation options:

Add a suffix to create a synthetic key (e.g., /tenantId-random(0..9))
Enable hierarchical partitioning
Redesign the data model to split the hot entity

Throughput Management¶

Provisioned vs. Serverless¶

Model	Best For	Billing	Limits
Provisioned (manual)	Predictable, steady workloads	Per-hour RU/s reservation	Min 400 RU/s
Provisioned (autoscale)	Spiky workloads, production	10% - 100% of max RU/s, billed at peak	Min 1,000 max RU/s
Serverless	Dev/test, low traffic	Per-request RU charge	Max 5,000 RU/s burst

CSA-in-a-Box Recommendation

Use serverless for sandbox and dev environments. Use autoscale for staging and production. Reserve manual provisioned only for workloads with flat, well-understood throughput.

RU Calculation Examples¶

Every operation in Cosmos DB costs Request Units (RU). Rough estimates:

Operation	Approximate RU Cost
Point read (1 KB document by ID + partition key)	1 RU
Point read (10 KB document)	~3 RU
Write (1 KB document)	~6 RU
Write (10 KB document)	~30 RU
Query returning 5 documents (1 KB each, indexed)	~5 RU
Cross-partition query (same result set)	~5 RU x partition count

Use the Cosmos DB Capacity Calculator for production sizing.

Autoscale Configuration¶

from azure.cosmos import CosmosClient, PartitionKey

client = CosmosClient(endpoint, credential)
database = client.create_database_if_not_exists("csa-metadata")

# Autoscale: scales between 1,000 and 10,000 RU/s
container = database.create_container_if_not_exists(
    id="sessions",
    partition_key=PartitionKey(path="/userId"),
    offer_throughput=None,              # not manual
    auto_scale_max_throughput=10000,     # max 10,000 RU/s
    default_ttl=86400                   # 24-hour TTL
)

Burst Capacity¶

Cosmos DB accumulates unused RU/s (up to 300 seconds worth) as burst capacity. Short spikes that exceed provisioned throughput consume from this budget before throttling. No configuration required — it is automatic.

Priority-Based Execution¶

For workloads mixing user-facing queries with background tasks (e.g., change feed processing), use priority-based execution to protect interactive latency:

# High priority — user-facing reads
container.read_item(item="doc-1", partition_key="user-42",
                    priority_level="High")

# Low priority — background analytics
container.query_items(query="SELECT * FROM c WHERE c.type='metrics'",
                      partition_key="system",
                      priority_level="Low")

Low-priority requests are throttled first when the container approaches its RU limit.

Data Modeling¶

Embed vs. Reference¶

Cosmos DB is not a relational database. Denormalization is the norm, not the exception.

Strategy	When to Use	Trade-off
Embed (nested document)	Data is read together, bounded size, 1:few	Larger documents, atomic writes
Reference (separate documents)	Unbounded relationships, 1:many, shared entities	Extra round-trips, no JOINs

// Embedded: order with line items (bounded, always read together)
{
    "id": "order-001",
    "customerId": "cust-42",
    "items": [
        { "sku": "WIDGET-A", "qty": 2, "price": 19.99 },
        { "sku": "GADGET-B", "qty": 1, "price": 49.99 }
    ],
    "total": 89.97
}

// Referenced: product catalog (shared across orders, updated independently)
{
    "id": "WIDGET-A",
    "productId": "WIDGET-A",
    "name": "Widget Alpha",
    "price": 19.99,
    "category": "widgets"
}

Denormalization Patterns for Analytics¶

In analytics contexts, denormalize aggressively to avoid cross-document lookups:

Duplicate lookup fields — store customerName on the order, not just customerId
Pre-aggregate — maintain running totals in the parent document
Store computed fields — dayOfWeek, fiscalQuarter at write time to avoid query-time computation

Use the change feed to keep denormalized copies consistent.

TTL for Data Expiration¶

Set defaultTtl on the container and per-document ttl (in seconds) to automatically expire data at zero RU cost:

# Container-level: all documents expire after 30 days unless overridden
container = database.create_container_if_not_exists(
    id="events",
    partition_key=PartitionKey(path="/deviceId"),
    default_ttl=2592000   # 30 days
)

# Document-level override: this document expires in 1 hour
item = {
    "id": "evt-999",
    "deviceId": "sensor-07",
    "payload": { ... },
    "ttl": 3600
}
container.upsert_item(item)

Use TTL for session state, cache documents, staging data awaiting lakehouse ingestion, and soft-delete grace periods.

Change Feed to Lakehouse¶

The Cosmos DB change feed provides an ordered stream of inserts and updates (not deletes unless soft-delete + TTL). This is the primary CDC mechanism for feeding operational data into the Bronze layer.

sequenceDiagram
    participant App as Application
    participant Cosmos as Cosmos DB (NoSQL)
    participant CFP as Change Feed Processor
    participant ADLS as ADLS Gen2 (Bronze)
    participant DBR as Databricks / dbt

    App->>Cosmos: Write document
    Cosmos-->>CFP: Change feed event
    CFP->>ADLS: Write JSON / Delta to Bronze
    DBR->>ADLS: Read Bronze, transform to Silver

Azure Functions Trigger¶

The simplest integration for low-to-medium throughput:

import azure.functions as func
import json, logging
from azure.storage.filedatalake import DataLakeServiceClient

app = func.FunctionApp()

@app.cosmos_db_trigger(
    arg_name="documents",
    container_name="metadata",
    database_name="csa-metadata",
    connection="CosmosDBConnection",
    lease_container_name="leases",
    create_lease_container_if_not_exists=True
)
def cosmos_to_bronze(documents: func.DocumentList):
    """Write Cosmos DB changes to ADLS Bronze layer as JSON."""
    if not documents:
        return

    datalake = DataLakeServiceClient.from_connection_string(
        conn_str=os.environ["ADLS_CONNECTION"]
    )
    fs = datalake.get_file_system_client("bronze")

    for doc in documents:
        data = json.loads(doc.to_json())
        path = f"cosmos/metadata/{data['id']}.json"
        file_client = fs.get_file_client(path)
        file_client.upload_data(json.dumps(data), overwrite=True)

    logging.info("Wrote %d documents to Bronze", len(documents))

Change Feed Processor (SDK)¶

For higher throughput or custom processing logic, use the change feed processor directly:

from azure.cosmos import CosmosClient, PartitionKey

client = CosmosClient(endpoint, credential)
database = client.get_database_client("csa-metadata")
source = database.get_container_client("metadata")
lease = database.get_container_client("leases")

def handle_changes(docs, context):
    """Process a batch of changes from the feed."""
    for doc in docs:
        # Transform and write to lakehouse
        write_to_bronze(doc)
    # Checkpoint is automatic after handler returns

source.query_items_change_feed(
    partition_key_range_id=None,  # all partitions
    is_start_from_beginning=True,
    max_item_count=100
)

Exactly-Once Processing¶

The change feed guarantees at-least-once delivery. For exactly-once semantics:

Idempotent writes — use the document id as the Bronze file name (upsert semantics)
Lease container — tracks the continuation token per partition; restarts resume from the last checkpoint
Deduplication downstream — dbt Silver layer deduplicates on id + _ts (Cosmos timestamp)

GraphRAG Integration¶

Cosmos DB Gremlin API stores the knowledge graph that powers GraphRAG entity-relationship queries. For the full tutorial, see Tutorial 09 — GraphRAG Knowledge Graphs.

Entity-Relationship Modeling¶

graph LR
    E1[Entity: DataProduct<br/>name, domain, owner] -->|DEPENDS_ON| E2[Entity: DataProduct<br/>name, domain, owner]
    E1 -->|USES| E3[Entity: Column<br/>name, type, pii]
    E4[Entity: Pipeline<br/>name, schedule] -->|PRODUCES| E1
    E5[Entity: Team<br/>name, org] -->|OWNS| E1

    style E1 fill:#bbdefb,stroke:#1565c0
    style E2 fill:#bbdefb,stroke:#1565c0
    style E3 fill:#c8e6c9,stroke:#2e7d32
    style E4 fill:#fff9c4,stroke:#f9a825
    style E5 fill:#f8bbd0,stroke:#c62828

Vertices represent entities (data products, columns, pipelines, teams). Edges represent relationships (DEPENDS_ON, USES, PRODUCES, OWNS). This structure supports:

Impact analysis — "What breaks if I change this column?"
Lineage traversal — "Where did this data product originate?"
Community detection — Identify tightly-coupled data domains

Graph Traversal Queries¶

// Find all downstream dependencies of a data product
g.V().has('dataProduct', 'name', 'customer-360')
  .repeat(out('DEPENDS_ON'))
  .until(outE('DEPENDS_ON').count().is(0))
  .path()
  .by('name')

// Find all PII columns in a domain
g.V().has('domain', 'name', 'sales')
  .out('OWNS').out('USES')
  .has('pii', true)
  .values('name')

// Impact analysis: what pipelines are affected?
g.V().has('column', 'name', 'ssn')
  .in('USES').in('PRODUCES')
  .dedup()
  .values('name')

Integration with Azure OpenAI¶

GraphRAG combines graph traversal with LLM reasoning. The pattern:

Extract entities and relationships from documents using Azure OpenAI
Store the graph in Cosmos DB Gremlin
Query the graph for context relevant to the user's question
Augment the LLM prompt with graph-retrieved context

from gremlin_python.driver import client as gremlin_client

# Connect to Cosmos DB Gremlin
gremlin = gremlin_client.Client(
    url="wss://<account>.gremlin.cosmos.azure.com:443/",
    traversal_source="g",
    username="/dbs/graphdb/colls/knowledge",
    password="<primary-key>"
)

# Retrieve context for a question
def get_graph_context(entity_name: str) -> list[dict]:
    """Traverse the knowledge graph for context around an entity."""
    query = (
        f"g.V().has('name', '{entity_name}')"
        f".bothE().otherV().path()"
        f".by(valueMap(true))"
    )
    results = gremlin.submit(query).all().result()
    return results

Production Authentication

In production, replace the primary key with a managed identity credential. See the Security section below.

Global Distribution¶

Multi-Region Writes¶

Cosmos DB supports active-active multi-region writes. Each region accepts writes locally and replicates asynchronously.

Topology	Use Case	Trade-off
Single-region write, multi-region read	Most workloads, DR with automatic failover	Writes go to one region; reads are local
Multi-region write	Active-active global apps, latency-sensitive writes	Conflict resolution required; higher cost

Multi-Region Writes is Not Free DR

Multi-region writes is for active-active scenarios where write latency matters globally. For DR alone, single-region write with automatic failover is simpler, cheaper, and avoids conflict resolution complexity.

Consistency Levels¶

Cosmos DB offers five consistency levels, each a trade-off between latency, availability, and data freshness.

Level	Guarantee	Latency Impact	RU Cost	Best For
Strong	Linearizable reads	Highest (cross-region round-trip)	2x read RU	Financial transactions
Bounded Staleness	Reads lag by at most K versions or T seconds	High	2x read RU	Predictable max staleness
Session (default)	Read-your-own-writes within session token	Medium	1x read RU	User-facing applications
Consistent Prefix	Reads never see out-of-order writes	Low	1x read RU	Event streams, append-only logs
Eventual	No ordering guarantees	Lowest	1x read RU	Analytics, aggregations

Start with Session

Session consistency is the default and the right choice for most workloads. Tighten to Bounded Staleness or Strong only when you have a concrete consistency requirement. Loosen to Eventual only for analytics queries that tolerate stale reads.

Conflict Resolution¶

When multi-region writes are enabled, conflicts (concurrent writes to the same document in different regions) are resolved by one of:

Last Writer Wins (LWW) — highest _ts wins (default)
Custom conflict resolution — stored procedure or merge function
Async conflict feed — application reads the conflict feed and resolves manually

For CSA-in-a-Box, LWW is sufficient for metadata and session workloads. Graph workloads on Gremlin use single-region write to avoid vertex/edge conflict complexity.

Security¶

Managed Identity (Recommended)¶

from azure.identity import DefaultAzureCredential
from azure.cosmos import CosmosClient

# No connection strings or keys in code
credential = DefaultAzureCredential()
client = CosmosClient(
    url="https://csa-cosmos.documents.azure.com:443/",
    credential=credential
)

RBAC (Data Plane)¶

Use Cosmos DB built-in roles instead of primary/secondary keys:

Role	Permissions	Assign To
`Cosmos DB Built-in Data Reader`	Read all containers	Analytics pipelines, read replicas
`Cosmos DB Built-in Data Contributor`	Read + write all containers	Application managed identities
`Custom role`	Fine-grained per-container	Least-privilege service identities

# Assign data contributor role to a managed identity
az cosmosdb sql role assignment create \
    --account-name csa-cosmos \
    --resource-group csa-rg \
    --role-definition-id "00000000-0000-0000-0000-000000000002" \
    --principal-id "$MI_OBJECT_ID" \
    --scope "/dbs/csa-metadata"

Network Isolation¶

Layer	Configuration
Private Endpoint	Cosmos DB accessible only via VNet private IP
Service Endpoint	Allow traffic only from specific subnets
IP firewall	Restrict to known IP ranges (fallback)
Disable public access	Set `publicNetworkAccess: Disabled` after private endpoint is configured

Encryption¶

At rest — enabled by default with Microsoft-managed keys; optionally use customer-managed keys (CMK) in Azure Key Vault
In transit — TLS 1.2 enforced on all connections
Client-side encryption — available via SDK for field-level encryption of sensitive data (Always Encrypted)

Diagnostic Logs¶

Enable diagnostic settings to route to Log Analytics for auditing and troubleshooting:

az monitor diagnostic-settings create \
    --name cosmos-diagnostics \
    --resource "/subscriptions/$SUB/resourceGroups/csa-rg/providers/Microsoft.DocumentDB/databaseAccounts/csa-cosmos" \
    --workspace "/subscriptions/$SUB/resourceGroups/csa-rg/providers/Microsoft.OperationalInsights/workspaces/csa-logs" \
    --logs '[
        {"category": "DataPlaneRequests", "enabled": true},
        {"category": "QueryRuntimeStatistics", "enabled": true},
        {"category": "PartitionKeyStatistics", "enabled": true},
        {"category": "ControlPlaneRequests", "enabled": true}
    ]'

Cost Optimization¶

Environment-Tier Strategy¶

Environment	Throughput Model	Why
Dev / Sandbox	Serverless	Pay only for requests; zero cost when idle
Staging	Autoscale (low max)	Mimics production behavior at reduced scale
Production	Autoscale or Reserved	Autoscale for spiky; reserved capacity for predictable (up to 65% savings)

Reserved Capacity¶

For production workloads with predictable throughput, reserved capacity provides significant savings:

Term	Discount vs. Pay-as-you-go
1-year reservation	~20% savings
3-year reservation	~30-65% savings (varies by region)

Integrated Cache¶

The integrated cache (built into the Cosmos DB gateway) reduces RU consumption for repeated point reads and queries:

# Enable dedicated gateway with integrated cache
# (configured at the account level via Azure Portal or Bicep)

# Reads automatically use cache when routed through dedicated gateway
item = container.read_item(
    item="doc-1",
    partition_key="user-42",
    # Staleness tolerance: accept cached data up to 60 seconds old
    session_token=None  # bypass session consistency for cache hits
)

Indexing Policy Tuning¶

The default indexing policy indexes every path — expensive for write-heavy workloads. Exclude paths you never query:

{
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [
        { "path": "/tenantId/?" },
        { "path": "/status/?" },
        { "path": "/createdAt/?" }
    ],
    "excludedPaths": [
        { "path": "/payload/*" },
        { "path": "/metadata/*" },
        { "path": "/_etag/?" }
    ]
}

Index What You Query

Excluding large nested objects (e.g., /payload/*) from indexing can reduce write RU cost by 30-50% on document-heavy workloads.

Cosmos DB Analytical Store (Synapse Link)¶

The analytical store is an auto-synced, fully isolated columnar store within Cosmos DB. It enables no-ETL analytics over operational data.

flowchart LR
    App[Application] -->|Writes| TX[Transactional Store<br/>Row-oriented]
    TX -->|Auto-sync ~2 min| AN[Analytical Store<br/>Column-oriented]
    AN -->|Synapse Link| SS[Synapse Serverless SQL]
    AN -->|Synapse Link| SP[Synapse Spark]
    SS --> PBI[Power BI]
    SP --> Delta[(Delta Lake)]

    style TX fill:#e3f2fd,stroke:#1565c0
    style AN fill:#fff3e0,stroke:#e65100

Enabling Analytical Store¶

# Enable Synapse Link on the account (one-time, irreversible)
az cosmosdb update \
    --name csa-cosmos \
    --resource-group csa-rg \
    --enable-analytical-storage true

# Enable analytical store on a container
az cosmosdb sql container update \
    --account-name csa-cosmos \
    --database-name csa-metadata \
    --name metadata \
    --resource-group csa-rg \
    --analytical-storage-ttl -1   # -1 = infinite retention

Querying from Synapse Serverless SQL¶

-- No ETL, no RU impact on transactional workload
SELECT
    c.domainId,
    c.status,
    COUNT(*) AS product_count,
    AVG(c.qualityScore) AS avg_quality
FROM OPENROWSET(
    'CosmosDB',
    'Account=csa-cosmos;Database=csa-metadata;Key=<read-only-key>',
    metadata
) AS c
GROUP BY c.domainId, c.status
ORDER BY product_count DESC

Zero RU Impact

Analytical store queries consume Synapse capacity, not Cosmos DB RU/s. Your transactional workload is completely unaffected.

When to Use Analytical Store vs. Change Feed¶

Scenario	Use Analytical Store	Use Change Feed
Ad-hoc analytics over current data	Yes	No
Dashboard queries (Power BI)	Yes	No
Feed data into Delta Lake (Bronze)	No	Yes
Event-driven downstream processing	No	Yes
Historical trend analysis	Either	Either
Real-time CDC to lakehouse	No (2-min lag)	Yes (seconds)

Anti-Patterns¶

!!! danger "Do Not" - Use Cosmos DB as a data warehouse. It is an operational database. Use Synapse Link or change feed to push data into the lakehouse for analytical workloads. - Pick a low-cardinality partition key. Keys like status, region, or category create hot partitions. You cannot change the key later. - Default to Strong consistency. It doubles read RU cost and adds cross-region latency. Session consistency handles most use cases. - Ignore indexing policy on write-heavy containers. Default indexes every path, inflating write costs. - Run cross-partition queries in user-facing paths. These fan out to every physical partition, multiplying latency and RU cost. - Use multi-region writes "for DR". It adds conflict resolution complexity and cost. Single-write with automatic failover is the correct DR pattern.

!!! success "Do" - Use Synapse Link for analytics instead of querying the transactional store directly. - Set TTL on time-bounded data (sessions, caches, staging). TTL deletes are free. - Use autoscale for production workloads with variable traffic. - Use serverless for dev/test to avoid paying for idle RU/s. - Monitor partition heat maps via Normalized RU Consumption metric. - Use managed identity instead of connection strings or primary keys. - Tune indexing policy to exclude paths you never query. - Use bulk operations for batch inserts (up to 10x cheaper than individual writes).

Production Readiness Checklist¶

Cross-References¶

Resource	Description
Pattern -- Cosmos DB	Partition key, consistency, throughput, and TTL patterns
Pattern -- Streaming & CDC	Change feed integration with Event Hubs and lakehouse
Tutorial 09 -- GraphRAG Knowledge Graphs	End-to-end GraphRAG with Cosmos DB Gremlin
ADR-0015 -- PostgreSQL Portal Persistence	Why portal uses Cosmos DB PostgreSQL API
Best Practices -- Cost Optimization	Cross-service cost strategies including Cosmos DB
Best Practices -- Performance Tuning	Query optimization and indexing strategies
Best Practices -- Security & Compliance	Managed identity, RBAC, and network isolation
Cosmos DB Modeling	Official data modeling guidance
Cosmos DB Partitioning	Official partitioning documentation
Synapse Link for Cosmos DB	Official analytical store documentation