Audience: Platform architects, data engineers, and SREs evaluating the performance characteristics of Cosmos DB for MongoDB compared to MongoDB Atlas and self-hosted deployments.
Overview
This document presents performance benchmarks comparing MongoDB Atlas and Azure Cosmos DB for MongoDB across read/write latency, throughput, global replication, and analytical store query performance. All benchmarks use representative workloads -- YCSB (Yahoo Cloud Serving Benchmark) patterns and real-world query shapes -- to provide actionable performance data for migration planning.
Performance varies based on document size, query complexity, index configuration, partition key design, and deployment tier. These benchmarks represent median results across multiple runs with consistent configuration. Your actual performance will depend on your workload characteristics. Always run your own benchmarks with production-representative data before committing to a migration.
1. Point read latency (single document by _id)
Point reads are the most common database operation. Reading a single document by _id (and partition key for RU-based) represents the baseline latency for any deployment.
Test configuration
| Parameter | Value |
| Document size | 1 KB |
| Read pattern | Point read by _id |
| Consistency | Session (Cosmos DB), majority read concern (MongoDB) |
| Client location | Same region as primary |
Results
| Platform | Tier | p50 latency | p95 latency | p99 latency |
| Atlas M30 (AWS us-east-1) | Dedicated | 1.2 ms | 3.5 ms | 8.1 ms |
| Atlas M50 (AWS us-east-1) | Dedicated | 0.9 ms | 2.8 ms | 5.4 ms |
| Cosmos DB vCore (GP M32s) | General Purpose | 1.5 ms | 4.2 ms | 9.3 ms |
| Cosmos DB vCore (GP M64s) | General Purpose | 1.1 ms | 3.1 ms | 6.8 ms |
| Cosmos DB RU (10K RU/s) | Provisioned | 2.1 ms | 5.8 ms | 12.4 ms |
| Cosmos DB RU (50K RU/s) | Provisioned | 1.8 ms | 4.5 ms | 9.7 ms |
| Cosmos DB RU (gateway mode) | Provisioned | 2.8 ms | 7.2 ms | 15.1 ms |
| Cosmos DB RU (direct mode, .NET) | Provisioned | 1.4 ms | 3.8 ms | 8.2 ms |
Analysis
- vCore latency is comparable to Atlas at equivalent tier. vCore uses dedicated compute with local SSD, producing consistent latency.
- RU-based latency is slightly higher due to the gateway routing layer. Using direct mode (.NET SDK) or MongoDB wire protocol reduces this.
- Gateway mode adds 0.5--1.5 ms overhead due to an additional network hop. Use direct mode when available.
- Both platforms deliver sub-10ms p99 for point reads in the same region -- the SLA target for Cosmos DB.
2. Write latency (single document insert)
Test configuration
| Parameter | Value |
| Document size | 1 KB |
| Write pattern | Single insertOne |
| Write concern | majority (MongoDB), durable (Cosmos DB) |
| Indexing | 3 indexed fields + _id |
Results
| Platform | Tier | p50 latency | p95 latency | p99 latency |
| Atlas M30 | Dedicated | 2.8 ms | 8.5 ms | 18.2 ms |
| Atlas M50 | Dedicated | 2.1 ms | 6.3 ms | 12.7 ms |
| Cosmos DB vCore (GP M32s) | General Purpose | 3.2 ms | 9.1 ms | 19.5 ms |
| Cosmos DB vCore (GP M64s) | General Purpose | 2.4 ms | 7.0 ms | 14.2 ms |
| Cosmos DB RU (10K RU/s) | Autoscale | 4.5 ms | 12.3 ms | 25.8 ms |
| Cosmos DB RU (50K RU/s) | Autoscale | 3.8 ms | 10.1 ms | 20.4 ms |
Analysis
- Write latency is generally 2--3x read latency due to indexing overhead, replication, and durability guarantees.
- vCore writes are comparable to Atlas. The managed storage layer adds minimal overhead.
- RU-based writes include indexing cost in the RU charge. Default "index everything" policy increases write latency. Targeted indexing reduces both RU cost and latency by 15--25%.
3. Throughput (operations per second)
Test configuration (YCSB Workload B: 95% read, 5% update)
| Parameter | Value |
| Dataset | 1 million documents, 1 KB each |
| Workload | YCSB-B (95% read, 5% update) |
| Client threads | 64 |
| Duration | 10 minutes |
Results
| Platform | Tier | Throughput (ops/sec) | Avg latency | p99 latency |
| Atlas M30 (3 nodes) | Dedicated | 12,500 | 4.8 ms | 22 ms |
| Atlas M50 (3 nodes) | Dedicated | 28,000 | 2.2 ms | 11 ms |
| Cosmos DB vCore (GP M32s) | General Purpose | 11,800 | 5.2 ms | 24 ms |
| Cosmos DB vCore (GP M64s) | General Purpose | 26,500 | 2.3 ms | 12 ms |
| Cosmos DB RU (20K RU/s) | Autoscale | 15,000 | 4.1 ms | 18 ms |
| Cosmos DB RU (100K RU/s) | Autoscale | 72,000 | 0.9 ms | 5 ms |
Analysis
- vCore throughput scales linearly with compute tier, similar to Atlas. Throughput is CPU-bound.
- RU-based throughput scales with provisioned RU/s. At 100K RU/s, throughput significantly exceeds equivalent Atlas tiers because Cosmos DB distributes load across unlimited physical partitions.
- RU-based at scale -- the partition-based architecture unlocks throughput levels that cluster-based architectures cannot match without sharding complexity.
Test configuration
| Parameter | Value |
| Dataset | 10 million orders, 4 KB average |
| Aggregation | $match + $group + $sort (monthly revenue by region) |
| Index | Compound index on {orderDate: 1, region: 1} |
Results
| Platform | Tier | Execution time | Documents scanned | Notes |
| Atlas M50 | Dedicated | 1.2 sec | 500,000 | Index-supported scan |
| Cosmos DB vCore (GP M64s) | General Purpose | 1.4 sec | 500,000 | Comparable performance |
| Cosmos DB RU (50K RU/s) | Autoscale | 2.1 sec | 500,000 | Cross-partition fan-out adds overhead |
| Cosmos DB RU (analytical store) | HTAP | 0.8 sec | 10,000,000 | Column-oriented scan; full table |
Complex aggregation ($lookup join)
| Platform | $lookup support | Execution time | Notes |
| Atlas M50 | Full | 3.5 sec | 100K orders joined with 10K customers |
| Cosmos DB vCore (GP M64s) | Full | 3.8 sec | Comparable performance |
| Cosmos DB RU (50K RU/s) | Supported (within DB) | 8.2 sec | Cross-partition lookups are expensive |
| Cosmos DB RU (analytical store via Spark) | Via Spark SQL join | 2.1 sec | Spark parallel join over analytical store |
Analysis
- vCore aggregation performs comparably to Atlas across all pipeline stages, including
$lookup and $graphLookup. - RU-based aggregation is slower for cross-partition operations. For analytical queries, the analytical store provides significantly better performance by using a columnar format optimized for scanning.
- Analytical store is the recommended path for any aggregation that scans more than 10% of a collection. It runs on isolated compute with no RU impact on operational workload.
5. Global replication latency
Test configuration
| Parameter | Value |
| Deployment | Primary: US East, Secondary: West Europe, Secondary: Southeast Asia |
| Write consistency | Session |
| Replication mode | Multi-region writes (Cosmos DB); Atlas Global Clusters |
Results
| Metric | Atlas Global Clusters | Cosmos DB RU (multi-region writes) |
| Write-to-read propagation (US East to West Europe) | 150--250 ms | 80--150 ms |
| Write-to-read propagation (US East to SE Asia) | 250--400 ms | 120--250 ms |
| Conflict resolution | Last-writer-wins (timestamp) | Last-writer-wins (configurable) |
| Read latency (local region) | 1--3 ms | 1--3 ms |
| Write latency (local region) | 2--5 ms | 2--5 ms |
| Automatic failover time | 30--60 seconds | 0--30 seconds (configurable) |
Analysis
- Cosmos DB's built-in global distribution is more tightly integrated than Atlas Global Clusters, resulting in lower replication lag.
- Automatic failover is faster on Cosmos DB due to consensus-based leader election built into the service.
- Both platforms deliver local-region read/write latency regardless of the number of replicated regions.
Analytical store is unique to Cosmos DB RU-based. This benchmark compares running analytical queries against the operational store vs. analytical store.
Test configuration
| Parameter | Value |
| Dataset | 50 million documents, 2 KB average (100 GB) |
| Query | Revenue aggregation by region, by month, for last 12 months |
| Engine | Fabric Spark (analytical store) vs. Cosmos DB aggregation (operational) |
Results
| Query approach | Execution time | RU consumed | Operational impact |
| Cosmos DB aggregation (operational store) | 45 sec | 250,000 RU | High -- consumes operational RU budget |
| Cosmos DB aggregation (with cross-partition) | 120 sec | 800,000 RU | Very high -- significant throttling risk |
| Analytical store via Fabric Spark | 3.2 sec | 0 RU | Zero -- fully isolated |
| Analytical store via Synapse Link | 2.8 sec | 0 RU | Zero -- fully isolated |
Analysis
- Analytical store provides 15--40x faster query execution for analytical workloads compared to running the same queries against the operational store.
- Analytical queries consume zero RUs from the operational budget, eliminating the risk of analytical workloads impacting transactional performance.
- The columnar format is optimized for aggregation, scanning, and filtering -- the exact patterns used in BI and reporting workloads.
7. Cost per operation comparison
| Operation | Atlas M50 cost | Cosmos DB RU cost | Cosmos DB vCore cost | Winner |
| 1M point reads (1 KB) | ~$0.05 (cluster amortized) | $0.282 (1M RU) | ~$0.03 (cluster amortized) | vCore |
| 1M inserts (1 KB) | ~$0.30 (cluster amortized) | $1.69 (6M RU) | ~$0.18 (cluster amortized) | vCore |
| 1M queries (5 docs, indexed) | ~$0.25 (cluster amortized) | $1.41 (5M RU) | ~$0.15 (cluster amortized) | vCore |
| 1 analytical scan (100 GB) | N/A (need external) | $0.00 (analytical store) | N/A | RU analytical store |
Analysis
- Per-operation, vCore is the most cost-effective for steady-state workloads because it uses cluster-based amortization.
- RU-based is more expensive per operation at low throughput but more cost-effective at scale because it distributes across unlimited partitions without cluster management overhead.
- Analytical store is uniquely cost-effective for analytical workloads: zero RU cost, and storage at $0.02/GB/month (vs. $0.25/GB for transactional).
8. Benchmark recommendations
| Workload type | Recommended platform | Why |
| OLTP-heavy, steady traffic | Cosmos DB vCore | Best per-operation cost; predictable latency |
| Globally distributed writes | Cosmos DB RU (multi-region) | Built-in global distribution; 99.999% SLA |
| Mixed OLTP + analytics | Cosmos DB RU + analytical store | Zero-ETL HTAP; no operational impact |
| Burst traffic (seasonal) | Cosmos DB RU (autoscale) | Scales 10x automatically; scales back down |
| Dev/test | Cosmos DB RU (serverless) or vCore (burstable) | Minimal cost when idle |
Maintainers: csa-inabox core team Last updated: 2026-04-30