🌐 Azure Cosmos DB Quickstart¶
Get started with Azure Cosmos DB in under an hour. Learn to create a globally distributed NoSQL database and perform basic CRUD operations.
🎯 Learning Objectives¶
After completing this quickstart, you will be able to:
- Understand what Azure Cosmos DB is and its key features
- Create a Cosmos DB account with Core (SQL) API
- Create databases, containers, and items
- Perform CRUD operations using Python SDK
- Query data with SQL-like syntax
- Monitor performance and costs
📋 Prerequisites¶
Before starting, ensure you have:
- Azure subscription - Create free account
- Python 3.7+ - Download Python
- Azure Portal access - portal.azure.com
- Code editor - VS Code recommended
- Basic JSON knowledge - Understanding of key-value pairs
🔍 What is Azure Cosmos DB?¶
Azure Cosmos DB is a fully managed, globally distributed, multi-model NoSQL database service designed for:
- Global distribution - Multi-region writes and reads
- Low latency - Single-digit millisecond response times
- High availability - 99.999% SLA
- Flexible scaling - Elastic throughput and storage
- Multiple APIs - SQL, MongoDB, Cassandra, Gremlin, Table
Key Concepts¶
- Account: Top-level resource containing databases
- Database: Logical namespace for containers
- Container: Collection of items (like a table)
- Item: Individual JSON document
- Partition Key: Property used to distribute data
When to Use Cosmos DB¶
✅ Good For:
- Globally distributed applications
- Low-latency requirements (<10ms)
- IoT and telemetry data
- Real-time analytics
- User profiles and catalogs
- Shopping carts and session data
❌ Not Ideal For:
- Traditional relational data (use SQL Database)
- Complex joins across tables
- Very large analytical queries (use Synapse)
🚀 Step 1: Create Cosmos DB Account¶
Using Azure Portal¶
- Navigate to Azure Portal
- Go to portal.azure.com
- Click "Create a resource"
- Search for "Azure Cosmos DB"
-
Click "Create"
-
Select API
- Choose "Core (SQL)" - Recommended for beginners
-
Click "Create"
-
Configure Basics
- Subscription: Select your subscription
- Resource Group: Create new "rg-cosmos-quickstart"
- Account Name: "cosmos-quickstart-[yourname]" (globally unique)
- Location: Select nearest region
- Capacity mode: Provisioned throughput
-
Apply Free Tier Discount: Yes (if available)
-
Global Distribution
- Geo-Redundancy: Disable (for quickstart)
-
Multi-region Writes: Disable (for quickstart)
-
Review and Create
- Click "Review + create"
- Click "Create"
- Wait 5-10 minutes for deployment
Using Azure CLI¶
# Set variables
RESOURCE_GROUP="rg-cosmos-quickstart"
LOCATION="eastus"
ACCOUNT_NAME="cosmos-quickstart-$RANDOM"
# Create resource group
az group create \
--name $RESOURCE_GROUP \
--location $LOCATION
# Create Cosmos DB account
az cosmosdb create \
--name $ACCOUNT_NAME \
--resource-group $RESOURCE_GROUP \
--locations regionName=$LOCATION failoverPriority=0 \
--enable-free-tier true
echo "Cosmos DB Account: $ACCOUNT_NAME"
🗄️ Step 2: Create Database and Container¶
Using Azure Portal¶
- Navigate to Data Explorer
- Go to your Cosmos DB account
-
Click "Data Explorer" in left menu
-
Create Database
- Click "New Database"
- Database id: "SampleDB"
- Provision throughput: Uncheck (container-level)
-
Click "OK"
-
Create Container
- Click "New Container"
- Database id: Use existing "SampleDB"
- Container id: "Products"
- Partition key:
/category - Throughput: 400 RU/s (minimum)
- Click "OK"
Using Python SDK¶
"""
Create Cosmos DB database and container
"""
from azure.cosmos import CosmosClient, PartitionKey
import os
# Configuration
ENDPOINT = "https://your-account-name.documents.azure.com:443/"
KEY = "your-primary-key" # Get from Azure Portal > Keys
# Create client
client = CosmosClient(ENDPOINT, KEY)
# Create database
database = client.create_database_if_not_exists(id="SampleDB")
print(f"✅ Database created: {database.id}")
# Create container
container = database.create_container_if_not_exists(
id="Products",
partition_key=PartitionKey(path="/category"),
offer_throughput=400
)
print(f"✅ Container created: {container.id}")
📝 Step 3: Insert Data (Create)¶
Using Data Explorer¶
- Navigate to "Products" container
- Click "New Item"
- Replace default JSON with:
{
"id": "laptop-001",
"name": "Gaming Laptop",
"category": "Electronics",
"price": 1299.99,
"inStock": true,
"specs": {
"processor": "Intel i7",
"ram": "16GB",
"storage": "512GB SSD"
},
"tags": ["gaming", "laptop", "high-performance"]
}
- Click "Save"
Using Python SDK¶
"""
Insert items into Cosmos DB
"""
from azure.cosmos import CosmosClient
# Configuration
ENDPOINT = "https://your-account-name.documents.azure.com:443/"
KEY = "your-primary-key"
# Create client and get container
client = CosmosClient(ENDPOINT, KEY)
database = client.get_database_client("SampleDB")
container = database.get_container_client("Products")
# Define products
products = [
{
"id": "laptop-001",
"name": "Gaming Laptop",
"category": "Electronics",
"price": 1299.99,
"inStock": True,
"specs": {
"processor": "Intel i7",
"ram": "16GB",
"storage": "512GB SSD"
}
},
{
"id": "desk-001",
"name": "Standing Desk",
"category": "Furniture",
"price": 549.99,
"inStock": True,
"dimensions": {
"width": 60,
"depth": 30,
"height": "adjustable"
}
},
{
"id": "monitor-001",
"name": "4K Monitor",
"category": "Electronics",
"price": 399.99,
"inStock": False,
"specs": {
"size": "27 inch",
"resolution": "3840x2160"
}
}
]
# Insert items
for product in products:
container.create_item(body=product)
print(f"✅ Inserted: {product['name']}")
print(f"\n✅ Successfully inserted {len(products)} products")
📖 Step 4: Read Data¶
Read Single Item¶
"""
Read specific item by id and partition key
"""
# Read item
item_id = "laptop-001"
partition_key = "Electronics"
item = container.read_item(
item=item_id,
partition_key=partition_key
)
print(f"Product: {item['name']}")
print(f"Price: ${item['price']}")
print(f"In Stock: {item['inStock']}")
Query Multiple Items¶
"""
Query items using SQL-like syntax
"""
# Query all products
query = "SELECT * FROM Products p"
items = list(container.query_items(
query=query,
enable_cross_partition_query=True
))
print(f"Total products: {len(items)}")
# Query by category
query = "SELECT * FROM Products p WHERE p.category = 'Electronics'"
electronics = list(container.query_items(
query=query,
enable_cross_partition_query=True
))
for item in electronics:
print(f"- {item['name']}: ${item['price']}")
# Query in-stock items
query = "SELECT * FROM Products p WHERE p.inStock = true"
in_stock = list(container.query_items(
query=query,
enable_cross_partition_query=True
))
print(f"\nIn stock: {len(in_stock)} items")
Using Data Explorer¶
- Navigate to "Products" container
- Click "New SQL Query"
- Enter query:
- Click "Execute Query"
- View results
✏️ Step 5: Update Data¶
"""
Update existing item
"""
# Read item
item = container.read_item(
item="laptop-001",
partition_key="Electronics"
)
# Update fields
item['price'] = 1199.99 # Price reduction
item['inStock'] = True
item['lastUpdated'] = "2025-01-09"
# Replace item
container.replace_item(
item=item['id'],
body=item
)
print(f"✅ Updated {item['name']} to ${item['price']}")
🗑️ Step 6: Delete Data¶
"""
Delete item
"""
# Delete item
container.delete_item(
item="monitor-001",
partition_key="Electronics"
)
print("✅ Deleted item")
💡 Understanding Partition Keys¶
The partition key is CRITICAL for performance and cost optimization.
Good Partition Keys¶
✅ High cardinality (many unique values) ✅ Even distribution of data ✅ Commonly used in queries
# Examples of good partition keys:
- userId (for user data)
- category (for product catalogs)
- deviceId (for IoT data)
- tenantId (for multi-tenant apps)
Bad Partition Keys¶
❌ Low cardinality (few unique values) ❌ Hot partitions (uneven distribution) ❌ Not used in queries
# Examples of bad partition keys:
- country (only ~200 values)
- boolean fields (only 2 values)
- timestamp (creates hot partition)
📊 Step 7: Monitor and Optimize¶
View Metrics¶
- Navigate to Cosmos DB account
- Click "Metrics" in left menu
- View:
- Total Requests - Request count
- Request Units - RU/s consumption
- Storage - Data size
- Throttled Requests - 429 errors
Estimate RU/s Cost¶
"""
Get RU charge for operations
"""
# Query with RU tracking
response = container.query_items(
query="SELECT * FROM Products",
enable_cross_partition_query=True
)
items = list(response)
# RU charge is in response headers
print(f"Query consumed: {response.get('x-ms-request-charge')} RU")
🔧 Troubleshooting¶
Common Issues¶
Error: "Entity with the specified id already exists"
- ✅ Use
upsert_item()instead ofcreate_item() - ✅ Check for duplicate IDs
Error: "Request rate is large" (429)
- ✅ You exceeded provisioned RU/s
- ✅ Solution: Increase RU/s or implement retry logic
Error: "Partition key not found"
- ✅ Ensure item has partition key property
- ✅ Verify partition key path matches container
High Costs
- ✅ Review RU/s consumption in metrics
- ✅ Reduce RU/s when not needed
- ✅ Optimize queries (use indexes)
🎓 Next Steps¶
Beginner Practice¶
- Create different document types in same container
- Implement error handling and retries
- Query with filters and ordering
- Add more complex nested data
Intermediate Challenges¶
- Implement stored procedures
- Use change feed for real-time updates
- Set up indexing policies
- Configure TTL (time-to-live)
Advanced Topics¶
- Multi-region setup
- Implement consistency levels
- Use bulk operations
- Integrate with Azure Functions
📚 Additional Resources¶
Documentation¶
Next Tutorials¶
- Synapse Quickstart - Query Cosmos DB with Synapse Link
- Stream Analytics Tutorial - Process Cosmos DB changes
- Data Engineer Path
Tools¶
🧹 Cleanup¶
To avoid Azure charges, delete resources when done:
Or use Azure Portal:
- Navigate to Resource Groups
- Select "rg-cosmos-quickstart"
- Click "Delete resource group"
- Type resource group name to confirm
- Click "Delete"
🎉 Congratulations!¶
You've successfully:
✅ Created Azure Cosmos DB account ✅ Created database and container ✅ Performed CRUD operations ✅ Queried data with SQL syntax ✅ Understood partition keys and RU/s
You're ready to build globally distributed NoSQL applications!
Next Recommended Tutorial: Delta Lake Basics for analytics data storage
Last Updated: January 2025 Tutorial Version: 1.0 Tested with: Python 3.11, azure-cosmos 4.5.1