Skip to content

📖 Azure Cloud Scale Analytics Service Catalog

Status Coverage Updated

Complete catalog of Azure analytics services with capabilities, use cases, and decision guidance.


📊 Service Overview Matrix

Service Category Complexity Pricing Model Primary Use Case
Azure Synapse Analytics Analytics Compute Advanced Pay-per-use + Reserved Enterprise Data Warehousing
Azure Databricks Analytics Compute Advanced Compute + DBU Data Science & ML
HDInsight Analytics Compute Intermediate VM-based Big Data Processing
Stream Analytics Streaming Intermediate Streaming Units Real-time Analytics
Event Hubs Streaming Basic Throughput Units Event Ingestion
Event Grid Streaming Basic Per Operation Event Routing
Data Lake Gen2 Storage Basic Storage + Transactions Big Data Storage
Cosmos DB Storage Intermediate Request Units NoSQL Database
Azure SQL Storage Intermediate vCore or DTU Relational Database
Data Factory Orchestration Intermediate Pipeline Runs Data Integration
Logic Apps Orchestration Basic Action-based Workflow Automation

🎯 Analytics Compute Services

Azure Synapse Analytics Enterprise

Purpose: Unified analytics service combining data integration, data warehousing, and analytics.

Key Capabilities:

  • Serverless SQL Pools: Query data directly from data lake
  • Dedicated SQL Pools: Enterprise data warehousing
  • Spark Pools: Big data processing and machine learning
  • Data Integration: Built-in ETL/ELT pipelines
  • Shared Metadata: Unified catalog across compute engines

Best For:

  • Enterprise data warehousing
  • Unified analytics workspaces
  • Large-scale data processing
  • Self-service analytics

Pricing: Pay-per-query (serverless) + Reserved capacity (dedicated)

Documentation: Azure Synapse Guide


Azure Databricks Data Science

Purpose: Collaborative analytics platform optimized for machine learning and data science.

Key Capabilities:

  • Collaborative Notebooks: Multi-language data science environment
  • Delta Live Tables: Declarative ETL framework
  • MLflow Integration: End-to-end ML lifecycle management
  • Unity Catalog: Unified data governance
  • Photon Engine: High-performance query engine

Best For:

  • Data science and machine learning
  • Collaborative analytics
  • Advanced data engineering
  • Real-time ML inference

Pricing: Compute costs + Databricks Unit (DBU) charges

Documentation: Azure Databricks Guide


HDInsight Migration

Purpose: Managed Apache Hadoop, Spark, and Kafka clusters in Azure.

Key Capabilities:

  • Multiple Cluster Types: Hadoop, Spark, HBase, Kafka, Storm
  • Enterprise Security: ESP integration with Active Directory
  • Custom Applications: Support for custom Hadoop ecosystem tools
  • Hybrid Connectivity: Integration with on-premises systems

Best For:

  • Hadoop migration to cloud
  • Custom big data applications
  • Cost-optimized big data processing
  • Open-source ecosystem requirements

Pricing: VM-based pricing model

Documentation: HDInsight Guide


🔄 Streaming Services

Azure Stream Analytics Real-time

Purpose: Real-time analytics service for streaming data processing.

Key Capabilities:

  • SQL-based Queries: Familiar SQL syntax for stream processing
  • Windowing Functions: Tumbling, hopping, and sliding windows
  • Anomaly Detection: Built-in ML-based anomaly detection
  • Edge Deployment: Run analytics on IoT Edge devices
  • Output Integration: Direct integration with Power BI, SQL, Cosmos DB

Best For:

  • IoT device telemetry processing
  • Real-time dashboards
  • Fraud detection
  • Operational monitoring

Pricing: Streaming Units (SU) hourly billing

Documentation: Streaming Services Guide


Azure Event Hubs Ingestion

Purpose: Big data streaming platform and event ingestion service.

Key Capabilities:

  • High Throughput: Millions of events per second
  • Kafka Compatibility: Drop-in replacement for Apache Kafka
  • Capture Feature: Automatic data archival to storage
  • Schema Registry: Centralized schema management
  • Dedicated Clusters: Isolated, high-performance clusters

Best For:

  • High-volume event ingestion
  • Kafka migration scenarios
  • Event-driven architectures
  • IoT data collection

Pricing: Throughput Units or Dedicated Cluster Units

Documentation: Event Hubs Guide


Azure Event Grid Routing

Purpose: Event routing service for building event-driven applications.

Key Capabilities:

  • Event Routing: Intelligent event routing to multiple destinations
  • Custom Topics: Create custom event publishers
  • System Topics: Built-in events from Azure services
  • Dead Letter Queues: Handle failed event deliveries
  • Event Filtering: Route events based on content

Best For:

  • Event-driven application architectures
  • Serverless workflows
  • System integration
  • Reactive applications

Pricing: Pay-per-operation model

Documentation: Streaming Services Guide


🗃️ Storage Services

Azure Data Lake Storage Gen2 Big Data

Purpose: Hierarchical namespace storage optimized for big data analytics.

Key Capabilities:

  • Hierarchical Namespace: Directory and file-level operations
  • Fine-grained ACLs: POSIX-compliant access control
  • Multi-protocol Access: Blob and Data Lake APIs
  • Lifecycle Management: Automated data tiering and archival
  • Performance Tiers: Hot, cool, and archive storage

Best For:

  • Data lake implementations
  • Big data analytics storage
  • Data archival and backup
  • Multi-format data storage

Pricing: Storage capacity + transaction costs

Documentation: Data Lake Gen2 Guide


Azure Cosmos DB NoSQL

Purpose: Globally distributed, multi-model NoSQL database service.

Key Capabilities:

  • Multiple APIs: SQL, MongoDB, Cassandra, Gremlin, Table
  • Global Distribution: Multi-region writes and reads
  • Analytical Store: HTAP capabilities with Synapse Link
  • Change Feed: Real-time change data capture
  • Serverless Option: Pay-per-request pricing model

Best For:

  • Globally distributed applications
  • Real-time applications requiring low latency
  • Multi-model data scenarios
  • HTAP workloads with Synapse integration

Pricing: Request Units (RU/s) or serverless

Documentation: Storage Services Guide


Azure SQL Database Relational

Purpose: Fully managed relational database service.

Key Capabilities:

  • Hyperscale: Massively scalable database architecture
  • Elastic Pools: Shared resources across multiple databases
  • Built-in Intelligence: Automatic tuning and threat detection
  • Always Encrypted: Column-level encryption
  • Temporal Tables: Built-in data history tracking

Best For:

  • Relational data workloads
  • Transactional applications
  • Data marts and reporting
  • Application modernization

Pricing: vCore-based or DTU-based models

Documentation: Storage Services Guide


🔧 Orchestration Services

Azure Data Factory ETL

Purpose: Cloud-based data integration service for creating ETL/ELT pipelines.

Key Capabilities:

  • Code-free ETL: Visual pipeline designer
  • Data Flows: Transformation logic with Spark execution
  • Hybrid Integration: On-premises and cloud data sources
  • CI/CD Support: Azure DevOps and GitHub integration
  • Monitoring: Built-in pipeline monitoring and alerting

Best For:

  • Data integration pipelines
  • ETL/ELT processes
  • Data migration projects
  • Scheduled data processing

Pricing: Pipeline orchestration + activity execution costs

Documentation: Data Factory Guide


Azure Logic Apps Workflow

Purpose: Serverless workflow automation service.

Key Capabilities:

  • Visual Designer: Drag-and-drop workflow creation
  • 300+ Connectors: Pre-built connectors for popular services
  • B2B Integration: EDI and AS2 support
  • Event-driven: Trigger-based workflow execution
  • Enterprise Integration: Integration with on-premises systems

Best For:

  • Business process automation
  • System integrations
  • Event-driven workflows
  • B2B data exchange

Pricing: Pay-per-action execution

Documentation: Orchestration Services Guide


🎯 Service Selection Guide

By Use Case

Real-time Analytics

Primary: Stream Analytics, Event Hubs Storage: Cosmos DB, Data Lake Gen2 Visualization: Power BI Real-time Dashboards

Data Warehousing

Primary: Synapse Dedicated SQL Pools Storage: Data Lake Gen2, Azure SQL Orchestration: Data Factory

Data Science & ML

Primary: Databricks, Synapse Spark Pools Storage: Data Lake Gen2, Cosmos DB Orchestration: Data Factory, Databricks Workflows

IoT Analytics

Primary: Stream Analytics, Event Hubs Edge: Stream Analytics on IoT Edge Storage: Data Lake Gen2, Cosmos DB

By Data Volume

Small to Medium (< 1TB)

  • Azure SQL Database
  • Cosmos DB
  • Stream Analytics (< 100 SU)

Large (1-100TB)

  • Synapse Dedicated SQL Pools
  • Databricks
  • HDInsight

Very Large (> 100TB)

  • Synapse Serverless SQL Pools
  • Data Lake Gen2 with Synapse
  • Databricks with Delta Lake

By Budget Considerations

Cost-Optimized

  • HDInsight
  • Synapse Serverless SQL Pools
  • Event Grid

Balanced Performance/Cost

  • Stream Analytics
  • Data Factory
  • Cosmos DB (provisioned throughput)

Performance-Optimized

  • Synapse Dedicated SQL Pools
  • Databricks Premium
  • Event Hubs Dedicated Clusters

📊 Service Comparison Matrix

Analytics Compute Comparison

Feature Synapse Databricks HDInsight
SQL Support ✅ Native ✅ Spark SQL ✅ Hive/SparkSQL
Python/R ✅ Spark ✅ Native ✅ Spark
Scala/Java ✅ Spark ✅ Native ✅ Native
ML Integration ✅ Built-in ✅ MLflow ⚠️ Custom
Serverless ✅ Yes ❌ No ❌ No
Auto-scaling ✅ Yes ✅ Yes ✅ Yes
Enterprise Security ✅ AAD ✅ Unity Catalog ✅ ESP
Cost Model Pay-per-use DBU-based VM-based

Streaming Services Comparison

Feature Stream Analytics Event Hubs Event Grid
Processing ✅ Built-in ❌ Storage only ❌ Routing only
Throughput Medium (SU-based) ✅ Very High High
Latency Sub-second Milliseconds Seconds
SQL Queries ✅ Yes ❌ No ❌ No
Schema Registry ❌ No ✅ Yes ❌ No
Event Filtering ✅ Yes ❌ No ✅ Yes
Cost Model SU hourly TU/CU Per operation

🔗 Next Steps

🚀 Quick Starts

📖 Deep Dive Documentation

🛠️ Hands-on Learning


Last Updated: 2025-01-28
Next Review: 2025-04-28