Complete Snowflake-to-Azure Feature Mapping¶

Comparative positioning note

This document is written from the perspective of Microsoft Azure, Cloud Scale Analytics, and CSA Loom. Any description of third-party or competing products, services, pricing, or capabilities is derived from publicly available documentation and sources believed accurate at the time of writing, and is provided for general comparison only. We do not claim expertise in, or authority over, any non-Microsoft product or service; the respective vendor's official documentation is the authoritative source for their offerings, which may change over time. Nothing here is intended to disparage any vendor — where a competing product has genuine advantages, we aim to note them honestly. Verify all third-party details against the vendor's current official documentation before making decisions.

Status: Authored 2026-04-30 Audience: Data architects, platform engineers, migration leads Scope: 50+ Snowflake features mapped to Azure equivalents with effort estimates and evidence

How to read this document¶

Each feature is mapped with:

Snowflake feature -- the capability as Snowflake names it
Azure equivalent -- the csa-inabox / Azure service(s) that replace it
Mapping notes -- what changes, what stays the same, what to watch out for
Effort -- XS (hours), S (days), M (1-2 weeks), L (2-4 weeks), XL (4+ weeks)
Gov parity -- whether the Azure equivalent is available in Azure Government
Evidence -- repo paths in csa-inabox that implement or document the pattern

1. Compute and warehousing (8 features)¶

#	Snowflake feature	Azure equivalent	Mapping notes	Effort	Gov parity
1	Virtual warehouses (XS-6XL)	Databricks SQL Warehouses (2XS-4XL)	Size mapping in warehouse-migration.md; auto-stop replaces auto-suspend	M	GA
2	Multi-cluster warehouses	Databricks SQL auto-scaling	Auto-scale clusters per warehouse; scaling is per-node rather than per-clone	M	GA
3	Resource monitors	Azure Cost Management budgets + Databricks budget alerts	Budgets trigger alerts and optional actions; `scripts/deploy/teardown-platform.sh` as hard kill	S	GA
4	Warehouse auto-suspend / auto-resume	SQL Warehouse auto-stop (1 min classic, 10 min serverless)	Functionally equivalent; serverless has faster spin-up	XS	GA
5	Query acceleration service	Databricks Photon engine	Photon accelerates scan-heavy queries automatically; no separate activation	XS	GA
6	Search Optimization Service	Delta Lake Z-ordering + liquid clustering	Z-ORDER on high-cardinality columns; liquid clustering (Databricks Runtime 13.3+) for auto-optimization	S	GA
7	Result caching	Databricks SQL result cache + Delta cache	Automatic result caching at SQL Warehouse level; Delta cache for SSD-accelerated reads	XS	GA
8	Warehouse scheduling (min/max clusters by time)	Databricks SQL Warehouse scaling policies + ADF triggers	Schedule warehouse size changes via ADF or Databricks API; less native than Snowflake scheduler	S	GA

Evidence: csa_platform/unity_catalog_pattern/README.md, docs/adr/0002-databricks-over-oss-spark.md

2. Storage and data formats (7 features)¶

#	Snowflake feature	Azure equivalent	Mapping notes	Effort	Gov parity
9	Micro-partitions (proprietary)	Delta Lake on Parquet (open)	Open format; no vendor lock-in; exit cost is weeks not years	M	GA
10	Time Travel	Delta Lake Time Travel (`VERSION AS OF` / `TIMESTAMP AS OF`)	Semantics equivalent; retention via `delta.deletedFileRetentionDuration`	XS	GA
11	Zero-copy cloning	Delta SHALLOW CLONE / DEEP CLONE	SHALLOW CLONE is metadata-only (equivalent); DEEP CLONE copies data	XS	GA
12	Fail-safe (7-day recovery)	ADLS Gen2 soft delete + GRS replication	Soft delete provides recovery window; GRS for cross-region durability	S	GA
13	External tables (on S3/GCS/Azure)	OneLake shortcuts + Lakehouse Federation	Shortcuts preserve remote as source of truth; Federation queries remote directly	S	GA
14	Iceberg tables	Delta Lake (preferred) or Databricks Iceberg support	csa-inabox standardizes on Delta; ADR-0003 documents the decision	XS	GA
15	Directory tables	Unity Catalog volumes	Volumes provide file-level management; directory tables not needed	S	GA

Evidence: docs/adr/0003-delta-lake-over-iceberg-and-parquet.md, csa_platform/unity_catalog_pattern/onelake_config.yaml

3. Data ingestion (6 features)¶

#	Snowflake feature	Azure equivalent	Mapping notes	Effort	Gov parity
16	Snowpipe (batch auto-ingest)	Databricks Autoloader	Autoloader monitors cloud storage for new files; incremental, exactly-once	M	GA
17	Snowpipe Streaming	Event Hubs + Databricks Structured Streaming	Event Hubs for ingestion; Structured Streaming for processing; sub-second latency	M	GA
18	COPY INTO (bulk load)	COPY INTO (Databricks SQL) / ADF Copy Activity	Nearly identical syntax on Databricks; ADF for orchestrated bulk loads	S	GA
19	PUT/GET (file staging)	AZ CLI upload / ADF / Databricks DBFS	Stage files to ADLS Gen2 or OneLake; no proprietary staging area needed	S	GA
20	External stages (S3/GCS/Azure)	ADLS Gen2 containers + OneLake shortcuts	Mount or shortcut external storage; no intermediate staging needed	S	GA
21	Data loading transformations (COPY INTO ... SELECT)	Autoloader with schema hints + dbt staging models	Inline transformations move to dbt staging layer for better testability	M	GA

Evidence: docs/adr/0005-event-hubs-over-kafka.md, examples/iot-streaming/, domains/shared/pipelines/adf/

4. Data transformation and modeling (6 features)¶

#	Snowflake feature	Azure equivalent	Mapping notes	Effort	Gov parity
22	Dynamic Tables	dbt incremental models + Databricks DLT	Lag-based refresh becomes dbt incremental with merge strategy; heavy CDC uses DLT	M	GA
23	Materialized views	Databricks materialized views (GA) + dbt table materializations	Direct equivalent in Databricks Runtime 13+; dbt `table` materialization as alternative	S	GA
24	Stored procedures (JavaScript/SQL)	Databricks SQL stored procedures + notebooks	SQL procedures translate directly; JavaScript procedures rewrite to Python	M	GA
25	User-defined functions (SQL/JavaScript/Python)	Databricks SQL UDFs + PySpark UDFs	SQL UDFs: near-identical syntax; JavaScript: rewrite; Python: Snowpark to PySpark	M	GA
26	User-defined table functions (UDTFs)	Databricks SQL UDTFs + PySpark UDTFs	Supported in Databricks Runtime 13+; syntax differs slightly	M	GA
27	External functions	Azure Functions + Databricks SQL external models	HTTP endpoints callable from SQL; external models for AI/ML inference	M	GA

Evidence: domains/shared/dbt/dbt_project.yml, domains/finance/dbt/, domains/sales/dbt/

5. Snowpark ecosystem (5 features)¶

#	Snowflake feature	Azure equivalent	Mapping notes	Effort	Gov parity
28	Snowpark Python	PySpark + pandas-on-Spark (Koalas)	DataFrame API translates; Snowpark-specific functions need rewrite	M	GA
29	Snowpark Java/Scala	Spark Java/Scala API	Direct translation; Spark API is the original	M	GA
30	Snowpark ML	MLflow on Databricks	Model training, tracking, registry all in MLflow; richer ecosystem	M	GA
31	Snowpark Container Services	Azure Container Apps + Databricks Model Serving	General compute: Container Apps; inference: Model Serving	L	GA
32	Snowflake Notebooks	Fabric Notebooks + Databricks Notebooks	Richer notebook experience with better collaboration and versioning	S	GA

Evidence: csa_platform/ai_integration/model_serving/, domains/shared/notebooks/

6. Cortex AI services (8 features)¶

#	Snowflake feature	Azure equivalent	Mapping notes	Effort	Gov parity
33	Cortex COMPLETE (text generation)	Azure OpenAI (GPT-4o, GPT-4.1)	Better models; invoked via `ai_query()` or dbt macros	M	GA in Gov
34	Cortex SUMMARIZE	Azure OpenAI with summarization prompt	Prompt-based; more flexible than Cortex's fixed function	S	GA in Gov
35	Cortex TRANSLATE	Azure AI Translator / Azure OpenAI	Dedicated translation service or GPT-4o for context-aware translation	S	GA in Gov
36	Cortex EXTRACT_ANSWER	Azure OpenAI with RAG pattern	AI Search + OpenAI for extractive QA; richer than single-function call	M	GA in Gov
37	Cortex SENTIMENT	Azure AI Language / Azure OpenAI	Dedicated sentiment API or prompt-based via GPT-4o	S	GA in Gov
38	Cortex Search	Azure AI Search (hybrid vector + keyword)	Full hybrid search with vector embeddings; richer relevance tuning	M	GA in Gov
39	Cortex Analyst	Power BI Copilot	Natural-language analytics over semantic models	M	GA in Gov
40	Cortex Guard	Azure AI Content Safety	Content filtering, prompt shields, groundedness detection	M	GA in Gov

Evidence: csa_platform/ai_integration/README.md, csa_platform/ai_integration/rag/pipeline.py, docs/adr/0007-azure-openai-over-self-hosted-llm.md

7. Security and governance (9 features)¶

#	Snowflake feature	Azure equivalent	Mapping notes	Effort	Gov parity
41	Network policies (IP allowlists)	Azure Private Endpoints + NSGs + Firewall	Stronger isolation via Private Endpoints; NSGs for fine-grained control	M	GA
42	Dynamic data masking	Purview sensitivity labels + Unity Catalog MASK functions	Column-level masking via UC; classification-driven via Purview	M	GA
43	Row access policies	Unity Catalog row filters	Rewrite policy body as UC row filter; map CURRENT_ROLE() to Entra groups	M	GA
44	Object tagging (governance tags)	Purview classifications + Unity Catalog tags	Tags flow across catalog; Purview scans auto-classify PII/PHI	S	GA
45	Access history / query history	Purview audit + Azure Monitor + tamper-evident audit	Query audit to Log Analytics; tamper-evident chain (CSA-0016) exceeds Snowflake audit	M	GA
46	Account/Database/Schema hierarchy	Entra tenant / Workspace / UC Catalog / Schema	1:1 mapping; more layers but more granular control	S	GA
47	RBAC (roles + grants)	Entra ID groups + Unity Catalog grants	Roles become Entra groups; GRANT syntax nearly identical in UC	M	GA
48	Data classification	Purview auto-classification (PII, PHI, CUI, financial)	Automated scanning with 200+ built-in classifiers; custom classifiers supported	M	GA
49	Key pair authentication	Entra ID service principals + managed identities	Stronger identity model; no key rotation burden with managed identities	M	GA

Evidence: csa_platform/csa_platform/governance/purview/, csa_platform/unity_catalog_pattern/unity_catalog/, csa_platform/multi_synapse/rbac_templates/

#	Snowflake feature	Azure equivalent	Mapping notes	Effort	Gov parity
50	Secure Data Sharing (intra-Snowflake)	Delta Sharing (open protocol) + OneLake shortcuts	More setup than Snowflake; open protocol works across platforms	L	GA
51	Reader accounts	Delta Sharing recipients (no compute cost to provider)	Recipients bring their own compute; no provider credit consumption	M	GA
52	Data Marketplace	Fabric Data Marketplace + Purview data products	Data product registry with contracts; marketplace discovery	L	GA
53	Data Clean Rooms	Delta Sharing + Purview + Azure Confidential Computing	More stitching than Snowflake; purpose-built UX is a gap	L	Partial
54	Listings (provider/consumer model)	Purview data products + contract.yaml	Data product contracts with SLA, schema, classification metadata	M	GA

Evidence: csa_platform/data_marketplace/, csa_platform/data_marketplace/api/

9. Orchestration and scheduling (4 features)¶

#	Snowflake feature	Azure equivalent	Mapping notes	Effort	Gov parity
55	Tasks (scheduled SQL)	ADF triggers + Databricks Jobs	Schedule triggers for time-based; event triggers for data-arrival	M	GA
56	Task DAGs (dependencies)	dbt ref() DAG + ADF pipeline dependencies	dbt handles model dependencies; ADF for cross-pipeline orchestration	M	GA
57	Streams (CDC)	Delta change-data-feed (CDF) + Databricks DLT	Enable CDF on Delta tables; DLT for streaming CDC pipelines	M	GA
58	Alerts (condition-based notifications)	Azure Monitor alerts + Logic Apps	Monitor metrics trigger alerts; Logic Apps for complex notification workflows	S	GA

Evidence: domains/shared/pipelines/adf/, docs/adr/0001-adf-dbt-over-airflow.md

10. Developer tools and interfaces (5 features)¶

#	Snowflake feature	Azure equivalent	Mapping notes	Effort	Gov parity
59	SnowSQL CLI	Databricks CLI + Azure CLI + dbt CLI	Three CLIs replace one; each is purpose-built for its domain	S	GA
60	Snowsight (web UI)	Databricks SQL Editor + Fabric portal	SQL editing, visualization, dashboards in browser	XS	GA
61	Snowflake Connector for Python	Databricks SDK for Python + ODBC/JDBC	`databricks-sdk` package; ODBC/JDBC for legacy tools	S	GA
62	Snowflake Connector for Spark	Native (Databricks IS Spark)	No connector needed; Spark is the compute engine	XS	GA
63	Snowflake REST API	Databricks REST API + Azure REST APIs	Databricks API for workspace/warehouse/jobs; Azure APIs for platform services	S	GA

11. Miscellaneous features (3 features)¶

#	Snowflake feature	Azure equivalent	Mapping notes	Effort	Gov parity
64	Replication (cross-region / cross-cloud)	ADLS GRS + Databricks disaster recovery	GRS for storage; Databricks workspace DR for compute metadata	M	GA
65	Data Exchange (private marketplace)	Purview data products + Delta Sharing	Private exchanges via Purview-managed data products with access controls	L	GA
66	Snowflake Horizon (governance suite)	Purview + Unity Catalog + dbt contracts	Federated governance: each piece is best-in-class and swappable	M	GA

Summary statistics¶

Category	Features mapped	Avg effort	Gov parity
Compute and warehousing	8	S-M	100% GA
Storage and data formats	7	XS-M	100% GA
Data ingestion	6	S-M	100% GA
Data transformation	6	S-M	100% GA
Snowpark ecosystem	5	S-L	100% GA
Cortex AI services	8	S-M	100% GA in Gov
Security and governance	9	S-M	100% GA
Data sharing	5	M-L	80% GA (Clean Rooms partial)
Orchestration	4	S-M	100% GA
Developer tools	5	XS-S	100% GA
Miscellaneous	3	M-L	100% GA
Total	66		98% GA in Gov

Gaps summary¶

Only one feature has incomplete Gov parity:

Data Clean Rooms -- Snowflake's purpose-built clean-room UX is more turnkey. Azure's stitch (Delta Sharing + Purview + Confidential Computing) works but requires more configuration. For most federal data-sharing scenarios, this is acceptable.

Warehouse Migration -- deep dive on compute translation
Cortex Migration -- deep dive on AI feature migration
Security Migration -- deep dive on governance and access control
Master playbook -- Section 2 for the original capability mapping

Last updated: 2026-04-30 Maintainers: CSA-in-a-Box core team