Skip to content

Azure Synapse Analytics Shared Metadata

🏠 Home > 🏗️ Architecture > 📄 Shared Metadata

Azure Synapse Analytics provides a powerful shared metadata architecture that enables seamless integration between different compute engines, including Apache Spark pools and serverless SQL pools. This section provides in-depth documentation on the shared metadata capabilities, architecture, and best practices.

Documentation

  • Shared Metadata Architecture Overview - Comprehensive guide to the shared metadata architecture, including key components, security model, and best practices.
  • Visual Guides and Diagrams - Visual representations of serverless replicated databases, three-part naming concepts, and layered data architecture.
  • Code Examples - Detailed code samples for implementing shared metadata patterns.

Key Features

  • Single metadata store for multiple compute engines
  • Consistent schema definition across Spark and SQL
  • Unified data governance and lineage
  • Streamlined cross-engine workloads
  • Simplified DevOps management

Architecture Overview

Azure Synapse SQL Architecture

The shared metadata architecture in Azure Synapse Analytics provides a unified metadata experience that bridges the gap between different compute engines, allowing for seamless data access and governance.

Implementation Patterns

Cross-Engine Table Access

Access tables defined in Spark from SQL:

-- Access a table created in Spark from SQL
SELECT TOP 10 * FROM sales_gold.customer_summary;

Access tables defined in SQL from Spark:

# Access a table created in SQL from Spark
customer_df = spark.read.synapsesql("sales_gold.customer_summary")

Metadata Propagation

  • Schema Changes: Schema changes in one engine are automatically visible in others
  • Statistics: Query optimization statistics are shared for better performance
  • Access Control: Security permissions are consistently applied across engines
  • Lineage: Data lineage is tracked across different processing engines

Best Practices

  1. Use Consistent Naming Conventions: Adopt a clear naming standard across all engines
  2. Implement Row-Level Security: Apply consistent security at the row level where needed
  3. Establish Data Ownership: Define clear ownership of metadata objects
  4. Document Metadata: Maintain comprehensive documentation of your metadata structure
  5. Regular Validation: Periodically validate metadata consistency across engines