Skip to content

Spark Pools¶

Azure Synapse Spark Pools provide scalable Apache Spark compute for big data analytics and machine learning workloads.

Overview¶

Spark Pools in Azure Synapse Analytics enable you to:

Process large-scale data using Apache Spark
Run machine learning workloads with built-in libraries
Integrate with Delta Lake for ACID transactions
Scale compute resources on-demand

Key Features¶

Auto-scaling: Automatically scale nodes based on workload
Built-in Libraries: Pre-installed Spark, Python, and ML libraries
Notebook Integration: Interactive development with Synapse notebooks
Delta Lake Support: ACID transactions and time travel

Sections¶

Delta Lakehouse - Delta Lake implementation patterns

Getting Started¶

To create a Spark Pool:

Navigate to your Synapse workspace
Select "Apache Spark pools" from the left menu
Click "+ New" to create a pool
Configure node size and auto-scaling settings
Review and create

Best Practices¶

Use auto-pause to save costs when pools are idle
Right-size your nodes based on workload requirements
Enable dynamic allocation for variable workloads
Use Delta Lake for production data pipelines

Back to Azure Synapse | Documentation Home