= Azure Synapse Analytics Quickstart¶
< Home | = Documentation | < Tutorials | < Beginner | = Synapse
Get started with Azure Synapse Analytics in under an hour. Create your first workspace, load data, and run your first analytics query.
< Learning Objectives¶
After completing this quickstart, you will be able to:
- Understand what Azure Synapse Analytics is and its components
- Create and configure a Synapse workspace
- Load data into Data Lake Storage
- Query data using Serverless SQL Pool
- Visualize results in Azure Portal
= Prerequisites¶
- Azure subscription - Create free account
- Basic SQL knowledge - Understanding of SELECT, WHERE, JOIN
- Azure Portal access - portal.azure.com
- 5-10 minutes to complete setup
=¶
What is Azure Synapse Analytics?
Azure Synapse is a unified analytics platform that brings together:
- Data Integration: Pipelines for ETL/ELT
- Data Warehousing: Dedicated and Serverless SQL pools
- Big Data Analytics: Apache Spark pools for distributed processing
- Data Exploration: Integrated notebooks and SQL scripts
- Visualization: Power BI integration
Key Components¶
graph LR
A[Data Sources] --> B[Azure Synapse Workspace]
B --> C[Serverless SQL Pool]
B --> D[Dedicated SQL Pool]
B --> E[Spark Pools]
B --> F[Pipelines]
C --> G[Query Results]
D --> G
E --> G = Step 1: Create Synapse Workspace¶
Using Azure Portal¶
- Navigate to Azure Portal
- Go to portal.azure.com
- Click "Create a resource"
- Search for "Azure Synapse Analytics"
-
Click "Create"
-
Configure Basics
- Subscription: Select your subscription
- Resource Group: Create new "rg-synapse-quickstart"
- Workspace Name: "synapse-quickstart-[yourname]" (must be globally unique)
- Region: Select nearest region
- Account Name: Create new storage account (auto-generated)
-
File System Name: "data"
-
Configure Security
- SQL Administrator Login: "sqladmin"
- Password: Create strong password (save it!)
-
Allow connections from: Select "All networks" for quickstart
-
Review and Create
- Click "Review + create"
- Click "Create"
- Wait 5-10 minutes for deployment
Verify Deployment¶
Once deployment completes:
- Click "Go to resource"
- Click "Open Synapse Studio"
- You should see the Synapse Studio interface
Tip: Bookmark the Synapse Studio URL for easy access
= Step 2: Prepare Sample Data¶
We'll use a sample CSV file with sales data.
Create Sample Data File¶
Create a file named sales_data.csv:
order_id,product,category,amount,order_date,customer_id
1001,Laptop,Electronics,1299.99,2024-01-15,C101
1002,Chair,Furniture,249.99,2024-01-15,C102
1003,Monitor,Electronics,399.99,2024-01-16,C101
1004,Desk,Furniture,549.99,2024-01-16,C103
1005,Keyboard,Electronics,89.99,2024-01-17,C102
1006,Mouse,Electronics,39.99,2024-01-17,C104
1007,Lamp,Furniture,79.99,2024-01-18,C101
1008,Tablet,Electronics,599.99,2024-01-18,C105
1009,Bookshelf,Furniture,299.99,2024-01-19,C103
1010,Phone,Electronics,899.99,2024-01-19,C102
Upload to Data Lake Storage¶
- Open Synapse Studio
- Click "Data" hub (database icon on left)
- Select "Linked" tab
- Expand "Azure Data Lake Storage Gen2"
- Expand your storage account
-
Expand "data" container
-
Create Folder
- Right-click on "data"
- Select "New folder"
-
Name it "sales"
-
Upload File
- Right-click "sales" folder
- Select "Upload files"
- Choose
sales_data.csv - Click "Upload"
=¶
Step 3: Query Data with Serverless SQL
Serverless SQL Pool lets you query data directly in storage without loading it first.
Create SQL Script¶
- Open Develop Hub
- Click "Develop" hub (document icon on left)
- Click "+" to add new resource
-
Select "SQL script"
-
Write Query
-- Query sales data from CSV file
SELECT
*
FROM OPENROWSET(
BULK 'https://[your-storage-account].dfs.core.windows.net/data/sales/sales_data.csv',
FORMAT = 'CSV',
PARSER_VERSION = '2.0',
HEADER_ROW = TRUE
) AS sales_data;
- Update Storage Account Name
- Replace
[your-storage-account]with your actual storage account name -
Find it in the Data hub under Linked > Azure Data Lake Storage Gen2
-
Run Query
- Click "Run" button
- View results in the output pane
Analyze Sales Data¶
Try these example queries:
Total Sales by Category:
SELECT
category,
COUNT(*) AS order_count,
SUM(CAST(amount AS DECIMAL(10,2))) AS total_sales,
AVG(CAST(amount AS DECIMAL(10,2))) AS avg_order_value
FROM OPENROWSET(
BULK 'https://[your-storage-account].dfs.core.windows.net/data/sales/sales_data.csv',
FORMAT = 'CSV',
PARSER_VERSION = '2.0',
HEADER_ROW = TRUE
) AS sales_data
GROUP BY category
ORDER BY total_sales DESC;
Top Customers by Spend:
SELECT
customer_id,
COUNT(*) AS order_count,
SUM(CAST(amount AS DECIMAL(10,2))) AS total_spent
FROM OPENROWSET(
BULK 'https://[your-storage-account].dfs.core.windows.net/data/sales/sales_data.csv',
FORMAT = 'CSV',
PARSER_VERSION = '2.0',
HEADER_ROW = TRUE
) AS sales_data
GROUP BY customer_id
ORDER BY total_spent DESC;
Daily Sales Trend:
SELECT
CAST(order_date AS DATE) AS sale_date,
COUNT(*) AS orders,
SUM(CAST(amount AS DECIMAL(10,2))) AS daily_sales
FROM OPENROWSET(
BULK 'https://[your-storage-account].dfs.core.windows.net/data/sales/sales_data.csv',
FORMAT = 'CSV',
PARSER_VERSION = '2.0',
HEADER_ROW = TRUE
) AS sales_data
GROUP BY CAST(order_date AS DATE)
ORDER BY sale_date;
= Step 4: Create a Database View¶
Make queries easier by creating a view.
-- Create database for views
CREATE DATABASE SalesDB;
GO
-- Use the new database
USE SalesDB;
GO
-- Create external data source
CREATE EXTERNAL DATA SOURCE SalesData
WITH (
LOCATION = 'https://[your-storage-account].dfs.core.windows.net/data/sales/'
);
GO
-- Create view
CREATE VIEW vw_Sales AS
SELECT
CAST(order_id AS INT) AS order_id,
product,
category,
CAST(amount AS DECIMAL(10,2)) AS amount,
CAST(order_date AS DATE) AS order_date,
customer_id
FROM OPENROWSET(
BULK 'sales_data.csv',
DATA_SOURCE = 'SalesData',
FORMAT = 'CSV',
PARSER_VERSION = '2.0',
HEADER_ROW = TRUE
) AS sales_data;
GO
Now query the view:
-- Much simpler query!
SELECT
category,
SUM(amount) AS total_sales
FROM vw_Sales
GROUP BY category
ORDER BY total_sales DESC;
= Step 5: Visualize Results¶
Built-in Chart Visualization¶
- Run a query
- Click "Chart" tab in results pane
- Configure chart:
- Chart Type: Column chart
- Category Column: category
- Legend: None
- Value: total_sales
Export Results¶
- CSV: Click "Export" > "CSV"
- Excel: Click "Export" > "Excel"
- Power BI: Click "Chart" > "Export to Power BI"
= Understanding Serverless SQL¶
Key Benefits¶
No Infrastructure Management
- No servers to provision or manage
- Automatically scales based on query complexity
Pay-Per-Query
- Only pay for data processed (per TB scanned)
- No idle costs when not querying
Query Data in Place
- No need to load/copy data
- Query files directly in storage
Support Multiple Formats
- CSV, Parquet, JSON, Delta Lake
- Compressed files (gzip, snappy)
Best Practices¶
- Use Parquet for Large Datasets
- More efficient than CSV
- Built-in compression
-
Columnar format (better for analytics)
-
Partition Your Data
- Organize by date:
/year=2024/month=01/day=15/ -
Use folder partition elimination in queries
-
Create Statistics
- Improves query performance
- Auto-creates for frequently queried columns
=' Troubleshooting¶
Common Issues¶
Error: "External table access failed"
- Check storage account name and container
- Verify file exists in specified path
- Ensure Synapse workspace has storage access
Error: "Permission denied"
- Grant "Storage Blob Data Contributor" role to Synapse workspace MSI
- Go to Storage Account > Access Control (IAM) > Add role assignment
Query Returns No Results
- Verify CSV header row matches query
- Check file path and name spelling
- Ensure file uploaded successfully
< Next Steps¶
Beginner Practice¶
- Upload your own CSV files
- Create additional views
- Join multiple files in a query
- Export results to Power BI
Intermediate Challenges¶
- Query Parquet files instead of CSV
- Create partitioned folder structure
- Build external tables
- Use CETAS to save query results
Advanced Topics¶
- Create Spark Pool for big data processing
- Build data pipeline with Data Factory
- Implement row-level security
- Integrate with Azure Purview
= Additional Resources¶
Documentation¶
Next Tutorials¶
- Synapse Complete Tutorial - Deep dive into all features
- Data Engineer Learning Path
- Serverless SQL Best Practices
> Cleanup¶
To avoid Azure charges:
Delete Resource Group¶
Or in Azure Portal:
- Navigate to Resource Groups
- Select "rg-synapse-quickstart"
- Click "Delete resource group"
- Type resource group name to confirm
- Click "Delete"
Note: This deletes the workspace, storage account, and all data!
< Congratulations!¶
You've successfully:
Created Azure Synapse Analytics workspace Uploaded data to Data Lake Storage Queried data with Serverless SQL Pool Created database views for easier analysis Visualized query results
You're ready to build more complex analytics solutions with Azure Synapse!
Ready for more? Try the Complete Synapse Tutorial
Last Updated: January 2025 Tutorial Version: 1.0 Tested with: Azure Synapse Analytics (December 2024 release)