= Azure Synapse Analytics Quickstart¶

< Home | = Documentation | < Tutorials | < Beginner | = Synapse

Get started with Azure Synapse Analytics in under an hour. Create your first workspace, load data, and run your first analytics query.

< Learning Objectives¶

After completing this quickstart, you will be able to:

Understand what Azure Synapse Analytics is and its components
Create and configure a Synapse workspace
Load data into Data Lake Storage
Query data using Serverless SQL Pool
Visualize results in Azure Portal

= Prerequisites¶

Azure subscription - Create free account
Basic SQL knowledge - Understanding of SELECT, WHERE, JOIN
Azure Portal access - portal.azure.com
5-10 minutes to complete setup

=¶

What is Azure Synapse Analytics?

Azure Synapse is a unified analytics platform that brings together:

Data Integration: Pipelines for ETL/ELT
Data Warehousing: Dedicated and Serverless SQL pools
Big Data Analytics: Apache Spark pools for distributed processing
Data Exploration: Integrated notebooks and SQL scripts
Visualization: Power BI integration

Key Components¶

graph LR
    A[Data Sources] --> B[Azure Synapse Workspace]
    B --> C[Serverless SQL Pool]
    B --> D[Dedicated SQL Pool]
    B --> E[Spark Pools]
    B --> F[Pipelines]
    C --> G[Query Results]
    D --> G
    E --> G

= Step 1: Create Synapse Workspace¶

Using Azure Portal¶

Navigate to Azure Portal
Go to portal.azure.com
Click "Create a resource"
Search for "Azure Synapse Analytics"
Click "Create"
Configure Basics
Subscription: Select your subscription
Resource Group: Create new "rg-synapse-quickstart"
Workspace Name: "synapse-quickstart-[yourname]" (must be globally unique)
Region: Select nearest region
Account Name: Create new storage account (auto-generated)
File System Name: "data"
Configure Security
SQL Administrator Login: "sqladmin"
Password: Create strong password (save it!)
Allow connections from: Select "All networks" for quickstart
Review and Create
Click "Review + create"
Click "Create"
Wait 5-10 minutes for deployment

Verify Deployment¶

Once deployment completes:

Click "Go to resource"
Click "Open Synapse Studio"
You should see the Synapse Studio interface

Tip: Bookmark the Synapse Studio URL for easy access

= Step 2: Prepare Sample Data¶

We'll use a sample CSV file with sales data.

Create Sample Data File¶

Create a file named sales_data.csv:

order_id,product,category,amount,order_date,customer_id
1001,Laptop,Electronics,1299.99,2024-01-15,C101
1002,Chair,Furniture,249.99,2024-01-15,C102
1003,Monitor,Electronics,399.99,2024-01-16,C101
1004,Desk,Furniture,549.99,2024-01-16,C103
1005,Keyboard,Electronics,89.99,2024-01-17,C102
1006,Mouse,Electronics,39.99,2024-01-17,C104
1007,Lamp,Furniture,79.99,2024-01-18,C101
1008,Tablet,Electronics,599.99,2024-01-18,C105
1009,Bookshelf,Furniture,299.99,2024-01-19,C103
1010,Phone,Electronics,899.99,2024-01-19,C102

Upload to Data Lake Storage¶

Open Synapse Studio
Click "Data" hub (database icon on left)
Select "Linked" tab
Expand "Azure Data Lake Storage Gen2"
Expand your storage account
Expand "data" container
Create Folder
Right-click on "data"
Select "New folder"
Name it "sales"
Upload File
Right-click "sales" folder
Select "Upload files"
Choose sales_data.csv
Click "Upload"

=¶

Step 3: Query Data with Serverless SQL

Serverless SQL Pool lets you query data directly in storage without loading it first.

Create SQL Script¶

Open Develop Hub
Click "Develop" hub (document icon on left)
Click "+" to add new resource
Select "SQL script"
Write Query

-- Query sales data from CSV file
SELECT
    *
FROM OPENROWSET(
    BULK 'https://[your-storage-account].dfs.core.windows.net/data/sales/sales_data.csv',
    FORMAT = 'CSV',
    PARSER_VERSION = '2.0',
    HEADER_ROW = TRUE
) AS sales_data;

Update Storage Account Name
Replace [your-storage-account] with your actual storage account name
Find it in the Data hub under Linked > Azure Data Lake Storage Gen2
Run Query
Click "Run" button
View results in the output pane

Analyze Sales Data¶

Try these example queries:

Total Sales by Category:

SELECT
    category,
    COUNT(*) AS order_count,
    SUM(CAST(amount AS DECIMAL(10,2))) AS total_sales,
    AVG(CAST(amount AS DECIMAL(10,2))) AS avg_order_value
FROM OPENROWSET(
    BULK 'https://[your-storage-account].dfs.core.windows.net/data/sales/sales_data.csv',
    FORMAT = 'CSV',
    PARSER_VERSION = '2.0',
    HEADER_ROW = TRUE
) AS sales_data
GROUP BY category
ORDER BY total_sales DESC;

Top Customers by Spend:

SELECT
    customer_id,
    COUNT(*) AS order_count,
    SUM(CAST(amount AS DECIMAL(10,2))) AS total_spent
FROM OPENROWSET(
    BULK 'https://[your-storage-account].dfs.core.windows.net/data/sales/sales_data.csv',
    FORMAT = 'CSV',
    PARSER_VERSION = '2.0',
    HEADER_ROW = TRUE
) AS sales_data
GROUP BY customer_id
ORDER BY total_spent DESC;

Daily Sales Trend:

SELECT
    CAST(order_date AS DATE) AS sale_date,
    COUNT(*) AS orders,
    SUM(CAST(amount AS DECIMAL(10,2))) AS daily_sales
FROM OPENROWSET(
    BULK 'https://[your-storage-account].dfs.core.windows.net/data/sales/sales_data.csv',
    FORMAT = 'CSV',
    PARSER_VERSION = '2.0',
    HEADER_ROW = TRUE
) AS sales_data
GROUP BY CAST(order_date AS DATE)
ORDER BY sale_date;

= Step 4: Create a Database View¶

Make queries easier by creating a view.

-- Create database for views
CREATE DATABASE SalesDB;
GO

-- Use the new database
USE SalesDB;
GO

-- Create external data source
CREATE EXTERNAL DATA SOURCE SalesData
WITH (
    LOCATION = 'https://[your-storage-account].dfs.core.windows.net/data/sales/'
);
GO

-- Create view
CREATE VIEW vw_Sales AS
SELECT
    CAST(order_id AS INT) AS order_id,
    product,
    category,
    CAST(amount AS DECIMAL(10,2)) AS amount,
    CAST(order_date AS DATE) AS order_date,
    customer_id
FROM OPENROWSET(
    BULK 'sales_data.csv',
    DATA_SOURCE = 'SalesData',
    FORMAT = 'CSV',
    PARSER_VERSION = '2.0',
    HEADER_ROW = TRUE
) AS sales_data;
GO

Now query the view:

-- Much simpler query!
SELECT
    category,
    SUM(amount) AS total_sales
FROM vw_Sales
GROUP BY category
ORDER BY total_sales DESC;

= Step 5: Visualize Results¶

Built-in Chart Visualization¶

Run a query
Click "Chart" tab in results pane
Configure chart:
Chart Type: Column chart
Category Column: category
Legend: None
Value: total_sales

Export Results¶

CSV: Click "Export" > "CSV"
Excel: Click "Export" > "Excel"
Power BI: Click "Chart" > "Export to Power BI"

= Understanding Serverless SQL¶

Key Benefits¶

No Infrastructure Management

No servers to provision or manage
Automatically scales based on query complexity

Pay-Per-Query

Only pay for data processed (per TB scanned)
No idle costs when not querying

Query Data in Place

No need to load/copy data
Query files directly in storage

Support Multiple Formats

CSV, Parquet, JSON, Delta Lake
Compressed files (gzip, snappy)

Best Practices¶

Use Parquet for Large Datasets
More efficient than CSV
Built-in compression
Columnar format (better for analytics)
Partition Your Data
Organize by date: /year=2024/month=01/day=15/
Use folder partition elimination in queries
Create Statistics
Improves query performance
Auto-creates for frequently queried columns

=' Troubleshooting¶

Common Issues¶

Error: "External table access failed"

Check storage account name and container
Verify file exists in specified path
Ensure Synapse workspace has storage access

Error: "Permission denied"

Grant "Storage Blob Data Contributor" role to Synapse workspace MSI
Go to Storage Account > Access Control (IAM) > Add role assignment

Query Returns No Results

Verify CSV header row matches query
Check file path and name spelling
Ensure file uploaded successfully

< Next Steps¶

Beginner Practice¶

Upload your own CSV files
Create additional views
Join multiple files in a query
Export results to Power BI

Intermediate Challenges¶

Query Parquet files instead of CSV
Create partitioned folder structure
Build external tables
Use CETAS to save query results

Advanced Topics¶

Create Spark Pool for big data processing
Build data pipeline with Data Factory
Implement row-level security
Integrate with Azure Purview

= Additional Resources¶

Documentation¶

Next Tutorials¶

Synapse Complete Tutorial - Deep dive into all features
Data Engineer Learning Path
Serverless SQL Best Practices

> Cleanup¶

To avoid Azure charges:

Delete Resource Group¶

# Using Azure CLI
az group delete --name rg-synapse-quickstart --yes --no-wait

Or in Azure Portal:

Navigate to Resource Groups
Select "rg-synapse-quickstart"
Click "Delete resource group"
Type resource group name to confirm
Click "Delete"

Note: This deletes the workspace, storage account, and all data!

< Congratulations!¶

You've successfully:

Created Azure Synapse Analytics workspace Uploaded data to Data Lake Storage Queried data with Serverless SQL Pool Created database views for easier analysis Visualized query results

You're ready to build more complex analytics solutions with Azure Synapse!

Ready for more? Try the Complete Synapse Tutorial

Last Updated: January 2025 Tutorial Version: 1.0 Tested with: Azure Synapse Analytics (December 2024 release)