Skip to content

= Azure Synapse Analytics Quickstart

< Home | = Documentation | < Tutorials | < Beginner | = Synapse

Status Level Duration

Get started with Azure Synapse Analytics in under an hour. Create your first workspace, load data, and run your first analytics query.

< Learning Objectives

After completing this quickstart, you will be able to:

  • Understand what Azure Synapse Analytics is and its components
  • Create and configure a Synapse workspace
  • Load data into Data Lake Storage
  • Query data using Serverless SQL Pool
  • Visualize results in Azure Portal

= Prerequisites

=

What is Azure Synapse Analytics?

Azure Synapse is a unified analytics platform that brings together:

  • Data Integration: Pipelines for ETL/ELT
  • Data Warehousing: Dedicated and Serverless SQL pools
  • Big Data Analytics: Apache Spark pools for distributed processing
  • Data Exploration: Integrated notebooks and SQL scripts
  • Visualization: Power BI integration

Key Components

graph LR
    A[Data Sources] --> B[Azure Synapse Workspace]
    B --> C[Serverless SQL Pool]
    B --> D[Dedicated SQL Pool]
    B --> E[Spark Pools]
    B --> F[Pipelines]
    C --> G[Query Results]
    D --> G
    E --> G

= Step 1: Create Synapse Workspace

Using Azure Portal

  1. Navigate to Azure Portal
  2. Go to portal.azure.com
  3. Click "Create a resource"
  4. Search for "Azure Synapse Analytics"
  5. Click "Create"

  6. Configure Basics

  7. Subscription: Select your subscription
  8. Resource Group: Create new "rg-synapse-quickstart"
  9. Workspace Name: "synapse-quickstart-[yourname]" (must be globally unique)
  10. Region: Select nearest region
  11. Account Name: Create new storage account (auto-generated)
  12. File System Name: "data"

  13. Configure Security

  14. SQL Administrator Login: "sqladmin"
  15. Password: Create strong password (save it!)
  16. Allow connections from: Select "All networks" for quickstart

  17. Review and Create

  18. Click "Review + create"
  19. Click "Create"
  20. Wait 5-10 minutes for deployment

Verify Deployment

Once deployment completes:

  1. Click "Go to resource"
  2. Click "Open Synapse Studio"
  3. You should see the Synapse Studio interface

Tip: Bookmark the Synapse Studio URL for easy access

= Step 2: Prepare Sample Data

We'll use a sample CSV file with sales data.

Create Sample Data File

Create a file named sales_data.csv:

order_id,product,category,amount,order_date,customer_id
1001,Laptop,Electronics,1299.99,2024-01-15,C101
1002,Chair,Furniture,249.99,2024-01-15,C102
1003,Monitor,Electronics,399.99,2024-01-16,C101
1004,Desk,Furniture,549.99,2024-01-16,C103
1005,Keyboard,Electronics,89.99,2024-01-17,C102
1006,Mouse,Electronics,39.99,2024-01-17,C104
1007,Lamp,Furniture,79.99,2024-01-18,C101
1008,Tablet,Electronics,599.99,2024-01-18,C105
1009,Bookshelf,Furniture,299.99,2024-01-19,C103
1010,Phone,Electronics,899.99,2024-01-19,C102

Upload to Data Lake Storage

  1. Open Synapse Studio
  2. Click "Data" hub (database icon on left)
  3. Select "Linked" tab
  4. Expand "Azure Data Lake Storage Gen2"
  5. Expand your storage account
  6. Expand "data" container

  7. Create Folder

  8. Right-click on "data"
  9. Select "New folder"
  10. Name it "sales"

  11. Upload File

  12. Right-click "sales" folder
  13. Select "Upload files"
  14. Choose sales_data.csv
  15. Click "Upload"

=

Step 3: Query Data with Serverless SQL

Serverless SQL Pool lets you query data directly in storage without loading it first.

Create SQL Script

  1. Open Develop Hub
  2. Click "Develop" hub (document icon on left)
  3. Click "+" to add new resource
  4. Select "SQL script"

  5. Write Query

-- Query sales data from CSV file
SELECT
    *
FROM OPENROWSET(
    BULK 'https://[your-storage-account].dfs.core.windows.net/data/sales/sales_data.csv',
    FORMAT = 'CSV',
    PARSER_VERSION = '2.0',
    HEADER_ROW = TRUE
) AS sales_data;
  1. Update Storage Account Name
  2. Replace [your-storage-account] with your actual storage account name
  3. Find it in the Data hub under Linked > Azure Data Lake Storage Gen2

  4. Run Query

  5. Click "Run" button
  6. View results in the output pane

Analyze Sales Data

Try these example queries:

Total Sales by Category:

SELECT
    category,
    COUNT(*) AS order_count,
    SUM(CAST(amount AS DECIMAL(10,2))) AS total_sales,
    AVG(CAST(amount AS DECIMAL(10,2))) AS avg_order_value
FROM OPENROWSET(
    BULK 'https://[your-storage-account].dfs.core.windows.net/data/sales/sales_data.csv',
    FORMAT = 'CSV',
    PARSER_VERSION = '2.0',
    HEADER_ROW = TRUE
) AS sales_data
GROUP BY category
ORDER BY total_sales DESC;

Top Customers by Spend:

SELECT
    customer_id,
    COUNT(*) AS order_count,
    SUM(CAST(amount AS DECIMAL(10,2))) AS total_spent
FROM OPENROWSET(
    BULK 'https://[your-storage-account].dfs.core.windows.net/data/sales/sales_data.csv',
    FORMAT = 'CSV',
    PARSER_VERSION = '2.0',
    HEADER_ROW = TRUE
) AS sales_data
GROUP BY customer_id
ORDER BY total_spent DESC;

Daily Sales Trend:

SELECT
    CAST(order_date AS DATE) AS sale_date,
    COUNT(*) AS orders,
    SUM(CAST(amount AS DECIMAL(10,2))) AS daily_sales
FROM OPENROWSET(
    BULK 'https://[your-storage-account].dfs.core.windows.net/data/sales/sales_data.csv',
    FORMAT = 'CSV',
    PARSER_VERSION = '2.0',
    HEADER_ROW = TRUE
) AS sales_data
GROUP BY CAST(order_date AS DATE)
ORDER BY sale_date;

= Step 4: Create a Database View

Make queries easier by creating a view.

-- Create database for views
CREATE DATABASE SalesDB;
GO

-- Use the new database
USE SalesDB;
GO

-- Create external data source
CREATE EXTERNAL DATA SOURCE SalesData
WITH (
    LOCATION = 'https://[your-storage-account].dfs.core.windows.net/data/sales/'
);
GO

-- Create view
CREATE VIEW vw_Sales AS
SELECT
    CAST(order_id AS INT) AS order_id,
    product,
    category,
    CAST(amount AS DECIMAL(10,2)) AS amount,
    CAST(order_date AS DATE) AS order_date,
    customer_id
FROM OPENROWSET(
    BULK 'sales_data.csv',
    DATA_SOURCE = 'SalesData',
    FORMAT = 'CSV',
    PARSER_VERSION = '2.0',
    HEADER_ROW = TRUE
) AS sales_data;
GO

Now query the view:

-- Much simpler query!
SELECT
    category,
    SUM(amount) AS total_sales
FROM vw_Sales
GROUP BY category
ORDER BY total_sales DESC;

= Step 5: Visualize Results

Built-in Chart Visualization

  1. Run a query
  2. Click "Chart" tab in results pane
  3. Configure chart:
  4. Chart Type: Column chart
  5. Category Column: category
  6. Legend: None
  7. Value: total_sales

Export Results

  • CSV: Click "Export" > "CSV"
  • Excel: Click "Export" > "Excel"
  • Power BI: Click "Chart" > "Export to Power BI"

= Understanding Serverless SQL

Key Benefits

 No Infrastructure Management

  • No servers to provision or manage
  • Automatically scales based on query complexity

 Pay-Per-Query

  • Only pay for data processed (per TB scanned)
  • No idle costs when not querying

 Query Data in Place

  • No need to load/copy data
  • Query files directly in storage

 Support Multiple Formats

  • CSV, Parquet, JSON, Delta Lake
  • Compressed files (gzip, snappy)

Best Practices

  1. Use Parquet for Large Datasets
  2. More efficient than CSV
  3. Built-in compression
  4. Columnar format (better for analytics)

  5. Partition Your Data

  6. Organize by date: /year=2024/month=01/day=15/
  7. Use folder partition elimination in queries

  8. Create Statistics

  9. Improves query performance
  10. Auto-creates for frequently queried columns

=' Troubleshooting

Common Issues

Error: "External table access failed"

  •  Check storage account name and container
  •  Verify file exists in specified path
  •  Ensure Synapse workspace has storage access

Error: "Permission denied"

  •  Grant "Storage Blob Data Contributor" role to Synapse workspace MSI
  •  Go to Storage Account > Access Control (IAM) > Add role assignment

Query Returns No Results

  •  Verify CSV header row matches query
  •  Check file path and name spelling
  •  Ensure file uploaded successfully

< Next Steps

Beginner Practice

  • Upload your own CSV files
  • Create additional views
  • Join multiple files in a query
  • Export results to Power BI

Intermediate Challenges

  • Query Parquet files instead of CSV
  • Create partitioned folder structure
  • Build external tables
  • Use CETAS to save query results

Advanced Topics

  • Create Spark Pool for big data processing
  • Build data pipeline with Data Factory
  • Implement row-level security
  • Integrate with Azure Purview

= Additional Resources

Documentation

Next Tutorials

> Cleanup

To avoid Azure charges:

Delete Resource Group

# Using Azure CLI
az group delete --name rg-synapse-quickstart --yes --no-wait

Or in Azure Portal:

  1. Navigate to Resource Groups
  2. Select "rg-synapse-quickstart"
  3. Click "Delete resource group"
  4. Type resource group name to confirm
  5. Click "Delete"

Note: This deletes the workspace, storage account, and all data!

< Congratulations!

You've successfully:

 Created Azure Synapse Analytics workspace  Uploaded data to Data Lake Storage  Queried data with Serverless SQL Pool  Created database views for easier analysis  Visualized query results

You're ready to build more complex analytics solutions with Azure Synapse!


Ready for more? Try the Complete Synapse Tutorial


Last Updated: January 2025 Tutorial Version: 1.0 Tested with: Azure Synapse Analytics (December 2024 release)