🚀 Tutorial 1: Environment Setup and Prerequisites¶
Set up your Azure environment and local development tools for the complete Stream Analytics tutorial series. This foundation ensures smooth execution of all real-time analytics tutorials.
🎯 Learning Objectives¶
After completing this tutorial, you will be able to:
- ✅ Configure Azure subscription with necessary permissions and quotas
- ✅ Create Event Hubs namespace for data ingestion
- ✅ Set up Stream Analytics workspace for real-time processing
- ✅ Install development tools for testing and monitoring
- ✅ Validate environment setup using test data flows
⏱️ Time Estimate: 30 minutes¶
- Azure Setup: 15 minutes
- Event Hubs Configuration: 10 minutes
- Validation & Testing: 5 minutes
📋 Prerequisites¶
Required Access¶
- Azure Subscription with Contributor or Owner role
- Sufficient credits or payment method configured (~$100 recommended for full series)
- Administrative access to local machine for tool installation
Basic Knowledge¶
- SQL fundamentals - SELECT, WHERE, GROUP BY operations
- Azure fundamentals - Understanding of resource groups and subscriptions
- JSON format - For event data structure
🛠️ Step 1: Azure Subscription Setup¶
1.1 Verify Subscription Access¶
First, ensure your Azure subscription has the necessary permissions:
# Login to Azure (will open browser for authentication)
az login
# List available subscriptions
az account list --output table
# Set the subscription you want to use for tutorials
az account set --subscription "your-subscription-id-here"
# Verify current subscription
az account show --output table
Expected Output:
EnvironmentName HomeTenantId Id Name State TenantId
----------------- -------------- ---------- ----------------- ------- -----------
AzureCloud xxxx-xxxx-... yyyy-yyyy-... Your Subscription Enabled xxxx-xxxx-...
1.2 Enable Required Resource Providers¶
Register necessary Azure resource providers:
# Register required providers
$providers = @(
'Microsoft.EventHub',
'Microsoft.StreamAnalytics',
'Microsoft.Storage',
'Microsoft.Sql',
'Microsoft.Web'
)
foreach ($provider in $providers) {
Write-Host "Registering $provider..."
az provider register --namespace $provider --wait
}
# Verify registration status
az provider list --query "[?namespace=='Microsoft.EventHub' || namespace=='Microsoft.StreamAnalytics'].{Provider:namespace, Status:registrationState}" --output table
Expected Output:
Provider Status
-------------------------- ----------
Microsoft.EventHub Registered
Microsoft.StreamAnalytics Registered
1.3 Define Resource Naming Convention¶
Establish consistent naming for all tutorial resources:
# Set base variables for all tutorials
$location = "eastus"
$prefix = "streamtutorial"
$suffix = Get-Random -Minimum 1000 -Maximum 9999
# Define resource names
$resourceGroupName = "$prefix-rg-$suffix"
$eventHubNamespace = "$prefix-eh-$suffix"
$eventHubName = "sensordata"
$storageAccountName = "$($prefix)sa$suffix"
$streamAnalyticsJob = "$prefix-asa-$suffix"
# Save to environment variables for later tutorials
[Environment]::SetEnvironmentVariable("STREAM_RG", $resourceGroupName, "User")
[Environment]::SetEnvironmentVariable("STREAM_EH_NAMESPACE", $eventHubNamespace, "User")
[Environment]::SetEnvironmentVariable("STREAM_EH_NAME", $eventHubName, "User")
[Environment]::SetEnvironmentVariable("STREAM_SA", $storageAccountName, "User")
[Environment]::SetEnvironmentVariable("STREAM_JOB", $streamAnalyticsJob, "User")
[Environment]::SetEnvironmentVariable("STREAM_LOCATION", $location, "User")
Write-Host "Resource names configured and saved to environment variables"
🌐 Step 2: Create Core Azure Resources¶
2.1 Create Resource Group¶
Create a dedicated resource group for all streaming resources:
# Create resource group
az group create `
--name $resourceGroupName `
--location $location `
--tags "Environment=Tutorial" "Purpose=StreamAnalytics" "CostCenter=Training"
# Verify creation
az group show --name $resourceGroupName --output table
Expected Output:
Name Location Status
---------------------- ---------- ---------
streamtutorial-rg-1234 eastus Succeeded
2.2 Create Event Hubs Namespace¶
Set up Event Hubs for streaming data ingestion:
# Create Event Hubs namespace (Standard tier for production features)
az eventhubs namespace create `
--name $eventHubNamespace `
--resource-group $resourceGroupName `
--location $location `
--sku Standard `
--capacity 1 `
--enable-auto-inflate false `
--tags "Purpose=DataIngestion"
# Create Event Hub for sensor data
az eventhubs eventhub create `
--name $eventHubName `
--namespace-name $eventHubNamespace `
--resource-group $resourceGroupName `
--partition-count 4 `
--message-retention 1
# Verify Event Hub creation
az eventhubs eventhub show `
--name $eventHubName `
--namespace-name $eventHubNamespace `
--resource-group $resourceGroupName `
--query "{Name:name, Partitions:partitionCount, Retention:messageRetentionInDays}" `
--output table
Expected Output:
2.3 Create Shared Access Policies¶
Set up authentication for producers and consumers:
# Create policy for data producers (send only)
az eventhubs eventhub authorization-rule create `
--name SendPolicy `
--eventhub-name $eventHubName `
--namespace-name $eventHubNamespace `
--resource-group $resourceGroupName `
--rights Send
# Create policy for Stream Analytics (listen only)
az eventhubs eventhub authorization-rule create `
--name ListenPolicy `
--eventhub-name $eventHubName `
--namespace-name $eventHubNamespace `
--resource-group $resourceGroupName `
--rights Listen
# Get connection strings for later use
$sendConnectionString = az eventhubs eventhub authorization-rule keys list `
--name SendPolicy `
--eventhub-name $eventHubName `
--namespace-name $eventHubNamespace `
--resource-group $resourceGroupName `
--query primaryConnectionString `
--output tsv
$listenConnectionString = az eventhubs eventhub authorization-rule keys list `
--name ListenPolicy `
--eventhub-name $eventHubName `
--namespace-name $eventHubNamespace `
--resource-group $resourceGroupName `
--query primaryConnectionString `
--output tsv
# Save connection strings securely
[Environment]::SetEnvironmentVariable("STREAM_EH_SEND_CONN", $sendConnectionString, "User")
[Environment]::SetEnvironmentVariable("STREAM_EH_LISTEN_CONN", $listenConnectionString, "User")
Write-Host "Connection strings saved to environment variables"
2.4 Create Storage Account for Outputs¶
Set up storage for archival and reference data:
# Create storage account
az storage account create `
--name $storageAccountName `
--resource-group $resourceGroupName `
--location $location `
--sku Standard_LRS `
--kind StorageV2 `
--access-tier Hot `
--enable-hierarchical-namespace false
# Create container for raw data archive
az storage container create `
--name "rawdata" `
--account-name $storageAccountName `
--public-access off
# Create container for processed data
az storage container create `
--name "processeddata" `
--account-name $storageAccountName `
--public-access off
# Get storage account key
$storageKey = az storage account keys list `
--account-name $storageAccountName `
--resource-group $resourceGroupName `
--query "[0].value" `
--output tsv
[Environment]::SetEnvironmentVariable("STREAM_SA_KEY", $storageKey, "User")
# Verify containers
az storage container list `
--account-name $storageAccountName `
--account-key $storageKey `
--query "[].name" `
--output table
Expected Output:
🔧 Step 3: Install Development Tools¶
3.1 Azure CLI and Extensions¶
Ensure latest Azure CLI is installed:
# Check Azure CLI version (should be 2.50.0 or higher)
az --version
# Install/update Stream Analytics extension
az extension add --name stream-analytics --upgrade
# Verify extension installation
az extension list --query "[?name=='stream-analytics'].{Name:name, Version:version}" --output table
3.2 Install Azure Storage Explorer¶
Download and install for visual data inspection:
- Download from: Azure Storage Explorer
- Install following platform-specific instructions
- Connect using storage account credentials
3.3 Install Visual Studio Code with Extensions¶
Set up development environment:
# Install VS Code (if not already installed)
# Download from: https://code.visualstudio.com/
# Install recommended extensions via command line
code --install-extension ms-azuretools.vscode-azurefunctions
code --install-extension ms-vscode.azure-account
code --install-extension ms-azuretools.vscode-azurestorage
3.4 Install Python and Required Libraries¶
For data generator scripts (used in Tutorial 02):
# Verify Python 3.8+ is installed
python --version
# Create virtual environment for tutorial scripts
python -m venv stream-tutorial-env
# Activate virtual environment
# Windows PowerShell:
.\stream-tutorial-env\Scripts\Activate.ps1
# Install required packages
pip install azure-eventhub==5.11.4
pip install faker==19.12.0
pip install python-dotenv==1.0.0
# Verify installations
pip list | Select-String "azure-eventhub|faker|python-dotenv"
Expected Output:
✅ Step 4: Validate Environment Setup¶
4.1 Create Test Event Hub Message¶
Verify Event Hub can receive data:
# Create test script to send message
$testScript = @'
from azure.eventhub import EventHubProducerClient, EventData
import os
import json
from datetime import datetime
connection_string = os.environ.get("STREAM_EH_SEND_CONN")
eventhub_name = os.environ.get("STREAM_EH_NAME")
producer = EventHubProducerClient.from_connection_string(
conn_str=connection_string,
eventhub_name=eventhub_name
)
test_event = {
"deviceId": "test-device-001",
"temperature": 72.5,
"humidity": 45.2,
"timestamp": datetime.utcnow().isoformat()
}
event_data_batch = producer.create_batch()
event_data_batch.add(EventData(json.dumps(test_event)))
producer.send_batch(event_data_batch)
producer.close()
print("Test event sent successfully!")
'@
# Save and run test script
$testScript | Out-File -FilePath "test_eventhub.py" -Encoding UTF8
python test_eventhub.py
Expected Output:
4.2 Verify Event Hub Metrics¶
Check that message was received:
# Get Event Hub metrics
az monitor metrics list `
--resource "/subscriptions/$(az account show --query id -o tsv)/resourceGroups/$resourceGroupName/providers/Microsoft.EventHub/namespaces/$eventHubNamespace" `
--metric IncomingMessages `
--start-time (Get-Date).AddMinutes(-10).ToString("yyyy-MM-ddTHH:mm:ss") `
--end-time (Get-Date).ToString("yyyy-MM-ddTHH:mm:ss") `
--interval PT1M `
--query "value[0].timeseries[0].data[-5:]" `
--output table
4.3 Environment Validation Checklist¶
Run through this checklist to ensure setup is complete:
# Create validation script
$validationScript = @'
Write-Host "`n=== Stream Analytics Environment Validation ===" -ForegroundColor Cyan
# Check environment variables
$requiredEnvVars = @(
"STREAM_RG",
"STREAM_EH_NAMESPACE",
"STREAM_EH_NAME",
"STREAM_SA",
"STREAM_LOCATION",
"STREAM_EH_SEND_CONN",
"STREAM_EH_LISTEN_CONN",
"STREAM_SA_KEY"
)
$missingVars = @()
foreach ($var in $requiredEnvVars) {
$value = [Environment]::GetEnvironmentVariable($var, "User")
if ($value) {
Write-Host "[OK] $var is set" -ForegroundColor Green
} else {
Write-Host "[FAIL] $var is missing" -ForegroundColor Red
$missingVars += $var
}
}
# Check Azure resources
Write-Host "`nChecking Azure resources..." -ForegroundColor Cyan
$rgExists = az group exists --name $env:STREAM_RG
Write-Host "[$(if($rgExists -eq 'true'){'OK'}else{'FAIL'})] Resource Group exists" -ForegroundColor $(if($rgExists -eq 'true'){'Green'}else{'Red'})
$ehNamespace = az eventhubs namespace show --name $env:STREAM_EH_NAMESPACE --resource-group $env:STREAM_RG --query name -o tsv 2>$null
Write-Host "[$(if($ehNamespace){'OK'}else{'FAIL'})] Event Hubs Namespace exists" -ForegroundColor $(if($ehNamespace){'Green'}else{'Red'})
$eh = az eventhubs eventhub show --name $env:STREAM_EH_NAME --namespace-name $env:STREAM_EH_NAMESPACE --resource-group $env:STREAM_RG --query name -o tsv 2>$null
Write-Host "[$(if($eh){'OK'}else{'FAIL'})] Event Hub exists" -ForegroundColor $(if($eh){'Green'}else{'Red'})
$sa = az storage account show --name $env:STREAM_SA --resource-group $env:STREAM_RG --query name -o tsv 2>$null
Write-Host "[$(if($sa){'OK'}else{'FAIL'})] Storage Account exists" -ForegroundColor $(if($sa){'Green'}else{'Red'})
# Summary
Write-Host "`n=== Validation Summary ===" -ForegroundColor Cyan
if ($missingVars.Count -eq 0 -and $rgExists -eq 'true' -and $ehNamespace -and $eh -and $sa) {
Write-Host "Environment setup is COMPLETE! Ready for Tutorial 02." -ForegroundColor Green
} else {
Write-Host "Environment setup has ISSUES. Please review errors above." -ForegroundColor Red
}
'@
# Save and run validation
$validationScript | Out-File -FilePath "validate_environment.ps1" -Encoding UTF8
.\validate_environment.ps1
💰 Cost Considerations¶
Expected Monthly Costs (Tutorial Series)¶
| Service | Configuration | Estimated Cost |
|---|---|---|
| Event Hubs Standard | 1M messages/month | $10-20 |
| Storage Account | 100GB Standard LRS | $2-5 |
| Stream Analytics | 1 Streaming Unit | $0 (stopped when not in use) |
| Azure SQL Database | Basic tier | $5 (created in later tutorials) |
Total Tutorial Cost: ~$20-30/month if resources kept running
Cost Optimization Tips¶
# Stop Event Hub when not in use (for tutorials only)
# Note: This is NOT recommended for production scenarios
# Delete test resources when series is complete
az group delete --name $resourceGroupName --yes --no-wait
💡 Cost Alert: Set up budget alerts to monitor spending during tutorials
🎓 Key Concepts Learned¶
Event Hubs Architecture¶
- Namespace: Logical container for multiple Event Hubs
- Event Hub: Individual streaming endpoint with partitions
- Partitions: Enable parallel processing (4 partitions = 4 parallel consumers)
- Consumer Groups: Allow multiple applications to read same stream independently
Authentication & Security¶
- Shared Access Policies: Fine-grained access control (Send, Listen, Manage)
- Connection Strings: Contain endpoint, policy name, and key
- Environment Variables: Secure way to store credentials locally
Naming Conventions¶
- Use lowercase with hyphens for resource names
- Include resource type abbreviations (rg, eh, sa, asa)
- Add unique suffix to avoid naming conflicts
- Tag resources for cost tracking and organization
🚀 Next Steps¶
Your environment is now ready for real-time streaming analytics! Continue to:
Tutorial 02: Data Generator Setup →
In the next tutorial, you'll:
- Create realistic IoT sensor data simulators
- Implement different data generation patterns
- Configure data velocity and variety
- Test Event Hub ingestion at scale
📚 Additional Resources¶
- Azure Event Hubs Documentation
- Azure Stream Analytics Overview
- Event Hubs Quotas and Limits
- Stream Analytics Pricing
🔧 Troubleshooting¶
Issue: Provider Registration Fails¶
Symptoms: Error message "The subscription is not registered to use namespace 'Microsoft.EventHub'"
Solution:
# Register provider manually
az provider register --namespace Microsoft.EventHub --wait
az provider register --namespace Microsoft.StreamAnalytics --wait
# Verify registration
az provider show --namespace Microsoft.EventHub --query "registrationState"
Issue: Resource Name Already Exists¶
Symptoms: "Storage account name is already taken" or similar
Solution:
# Generate new unique suffix
$suffix = Get-Random -Minimum 10000 -Maximum 99999
$storageAccountName = "streamtutorialsa$suffix"
# Update environment variable
[Environment]::SetEnvironmentVariable("STREAM_SA", $storageAccountName, "User")
Issue: Python Package Installation Fails¶
Symptoms: pip install errors or import failures
Solution:
# Upgrade pip first
python -m pip install --upgrade pip
# Install packages with verbose output
pip install azure-eventhub==5.11.4 --verbose
# If SSL errors occur, try:
pip install --trusted-host pypi.org --trusted-host files.pythonhosted.org azure-eventhub
💬 Feedback¶
Was this tutorial helpful? Let us know:
- ✅ Completed successfully - Continue to Tutorial 02
- ⚠️ Had issues - Report a problem
- 💡 Have suggestions - Share feedback
Tutorial Progress: 1 of 11 complete | Next: Data Generator Setup
Last Updated: January 2025