⚡ Integration Runtime Configuration¶
Configure Azure Integration Runtime and Self-Hosted Integration Runtime for secure data movement across cloud and on-premises environments.
📋 Table of Contents¶
- Integration Runtime Overview
- Azure Integration Runtime
- Self-Hosted Integration Runtime
- Azure-SSIS Integration Runtime
- Best Practices
- Next Steps
🌟 Integration Runtime Overview¶
The Integration Runtime (IR) is the compute infrastructure used by Azure Data Factory to provide data integration capabilities across different network environments.
IR Types Comparison¶
| Feature | Azure IR | Self-Hosted IR | Azure-SSIS IR |
|---|---|---|---|
| Location | Azure cloud | On-premises/VM | Azure cloud |
| Management | Fully managed | Self-managed | Partially managed |
| Network | Public internet | Private network | Public/Private |
| Use Case | Cloud data movement | On-premises access | SSIS package execution |
| Scaling | Auto-scale | Manual scale | Manual scale |
| Cost | Pay per use | VM costs + licensing | Compute + licensing |
When to Use Each IR Type¶
graph TD
A[Data Source Location?] --> B{Cloud Only?}
B -->|Yes| C[Azure IR]
B -->|No| D{On-Premises?}
D -->|Yes| E[Self-Hosted IR]
D -->|No| F{SSIS Packages?}
F -->|Yes| G[Azure-SSIS IR]
F -->|No| C ☁️ Azure Integration Runtime¶
Azure IR is a fully managed, serverless compute infrastructure.
Create Azure Integration Runtime¶
Using Azure Portal¶
- Navigate to ADF Studio
- Click Manage (toolbox icon)
- Click Integration runtimes
- Click + New
- Select Azure, Self-Hosted
- Select Azure
- Configure settings:
Azure IR Configuration:
├── Name: AzureIR-EastUS2
├── Description: Auto-resolve Azure IR for East US 2
├── Region: Auto Resolve (recommended)
├── Virtual Network: None (default)
└── Data Flow Runtime:
├── Compute type: General purpose
├── Core count: 4 cores
└── Time to live: 10 minutes
Using PowerShell¶
# Create Azure Integration Runtime
$ResourceGroupName = "rg-adf-tutorial-dev"
$DataFactoryName = "adf-tutorial-dev-001"
$IntegrationRuntimeName = "AzureIR-AutoResolve"
Set-AzDataFactoryV2IntegrationRuntime `
-ResourceGroupName $ResourceGroupName `
-DataFactoryName $DataFactoryName `
-Name $IntegrationRuntimeName `
-Type Managed `
-Location "AutoResolve"
# Verify creation
Get-AzDataFactoryV2IntegrationRuntime `
-ResourceGroupName $ResourceGroupName `
-DataFactoryName $DataFactoryName `
-Name $IntegrationRuntimeName
Configure Region-Specific Azure IR¶
For better performance, create region-specific IRs:
# Create East US 2 Integration Runtime
Set-AzDataFactoryV2IntegrationRuntime `
-ResourceGroupName $ResourceGroupName `
-DataFactoryName $DataFactoryName `
-Name "AzureIR-EastUS2" `
-Type Managed `
-Location "East US 2"
# Create West Europe Integration Runtime
Set-AzDataFactoryV2IntegrationRuntime `
-ResourceGroupName $ResourceGroupName `
-DataFactoryName $DataFactoryName `
-Name "AzureIR-WestEurope" `
-Type Managed `
-Location "West Europe"
Configure Data Flow Runtime Settings¶
{
"name": "AzureIR-DataFlow-Optimized",
"properties": {
"type": "Managed",
"typeProperties": {
"computeProperties": {
"location": "AutoResolve",
"dataFlowProperties": {
"computeType": "MemoryOptimized",
"coreCount": 8,
"timeToLive": 10,
"cleanup": true
}
}
}
}
}
🏢 Self-Hosted Integration Runtime¶
Self-Hosted IR allows secure access to data sources in private networks.
Architecture¶
graph LR
subgraph "Azure"
A[Data Factory]
end
subgraph "Corporate Network"
B[Self-Hosted IR]
C[On-Prem SQL Server]
D[File Server]
E[Oracle DB]
end
A <--> B
B --> C
B --> D
B --> E Create Self-Hosted Integration Runtime¶
Step 1: Create IR in Azure Portal¶
- Navigate to ADF Studio > Manage > Integration runtimes
- Click + New
- Select Self-Hosted
- Configure:
Self-Hosted IR Configuration:
├── Name: SelfHostedIR-Corporate
├── Description: On-premises data access
└── Network environment: Private network
- Click Create
- Copy the Authentication Key (you'll need this)
Step 2: Install Self-Hosted IR on Server¶
System Requirements:
- Windows Server 2012 R2 or later
- .NET Framework 4.7.2 or later
- Minimum 2 CPU cores
- Minimum 8 GB RAM
- 80 GB available disk space
Installation Steps:
# Download installer
$DownloadUrl = "https://www.microsoft.com/en-us/download/details.aspx?id=39717"
$InstallerPath = "C:\Temp\IntegrationRuntime.msi"
# Install (run as Administrator)
Start-Process msiexec.exe -Wait -ArgumentList "/i $InstallerPath /quiet"
# Register with authentication key
$AuthKey = "YOUR_AUTHENTICATION_KEY_FROM_PORTAL"
& "C:\Program Files\Microsoft Integration Runtime\5.0\PowerShellScript\RegisterIntegrationRuntime.ps1" `
-AuthenticationKey $AuthKey
Step 3: Verify Installation¶
# Check IR status
Get-Service -Name "DIAHostService"
# View IR configuration
$ConfigFile = "C:\Program Files\Microsoft Integration Runtime\5.0\Shared\Microsoft.DataTransfer.Gateway.Configuration.json"
Get-Content $ConfigFile | ConvertFrom-Json
Configure High Availability¶
Install multiple nodes for redundancy:
# On second server, use the same authentication key
& "C:\Program Files\Microsoft Integration Runtime\5.0\PowerShellScript\RegisterIntegrationRuntime.ps1" `
-AuthenticationKey $AuthKey
# Verify both nodes in portal
# Navigate to: ADF Studio > Manage > Integration runtimes > SelfHostedIR-Corporate
# You should see 2 nodes
Self-Hosted IR Performance Tuning¶
# Modify configuration file
$ConfigPath = "C:\Program Files\Microsoft Integration Runtime\5.0\Shared\Microsoft.DataTransfer.Gateway.Configuration.json"
$Config = Get-Content $ConfigPath | ConvertFrom-Json
# Increase concurrent jobs
$Config.node.maxConcurrentJobs = 8
# Increase memory limit (MB)
$Config.node.maxMemoryUsagePercentage = 80
# Save configuration
$Config | ConvertTo-Json -Depth 10 | Set-Content $ConfigPath
# Restart service
Restart-Service -Name "DIAHostService"
Network Configuration¶
Open Required Ports¶
# Outbound to Azure (HTTPS)
New-NetFirewallRule -DisplayName "ADF-HTTPS-Outbound" `
-Direction Outbound `
-Protocol TCP `
-RemotePort 443 `
-Action Allow
# Service Bus (for communication)
New-NetFirewallRule -DisplayName "ADF-ServiceBus" `
-Direction Outbound `
-Protocol TCP `
-RemotePort 9350-9354 `
-Action Allow
Configure Proxy (if required)¶
# Set proxy settings
$ProxyServer = "http://proxy.company.com:8080"
& "C:\Program Files\Microsoft Integration Runtime\5.0\Shared\dmgcmd.exe" `
-SetProxySettings $ProxyServer
🔄 Azure-SSIS Integration Runtime¶
For running SQL Server Integration Services (SSIS) packages in Azure.
Create Azure-SSIS IR¶
$ResourceGroupName = "rg-adf-tutorial-dev"
$DataFactoryName = "adf-tutorial-dev-001"
$AzureSSISName = "AzureSSISIR-Dev"
$AzureSSISLocation = "East US 2"
# Create Azure-SSIS IR
Set-AzDataFactoryV2IntegrationRuntime `
-ResourceGroupName $ResourceGroupName `
-DataFactoryName $DataFactoryName `
-Name $AzureSSISName `
-Type Managed `
-Location $AzureSSISLocation `
-NodeSize "Standard_D2_v3" `
-NodeCount 2 `
-MaxParallelExecutionsPerNode 2 `
-Edition "Standard" `
-LicenseType "LicenseIncluded"
# Start the IR (takes 15-20 minutes)
Start-AzDataFactoryV2IntegrationRuntime `
-ResourceGroupName $ResourceGroupName `
-DataFactoryName $DataFactoryName `
-Name $AzureSSISName
Configure SSIS Catalog¶
# Configure with Azure SQL Database
$CatalogServerEndpoint = "your-sql-server.database.windows.net"
$CatalogAdminCredential = Get-Credential
Set-AzDataFactoryV2IntegrationRuntime `
-ResourceGroupName $ResourceGroupName `
-DataFactoryName $DataFactoryName `
-Name $AzureSSISName `
-CatalogServerEndpoint $CatalogServerEndpoint `
-CatalogAdminCredential $CatalogAdminCredential `
-CatalogPricingTier "S1"
🎯 Best Practices¶
Azure IR Best Practices¶
- Use Auto-Resolve for Flexibility
- Automatically selects closest Azure region
-
Reduces data transfer costs
-
Region-Specific IRs for Compliance
- Create dedicated IRs for data residency requirements
-
Example: EU data stays in EU regions
-
Data Flow Optimization
- Use Memory Optimized compute for large transformations
- Configure appropriate TTL to balance cost and performance
Self-Hosted IR Best Practices¶
-
High Availability Setup
-
Network Optimization
- Place IR close to data sources
- Use dedicated network interface
-
Configure bandwidth throttling if needed
-
Security Hardening
-
Resource Monitoring
- Monitor CPU, memory, disk usage
- Set up alerts for resource exhaustion
- Plan capacity for peak loads
Cost Optimization¶
| Strategy | Azure IR | Self-Hosted IR |
|---|---|---|
| Scaling | Auto-scale enabled | Right-size VM |
| Data Flow | Configure TTL, use cleanup | N/A |
| Scheduling | Off-peak processing | Schedule during low usage |
| Region | Use local region | N/A |
✅ Validation¶
Test Azure IR¶
# Get IR status
$AzureIR = Get-AzDataFactoryV2IntegrationRuntime `
-ResourceGroupName $ResourceGroupName `
-DataFactoryName $DataFactoryName `
-Name "AzureIR-AutoResolve"
Write-Output "State: $($AzureIR.State)"
Test Self-Hosted IR Connectivity¶
- Navigate to ADF Studio > Manage > Integration runtimes
- Find your Self-Hosted IR
- Check Status column (should be "Running")
- Click the IR name
- Review Nodes tab (all nodes should be "Online")
Test Data Source Connection¶
Create a test linked service:
{
"name": "TestOnPremSqlServer",
"type": "Microsoft.DataFactory/factories/linkedservices",
"properties": {
"type": "SqlServer",
"typeProperties": {
"connectionString": "Server=myserver;Database=mydb;Integrated Security=True;",
"authenticationType": "Windows"
},
"connectVia": {
"referenceName": "SelfHostedIR-Corporate",
"type": "IntegrationRuntimeReference"
}
}
}
Test connection in portal.
🚨 Troubleshooting¶
Self-Hosted IR Not Connecting¶
# Check service status
Get-Service -Name "DIAHostService" | Select-Object Status, StartType
# Restart service
Restart-Service -Name "DIAHostService"
# Check event logs
Get-EventLog -LogName "Application" -Source "Microsoft.DataTransfer.Gateway" -Newest 50
Performance Issues¶
# Check current jobs
$IRStatus = Get-AzDataFactoryV2IntegrationRuntimeMetric `
-ResourceGroupName $ResourceGroupName `
-DataFactoryName $DataFactoryName `
-Name "SelfHostedIR-Corporate"
Write-Output "Active Jobs: $($IRStatus.Properties.ActiveJobCount)"
Write-Output "Max Jobs: $($IRStatus.Properties.MaxConcurrentJobCount)"
📚 Additional Resources¶
🚀 Next Steps¶
Integration Runtime configured! Proceed to:
→ 04. Linked Services & Datasets - Connect to data sources
Module Progress: 3 of 18 complete
Tutorial Version: 1.0 Last Updated: January 2025