Skip to content

⚡ Integration Runtime Configuration

Tutorial Duration Level

Configure Azure Integration Runtime and Self-Hosted Integration Runtime for secure data movement across cloud and on-premises environments.

📋 Table of Contents

🌟 Integration Runtime Overview

The Integration Runtime (IR) is the compute infrastructure used by Azure Data Factory to provide data integration capabilities across different network environments.

IR Types Comparison

Feature Azure IR Self-Hosted IR Azure-SSIS IR
Location Azure cloud On-premises/VM Azure cloud
Management Fully managed Self-managed Partially managed
Network Public internet Private network Public/Private
Use Case Cloud data movement On-premises access SSIS package execution
Scaling Auto-scale Manual scale Manual scale
Cost Pay per use VM costs + licensing Compute + licensing

When to Use Each IR Type

graph TD
    A[Data Source Location?] --> B{Cloud Only?}
    B -->|Yes| C[Azure IR]
    B -->|No| D{On-Premises?}
    D -->|Yes| E[Self-Hosted IR]
    D -->|No| F{SSIS Packages?}
    F -->|Yes| G[Azure-SSIS IR]
    F -->|No| C

☁️ Azure Integration Runtime

Azure IR is a fully managed, serverless compute infrastructure.

Create Azure Integration Runtime

Using Azure Portal

  1. Navigate to ADF Studio
  2. Click Manage (toolbox icon)
  3. Click Integration runtimes
  4. Click + New
  5. Select Azure, Self-Hosted
  6. Select Azure
  7. Configure settings:
Azure IR Configuration:
├── Name: AzureIR-EastUS2
├── Description: Auto-resolve Azure IR for East US 2
├── Region: Auto Resolve (recommended)
├── Virtual Network: None (default)
└── Data Flow Runtime:
    ├── Compute type: General purpose
    ├── Core count: 4 cores
    └── Time to live: 10 minutes

Using PowerShell

# Create Azure Integration Runtime
$ResourceGroupName = "rg-adf-tutorial-dev"
$DataFactoryName = "adf-tutorial-dev-001"
$IntegrationRuntimeName = "AzureIR-AutoResolve"

Set-AzDataFactoryV2IntegrationRuntime `
    -ResourceGroupName $ResourceGroupName `
    -DataFactoryName $DataFactoryName `
    -Name $IntegrationRuntimeName `
    -Type Managed `
    -Location "AutoResolve"

# Verify creation
Get-AzDataFactoryV2IntegrationRuntime `
    -ResourceGroupName $ResourceGroupName `
    -DataFactoryName $DataFactoryName `
    -Name $IntegrationRuntimeName

Configure Region-Specific Azure IR

For better performance, create region-specific IRs:

# Create East US 2 Integration Runtime
Set-AzDataFactoryV2IntegrationRuntime `
    -ResourceGroupName $ResourceGroupName `
    -DataFactoryName $DataFactoryName `
    -Name "AzureIR-EastUS2" `
    -Type Managed `
    -Location "East US 2"

# Create West Europe Integration Runtime
Set-AzDataFactoryV2IntegrationRuntime `
    -ResourceGroupName $ResourceGroupName `
    -DataFactoryName $DataFactoryName `
    -Name "AzureIR-WestEurope" `
    -Type Managed `
    -Location "West Europe"

Configure Data Flow Runtime Settings

{
  "name": "AzureIR-DataFlow-Optimized",
  "properties": {
    "type": "Managed",
    "typeProperties": {
      "computeProperties": {
        "location": "AutoResolve",
        "dataFlowProperties": {
          "computeType": "MemoryOptimized",
          "coreCount": 8,
          "timeToLive": 10,
          "cleanup": true
        }
      }
    }
  }
}

🏢 Self-Hosted Integration Runtime

Self-Hosted IR allows secure access to data sources in private networks.

Architecture

graph LR
    subgraph "Azure"
        A[Data Factory]
    end

    subgraph "Corporate Network"
        B[Self-Hosted IR]
        C[On-Prem SQL Server]
        D[File Server]
        E[Oracle DB]
    end

    A <--> B
    B --> C
    B --> D
    B --> E

Create Self-Hosted Integration Runtime

Step 1: Create IR in Azure Portal

  1. Navigate to ADF Studio > Manage > Integration runtimes
  2. Click + New
  3. Select Self-Hosted
  4. Configure:
Self-Hosted IR Configuration:
├── Name: SelfHostedIR-Corporate
├── Description: On-premises data access
└── Network environment: Private network
  1. Click Create
  2. Copy the Authentication Key (you'll need this)

Step 2: Install Self-Hosted IR on Server

System Requirements:

  • Windows Server 2012 R2 or later
  • .NET Framework 4.7.2 or later
  • Minimum 2 CPU cores
  • Minimum 8 GB RAM
  • 80 GB available disk space

Installation Steps:

# Download installer
$DownloadUrl = "https://www.microsoft.com/en-us/download/details.aspx?id=39717"
$InstallerPath = "C:\Temp\IntegrationRuntime.msi"

# Install (run as Administrator)
Start-Process msiexec.exe -Wait -ArgumentList "/i $InstallerPath /quiet"

# Register with authentication key
$AuthKey = "YOUR_AUTHENTICATION_KEY_FROM_PORTAL"

& "C:\Program Files\Microsoft Integration Runtime\5.0\PowerShellScript\RegisterIntegrationRuntime.ps1" `
    -AuthenticationKey $AuthKey

Step 3: Verify Installation

# Check IR status
Get-Service -Name "DIAHostService"

# View IR configuration
$ConfigFile = "C:\Program Files\Microsoft Integration Runtime\5.0\Shared\Microsoft.DataTransfer.Gateway.Configuration.json"
Get-Content $ConfigFile | ConvertFrom-Json

Configure High Availability

Install multiple nodes for redundancy:

# On second server, use the same authentication key
& "C:\Program Files\Microsoft Integration Runtime\5.0\PowerShellScript\RegisterIntegrationRuntime.ps1" `
    -AuthenticationKey $AuthKey

# Verify both nodes in portal
# Navigate to: ADF Studio > Manage > Integration runtimes > SelfHostedIR-Corporate
# You should see 2 nodes

Self-Hosted IR Performance Tuning

# Modify configuration file
$ConfigPath = "C:\Program Files\Microsoft Integration Runtime\5.0\Shared\Microsoft.DataTransfer.Gateway.Configuration.json"
$Config = Get-Content $ConfigPath | ConvertFrom-Json

# Increase concurrent jobs
$Config.node.maxConcurrentJobs = 8

# Increase memory limit (MB)
$Config.node.maxMemoryUsagePercentage = 80

# Save configuration
$Config | ConvertTo-Json -Depth 10 | Set-Content $ConfigPath

# Restart service
Restart-Service -Name "DIAHostService"

Network Configuration

Open Required Ports

# Outbound to Azure (HTTPS)
New-NetFirewallRule -DisplayName "ADF-HTTPS-Outbound" `
    -Direction Outbound `
    -Protocol TCP `
    -RemotePort 443 `
    -Action Allow

# Service Bus (for communication)
New-NetFirewallRule -DisplayName "ADF-ServiceBus" `
    -Direction Outbound `
    -Protocol TCP `
    -RemotePort 9350-9354 `
    -Action Allow

Configure Proxy (if required)

# Set proxy settings
$ProxyServer = "http://proxy.company.com:8080"

& "C:\Program Files\Microsoft Integration Runtime\5.0\Shared\dmgcmd.exe" `
    -SetProxySettings $ProxyServer

🔄 Azure-SSIS Integration Runtime

For running SQL Server Integration Services (SSIS) packages in Azure.

Create Azure-SSIS IR

$ResourceGroupName = "rg-adf-tutorial-dev"
$DataFactoryName = "adf-tutorial-dev-001"
$AzureSSISName = "AzureSSISIR-Dev"
$AzureSSISLocation = "East US 2"

# Create Azure-SSIS IR
Set-AzDataFactoryV2IntegrationRuntime `
    -ResourceGroupName $ResourceGroupName `
    -DataFactoryName $DataFactoryName `
    -Name $AzureSSISName `
    -Type Managed `
    -Location $AzureSSISLocation `
    -NodeSize "Standard_D2_v3" `
    -NodeCount 2 `
    -MaxParallelExecutionsPerNode 2 `
    -Edition "Standard" `
    -LicenseType "LicenseIncluded"

# Start the IR (takes 15-20 minutes)
Start-AzDataFactoryV2IntegrationRuntime `
    -ResourceGroupName $ResourceGroupName `
    -DataFactoryName $DataFactoryName `
    -Name $AzureSSISName

Configure SSIS Catalog

# Configure with Azure SQL Database
$CatalogServerEndpoint = "your-sql-server.database.windows.net"
$CatalogAdminCredential = Get-Credential

Set-AzDataFactoryV2IntegrationRuntime `
    -ResourceGroupName $ResourceGroupName `
    -DataFactoryName $DataFactoryName `
    -Name $AzureSSISName `
    -CatalogServerEndpoint $CatalogServerEndpoint `
    -CatalogAdminCredential $CatalogAdminCredential `
    -CatalogPricingTier "S1"

🎯 Best Practices

Azure IR Best Practices

  1. Use Auto-Resolve for Flexibility
  2. Automatically selects closest Azure region
  3. Reduces data transfer costs

  4. Region-Specific IRs for Compliance

  5. Create dedicated IRs for data residency requirements
  6. Example: EU data stays in EU regions

  7. Data Flow Optimization

  8. Use Memory Optimized compute for large transformations
  9. Configure appropriate TTL to balance cost and performance

Self-Hosted IR Best Practices

  1. High Availability Setup

    Configuration:
    ├── Multiple nodes (2-4 recommended)
    ├── Load balancing automatic
    ├── Failover automatic
    └── Health monitoring enabled
    

  2. Network Optimization

  3. Place IR close to data sources
  4. Use dedicated network interface
  5. Configure bandwidth throttling if needed

  6. Security Hardening

    # Enable TLS 1.2
    [Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
    
    # Restrict to specific IP ranges
    # Configure firewall rules
    

  7. Resource Monitoring

  8. Monitor CPU, memory, disk usage
  9. Set up alerts for resource exhaustion
  10. Plan capacity for peak loads

Cost Optimization

Strategy Azure IR Self-Hosted IR
Scaling Auto-scale enabled Right-size VM
Data Flow Configure TTL, use cleanup N/A
Scheduling Off-peak processing Schedule during low usage
Region Use local region N/A

✅ Validation

Test Azure IR

# Get IR status
$AzureIR = Get-AzDataFactoryV2IntegrationRuntime `
    -ResourceGroupName $ResourceGroupName `
    -DataFactoryName $DataFactoryName `
    -Name "AzureIR-AutoResolve"

Write-Output "State: $($AzureIR.State)"

Test Self-Hosted IR Connectivity

  1. Navigate to ADF Studio > Manage > Integration runtimes
  2. Find your Self-Hosted IR
  3. Check Status column (should be "Running")
  4. Click the IR name
  5. Review Nodes tab (all nodes should be "Online")

Test Data Source Connection

Create a test linked service:

{
  "name": "TestOnPremSqlServer",
  "type": "Microsoft.DataFactory/factories/linkedservices",
  "properties": {
    "type": "SqlServer",
    "typeProperties": {
      "connectionString": "Server=myserver;Database=mydb;Integrated Security=True;",
      "authenticationType": "Windows"
    },
    "connectVia": {
      "referenceName": "SelfHostedIR-Corporate",
      "type": "IntegrationRuntimeReference"
    }
  }
}

Test connection in portal.

🚨 Troubleshooting

Self-Hosted IR Not Connecting

# Check service status
Get-Service -Name "DIAHostService" | Select-Object Status, StartType

# Restart service
Restart-Service -Name "DIAHostService"

# Check event logs
Get-EventLog -LogName "Application" -Source "Microsoft.DataTransfer.Gateway" -Newest 50

Performance Issues

# Check current jobs
$IRStatus = Get-AzDataFactoryV2IntegrationRuntimeMetric `
    -ResourceGroupName $ResourceGroupName `
    -DataFactoryName $DataFactoryName `
    -Name "SelfHostedIR-Corporate"

Write-Output "Active Jobs: $($IRStatus.Properties.ActiveJobCount)"
Write-Output "Max Jobs: $($IRStatus.Properties.MaxConcurrentJobCount)"

📚 Additional Resources

🚀 Next Steps

Integration Runtime configured! Proceed to:

04. Linked Services & Datasets - Connect to data sources


Module Progress: 3 of 18 complete

Tutorial Version: 1.0 Last Updated: January 2025