🔄 Multi-Source Data Integration¶
Integrate data from diverse sources including databases, file systems, APIs, cloud services, and SaaS applications into a unified data platform.
📋 Table of Contents¶
- Integration Architecture
- Database Sources
- File-Based Sources
- API and Web Services
- Cloud Storage
- SaaS Applications
- Integration Patterns
- Next Steps
🏗️ Integration Architecture¶
graph TB
subgraph "Data Sources"
A[SQL Databases]
B[File Systems]
C[REST APIs]
D[Cloud Storage]
E[SaaS Apps]
end
subgraph "Azure Data Factory"
F[Self-Hosted IR]
G[Azure IR]
H[Pipeline Orchestration]
end
subgraph "Unified Data Platform"
I[Data Lake Gen2]
J[Azure Synapse]
K[Azure SQL DB]
end
A --> F
B --> F
C --> G
D --> G
E --> G
F --> H
G --> H
H --> I
H --> J
H --> K 🗄️ Database Sources¶
SQL Server (On-Premises)¶
{
"name": "OnPremSqlServer_LinkedService",
"properties": {
"type": "SqlServer",
"typeProperties": {
"connectionString": "Server=SERVER01;Database=SalesDB;",
"userName": "sqluser",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault_LinkedService",
"type": "LinkedServiceReference"
},
"secretName": "onprem-sql-password"
}
},
"connectVia": {
"referenceName": "SelfHostedIR",
"type": "IntegrationRuntimeReference"
}
}
}
Oracle Database¶
{
"name": "OracleDB_LinkedService",
"properties": {
"type": "Oracle",
"typeProperties": {
"connectionString": "Host=oracle-server;Port=1521;Sid=ORCL;",
"username": "oracleuser",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault_LinkedService",
"type": "LinkedServiceReference"
},
"secretName": "oracle-password"
}
},
"connectVia": {
"referenceName": "SelfHostedIR",
"type": "IntegrationRuntimeReference"
}
}
}
PostgreSQL¶
{
"name": "PostgreSQL_LinkedService",
"properties": {
"type": "PostgreSql",
"typeProperties": {
"connectionString": "Server=postgres-server;Database=mydb;Port=5432;",
"username": "pguser",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault_LinkedService",
"type": "LinkedServiceReference"
},
"secretName": "postgres-password"
}
}
}
}
MySQL¶
{
"name": "MySQL_LinkedService",
"properties": {
"type": "MySql",
"typeProperties": {
"connectionString": "Server=mysql-server;Database=mydb;Port=3306;",
"username": "mysqluser",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault_LinkedService",
"type": "LinkedServiceReference"
},
"secretName": "mysql-password"
}
}
}
}
📁 File-Based Sources¶
File Share (SMB)¶
{
"name": "FileShare_LinkedService",
"properties": {
"type": "FileServer",
"typeProperties": {
"host": "\\\\fileserver\\share",
"userId": "domain\\username",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault_LinkedService",
"type": "LinkedServiceReference"
},
"secretName": "file-share-password"
}
},
"connectVia": {
"referenceName": "SelfHostedIR",
"type": "IntegrationRuntimeReference"
}
}
}
SFTP¶
{
"name": "SFTP_LinkedService",
"properties": {
"type": "Sftp",
"typeProperties": {
"host": "sftp.example.com",
"port": 22,
"authenticationType": "Basic",
"userName": "sftpuser",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault_LinkedService",
"type": "LinkedServiceReference"
},
"secretName": "sftp-password"
},
"skipHostKeyValidation": false
}
}
}
FTP¶
{
"name": "FTP_LinkedService",
"properties": {
"type": "FtpServer",
"typeProperties": {
"host": "ftp.example.com",
"port": 21,
"authenticationType": "Basic",
"userName": "ftpuser",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault_LinkedService",
"type": "LinkedServiceReference"
},
"secretName": "ftp-password"
},
"enableSsl": true
}
}
}
🌐 API and Web Services¶
REST API with OAuth2¶
{
"name": "RestAPI_OAuth2_LinkedService",
"properties": {
"type": "RestService",
"typeProperties": {
"url": "https://api.example.com",
"enableServerCertificateValidation": true,
"authenticationType": "OAuth2ClientCredential",
"clientId": "your-client-id",
"clientSecret": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault_LinkedService",
"type": "LinkedServiceReference"
},
"secretName": "api-client-secret"
},
"tokenEndpoint": "https://login.example.com/oauth2/token",
"scope": "api://default/.read"
}
}
}
REST API with API Key¶
{
"name": "RestAPI_ApiKey_LinkedService",
"properties": {
"type": "RestService",
"typeProperties": {
"url": "https://api.example.com",
"authenticationType": "Anonymous",
"authHeaders": {
"X-API-Key": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault_LinkedService",
"type": "LinkedServiceReference"
},
"secretName": "api-key"
}
}
}
}
}
HTTP Endpoint¶
{
"name": "HTTP_LinkedService",
"properties": {
"type": "HttpServer",
"typeProperties": {
"url": "https://data.example.com",
"authenticationType": "Basic",
"userName": "httpuser",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault_LinkedService",
"type": "LinkedServiceReference"
},
"secretName": "http-password"
}
}
}
}
☁️ Cloud Storage¶
AWS S3¶
{
"name": "AmazonS3_LinkedService",
"properties": {
"type": "AmazonS3",
"typeProperties": {
"accessKeyId": "your-access-key",
"secretAccessKey": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault_LinkedService",
"type": "LinkedServiceReference"
},
"secretName": "aws-secret-key"
}
}
}
}
Google Cloud Storage¶
{
"name": "GoogleCloudStorage_LinkedService",
"properties": {
"type": "GoogleCloudStorage",
"typeProperties": {
"accessKeyId": "your-access-key",
"secretAccessKey": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault_LinkedService",
"type": "LinkedServiceReference"
},
"secretName": "gcs-secret-key"
}
}
}
}
Azure Blob Storage (Other Subscription)¶
{
"name": "AzureBlobExternal_LinkedService",
"properties": {
"type": "AzureBlobStorage",
"typeProperties": {
"serviceEndpoint": "https://externalaccount.blob.core.windows.net/",
"authenticationType": "ServicePrincipal",
"servicePrincipalId": "your-sp-id",
"servicePrincipalKey": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault_LinkedService",
"type": "LinkedServiceReference"
},
"secretName": "sp-key"
},
"tenant": "tenant-id"
}
}
}
📱 SaaS Applications¶
Salesforce¶
{
"name": "Salesforce_LinkedService",
"properties": {
"type": "Salesforce",
"typeProperties": {
"environmentUrl": "https://login.salesforce.com",
"username": "salesforce@example.com",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault_LinkedService",
"type": "LinkedServiceReference"
},
"secretName": "salesforce-password"
},
"securityToken": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault_LinkedService",
"type": "LinkedServiceReference"
},
"secretName": "salesforce-token"
}
}
}
}
Dynamics 365¶
{
"name": "Dynamics365_LinkedService",
"properties": {
"type": "Dynamics",
"typeProperties": {
"deploymentType": "Online",
"organizationName": "your-org",
"authenticationType": "Office365",
"username": "user@example.com",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault_LinkedService",
"type": "LinkedServiceReference"
},
"secretName": "dynamics-password"
}
}
}
}
ServiceNow¶
{
"name": "ServiceNow_LinkedService",
"properties": {
"type": "ServiceNow",
"typeProperties": {
"endpoint": "https://instance.service-now.com",
"authenticationType": "Basic",
"username": "serviceuser",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault_LinkedService",
"type": "LinkedServiceReference"
},
"secretName": "servicenow-password"
}
}
}
}
SAP Table¶
{
"name": "SapTable_LinkedService",
"properties": {
"type": "SapTable",
"typeProperties": {
"server": "sap-server",
"systemNumber": "00",
"clientId": "100",
"userName": "sapuser",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault_LinkedService",
"type": "LinkedServiceReference"
},
"secretName": "sap-password"
}
},
"connectVia": {
"referenceName": "SelfHostedIR",
"type": "IntegrationRuntimeReference"
}
}
}
🔄 Integration Patterns¶
Pattern 1: Hub and Spoke¶
Centralize data from multiple sources into a data lake.
Pattern 2: Federation¶
Query data in-place without movement.
Pattern 3: Incremental Load¶
Load only changed data using watermarks.
{
"source": {
"type": "SqlServerSource",
"sqlReaderQuery": "SELECT * FROM Sales WHERE ModifiedDate > '@{pipeline().parameters.LastRunTime}' AND ModifiedDate <= '@{pipeline().parameters.CurrentRunTime}'"
}
}
Pattern 4: Full Load with Archive¶
Complete refresh with historical archiving.
1. Archive current data to history container
2. Truncate target table
3. Load full dataset from source
4. Validate record counts
✅ Hands-On Exercise¶
Multi-Source Integration Pipeline¶
Create a pipeline that integrates data from:
- SQL Server - Customer data
- CSV Files - Transaction logs
- REST API - Product catalog
- Salesforce - Opportunity data
Destination: Azure Data Lake Gen2 with organized folder structure:
data/
├── customers/
│ └── year=2025/month=01/day=15/
├── transactions/
│ └── year=2025/month=01/day=15/
├── products/
│ └── year=2025/month=01/day=15/
└── opportunities/
└── year=2025/month=01/day=15/
Implementation Steps¶
- Create all required linked services
- Create datasets for each source
- Create copy activities
- Configure parallel execution
- Add error handling
- Test the pipeline
📚 Additional Resources¶
🚀 Next Steps¶
Multi-source integration mastered! Proceed to:
→ 06. Secure Connectivity Patterns - Implement enterprise security
Module Progress: 5 of 18 complete
Tutorial Version: 1.0 Last Updated: January 2025