Oracle Data Migration to Azure¶

Data movement strategies and tools: SSMA data migration, Azure DMS, Azure Data Factory, Oracle Data Pump + AzCopy, Fabric Mirroring for Oracle, and GoldenGate for CDC.

Choosing a data migration approach

The right tool depends on data volume, downtime tolerance, and whether you need ongoing replication or a one-time migration.

- **< 100 GB, downtime OK:** SSMA or ora2pg built-in data migration
- **100 GB - 10 TB, minimal downtime:** Azure Database Migration Service (DMS)
- **> 10 TB, complex:** Oracle Data Pump export + AzCopy + target import
- **Zero downtime, ongoing replication:** GoldenGate or Fabric Mirroring
- **Ongoing analytics replication:** Fabric Mirroring for Oracle or ADF CDC pipelines

1. Migration approach comparison¶

Approach	Max data size	Downtime	Complexity	Ongoing replication	Best for
SSMA data migration	~100 GB	Hours	Low	No	Small databases, SQL MI target
ora2pg data migration	~100 GB	Hours	Low	No	Small databases, PostgreSQL target
Azure DMS (offline)	10+ TB	Hours-days	Medium	No	Medium databases, one-time migration
Azure DMS (online)	10+ TB	Minutes	Medium	During migration	Medium databases, minimal downtime
Data Pump + AzCopy	50+ TB	Hours-days	Medium	No	Very large databases, bulk migration
Azure Data Factory	Unlimited	Variable	Medium	Yes (scheduled)	Batch replication, ETL integration
Fabric Mirroring	Varies	Near-zero	Low	Yes (continuous)	Analytics replication to OneLake
GoldenGate	Unlimited	Near-zero	High	Yes (continuous)	Zero-downtime, Oracle-to-Oracle

2. SSMA data migration¶

SSMA for Oracle includes built-in data migration for Azure SQL MI targets.

2.1 Data migration workflow¶

Oracle Source ──► SSMA Client ──► Azure SQL MI Target
                     │
              ┌──────┴──────┐
              │             │
         Client-side    Server-side
         (small DBs)    (large DBs)

2.2 Configuration¶

# In SSMA Project Settings > Migration:
# - Migration Engine: Server Side Data Migration (for databases > 10 GB)
# - Batch Size: 10000 (adjust based on row size)
# - Parallel Data Migration: Enable
# - Number of Parallel Processes: 4 (adjust based on network bandwidth)
# - Extended Data Migration Options:
#   - Truncate target tables before migration: Yes (for fresh migration)
#   - Retain identity values: Yes

2.3 Running data migration¶

# In SSMA:
# 1. Ensure schema is already converted and deployed to target
# 2. Right-click source schema in Oracle Metadata Explorer
# 3. Select "Migrate Data"
# 4. Monitor progress in the Migration Report
# 5. Review error log for failed rows

3. Azure Database Migration Service (DMS)¶

Azure DMS provides managed data migration with both offline and online modes.

3.1 Offline migration (one-time)¶

# Create DMS instance
az dms create \
    --resource-group rg-migration \
    --name dms-oracle-migration \
    --location eastus \
    --sku-name Standard_4vCores

# Create migration project
az dms project create \
    --resource-group rg-migration \
    --service-name dms-oracle-migration \
    --name oracle-to-sqlmi \
    --source-platform Oracle \
    --target-platform SQLMI

# Create and run migration task
# (Use Azure Portal for GUI-based task creation, or REST API)

3.2 Online migration (continuous sync)¶

Online migration uses Oracle LogMiner to capture changes during migration:

Phase 1: Full load (bulk copy of existing data)
    Oracle ──► DMS ──► Azure SQL MI

Phase 2: Incremental sync (LogMiner CDC)
    Oracle ──► LogMiner ──► DMS ──► Azure SQL MI
    (Continuous until cutover)

Phase 3: Cutover
    Stop application writes to Oracle
    Wait for DMS sync to complete
    Switch application to Azure SQL MI

3.3 DMS requirements for Oracle source¶

-- On Oracle source: Enable supplemental logging
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;

-- Create migration user with required privileges
CREATE USER dms_user IDENTIFIED BY "***";
GRANT CREATE SESSION TO dms_user;
GRANT SELECT ANY TABLE TO dms_user;
GRANT SELECT ANY TRANSACTION TO dms_user;
GRANT SELECT ON V_$ARCHIVED_LOG TO dms_user;
GRANT SELECT ON V_$LOG TO dms_user;
GRANT SELECT ON V_$LOGFILE TO dms_user;
GRANT SELECT ON V_$DATABASE TO dms_user;
GRANT SELECT ON V_$THREAD TO dms_user;
GRANT SELECT ON V_$PARAMETER TO dms_user;
GRANT SELECT ON V_$NLS_PARAMETERS TO dms_user;
GRANT SELECT ON V_$TIMEZONE_NAMES TO dms_user;
GRANT SELECT ON ALL_INDEXES TO dms_user;
GRANT SELECT ON ALL_OBJECTS TO dms_user;
GRANT SELECT ON ALL_TABLES TO dms_user;
GRANT SELECT ON ALL_USERS TO dms_user;
GRANT SELECT ON ALL_TAB_COLUMNS TO dms_user;
GRANT SELECT ON DBA_OBJECTS TO dms_user;
GRANT LOGMINING TO dms_user;  -- Oracle 12c+

4. Oracle Data Pump + AzCopy¶

For very large databases (10+ TB), Oracle Data Pump export followed by AzCopy to Azure Blob and target import provides the fastest bulk transfer.

4.1 Export with Data Pump¶

# Create directory object for export
sqlplus / as sysdba << 'EOF'
CREATE OR REPLACE DIRECTORY dp_export_dir AS '/opt/oracle/export';
GRANT READ, WRITE ON DIRECTORY dp_export_dir TO migration_user;
EXIT;
EOF

# Full database export (parallel)
expdp migration_user/*** \
    DIRECTORY=dp_export_dir \
    DUMPFILE=feddb_%U.dmp \
    LOGFILE=feddb_export.log \
    SCHEMAS=APP_SCHEMA \
    PARALLEL=8 \
    FILESIZE=10G \
    COMPRESSION=ALL \
    EXCLUDE=STATISTICS

# Table-level export for largest tables
expdp migration_user/*** \
    DIRECTORY=dp_export_dir \
    DUMPFILE=large_table_%U.dmp \
    LOGFILE=large_table_export.log \
    TABLES=APP_SCHEMA.TRANSACTIONS \
    PARALLEL=8 \
    QUERY="WHERE transaction_date >= TO_DATE('2020-01-01','YYYY-MM-DD')"

4.2 Transfer with AzCopy¶

# Install AzCopy
# Download from https://aka.ms/downloadazcopy-v10-linux

# Upload to Azure Blob Storage
azcopy copy '/opt/oracle/export/feddb_*.dmp' \
    'https://stmigration.blob.core.windows.net/oracle-export/?sv=...' \
    --recursive \
    --put-md5 \
    --log-level INFO

# For large transfers, use multiple streams
azcopy copy '/opt/oracle/export/' \
    'https://stmigration.blob.core.windows.net/oracle-export/' \
    --recursive \
    --cap-mbps 5000

4.3 Import to target¶

For Azure SQL MI:

# Convert Data Pump format to SQL MI-compatible format
# Use SSMA or BCP for bulk import after schema conversion

# BCP bulk import (per table)
bcp dbo.transactions in /mnt/export/transactions.csv \
    -S mi-instance.database.windows.net \
    -d FEDDB \
    -U migration_admin \
    -c -t "," -r "\n" \
    -b 50000 \
    -h "TABLOCK"

For Azure PostgreSQL:

# Use pg_restore for Data Pump → PostgreSQL conversion
# (Requires intermediate conversion with ora2pg)

# Or use COPY command for CSV data
psql -h pg-flex.postgres.database.azure.com \
     -U migration_admin \
     -d feddb \
     -c "\COPY app_schema.transactions FROM '/mnt/export/transactions.csv' WITH CSV HEADER"

5. Azure Data Factory for ongoing replication¶

ADF provides scheduled batch replication from Oracle sources to Azure targets and OneLake.

5.1 Oracle source linked service¶

{
    "name": "OracleSource",
    "properties": {
        "type": "Oracle",
        "typeProperties": {
            "connectionString": "Host=oracle-prod.agency.gov;Port=1521;SID=FEDDB;User Id=adf_reader;Password=***;"
        }
    }
}

5.2 Incremental copy pipeline¶

{
    "name": "Oracle_Incremental_Copy",
    "properties": {
        "activities": [
            {
                "name": "LookupWatermark",
                "type": "Lookup",
                "typeProperties": {
                    "source": {
                        "type": "AzureSqlSource",
                        "sqlReaderQuery": "SELECT watermark_value FROM watermark_table WHERE table_name = 'transactions'"
                    },
                    "dataset": { "referenceName": "WatermarkDataset" }
                }
            },
            {
                "name": "CopyOracleToOneLake",
                "type": "Copy",
                "dependsOn": [
                    {
                        "activity": "LookupWatermark",
                        "dependencyConditions": ["Succeeded"]
                    }
                ],
                "typeProperties": {
                    "source": {
                        "type": "OracleSource",
                        "oracleReaderQuery": {
                            "value": "SELECT * FROM APP_SCHEMA.TRANSACTIONS WHERE UPDATED_AT > TO_TIMESTAMP('@{activity('LookupWatermark').output.firstRow.watermark_value}', 'YYYY-MM-DD HH24:MI:SS.FF')",
                            "type": "Expression"
                        }
                    },
                    "sink": {
                        "type": "LakehouseTableSink",
                        "tableActionOption": "Append"
                    }
                }
            },
            {
                "name": "UpdateWatermark",
                "type": "SqlServerStoredProcedure",
                "dependsOn": [
                    {
                        "activity": "CopyOracleToOneLake",
                        "dependencyConditions": ["Succeeded"]
                    }
                ],
                "typeProperties": {
                    "storedProcedureName": "sp_update_watermark",
                    "storedProcedureParameters": {
                        "table_name": { "value": "transactions" },
                        "watermark_value": { "value": "@{utcNow()}" }
                    }
                }
            }
        ],
        "parameters": {}
    }
}

6. Fabric Mirroring for Oracle¶

Fabric Mirroring for Oracle (preview) provides near-real-time replication from Oracle databases to OneLake.

6.1 Supported sources¶

Source	Mirroring support
Oracle Database@Azure	Preview
Azure SQL MI	GA
Azure SQL Database	GA
Cosmos DB	GA
On-premises Oracle (via Self-hosted IR)	Roadmap

6.2 Setup in Fabric portal¶

1. Navigate to Fabric workspace
2. New > Mirrored Database > Oracle Database
3. Provide connection details:
   - Host: <exadata-ip-or-hostname>
   - Port: 1521
   - Service Name: FEDDB
   - Authentication: Username/Password or Managed Identity
4. Select tables to mirror
5. Configure mirroring options:
   - Initial snapshot: Yes
   - Continuous replication: Yes
6. Start mirroring

6.3 CSA-in-a-Box integration¶

Once mirrored, Oracle tables appear as Delta Lake tables in OneLake:

# In a Fabric notebook or dbt model, reference mirrored data
# dbt source definition:
# sources:
#   - name: oracle_mirror
#     schema: oracle_feddb
#     tables:
#       - name: transactions
#       - name: employees
#       - name: departments

# Fabric notebook (PySpark)
df = spark.read.format("delta").load(
    "abfss://workspace@onelake.dfs.fabric.microsoft.com/oracle_mirror.Lakehouse/Tables/transactions"
)
df.createOrReplaceTempView("transactions")
spark.sql("SELECT department_id, SUM(amount) FROM transactions GROUP BY department_id").show()

7. GoldenGate for CDC (Oracle-to-Oracle)¶

For Oracle Database@Azure migrations, GoldenGate provides real-time replication with zero downtime.

7.1 Architecture¶

Source Oracle ──► GoldenGate Extract ──► Trail Files ──► GoldenGate Replicat ──► Oracle DB@Azure
                                                │
                                       (Optional: to Kafka/Event Hubs
                                        for Azure analytics integration)

7.2 GoldenGate for heterogeneous targets¶

GoldenGate also supports replication to non-Oracle targets:

Target	GoldenGate support	Use case
Oracle DB@Azure	Full (native)	Oracle-to-Oracle migration
Azure SQL MI	Via GoldenGate for SQL Server	Heterogeneous replication
Azure PostgreSQL	Via GoldenGate for PostgreSQL	Heterogeneous replication
Kafka / Event Hubs	Via GoldenGate for Big Data	Analytics pipeline integration

8. Data validation¶

8.1 Row count validation¶

-- Oracle source
SELECT table_name, num_rows
FROM all_tables
WHERE owner = 'APP_SCHEMA'
ORDER BY table_name;

-- Azure SQL MI target
SELECT t.name AS table_name,
       SUM(p.rows) AS row_count
FROM sys.tables t
JOIN sys.partitions p ON t.object_id = p.object_id AND p.index_id IN (0,1)
GROUP BY t.name
ORDER BY t.name;

-- PostgreSQL target
SELECT schemaname, relname AS table_name, n_live_tup AS row_count
FROM pg_stat_user_tables
WHERE schemaname = 'app_schema'
ORDER BY relname;

8.2 Checksum validation¶

-- Oracle: Generate table checksum
SELECT ORA_HASH(
    LISTAGG(employee_id || '|' || NVL(TO_CHAR(salary), 'NULL') || '|' || name, ',')
    WITHIN GROUP (ORDER BY employee_id)
) AS table_checksum
FROM employees;

-- Azure SQL MI: Generate comparable checksum
SELECT CHECKSUM_AGG(CHECKSUM(employee_id, salary, name)) AS table_checksum
FROM employees;

8.3 Business rule validation¶

-- Validate critical business aggregates match between source and target
-- Example: Total revenue by month should match exactly

-- Oracle source
SELECT TO_CHAR(transaction_date, 'YYYY-MM') AS month,
       SUM(amount) AS total_amount,
       COUNT(*) AS transaction_count
FROM transactions
GROUP BY TO_CHAR(transaction_date, 'YYYY-MM')
ORDER BY month;

-- Azure SQL MI target
SELECT FORMAT(transaction_date, 'yyyy-MM') AS month,
       SUM(amount) AS total_amount,
       COUNT(*) AS transaction_count
FROM transactions
GROUP BY FORMAT(transaction_date, 'yyyy-MM')
ORDER BY month;

9. Data migration best practices¶

Migrate schema first, data second. Convert and validate the schema before moving data.
Disable indexes and constraints during bulk load. Re-enable after data migration for faster loading.
Parallelize large table migrations. Split large tables by partition key or date range.
Validate incrementally. Do not wait until all data is migrated to start validation.
Plan for LOBs separately. BLOB and CLOB columns require special handling and are significantly slower to migrate.
Test with production-scale data. Migration tools that work on 10 GB often fail at 1 TB.
Keep Oracle read-only during cutover. Use ALTER DATABASE OPEN READ ONLY during final sync.
Document the watermark. Record the exact timestamp or SCN used for the final incremental sync.

Maintainers: csa-inabox core team Last updated: 2026-04-30