Oracle to Azure Migration -- Best Practices¶

Assessment methodology, complexity tiers, workload decomposition, application testing strategy, parallel-run validation, and CSA-in-a-Box integration for analytics on migrated Oracle data.

1. Assessment methodology¶

1.1 Discovery phase¶

Before any migration work begins, conduct a thorough Oracle estate discovery:

# Run SSMA Assessment Report (for Azure SQL MI targets)
# Produces: Conversion statistics, object inventory, complexity scoring

# Run ora2pg assessment (for PostgreSQL targets)
ora2pg -c ora2pg.conf -t SHOW_REPORT --estimate_cost

# Both tools provide:
# - Object count by type (tables, views, procedures, functions, packages)
# - PL/SQL line count and complexity
# - Oracle-specific feature usage
# - Estimated conversion effort

1.2 Assessment dimensions¶

Dimension	What to capture	How to capture
Schema complexity	Tables, views, sequences, indexes, constraints	SSMA assessment, ora2pg report
PL/SQL complexity	Procedures, functions, packages (line count, cyclomatic complexity)	SSMA detailed report, manual code review
Oracle feature usage	RAC, Data Guard, Partitioning, VPD, AQ, Spatial, Oracle Text	DBA interview + `V$OPTION`, `DBA_FEATURE_USAGE_STATISTICS`
Data volume	Database size, table sizes, growth rate	`DBA_SEGMENTS`, `DBA_TABLESPACES`
Application dependencies	Applications connecting to each database, connection methods	Application inventory, TNS listener logs
Performance baseline	Top SQL, execution frequency, response times	AWR reports, ASH data
Security model	Users, roles, VPD policies, TDE, audit policies	`DBA_USERS`, `DBA_ROLE_PRIVS`, `DBA_POLICIES`
Integration points	Database links, external tables, AQ subscribers, GoldenGate streams	`DBA_DB_LINKS`, `DBA_EXTERNAL_TABLES`

1.3 Oracle feature usage query¶

-- Run on each Oracle database to discover feature usage
SELECT name, currently_used, detected_usages, first_usage_date, last_usage_date
FROM dba_feature_usage_statistics
WHERE currently_used = 'TRUE'
  AND dbid = (SELECT dbid FROM v$database)
ORDER BY name;

-- Key features to watch for:
-- "Real Application Clusters (RAC)" -> Requires HA architecture decision
-- "Partitioning" -> Map to target partitioning
-- "Virtual Private Database (VPD)" -> Map to RLS
-- "Oracle Advanced Security" -> Map to TDE/Key Vault
-- "Oracle Spatial" -> Map to PostGIS or SQL Server Spatial
-- "Advanced Queuing" -> Map to Service Bus
-- "Oracle Text" -> Map to Full-Text Search
-- "In-Memory Column Store" -> Map to Columnstore indexes
-- "Oracle Data Guard" -> Map to geo-replication/failover groups

2. Complexity tiers¶

2.1 Tier classification¶

Classify each Oracle database into a complexity tier to guide migration approach and timeline:

Tier	Criteria	Migration approach	Timeline	Risk
Tier 1: Simple	< 50 tables, < 10 stored procs, no Oracle-specific features, < 10 GB	Automated (SSMA/ora2pg), minimal manual work	4-6 weeks	Low
Tier 2: Standard	50-200 tables, 10-100 stored procs, some DECODE/NVL patterns, 10-100 GB	Automated + manual fixes, focused testing	8-12 weeks	Medium
Tier 3: Complex	200+ tables, 100+ stored procs, packages, partitioning, triggers, 100 GB-1 TB	Automated assessment + significant manual PL/SQL conversion	16-24 weeks	High
Tier 4: Enterprise	500+ tables, complex PL/SQL packages (10K+ lines each), RAC, VPD, AQ, Spatial, > 1 TB	Phased migration, dedicated conversion team, extensive testing	24-40+ weeks	Very High

2.2 Scoring model¶

Assign points for each complexity factor:

Factor	Points	Description
PL/SQL lines: 0-1K	1	Trivial
PL/SQL lines: 1K-10K	3	Moderate
PL/SQL lines: 10K-50K	7	Complex
PL/SQL lines: 50K+	15	Enterprise
Oracle packages	3 per package	Each package requires decomposition
CONNECT BY queries	2 each	Recursive CTE conversion
Autonomous transactions	5 each	Significant redesign
VPD policies	3 each	RLS policy creation
Oracle Spatial usage	10	PostGIS or SQL Spatial migration
Advanced Queuing	10	Service Bus architecture
RAC dependency	15	HA architecture redesign
Database links	3 each	Cross-database query refactoring

Total score interpretation:

Score	Tier	Recommended approach
1-10	Tier 1 (Simple)	Automated migration, minimal manual work
11-30	Tier 2 (Standard)	Automated + focused manual conversion
31-60	Tier 3 (Complex)	Phased migration, dedicated team
61+	Tier 4 (Enterprise)	Consider Oracle DB@Azure for short-term, phased displacement

3. Workload decomposition¶

3.1 Decompose by migration target¶

Not all databases in an Oracle estate should go to the same target:

Oracle Estate (100 databases)
    │
    ├── Tier 1 + Tier 2 (70 databases) ──► Azure SQL MI or PostgreSQL
    │     Standard OLTP, moderate PL/SQL
    │     Timeline: 12-24 weeks (wave-based)
    │
    ├── Tier 3 (20 databases) ──► Azure SQL MI (with dedicated PL/SQL conversion)
    │     Complex PL/SQL, partitioning, triggers
    │     Timeline: 24-36 weeks
    │
    ├── Tier 4 (5 databases) ──► Oracle DB@Azure
    │     EBS, deep PL/SQL, RAC, cannot refactor
    │     Timeline: 8-12 weeks (lift and shift)
    │
    └── Retire (5 databases) ──► Archive and decommission
          Legacy, no active consumers
          Timeline: 4-8 weeks

3.2 Wave planning¶

Group databases into migration waves of 3-5 databases each:

Wave	Databases	Criteria	Duration
Wave 1 (Pilot)	2-3 Tier 1 databases	Lowest risk, highest visibility	6-8 weeks
Wave 2	5 Tier 1-2 databases	Standard OLTP, growing confidence	8-10 weeks
Wave 3	5-8 Tier 2 databases	Standard complexity, established patterns	8-10 weeks
Wave 4	5-8 Tier 2-3 databases	Increasing complexity	12-16 weeks
Wave 5	Remaining Tier 2-3	All remaining displacement targets	12-20 weeks
Wave 6	Tier 4 databases	Oracle DB@Azure (if applicable)	8-12 weeks

3.3 Pilot database selection criteria¶

Select the pilot database(s) to maximize learning while minimizing risk:

Tier 1 or low Tier 2 complexity
Non-mission-critical (dev/test acceptable for pilot)
Representative of common patterns in the estate
Willing application team / stakeholder
< 10 GB data volume
< 20 stored procedures
No Oracle-specific features (RAC, VPD, AQ)
Well-documented application with existing test suite

4. Application testing strategy¶

4.1 Testing layers¶

                    ┌────────────────────┐
                    │   User Acceptance  │  ← Business users validate workflows
                    │   Testing (UAT)    │
                    ├────────────────────┤
                    │   Performance      │  ← Load testing at production scale
                    │   Testing          │
                    ├────────────────────┤
                    │   Integration      │  ← Cross-system data flow validation
                    │   Testing          │
                    ├────────────────────┤
                    │   Functional       │  ← Feature-by-feature validation
                    │   Testing          │
                    ├────────────────────┤
                    │   Unit Testing     │  ← Stored procedure / function testing
                    │   (Database)       │
                    └────────────────────┘

4.2 Database unit testing¶

Test every converted stored procedure and function:

-- Azure SQL MI: tSQLt framework for database unit testing
-- Install tSQLt: https://tsqlt.org

EXEC tSQLt.NewTestClass 'TestEmployeeFunctions';
GO

CREATE PROCEDURE TestEmployeeFunctions.[test get_salary returns correct value]
AS
BEGIN
    -- Arrange
    EXEC tSQLt.FakeTable 'dbo.employees';
    INSERT INTO dbo.employees (employee_id, salary) VALUES (1001, 85000.00);

    -- Act
    DECLARE @result decimal(10,2);
    SET @result = dbo.get_salary(1001);

    -- Assert
    EXEC tSQLt.AssertEquals 85000.00, @result;
END;
GO

-- Run all tests
EXEC tSQLt.RunAll;

-- PostgreSQL: pgTAP framework for database unit testing
CREATE EXTENSION IF NOT EXISTS pgtap;

SELECT plan(3);

-- Test function returns correct value
SELECT is(
    app_schema.get_employee_salary(1001),
    85000.00::numeric,
    'get_salary returns correct value for employee 1001'
);

-- Test function returns NULL for non-existent employee
SELECT is(
    app_schema.get_employee_salary(99999),
    NULL::numeric,
    'get_salary returns NULL for non-existent employee'
);

-- Test procedure raises exception for invalid input
SELECT throws_ok(
    'CALL app_schema.update_salary(99999, 50000)',
    'P0001',
    'Employee not found'
);

SELECT * FROM finish();

4.3 Data validation queries¶

Run on both source (Oracle) and target (Azure) to compare:

-- 1. Row count per table
-- 2. Checksum per table (see data-migration.md)
-- 3. NULL count per column (detect conversion errors)
-- 4. Min/Max/Avg for numeric columns
-- 5. Distinct count for categorical columns
-- 6. Date range validation
-- 7. Foreign key integrity check
-- 8. Business aggregate validation (monthly totals, etc.)

4.4 Performance testing¶

# Use Apache JMeter, k6, or Locust for load testing

# k6 example: Test migrated API endpoint
# k6 run --vus 100 --duration 30m load-test.js

# Compare metrics:
# - Response time (p50, p95, p99)
# - Throughput (requests/second)
# - Error rate
# - Database CPU / memory / IOPS during test

5. Parallel-run validation¶

5.1 Parallel-run architecture¶

                    ┌──────────────┐
                    │  Application │
                    │  (writes to  │
                    │   Oracle)    │
                    └──────┬───────┘
                           │
                    ┌──────▼───────┐
                    │   Oracle     │ ← Primary (still active)
                    │   Source     │
                    └──────┬───────┘
                           │
                    ┌──────▼───────┐
                    │  Replication │  CDC / GoldenGate / ADF
                    │  Layer       │
                    └──────┬───────┘
                           │
                    ┌──────▼───────┐
                    │  Azure SQL   │ ← Secondary (shadow mode)
                    │  MI / PG     │
                    └──────┬───────┘
                           │
                    ┌──────▼───────┐
                    │  Validation  │  Row counts, checksums,
                    │  Framework   │  query result comparison
                    └──────────────┘

5.2 Validation framework¶

Run automated comparisons during the parallel-run period:

# Parallel-run validation script (Python)
import pyodbc
import cx_Oracle
import hashlib

def compare_table(table_name, oracle_conn, azure_conn):
    """Compare row counts and sample checksums between Oracle and Azure."""

    # Row count
    ora_count = oracle_conn.execute(
        f"SELECT COUNT(*) FROM {table_name}"
    ).fetchone()[0]

    az_count = azure_conn.execute(
        f"SELECT COUNT(*) FROM {table_name}"
    ).fetchone()[0]

    count_match = ora_count == az_count

    # Sample checksum (first 1000 rows by PK)
    # ... implementation depends on table structure

    return {
        "table": table_name,
        "oracle_count": ora_count,
        "azure_count": az_count,
        "count_match": count_match,
        "variance_pct": abs(ora_count - az_count) / max(ora_count, 1) * 100
    }

5.3 Cutover criteria¶

Proceed to cutover when all criteria are met:

Row counts match within 0.01% variance (accounts for in-flight transactions)
Business aggregates match exactly (monthly totals, balances)
All automated tests pass on target
Performance is within 20% of Oracle baseline (acceptable for cost savings)
No P1 or P2 defects open for 5+ consecutive business days
Application team sign-off
DBA team sign-off
Security team sign-off (FedRAMP controls validated)
Rollback plan tested and documented

6. CSA-in-a-Box integration for analytics¶

6.1 Post-migration analytics pattern¶

After migrating Oracle to Azure, integrate with CSA-in-a-Box for analytics:

Migrated Database (Azure SQL MI / PostgreSQL / Oracle DB@Azure)
    │
    ├── Fabric Mirroring (for SQL MI and Oracle DB@Azure)
    │   └── OneLake (Delta Lake tables)
    │
    ├── ADF Pipelines (for PostgreSQL and other sources)
    │   └── OneLake (Delta Lake tables)
    │
    └── CSA-in-a-Box Medallion Architecture
        ├── Bronze: Raw mirrored data (schema-on-read)
        ├── Silver: Cleaned, validated, typed (dbt models)
        ├── Gold: Business-ready aggregates (dbt models + contracts)
        │
        ├── Purview: Classifications, lineage, catalog
        ├── Power BI: Direct Lake semantic model + reports
        └── AI Foundry: Azure OpenAI for NL analytics

6.2 dbt model for migrated Oracle data¶

# domains/shared/dbt/models/sources.yml
sources:
    - name: oracle_migrated
      description: "Data migrated from Oracle Database to Azure"
      meta:
          migration_date: "2026-04-30"
          source_system: "Oracle 19c FEDDB"
          target_database: "Azure SQL MI"
      tables:
          - name: employees
            description: "Employee records (migrated from Oracle HR)"
            columns:
                - name: employee_id
                  description: "Primary key (was Oracle NUMBER(10))"
                  tests: [not_null, unique]
                - name: salary
                  description: "Employee salary (was Oracle NUMBER(10,2))"
                  tests: [not_null]

-- domains/shared/dbt/models/silver/stg_employees.sql
WITH source AS (
    SELECT * FROM {{ source('oracle_migrated', 'employees') }}
),

cleaned AS (
    SELECT
        employee_id,
        UPPER(TRIM(first_name)) AS first_name,
        UPPER(TRIM(last_name)) AS last_name,
        department_id,
        CAST(salary AS decimal(10,2)) AS salary,
        hire_date,
        CASE status
            WHEN 'A' THEN 'Active'
            WHEN 'I' THEN 'Inactive'
            WHEN 'T' THEN 'Terminated'
            ELSE 'Unknown'
        END AS status_description,
        CURRENT_TIMESTAMP AS _loaded_at
    FROM source
    WHERE employee_id IS NOT NULL
)

SELECT * FROM cleaned

6.3 Post-migration Purview setup¶

# Register migrated database in Purview
# Using Purview automation from CSA-in-a-Box:
# csa_platform/csa_platform/governance/purview/purview_automation.py

# 1. Register Azure SQL MI data source in Purview
# 2. Run scan to discover all tables and columns
# 3. Apply classifications:
#    - PII columns (SSN, email, phone) -> pii_classifications.yaml
#    - CUI columns (case data, security) -> government_classifications.yaml
#    - PHI columns (health data) -> phi_classifications.yaml
# 4. Verify lineage: Oracle -> ADF/Mirroring -> OneLake -> dbt -> Power BI

7. Common pitfalls and how to avoid them¶

Pitfall	Impact	Prevention
Underestimating PL/SQL complexity	Timeline overrun, budget overrun	Run SSMA/ora2pg assessment before committing to timeline
Ignoring Oracle DATE vs SQL Server DATE	Data truncation (time component lost)	Map Oracle DATE to datetime2(0), not date
Not testing at production scale	Performance surprises in production	Load test with production-volume data before cutover
Migrating analytics to OLTP database	Performance degradation on target	Use Fabric Mirroring + CSA-in-a-Box for analytics
Forgetting to disable Oracle monitoring	Alerts from decommissioned Oracle	Decommission Oracle monitoring agents after cutover
Not planning for rollback	Stuck with broken migration	Maintain Oracle read-only during parallel run
Converting everything at once	Risk concentration	Wave-based approach (3-5 databases per wave)
Ignoring connection pooling	Connection exhaustion on Azure	Implement PgBouncer or application-level pooling
Skipping index review	Poor query performance on target	Review and recreate indexes for target optimizer
Not updating statistics	Query plan regression	Run statistics update after data migration

8. Post-migration optimization checklist¶

Maintainers: csa-inabox core team Last updated: 2026-04-30