Migration Best Practices: Informatica to Azure¶

Lessons learned, common pitfalls, and proven patterns for successful Informatica-to-Azure migrations.

Pre-migration assessment checklist¶

Before committing to a migration, complete this assessment to scope the effort accurately and identify risks early.

Informatica inventory¶

Complexity scoring¶

Score each mapping on a 1-5 scale for each dimension:

Dimension	1 (Simple)	3 (Medium)	5 (Complex)
Transformation count	1-3	4-10	11+
Source count	1	2-3	4+
Lookup count	0	1-3	4+
Custom code (Java/SQL)	None	Some expressions	Custom Java/Python
SCD handling	None	Type 1	Type 2 or 3
Error handling	None	Basic	Complex (router, error rows)
Reusable components	None	Uses mapplets	Creates/modifies mapplets
Data volume (daily rows)	<1M	1-50M	>50M

Total score interpretation:

Score range	Tier	Migration action	Estimated effort
8-12	A (Simple)	Direct convert to dbt	1-2 days
13-24	B (Medium)	Decompose and refactor	3-7 days
25-35	C (Complex)	Re-architect	7-20 days
Any	D (Decommission)	Archive metadata, delete	0.5 days

Risk register template¶

Risk	Likelihood	Impact	Mitigation
PowerCenter developers resist code-first approach	High	High	Invest in training early; pair programming with dbt-experienced engineers
MDM replacement takes longer than estimated	High	High	Start MDM track independently; consider Profisee as interim
IDQ rule discovery is incomplete	Medium	High	Parallel run for 6+ weeks with comprehensive reconciliation
ADF Self-Hosted IR performance issues	Medium	Medium	Size IR VMs appropriately; test with production data volumes
License termination penalties	Medium	Medium	Review contract terms; negotiate with Informatica early
Downstream consumer disruption	High	High	Communicate early; provide parallel run period
Data format differences (Oracle vs Azure SQL)	Medium	Low	Test data type conversions; document edge cases
Network latency (on-prem sources via Self-Hosted IR)	Low	Medium	Deploy IR close to source; optimize query pushdown

Conversion priority strategy¶

Principle: Simple first, complex last¶

Migrate in order of complexity, not business importance. This builds team confidence and establishes patterns before tackling hard problems.

graph LR
    A[Tier A: Simple mappings] --> B[Tier B: Medium mappings]
    B --> C[Tier C: Complex mappings + DQ]
    C --> D[MDM replacement]

    style A fill:#90EE90
    style B fill:#FFFF99
    style C fill:#FFB347
    style D fill:#FF6B6B

Wave planning by business domain¶

Within each tier, organize by business domain for atomic cutovers:

Wave	Business domain	Tier A mappings	Tier B mappings	Tier C mappings	Duration
1 (Pilot)	Finance (GL, AP)	15	5	2	8-10 weeks
2	Sales (Orders, Revenue)	20	10	5	10-14 weeks
3	HR + Operations	25	15	5	12-16 weeks
4	DQ + Governance	-	10	15	10-14 weeks
5	MDM (if applicable)	-	-	10	16-24 weeks

What to migrate first within each wave¶

Reference data pipelines -- lookups, dimension loads (lowest risk, highest reuse)
Simple fact loads -- extract, transform, load (establish the pattern)
Incremental loads -- establish watermark and incremental patterns
Complex transformations -- multi-source joins, SCD, business logic
DQ rules -- after data flows are migrated, apply quality checks
MDM workflows -- last, after all source pipelines are stable

IICS vs PowerCenter: migration differences¶

Key differences in migration approach¶

Aspect	PowerCenter migration	IICS migration
Infrastructure	Decommission servers, DR, network	Cancel SaaS subscription
Complexity	Higher (on-prem, more legacy)	Lower (already cloud)
Agent migration	N/A (PowerCenter uses Integration Service)	Secure Agent -> Self-Hosted IR
Deployment model	Repository export -> Git + CI/CD	Org export -> Git + CI/CD
Typical estate age	10-20 years	3-8 years
Legacy accumulation	High (many Tier D candidates)	Lower (newer estate)
Team readiness	Often needs more retraining	Often closer to cloud-ready
Timeline	12-36 months	8-24 months

IICS-specific considerations¶

IPU analysis first. Analyze actual IPU consumption to right-size Azure resources
Connector mapping. IICS premium connectors may have free ADF equivalents
CLAIRE AI features. Some IICS-specific AI recommendations have no direct equivalent; document which you rely on
Application Integration. IICS CAI -> Logic Apps requires architecture redesign, not just mapping
Data Integration Elastic. Elastic mappings on Spark may need Databricks, not just dbt

Parallel-run validation¶

Why parallel run is mandatory¶

Parallel run is the highest-confidence validation strategy. Both platforms run the same logic; you compare outputs.

Parallel-run setup¶

graph TB
    Source[(Source Data)] --> PC[PowerCenter Pipeline]
    Source --> ADF[ADF + dbt Pipeline]
    PC --> PC_Target[(PowerCenter Output)]
    ADF --> ADF_Target[(dbt Output)]
    PC_Target --> Reconcile{Reconciliation Report}
    ADF_Target --> Reconcile
    Reconcile --> Match[Match: Continue migration]
    Reconcile --> Mismatch[Mismatch: Investigate and fix]

Reconciliation checks¶

Check	SQL pattern	Tolerance
Row count match	`SELECT COUNT() FROM pc_output EXCEPT SELECT COUNT() FROM dbt_output`	0 rows difference
Sum of numeric columns	`SELECT ABS(SUM(pc.amount) - SUM(dbt.amount)) FROM ...`	< $0.01 (rounding)
Distinct key count	`SELECT COUNT(DISTINCT key) FROM ...`	0 difference
NULL distribution	`SELECT COUNT(*) WHERE col IS NULL FROM ...`	0 difference
Date range coverage	`SELECT MIN(date), MAX(date) FROM ...`	Identical ranges
String value comparison (sample)	`SELECT TOP 100 pc.val, dbt.val FROM ... WHERE pc.val != dbt.val`	0 differences (or documented)

Automated reconciliation dbt test¶

-- tests/reconciliation/assert_daily_sales_matches.sql
WITH pc AS (
    SELECT
        order_date,
        COUNT(*) AS row_count,
        SUM(order_amount) AS total_amount
    FROM {{ source('powercenter', 'fact_daily_sales_pc') }}
    GROUP BY order_date
),

dbt_output AS (
    SELECT
        order_date,
        COUNT(*) AS row_count,
        SUM(order_amount) AS total_amount
    FROM {{ ref('fct_daily_sales') }}
    GROUP BY order_date
)

SELECT
    COALESCE(pc.order_date, d.order_date) AS order_date,
    pc.row_count AS pc_rows,
    d.row_count AS dbt_rows,
    pc.total_amount AS pc_amount,
    d.total_amount AS dbt_amount,
    ABS(COALESCE(pc.total_amount, 0) - COALESCE(d.total_amount, 0)) AS amount_diff
FROM pc
FULL OUTER JOIN dbt_output d ON pc.order_date = d.order_date
WHERE pc.row_count != d.row_count
   OR ABS(COALESCE(pc.total_amount, 0) - COALESCE(d.total_amount, 0)) > 0.01

Parallel-run duration guidelines¶

Migration complexity	Recommended parallel-run duration	Notes
Simple pipelines (Tier A)	7-14 days	Quick validation
Medium pipelines (Tier B)	14-21 days	Ensure edge cases are covered
Complex pipelines (Tier C)	21-30 days	Cover month-end, quarter-end cycles
DQ rules	30 days	Must cover all rule trigger conditions
MDM workflows	45-60 days	Golden record reconciliation takes time

Team retraining: from GUI to code-first¶

The skill transition¶

PowerCenter skill	dbt equivalent	Training path
PowerCenter Designer (visual mapping)	dbt model (SQL file)	SQL fundamentals + dbt tutorial
Workflow Manager (GUI orchestration)	ADF pipeline (JSON/Bicep)	ADF portal tutorial + Bicep basics
Session log analysis	dbt logs + ADF Monitor	Azure Monitor training
Repository management	Git (branch, commit, PR)	Git fundamentals (1-2 days)
Informatica Admin Console	Azure Portal + dbt Cloud	Azure fundamentals + dbt Cloud tour

Training program structure¶

Week	Topic	Activities	Outcome
1	SQL refresher	SQL exercises on real data	Comfortable with CTEs, JOINs, window functions
2	dbt fundamentals	dbt tutorial (https://courses.getdbt.com)	Build first dbt project
3	dbt intermediate	Incremental models, snapshots, macros, tests	Convert first Tier A mapping
4	ADF fundamentals	ADF portal + pipeline creation	Build first ADF pipeline
5	Git workflow	Branch, commit, PR, code review	Complete first PR-based deployment
6	Paired conversion	Convert Tier A/B mappings with mentor	Build confidence and speed
7-8	Independent conversion	Convert mappings independently; mentor available	Fully productive

Common retraining challenges¶

Challenge	Solution
"I can't see the data flow visually"	Use dbt docs graph + CTRL+click through `ref()` chains; ADF pipeline view provides visual orchestration
"How do I debug without session log?"	`dbt debug` + `{{ log() }}` + Azure SQL query profiler + ADF Monitor
"I miss drag-and-drop"	ADF Mapping Data Flows provide visual interface for cases that need it; dbt is faster for most work
"Git is confusing"	Start with GitHub Desktop (GUI); transition to CLI later
"I don't know SQL well enough"	SQL is the foundation; invest 1-2 weeks in intensive SQL training before dbt
"Testing feels like extra work"	Tests save time in production; demonstrate with a caught-by-test example

Success metrics for retraining¶

Metric	Target	Measurement
Time to first dbt model	< 1 week	Tracked per developer
Mappings converted per week	3-5 (Tier A) by week 4	Weekly sprint velocity
dbt test coverage	> 80% of models have tests	`dbt test` output
Code review turnaround	< 24 hours	PR metrics
Developer satisfaction	> 70% positive	Monthly survey

Common pitfalls and how to avoid them¶

1. Trying to recreate PowerCenter's visual UX in dbt¶

Problem: Teams spend weeks trying to build visual mapping tools on top of dbt or using ADF Mapping Data Flows for everything.

Solution: Accept the paradigm shift. dbt is code-first. The visual UX is dbt docs generate (for documentation) and ADF pipeline designer (for orchestration). Use Mapping Data Flows only for flows that require a visual interface (analyst-built, non-production).

2. Migrating everything at once (big bang)¶

Problem: Attempting to migrate the entire Informatica estate in one release. Massive risk, long stabilization period, overwhelmed team.

Solution: Wave-based migration by business domain. Each wave has its own parallel run, cutover, and stabilization. Never more than one wave in flight.

3. Leaving DQ rules unimplemented¶

Problem: PowerCenter mappings are migrated but IDQ rules are deferred. Data quality issues surface in production weeks later.

Solution: For every mapping migrated, migrate its associated DQ rules at the same time. Use dbt tests as the primary mechanism. Never cut over a pipeline without its DQ tests in place.

4. Underestimating MDM replacement¶

Problem: MDM is treated as "just another migration" and scoped at 4-6 weeks. It invariably takes 4-6 months.

Solution: Scope MDM as a separate workstream with its own timeline. Consider Profisee as a purpose-built replacement. Assess whether full MDM is actually needed (often it is not).

5. Not training PowerCenter developers¶

Problem: PowerCenter developers are expected to write dbt models without training. They struggle, become frustrated, and resist the migration.

Solution: Invest 2-4 weeks in structured training before starting conversion work. Pair experienced dbt developers with PowerCenter developers. Budget 15-20% of migration cost for training.

6. Sequencing by mapping count instead of business domain¶

Problem: Team migrates the "easiest" mappings first regardless of business domain. This means no domain can fully cut over until all domains' easiest mappings are done.

Solution: Sequence by business domain. Migrate ALL of Finance (simple through complex) before starting Sales. This enables atomic domain cutover.

7. Forgetting B2B/EDI¶

Problem: B2B Data Exchange and EDI integrations are discovered late in the migration and require separate architecture (Logic Apps + APIM).

Solution: Inventory B2B/EDI in the assessment phase. Plan Logic Apps + APIM replacement explicitly. B2B is often a separate team and a separate workstream.

8. Not planning for license termination¶

Problem: Migration completes but Informatica license auto-renews because no one sent the termination notice (typically 90-day notice required).

Solution: Review Informatica contract terms during assessment. Send termination notice at the start of the last wave, timed to expire after parallel run completes.

9. Ignoring the repository metadata¶

Problem: PowerCenter repository metadata (lineage, mapping history, session statistics) is deleted when servers are decommissioned. Historical lineage is lost.

Solution: Export PowerCenter repository metadata to Purview before decommission. Use Purview's custom lineage API to preserve historical data flow documentation.

10. Over-engineering the target architecture¶

Problem: Team designs a complex multi-service architecture (ADF + Databricks + Synapse + Fabric + dbt) when a simpler stack would suffice.

Solution: Start with ADF + dbt + Azure SQL. Add Databricks only for Spark workloads. Add Synapse only for very large analytical queries. Add Fabric only for specific Fabric features (Direct Lake, mirroring). Keep it simple.

Team structure recommendations¶

Core migration team (mid-sized engagement)¶

Role	Count	Responsibility
Migration lead / architect	1	Overall architecture, risk management, stakeholder communication
dbt developer (senior)	1-2	Establish patterns, macros, project structure; mentor others
dbt developer (converting)	2-4	Convert mappings to dbt models; former PowerCenter developers
ADF developer	1	Pipeline creation, Linked Services, IR configuration, triggers
DQ engineer	1	Convert IDQ rules to dbt tests + Great Expectations
Platform engineer	1	Bicep IaC, CI/CD, Azure Monitor, Purview automation
Change management	0.5	Training, communication, user feedback
PowerCenter SME	1	Knowledge transfer, mapping analysis, validation

Timeline estimation guidelines¶

Estate size	Mappings	Workflows	IDQ rules	MDM	Estimated duration
Small	50-100	10-30	0-50	No	12-16 weeks
Medium	100-300	30-100	50-200	No	16-28 weeks
Medium + DQ	100-300	30-100	200+	No	24-36 weeks
Large	300-800	100-300	200+	Yes	36-52 weeks
Enterprise	800+	300+	500+	Yes	52-78 weeks

Multiply by 1.2x for federal/government deployments (compliance overhead).

Success criteria¶

Define clear, measurable success criteria before starting:

Criterion	Target	How to measure
Data accuracy	100% match with PowerCenter output	Reconciliation tests (automated)
Pipeline reliability	99.5% success rate (30-day rolling)	ADF Monitor metrics
Performance	Equal or better than PowerCenter	Execution time comparison
Cost	60%+ reduction in year 1	Azure Cost Management vs Informatica invoices
Developer velocity	2x improvement by month 6	Pipelines delivered per sprint
Test coverage	80%+ of models have automated tests	dbt test output
Team satisfaction	70%+ positive by month 3	Monthly survey
Downtime during cutover	< 4 hours per wave	Measured per cutover

PowerCenter Migration Guide -- Detailed PowerCenter-specific guidance
IICS Migration Guide -- IICS-specific migration
Data Quality Migration Guide -- IDQ replacement patterns
MDM Migration Guide -- MDM replacement architecture
Benchmarks -- Performance comparisons
Migration Playbook -- End-to-end migration guide

Last updated: 2026-04-30 Maintainers: CSA-in-a-Box core team Related: Migration Playbook | Tutorials | Benchmarks