dbt¶

A comprehensive guide for migrating Tableau Prep Builder flows to Power Query M, Dataflow Gen2, or dbt models.

Overview¶

Tableau Prep Builder provides a visual, node-based data preparation experience. Power BI offers three alternatives, each suited to different scenarios:

Power Query (M language) — built into Power BI Desktop, good for light-to-medium transforms
Dataflow Gen2 — cloud-based, reusable, shareable Power Query with scheduled refresh
dbt (data build tool) — SQL-based transformation in the data warehouse layer

This guide maps every Tableau Prep step type to its equivalent in each tool and provides guidance on which tool to choose.

1. When to use which tool¶

1.1 Decision matrix¶

Scenario	Recommended tool	Why
Light transforms (rename, filter, type cast)	Power Query in semantic model	Fastest; stays in the Power BI workflow
Reusable transforms shared across reports	Dataflow Gen2	Cloud-based; multiple reports consume the output
Complex business logic (multi-table joins, aggregations)	dbt on csa-inabox	SQL-based; version-controlled; testable; runs in the data warehouse
Data scientists / advanced analysts	Fabric notebooks (Python/Spark)	Full programming language for complex preparation
Simple file unions (monthly CSV files)	Power Query in semantic model	Folder connector handles this natively
ETL replacing Prep Conductor scheduling	Dataflow Gen2 or dbt + ADF/Fabric pipeline	Both support scheduled execution

1.2 The csa-inabox recommendation¶

flowchart TD
    START[Tableau Prep flow] --> Q1{Is the logic<br/>complex SQL?}
    Q1 -->|Yes| DBT[dbt model<br/>Silver/Gold layer]
    Q1 -->|No| Q2{Is the output<br/>shared across<br/>multiple reports?}
    Q2 -->|Yes| DF[Dataflow Gen2<br/>Reusable cloud transform]
    Q2 -->|No| Q3{Light shaping<br/>only?}
    Q3 -->|Yes| PQ[Power Query<br/>In semantic model]
    Q3 -->|No| DF

Prefer dbt for transformation logic

With csa-inabox, transformation logic should live in dbt models, not in Power Query or Dataflows. dbt models are version-controlled, testable, and produce Delta tables that Power BI can read via Direct Lake. Power Query should handle only the last mile of shaping between Gold tables and the semantic model.

2. Step-by-step mapping: Tableau Prep to Power Query¶

2.1 Input step¶

Tableau Prep	Power Query equivalent	M code example
Connect to database table	Get Data → Database connector	`= Sql.Database("server", "database")`
Connect to CSV file	Get Data → Text/CSV	`= Csv.Document(File.Contents("path"))`
Connect to Excel file	Get Data → Excel	`= Excel.Workbook(File.Contents("path"))`
Connect to Tableau Extract	Not applicable	Extract data does not exist in Power BI; connect to source
Connect to published data source	Live connection to semantic model	Use "Power BI datasets" connector
Wildcard union (multiple files)	Folder connector	`= Folder.Files("folder_path")` then combine

Example: Folder connector for multiple CSV files

// Tableau Prep: Input → Wildcard Union → *.csv from folder

// Power Query M:
let
    Source = Folder.Files("\\server\data\monthly"),
    FilteredCSV = Table.SelectRows(Source, each [Extension] = ".csv"),
    CombinedData = Table.Combine(
        Table.AddColumn(FilteredCSV, "Data",
            each Csv.Document([Content], [Delimiter=",", Encoding=65001])
        )[Data]
    )
in
    CombinedData

2.2 Clean step¶

Tableau Prep operation	Power Query equivalent	M code
Rename column	Right-click → Rename	`= Table.RenameColumns(Source, {{"old", "new"}})`
Remove column	Right-click → Remove	`= Table.RemoveColumns(Source, {"ColName"})`
Change data type	Transform → Data Type	`= Table.TransformColumnTypes(Source, {{"Col", type number}})`
Filter rows	Home → Keep Rows / Remove Rows	`= Table.SelectRows(Source, each [Col] > 100)`
Replace values	Transform → Replace Values	`= Table.ReplaceValue(Source, "old", "new", Replacer.ReplaceText, {"Col"})`
Split column	Transform → Split Column	`= Table.SplitColumn(Source, "Col", Splitter.SplitTextByDelimiter("-"))`
Merge columns	Add Column → Custom Column	`= Table.AddColumn(Source, "Full", each [First] & " " & [Last])`
Remove duplicates	Home → Remove Rows → Remove Duplicates	`= Table.Distinct(Source, {"KeyCol"})`
Remove nulls	Home → Remove Rows → Remove Blank Rows	`= Table.SelectRows(Source, each [Col] <> null)`
Calculate field (row-level)	Add Column → Custom Column	`= Table.AddColumn(Source, "NewCol", each [Price] * [Qty])`
Group & Replace (manual mapping)	Replace Values or conditional column	`= Table.AddColumn(Source, "Group", each if [Region] = "NY" then "Northeast" else [Region])`

2.3 Join step¶

Tableau Prep join	Power Query merge	Notes
Inner join	Merge Queries → Inner	`JoinKind.Inner`
Left join	Merge Queries → Left Outer	`JoinKind.LeftOuter`
Right join	Merge Queries → Right Outer	`JoinKind.RightOuter`
Full outer join	Merge Queries → Full Outer	`JoinKind.FullOuter`
Left anti join (not in right)	Merge Queries → Left Anti	`JoinKind.LeftAnti`
Right anti join	Merge Queries → Right Anti	`JoinKind.RightAnti`
Multi-key join	Merge on multiple columns	Select multiple columns in the merge dialog

Example: Left outer join

// Tableau Prep: Join Step → Left Join on CustomerID

// Power Query M:
let
    Orders = ...,
    Customers = ...,
    Merged = Table.NestedJoin(
        Orders, {"CustomerID"},
        Customers, {"CustomerID"},
        "CustomerDetails",
        JoinKind.LeftOuter
    ),
    Expanded = Table.ExpandTableColumn(Merged, "CustomerDetails",
        {"CustomerName", "Region", "Segment"})
in
    Expanded

2.4 Union step¶

Tableau Prep union	Power Query append	Notes
Union two tables	Append Queries	`= Table.Combine({Table1, Table2})`
Union multiple tables	Append Queries (three or more)	Select multiple tables in the Append dialog
Wildcard union (files)	Folder connector	Combine files from a folder
Mismatched columns	Append handles automatically	Columns not in both tables get null values

2.5 Pivot and unpivot¶

Tableau Prep pivot	Power Query equivalent	Notes
Pivot (rows to columns)	Transform → Pivot Column	Select value column, choose aggregation
Unpivot (columns to rows)	Transform → Unpivot Columns	Select columns to unpivot
Unpivot other columns	Transform → Unpivot Other Columns	Select columns to keep, unpivot the rest

Example: Unpivot monthly columns

// Tableau Prep: Pivot Rows to Columns on month columns
// Source table: Product, Jan, Feb, Mar, Apr, ...

// Power Query M:
let
    Source = ...,
    Unpivoted = Table.UnpivotOtherColumns(Source, {"Product"}, "Month", "Sales")
in
    Unpivoted
// Result: Product | Month | Sales (one row per product per month)

2.6 Aggregate step¶

Tableau Prep aggregate	Power Query Group By	Notes
Group by dimensions, aggregate measures	Transform → Group By	`= Table.Group(Source, {"Dim"}, {{"Total", each List.Sum([Amount])}})`
Multiple aggregations	Group By with multiple columns	Add multiple aggregation columns in the dialog
Count distinct	Group By → Count Distinct Rows	`each Table.RowCount(Table.Distinct(_))`

Example: Group by with multiple aggregations

// Tableau Prep: Aggregate → Group by Region, SUM(Sales), AVG(Profit)

// Power Query M:
let
    Source = ...,
    Grouped = Table.Group(Source, {"Region"}, {
        {"Total Sales", each List.Sum([Sales]), type number},
        {"Avg Profit", each List.Average([Profit]), type number},
        {"Order Count", each Table.RowCount(_), Int64.Type}
    })
in
    Grouped

2.7 Output step¶

Tableau Prep output	Power Query equivalent	Notes
Published data source	Semantic model (dataset)	Publish to Power BI Service
Hyper file extract	Import mode dataset	Data stored in Power BI
CSV file	Not applicable in Power Query	Use Power Automate or ADF for file output
Database table	Dataflow Gen2 → Lakehouse table	Dataflow outputs to Fabric Lakehouse

3. Step-by-step mapping: Tableau Prep to dbt¶

3.1 When to use dbt instead of Power Query¶

Use dbt when:

The transformation involves complex SQL joins across multiple tables
The logic should be version-controlled and tested
Multiple consumers (Power BI, APIs, notebooks) need the same output
The transformation is part of the csa-inabox Silver/Gold layer

3.2 Prep steps to dbt equivalents¶

Tableau Prep step	dbt equivalent	Example
Input	`source()` macro	`{{ source('raw', 'orders') }}`
Clean (filter, rename)	SQL SELECT with aliases	`SELECT col AS new_name FROM source WHERE condition`
Join	SQL JOIN	`FROM orders LEFT JOIN customers ON ...`
Union	SQL UNION ALL	`SELECT * FROM table1 UNION ALL SELECT * FROM table2`
Pivot	SQL PIVOT or CASE WHEN	Use conditional aggregation
Unpivot	SQL UNPIVOT or UNION	Stack columns with UNION ALL
Aggregate	SQL GROUP BY	`SELECT dim, SUM(measure) FROM ... GROUP BY dim`
Output	dbt materialization (table/view)	`{{ config(materialized='table') }}`

3.3 Example: Complete Prep flow as dbt model¶

-- Tableau Prep flow:
-- Input: raw_orders + raw_customers
-- Join: LEFT JOIN on customer_id
-- Filter: order_date >= '2024-01-01'
-- Calculate: total_amount = quantity * unit_price
-- Aggregate: by region, SUM(total_amount)
-- Output: regional_sales

-- dbt model: models/gold/regional_sales.sql
{{ config(materialized='table') }}

WITH orders AS (
    SELECT * FROM {{ ref('stg_orders') }}
    WHERE order_date >= '2024-01-01'
),

customers AS (
    SELECT * FROM {{ ref('stg_customers') }}
),

joined AS (
    SELECT
        o.*,
        c.region,
        c.segment,
        o.quantity * o.unit_price AS total_amount
    FROM orders o
    LEFT JOIN customers c ON o.customer_id = c.customer_id
),

aggregated AS (
    SELECT
        region,
        SUM(total_amount) AS total_sales,
        COUNT(DISTINCT customer_id) AS customer_count,
        AVG(total_amount) AS avg_order_value
    FROM joined
    GROUP BY region
)

SELECT * FROM aggregated

4. Prep Conductor to Dataflow/dbt scheduling¶

4.1 Scheduling comparison¶

Tableau Prep Conductor	Power BI / Fabric equivalent	Notes
Linked task (run flow after extract)	Dataflow refresh → Dataset refresh chain	Configure dataset to refresh after dataflow
Schedule (time-based)	Scheduled refresh	Configure in dataset/dataflow settings
Ad-hoc run	Manual refresh button	"Refresh now" in Power BI Service
Flow failure notification	Refresh failure email	Configure in dataset settings
Flow run history	Refresh history	View in dataset/dataflow settings
Prep Conductor license	Included in Fabric/Premium	No additional license needed

4.2 Refresh chain example¶

// Tableau: Prep flow runs → publishes data source → extract refreshes → workbook updated

// Power BI / Fabric:
// 1. Fabric Pipeline runs dbt models (updates Gold tables)
// 2. Dataflow Gen2 refreshes (if used for additional shaping)
// 3. Semantic model refresh triggers (Import mode) or data is immediately available (Direct Lake)
// 4. Subscriptions send updated reports

// Configuration:
// - Fabric Pipeline: scheduled via Fabric or ADF
// - Dataflow: scheduled in workspace settings
// - Dataset: "Refresh after dataflow" option or scheduled independently

5. Migration checklist for Prep flows¶

For each Tableau Prep flow:

5.1 Effort estimation¶

Prep flow complexity	Characteristics	Estimated migration effort
Simple	1-2 inputs, filters, renames, single output	2-4 hours
Medium	2-3 inputs, joins, calculated fields, pivots	4-8 hours
Complex	4+ inputs, multiple joins, aggregations, conditional logic	8-16 hours
Very complex	Complex with parameterized queries, LOD-like logic, custom SQL	16-32 hours

6. Power Query M quick reference¶

6.1 Essential M functions for Prep migrants¶

// Connect to SQL Server
= Sql.Database("server.database.windows.net", "mydatabase")

// Filter rows
= Table.SelectRows(Source, each [Amount] > 0 and [Date] >= #date(2024, 1, 1))

// Add calculated column
= Table.AddColumn(Source, "Revenue", each [Qty] * [Price], type number)

// Rename columns
= Table.RenameColumns(Source, {{"old_name", "new_name"}, {"old2", "new2"}})

// Change data types
= Table.TransformColumnTypes(Source, {{"Date", type date}, {"Amount", type number}})

// Remove columns
= Table.RemoveColumns(Source, {"UnneededCol1", "UnneededCol2"})

// Replace values
= Table.ReplaceValue(Source, null, 0, Replacer.ReplaceValue, {"Amount"})

// Merge (join) tables
= Table.NestedJoin(Orders, {"CustID"}, Customers, {"CustID"}, "Cust", JoinKind.LeftOuter)

// Expand merged table
= Table.ExpandTableColumn(Merged, "Cust", {"Name", "Region"})

// Append (union) tables
= Table.Combine({Table1, Table2, Table3})

// Group by (aggregate)
= Table.Group(Source, {"Region"}, {{"Total", each List.Sum([Amount]), type number}})

// Unpivot columns
= Table.UnpivotOtherColumns(Source, {"Product"}, "Month", "Sales")

// Pivot column
= Table.Pivot(Source, List.Distinct(Source[Month]), "Month", "Sales", List.Sum)

// Conditional column
= Table.AddColumn(Source, "Tier", each if [Amount] > 1000 then "High" else "Low")

// Remove duplicates
= Table.Distinct(Source, {"KeyCol"})

// Sort
= Table.Sort(Source, {{"Date", Order.Descending}})

Last updated: 2026-04-30 Maintainers: CSA-in-a-Box core team Related: Data Source Migration | Feature Mapping | Migration Playbook