Implementing CDC with Declarative Pipelines (AUTO CDC / SCD Type 1 and 2)

    Who this is for:

    Architecture / Concept Overview: Implementing CDC with Declarative Pipelines (AUTO CDC / SCD Type 1 and 2)

    CDC pipelines consume a change feed — typically from a database's transaction log — and apply those changes to a target Delta table. Declarative Pipelines automate the merge logic, deduplication, ordering, and optionally track historical versions of each record.

    %%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% flowchart LR classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED DB[(Source Database)]:::source --> WAL[Transaction Log / CDC]:::ingestion WAL --> CF[Change Feed - Streaming Table]:::storage CF --> AC[APPLY CHANGES INTO]:::processing AC --> SCD1[SCD Type 1 - Current State]:::serving AC --> SCD2[SCD Type 2 - Full History]:::serving

    *CDC flow from source database through the change feed to SCD Type 1 or Type 2 target tables.*

    %%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% graph TD classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED SCD[SCD Comparison]:::processing SCD --> T1[SCD Type 1]:::serving SCD --> T2[SCD Type 2]:::storage T1 --> T1A[Overwrites existing rows]:::serving T1 --> T1B[Current state only]:::serving T1 --> T1C[Simpler queries]:::serving T2 --> T2A[Preserves row history]:::storage T2 --> T2B[Adds validity columns]:::storage T2 --> T2C[Supports time-travel queries]:::storage

    *SCD Type 1 vs Type 2: trade-offs between simplicity and historical completeness.*

    Key Terms

    Prerequisites and Setup

    • A Declarative Pipeline with Unity Catalog enabled.
    • A streaming table containing the CDC change feed (from Lakeflow Connect, Debezium, or another CDC tool).
    • The change feed must include: primary key columns, a sequence/ordering column, and optionally an operation type column.

    Step-by-Step Implementation

      Configuration Reference

      Implementing CDC with Declarative Pipelines (AUTO CDC / SCD Type 1 and 2) configuration options
      ParameterDescriptionRequired
      targetName of the target streaming tableYes
      sourceName of the source streaming table containing CDC eventsYes
      keysList of primary key columnsYes
      sequence_byColumn expression for ordering changesYes
      stored_as_scd_type1 for current-state-only, 2 for full historyNo (default: 1)
      apply_as_deletesExpression identifying delete operationsNo
      except_column_listColumns to exclude from the targetNo
      track_history_column_listColumns to track for SCD Type 2 (subset tracking)No
      column_listExplicit list of columns to includeNo

      Monitoring, Cost, and Security Considerations

      Common Pitfalls and Recommended Patterns

        Frequently Asked Questions