Data Lineage: Tracking Data Flows Across Your Organisation

    Who this is for:

    Architecture / Concept Overview: Data Lineage: Tracking Data Flows Across Your Organisation

    Unity Catalog captures lineage at runtime — every notebook, job, pipeline, and SQL query that reads from or writes to a Unity Catalog table generates a lineage event.

    %%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% flowchart LR classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED NB[Notebook] -->|Writes| SILVER[Silver Table] JOB[Scheduled Job] -->|Writes| GOLD[Gold Table] SQL[SQL Query] -->|Reads| GOLD SILVER -->|Read by| JOB UC[Unity Catalog<br/>Lineage Capture] -.->|Tracks| NB UC -.->|Tracks| JOB UC -.->|Tracks| SQL NB:::source JOB:::processing SQL:::serving SILVER:::storage GOLD:::storage UC:::governance

    *Figure 1 — Unity Catalog transparently captures lineage from notebooks, jobs, and SQL queries at runtime.*

    Lineage operates at two granularity levels: table-level (which tables feed which tables) and column-level (which columns flow into which columns).

    %%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% graph TD classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED TL[Table-Level Lineage] --> UP_TBL[Upstream Tables<br/>What feeds this table?] TL --> DOWN_TBL[Downstream Tables<br/>What depends on this table?] CL[Column-Level Lineage] --> UP_COL[Upstream Columns<br/>Which source columns feed this column?] CL --> DOWN_COL[Downstream Columns<br/>Which columns depend on this column?] TL:::governance UP_TBL:::processing DOWN_TBL:::serving CL:::governance UP_COL:::processing DOWN_COL:::serving

    *Figure 2 — Two levels of lineage granularity: table-level for understanding data flow, column-level for impact analysis.*

    Key Terms

    Prerequisites and Setup

    • Unity Catalog enabled on the workspace
    • Tables registered in Unity Catalog (lineage is not captured for legacy Hive metastore tables)
    • Compute that supports lineage capture (SQL warehouses, jobs, notebooks using Unity Catalog-enabled clusters)
    • Access to system tables for programmatic lineage queries

    Step-by-Step Implementation

      Configuration Reference

      Data Lineage: Tracking Data Flows Across Your Organisation configuration options
      System TableDescriptionKey Columns
      system.access.table_lineageTable-level lineage eventssource_table_full_name, target_table_full_name, event_time
      system.access.column_lineageColumn-level lineage eventssource_column_name, target_column_name, event_time
      Lineage retentionDefault 1 yearConfigure via account settings
      Supported computeSQL warehouses, jobs, notebooksMust use Unity Catalog-enabled clusters

      Monitoring, Cost, and Security Considerations

      Common Pitfalls and Recommended Patterns

        Frequently Asked Questions