Flows, Streaming Tables, and Materialized Views Explained

    Who this is for:

    Architecture / Concept Overview: Flows, Streaming Tables, and Materialized Views Explained

    Every Declarative Pipeline consists of flows connecting datasets (streaming tables, materialized views, or views). A flow represents one transformation path, while the dataset type determines how data is stored and refreshed.

    %%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% flowchart LR classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED SRC[Source Data]:::source --> F1[Flow 1: Ingest]:::ingestion F1 --> ST[Streaming Table: raw_events]:::storage ST --> F2[Flow 2: Clean]:::processing F2 --> ST2[Streaming Table: clean_events]:::storage ST2 --> F3[Flow 3: Aggregate]:::processing F3 --> MV[Materialized View: hourly_stats]:::serving

    *Three flows connecting source data through streaming tables to a materialized view.*

    %%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% graph TD classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED COMP[Dataset Comparison]:::processing COMP --> ST[Streaming Table]:::storage COMP --> MV[Materialized View]:::serving COMP --> VW[View]:::processing ST --> SP1[Append-only]:::storage ST --> SP2[Incremental processing]:::storage ST --> SP3[Persisted to Delta]:::storage MV --> MP1[Recomputed on change]:::serving MV --> MP2[Supports aggregations]:::serving MV --> MP3[Persisted to Delta]:::serving VW --> VP1[Not persisted]:::processing VW --> VP2[Re-evaluated each run]:::processing VW --> VP3[Intermediate step only]:::processing

    *Comparison of the three dataset types and their key characteristics.*

    Key Terms

    Prerequisites and Setup

    • A Databricks workspace with Unity Catalog enabled.
    • An existing Declarative Pipeline or permission to create one.
    • Familiarity with Python or SQL for defining pipeline datasets.

    Step-by-Step Implementation

      Configuration Reference

      Flows, Streaming Tables, and Materialized Views Explained configuration options
      ParameterApplies ToDescriptionDefault
      table_properties.qualityAll tablesMetadata tag for medallion layerNone
      spark.databricks.delta.optimizeWrite.enabledStreaming TablesAuto-optimize file sizes on writetrue
      pipelines.maxFlowRetryAttemptsFlowsRetry count for failed flows2
      continuousPipelineEnables continuous processing modefalse
      photonPipelineEnables Photon-accelerated executionfalse

      Monitoring, Cost, and Security Considerations

      Common Pitfalls and Recommended Patterns

        Frequently Asked Questions