01_ingest.py

    Who this is for:

    Architecture / Concept Overview: 01_ingest.py

    Control flow tasks extend the standard DAG model with decision nodes (If/Else) and loop nodes (For Each). These let you route execution down different paths based on task values or parameters, and iterate over collections like file lists, table names, or date ranges.

    %%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% flowchart LR classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED START[Ingest Data]:::ingestion --> CHECK{If: rows > 0?}:::governance CHECK --> |True| TRANSFORM[Transform]:::processing CHECK --> |False| SKIP[Log: No New Data]:::source TRANSFORM --> LOOP[For Each: Region]:::serving LOOP --> AGG1[Aggregate EMEA]:::storage LOOP --> AGG2[Aggregate APAC]:::storage LOOP --> AGG3[Aggregate AMER]:::storage AGG1 --> NOTIFY[Notify]:::governance AGG2 --> NOTIFY AGG3 --> NOTIFY SKIP --> NOTIFY

    *A job with If/Else branching and For-Each looping over regions.*

    %%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% graph TD classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED CF[Control Flow Tasks]:::processing CF --> IFE[If/Else Task]:::governance CF --> FE[For-Each Task]:::serving CF --> RI[Run If Condition]:::processing IFE --> C1[Condition expression]:::governance IFE --> C2[True branch tasks]:::serving IFE --> C3[False branch tasks]:::source FE --> F1[Input collection]:::serving FE --> F2[Nested task template]:::processing FE --> F3[Concurrency control]:::processing RI --> R1[ALL_SUCCESS]:::processing RI --> R2[AT_LEAST_ONE_SUCCESS]:::processing RI --> R3[NONE_FAILED]:::processing RI --> R4[ALL_DONE]:::processing

    *Control flow capabilities in Lakeflow Jobs.*

    Key Terms

    Prerequisites and Setup

    • Familiarity with Lakeflow Jobs task definitions and dependencies.
    • Notebooks that set task values for use in conditions.
    • A workspace with the latest Jobs features enabled.

    Step-by-Step Implementation

      Configuration Reference

      01_ingest.py configuration options
      ParameterDescriptionDefault
      condition_task.opComparison operator: EQUAL_TO, NOT_EQUAL, GREATER_THAN, GREATER_THAN_OR_EQUAL, LESS_THAN, LESS_THAN_OR_EQUALRequired
      condition_task.leftLeft operand (supports template expressions)Required
      condition_task.rightRight operand (supports template expressions)Required
      for_each_task.inputsJSON array or template expression resolving to a listRequired
      for_each_task.concurrencyMax parallel iterations1
      run_ifExecution condition: ALL_SUCCESS, AT_LEAST_ONE_SUCCESS, NONE_FAILED, ALL_DONE, AT_LEAST_ONE_FAILEDALL_SUCCESS

      Monitoring, Cost, and Security Considerations

      Common Pitfalls and Recommended Patterns

        Frequently Asked Questions