01_ingest.py
Who this is for:
Architecture / Concept Overview: 01_ingest.py
Control flow tasks extend the standard DAG model with decision nodes (If/Else) and loop nodes (For Each). These let you route execution down different paths based on task values or parameters, and iterate over collections like file lists, table names, or date ranges.
%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%%
flowchart LR
classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
START[Ingest Data]:::ingestion --> CHECK{If: rows > 0?}:::governance
CHECK --> |True| TRANSFORM[Transform]:::processing
CHECK --> |False| SKIP[Log: No New Data]:::source
TRANSFORM --> LOOP[For Each: Region]:::serving
LOOP --> AGG1[Aggregate EMEA]:::storage
LOOP --> AGG2[Aggregate APAC]:::storage
LOOP --> AGG3[Aggregate AMER]:::storage
AGG1 --> NOTIFY[Notify]:::governance
AGG2 --> NOTIFY
AGG3 --> NOTIFY
SKIP --> NOTIFY
*A job with If/Else branching and For-Each looping over regions.*
%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%%
graph TD
classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
CF[Control Flow Tasks]:::processing
CF --> IFE[If/Else Task]:::governance
CF --> FE[For-Each Task]:::serving
CF --> RI[Run If Condition]:::processing
IFE --> C1[Condition expression]:::governance
IFE --> C2[True branch tasks]:::serving
IFE --> C3[False branch tasks]:::source
FE --> F1[Input collection]:::serving
FE --> F2[Nested task template]:::processing
FE --> F3[Concurrency control]:::processing
RI --> R1[ALL_SUCCESS]:::processing
RI --> R2[AT_LEAST_ONE_SUCCESS]:::processing
RI --> R3[NONE_FAILED]:::processing
RI --> R4[ALL_DONE]:::processing
*Control flow capabilities in Lakeflow Jobs.*
Key Terms
Prerequisites and Setup
- Familiarity with Lakeflow Jobs task definitions and dependencies.
- Notebooks that set task values for use in conditions.
- A workspace with the latest Jobs features enabled.
Step-by-Step Implementation
Configuration Reference
| Parameter | Description | Default |
|---|---|---|
condition_task.op | Comparison operator: EQUAL_TO, NOT_EQUAL, GREATER_THAN, GREATER_THAN_OR_EQUAL, LESS_THAN, LESS_THAN_OR_EQUAL | Required |
condition_task.left | Left operand (supports template expressions) | Required |
condition_task.right | Right operand (supports template expressions) | Required |
for_each_task.inputs | JSON array or template expression resolving to a list | Required |
for_each_task.concurrency | Max parallel iterations | 1 |
run_if | Execution condition: ALL_SUCCESS, AT_LEAST_ONE_SUCCESS, NONE_FAILED, ALL_DONE, AT_LEAST_ONE_FAILED | ALL_SUCCESS |