Integrating Databricks with Microsoft Azure Services (ADF, Synapse, Fabric)
Who this is for:
Architecture / Concept Overview: Integrating Databricks with Microsoft Azure Services (ADF, Synapse, Fabric)
%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%%
flowchart LR
classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
SOURCES[Azure Data Sources]:::source
ADF[Azure Data Factory]:::ingestion
DBX[Azure Databricks]:::processing
ADLS[ADLS Gen2]:::storage
SYNAPSE[Synapse Analytics]:::serving
FABRIC[Microsoft Fabric]:::governance
SOURCES --> ADF --> DBX
DBX --> ADLS
ADLS --> SYNAPSE
ADLS --> FABRIC
ADF --> SYNAPSE
*Azure Data Factory orchestrates data movement into Databricks for processing, with results stored in ADLS Gen2 and consumed by Synapse or Fabric.*
%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%%
graph TD
classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
INT[Azure Integration Patterns]:::source
ORCH[Orchestration]:::ingestion
STORAGE_P[Storage]:::storage
ANALYTICS[Analytics]:::serving
GOV[Governance]:::governance
INT --> ORCH
INT --> STORAGE_P
INT --> ANALYTICS
INT --> GOV
ORCH --> ADF_P[ADF Pipelines]:::ingestion
ORCH --> ADF_T[ADF Databricks Activity]:::ingestion
STORAGE_P --> ADLS_P[ADLS Gen2 + Unity Catalog]:::storage
STORAGE_P --> DELTA[Delta Lake on ADLS]:::storage
ANALYTICS --> SYN[Synapse Serverless Pools]:::serving
ANALYTICS --> FAB[Fabric Lakehouse]:::serving
GOV --> PURVIEW[Microsoft Purview]:::governance
GOV --> ENTRA[Entra ID (Azure AD)]:::governance
*Azure services complement Databricks across orchestration, storage, analytics, and governance layers.*
Key Terms
Prerequisites and Setup
- Azure subscription with Databricks workspace deployed
- Azure Data Factory instance in the same region
- ADLS Gen2 storage account configured as Unity Catalog storage
- Service principal with Contributor role on the Databricks workspace
- VNet peering configured for private connectivity (production)
Step-by-Step Implementation
Configuration Reference
| Integration | Auth Method | Network | Use Case |
|---|---|---|---|
| ADF → Databricks | Managed Identity (MSI) | VNet / Public | Pipeline orchestration |
| Databricks → ADLS | Access Connector / MSI | Private Endpoint | Data read/write |
| Synapse → ADLS Delta | Managed Identity | Private Endpoint | SQL analytics on Delta |
| Fabric → Databricks | Delta Sharing token | Public (TLS) | Cross-platform data sharing |
| Purview → Databricks | Service Principal | VNet | Lineage and governance |