Securing Databricks on AWS (Networking, IAM, and Encryption)
Who this is for:
Architecture / Concept Overview: Securing Databricks on AWS (Networking, IAM, and Encryption)
%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%%
flowchart LR
classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
USERS[Corporate Users] -->|PrivateLink| FE_EP[Front-End VPC Endpoint]
FE_EP -->|Private| CP[Databricks Control Plane]
CP -->|PrivateLink| BE_EP[Back-End VPC Endpoint]
BE_EP -->|Private| CLUSTER[Cluster Nodes]
CLUSTER -->|VPC Endpoint| S3[S3 Encrypted Storage]
CLUSTER -->|VPC Endpoint| STS[STS for Temp Creds]
CLUSTER -->|VPC Endpoint| KMS[KMS for Decryption]
USERS:::source
FE_EP:::governance
CP:::processing
BE_EP:::governance
CLUSTER:::processing
S3:::storage
STS:::governance
KMS:::governance
*Fully private connectivity architecture using PrivateLink for all communication paths.*
%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%%
graph TD
classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
SEC[Security Controls] --> NET[Network Layer]
SEC --> IAM_L[IAM Layer]
SEC --> ENC[Encryption Layer]
SEC --> AUDIT[Audit Layer]
NET --> PL[PrivateLink]
NET --> SG[Security Groups]
NET --> NACL[Network ACLs]
IAM_L --> CROSS[Cross-Account Role - Least Privilege]
IAM_L --> IP_R[Instance Profiles - Scoped]
IAM_L --> UC_R[Unity Catalog Credentials]
ENC --> S3_ENC[S3 SSE-KMS]
ENC --> EBS_ENC[EBS Volume Encryption]
ENC --> TLS[TLS 1.2+ In Transit]
AUDIT --> CT[CloudTrail]
AUDIT --> DBX_AL[Databricks Audit Logs]
AUDIT --> VPC_FL[VPC Flow Logs]
SEC:::governance
NET:::storage
IAM_L:::governance
ENC:::processing
AUDIT:::ingestion
PL:::storage
SG:::storage
NACL:::storage
CROSS:::governance
IP_R:::governance
UC_R:::governance
S3_ENC:::processing
EBS_ENC:::processing
TLS:::processing
CT:::ingestion
DBX_AL:::ingestion
VPC_FL:::ingestion
*Defense-in-depth security model across network, IAM, encryption, and audit layers.*
Key Terms
Prerequisites and Setup
- Databricks workspace deployed in a customer-managed VPC (required for full security controls)
- AWS PrivateLink service available in the workspace region
- KMS keys created for S3, EBS, and managed services encryption
- CloudTrail enabled in the account with data event logging for S3
- Security team alignment on network security requirements and compliance frameworks
Step-by-Step Implementation
Configuration Reference
| Security Control | AWS Service | Scope | Compliance Mapping |
|---|---|---|---|
| PrivateLink (Front-End) | VPC Endpoints | User-to-workspace | SOC 2 CC6.1, HIPAA |
| PrivateLink (Back-End) | VPC Endpoints | Control-to-data plane | SOC 2 CC6.6 |
| Security Groups | EC2 | Cluster node traffic | SOC 2 CC6.1 |
| KMS Encryption (S3) | KMS | Data at rest | SOC 2 CC6.1, HIPAA, FedRAMP |
| KMS Encryption (EBS) | KMS | Compute volumes | SOC 2 CC6.1, HIPAA |
| TLS 1.2+ | Built-in | Data in transit | PCI DSS 4.1 |
| CloudTrail | CloudTrail | API audit trail | SOC 2 CC7.2 |
| Databricks Audit Logs | Databricks | Workspace activity | SOC 2 CC7.2 |
| VPC Flow Logs | VPC | Network traffic | SOC 2 CC7.2, forensics |