- Condition: sts:ExternalId matches your Databricks account ID

    Who this is for:

    Architecture / Concept Overview: - Condition: sts:ExternalId matches your Databricks account ID

    %%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% flowchart LR classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED DEPLOY[Deployment Attempt] -->|Fail| DIAG{Diagnose} DIAG -->|IAM| IAM_FIX[Fix IAM Policies/Trust] DIAG -->|Network| NET_FIX[Fix VPC/Subnet/SG] DIAG -->|Storage| S3_FIX[Fix S3 Bucket Config] DIAG -->|Compute| EC2_FIX[Fix Quota/Instance Type] DIAG -->|DNS| DNS_FIX[Fix DNS Resolution] IAM_FIX --> RETRY[Retry Deployment] NET_FIX --> RETRY S3_FIX --> RETRY EC2_FIX --> RETRY DNS_FIX --> RETRY RETRY -->|Success| DONE[Workspace Running] DEPLOY:::source DIAG:::ingestion IAM_FIX:::governance NET_FIX:::storage S3_FIX:::storage EC2_FIX:::processing DNS_FIX:::serving RETRY:::processing DONE:::serving

    *Troubleshooting workflow for Databricks on AWS deployment failures.*

    %%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% graph TD classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED TOP5[Top 5 Deployment Pitfalls] --> P1[1. Cross-Account Trust Missing External ID] TOP5 --> P2[2. Subnet Has No Route to Internet] TOP5 --> P3[3. S3 Bucket Policy Denies Databricks] TOP5 --> P4[4. Security Group Blocks Control Plane] TOP5 --> P5[5. EC2 Service Quota Exceeded] P1 --> F1[STS AssumeRole fails silently] P2 --> F2[Cluster nodes cannot reach control plane] P3 --> F3[Workspace creation stuck or failed] P4 --> F4[Cluster state stuck in Pending] P5 --> F5[RunInstances API throttled] TOP5:::source P1:::governance P2:::storage P3:::storage P4:::governance P5:::processing F1:::ingestion F2:::ingestion F3:::ingestion F4:::ingestion F5:::ingestion

    *Top five deployment pitfalls and their downstream failure symptoms.*

    Key Terms

    Prerequisites and Setup

    • AWS CLI configured with admin access to the affected account
    • Access to the Databricks account console and workspace (if deployed)
    • CloudTrail enabled with management events logging
    • Familiarity with IAM policy simulator and VPC reachability analyzer
    • Access to Databricks cluster event logs and driver logs

    Step-by-Step Implementation

      Configuration Reference

      - Condition: sts:ExternalId matches your Databricks account ID configuration options
      Error SymptomRoot CauseDiagnostic CommandFix
      Workspace stuck in PROVISIONINGCross-account role trust failureCheck CloudTrail for AssumeRole errorsFix External ID in trust policy
      Cluster PENDING then TERMINATEDNo outbound connectivityCheck route tables for NAT/IGW routeAdd NAT Gateway and route
      CLOUD_PROVIDER_LAUNCH_FAILUREEC2 quota exceededaws service-quotas get-service-quotaRequest quota increase
      S3 access denied in notebooksInstance profile missing permissionsIAM policy simulatorAdd S3 actions to instance profile
      Workspace creation API returns 400Invalid network configurationValidate subnet/SG IDs existConfirm subnets are in correct VPC
      Cluster logs show TLS errorsSecurity group blocking 443 egressCheck egress rulesAllow 443 outbound to 0.0.0.0/0

      Monitoring, Cost, and Security Considerations

      Common Pitfalls and Recommended Patterns

        Frequently Asked Questions