Creating a Databricks Workspace on Google Cloud (Step-by-Step Guide)
Who this is for:
Architecture / Concept Overview: Creating a Databricks Workspace on Google Cloud (Step-by-Step Guide)
%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%%
flowchart LR
classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
ADMIN[Platform Admin] -->|gcloud + API| PROJ[GCP Project Setup]
PROJ -->|Enable| APIS[Required APIs]
APIS -->|Create| VPC_R[VPC + Subnets]
VPC_R -->|Create| SA[Service Account]
SA -->|Create| GCS_R[GCS Root Bucket]
GCS_R -->|Deploy| WS[Databricks Workspace]
WS -->|Validate| CLUSTER[Test Cluster]
ADMIN:::source
PROJ:::ingestion
APIS:::ingestion
VPC_R:::storage
SA:::governance
GCS_R:::storage
WS:::processing
CLUSTER:::serving
*Workspace provisioning flow from project setup through resource creation to deployment validation.*
%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%%
graph TD
classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
WS_ARCH[Workspace Architecture] --> CP[Control Plane - Databricks Managed]
WS_ARCH --> DP[Data Plane - Customer Project]
CP --> UI[Workspace UI]
CP --> API[REST API]
CP --> SCHED[Job Scheduler]
DP --> GKE_N[GKE Node Pool]
DP --> GCS_N[GCS Root Bucket]
DP --> VPC_N[Customer VPC]
GKE_N --> PODS[Spark Driver + Executor Pods]
WS_ARCH:::processing
CP:::processing
DP:::storage
UI:::serving
API:::serving
SCHED:::processing
GKE_N:::processing
GCS_N:::storage
VPC_N:::storage
PODS:::processing
*Deployed workspace architecture showing control plane services and customer data plane resources.*
%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%%
flowchart LR
classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED
S1[1. Enable APIs] --> S2[2. Create VPC]
S2 --> S3[3. Create Subnet]
S3 --> S4[4. Create Service Account]
S4 --> S5[5. Create GCS Bucket]
S5 --> S6[6. Deploy Workspace]
S6 --> S7[7. Verify Status]
S7 --> S8[8. Test Cluster]
S1:::source
S2:::storage
S3:::storage
S4:::governance
S5:::storage
S6:::processing
S7:::serving
S8:::serving
*Sequential deployment checklist.*
Key Terms
Prerequisites and Setup
- GCP project with billing enabled and Organization Admin or Project Owner role
gcloudCLI installed and authenticated- Databricks account created (via GCP Marketplace or accounts.gcp.databricks.com)
- Network plan: VPC CIDR, subnet for GKE nodes (/22+), pod and service IP ranges
- Cloud KMS keyring and key created (for CMEK — optional but recommended)
Step-by-Step Implementation
Configuration Reference
| Parameter | Description | Default | Recommended |
|---|---|---|---|
| Location | GCP region for the workspace | required | Match data locality |
| Subnet CIDR | IP range for GKE nodes | /22 minimum | /20 for large deployments |
| GKE Master IP Range | Control plane CIDR | /28 required | Non-overlapping /28 |
| Pod IP Range | Secondary range for Kubernetes pods | /14 | /14 for scale |
| Service IP Range | Secondary range for Kubernetes services | /20 | /20 |
| GCS Root Bucket | Workspace storage | required | Regional, CMEK encrypted |
| Pricing Tier | Workspace capabilities | STANDARD | PREMIUM for governance |
| Private Service Connect | Private control plane connectivity | disabled | enabled for production |