Alternatively, configure a specific service account

    Who this is for:

    Architecture / Concept Overview: Alternatively, configure a specific service account

    %%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% flowchart LR classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED DBX[Databricks Workspace] -->|Read/Write| GCS[GCS Delta Lake] DBX -->|Read/Write| BQ[BigQuery Tables] DBX -->|Train| MODEL[ML Model] MODEL -->|Register| MLFLOW[MLflow Registry] MLFLOW -->|Deploy| VERTEX[Vertex AI Endpoint] GCS -->|External Table| BQ BQ -->|BI| LOOKER[Looker / Data Studio] VERTEX -->|Predict| APP[Applications] DBX:::processing GCS:::storage BQ:::serving MODEL:::processing MLFLOW:::governance VERTEX:::serving LOOKER:::serving APP:::source

    *End-to-end integration flow showing Databricks as the processing hub connecting GCS storage, BigQuery analytics, and Vertex AI serving.*

    %%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% graph TD classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED GCS_INT[GCS Integration] --> DIRECT[Direct gs:// Access] GCS_INT --> MOUNT[DBFS Mount] GCS_INT --> UC_EXT[Unity Catalog External Location] BQ_INT[BigQuery Integration] --> SPARK_BQ[Spark BigQuery Connector] BQ_INT --> BQ_FED[BigQuery Federated - Delta Tables] BQ_INT --> BQ_MAT[Materialized Views] VERTEX_INT[Vertex AI Integration] --> MLFLOW_D[MLflow Model Export] VERTEX_INT --> FEAT[Feature Store Sync] VERTEX_INT --> PIPE[Vertex Pipelines] GCS_INT:::storage DIRECT:::storage MOUNT:::storage UC_EXT:::governance BQ_INT:::serving SPARK_BQ:::serving BQ_FED:::serving BQ_MAT:::serving VERTEX_INT:::processing MLFLOW_D:::processing FEAT:::processing PIPE:::processing

    *Integration patterns for each GCP service showing multiple connectivity approaches.*

    Key Terms

    Prerequisites and Setup

    • Databricks workspace deployed on GCP with a running cluster
    • GCS buckets created for your data lake zones
    • BigQuery dataset created in the target project
    • Service account with appropriate IAM roles for GCS, BigQuery, and Vertex AI
    • Python packages: google-cloud-bigquery, google-cloud-aiplatform (installed on cluster)

    Step-by-Step Implementation

      Configuration Reference

      Alternatively, configure a specific service account configuration options
      IntegrationConfigurationAuth MethodPerformance Notes
      GCS Read/Writegs:// path, format("delta")Workload Identity / Service AccountParallel reads via Spark partitions
      BigQuery Readformat("bigquery"), table optionService AccountUses Storage Read API for high throughput
      BigQuery Writeformat("bigquery"), temp GCS bucketService AccountStages data in GCS then bulk loads
      BigQuery ExternalDelta Lake external tableBigQuery service accountDirect read from GCS, no copy needed
      Vertex AI Deployaiplatform.Model.upload()Service AccountModel artifacts must be in GCS
      MLflow RegistryAutomatic with DatabricksWorkspace authenticationTracks versions and deployment stage

      Monitoring, Cost, and Security Considerations

      Common Pitfalls and Recommended Patterns

        Frequently Asked Questions