SQL Warehouses: Serverless vs Classic Options

SQL warehouses are Databricks compute endpoints optimised exclusively for SQL queries, dashboards, and BI tool connections. Serverless SQL warehouses start in seconds with zero infrastructure management, while Classic SQL warehouses run in your cloud account with full network control. Choose serverless for most analytical workloads and classic only when you need custom VPC configurations or specific compliance requirements.

  • Understand the architecture and trade-offs of serverless vs classic SQL warehouses
  • Create, configure, and size SQL warehouses for different workloads
  • Connect BI tools and optimise warehouse performance and cost

Who this is for: Data analysts, BI engineers, and platform administrators who run SQL workloads and connect BI tools to Databricks.

Part of the Databricks Compute section of the Databricks tutorial series.

Architecture / Concept Overview: SQL Warehouses: Serverless vs Classic Options

SQL warehouses provide a SQL-native compute layer with built-in concurrency management, automatic scaling, and Photon acceleration. Unlike general-purpose clusters, SQL warehouses are purpose-built for query execution and do not support Python, Scala, or R. They accept connections from the SQL editor, dashboards, JDBC/ODBC drivers, and partner BI tools.

%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% flowchart LR classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED Analyst[SQL Editor]:::source --> SW[SQL Warehouse]:::serving BI[BI Tool via JDBC]:::source --> SW Dash[Dashboard]:::source --> SW SW --> Photon[Photon Engine]:::processing Photon --> DL[(Delta Lake)]:::storage

*SQL warehouses accept queries from the SQL editor, BI tools, and dashboards, executing them through Photon on Delta Lake.*

The key architectural difference between serverless and classic lies in where the compute runs and who manages it.

%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% graph TD classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED subgraph Serverless SCP[Databricks Account]:::serving --> SWH[Managed Warehouse]:::serving end subgraph Classic YourVPC[Your Cloud VPC]:::processing --> CWH[Self-Managed Warehouse]:::processing end SWH --> Store[(Cloud Storage)]:::storage CWH --> Store UC[Unity Catalog]:::governance -.->|governs| SWH UC -.->|governs| CWH

*Serverless warehouses run in a Databricks-managed account; classic warehouses run in your VPC. Both access the same storage.*

SQL warehouses scale by adding or removing clusters (not individual workers) to handle concurrent queries.

%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% flowchart LR classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED Queries[Incoming Queries]:::source --> LB[Load Balancer]:::governance LB --> Cl1[Cluster 1]:::serving LB --> Cl2[Cluster 2]:::serving LB --> ClN[Cluster N]:::serving

*SQL warehouses scale horizontally by adding clusters behind a load balancer to serve concurrent queries.*

Key Terms

SQL Warehouse
A managed SQL compute endpoint for running queries, dashboards, and BI connections on Databricks.
T-Shirt Sizing
SQL warehouse capacity specified as Small, Medium, Large, etc., representing increasing compute power.
Max Clusters
The maximum number of parallel clusters a SQL warehouse can scale to for concurrency.
Auto-Stop
Automatic warehouse suspension after a configurable idle period to stop billing.
Photon
The C++ vectorised engine that all SQL warehouses use for high-performance query execution.
Query History
A log of all queries executed on a warehouse, including duration, DBU cost, and user.

Prerequisites and Setup

  • A Databricks workspace with SQL warehouse permissions
  • Unity Catalog enabled for data access governance
  • A BI tool with JDBC/ODBC support (if connecting externally)
  • Understanding of query concurrency requirements

Step-by-Step Implementation

    Configuration Reference

    SQL Warehouses: Serverless vs Classic Options configuration options
    SettingServerlessClassic
    Start timeSeconds3-7 minutes
    InfrastructureDatabricks-managedYour cloud VPC
    PhotonAlways onAlways on
    Auto-stopConfigurable (5+ min)Configurable (10+ min)
    Max clustersUp to account limitUp to account limit
    SizingT-shirt (2X-Small to 4X-Large)T-shirt (2X-Small to 4X-Large)
    Custom networkingNoYes (VPC peering, Private Link)
    DBU rateHigher per-DBULower per-DBU
    Idle costNear zero (fast stop)VM cost until stopped

    Monitoring, Cost, and Security Considerations

    Monitoring

    Use query history in the SQL warehouse UI and system.query.history system table to track per-query duration, queued time, and resource usage. Alert on queries that exceed duration thresholds or warehouses that fail to auto-stop.

    Cost Optimisation

    - Set auto-stop to the shortest acceptable interval (5-10 minutes for serverless, 10-15 for classic).

    - Right-size the warehouse: start with Small and scale up only if query latency is unacceptable.

    - Use max clusters to cap horizontal scaling and prevent runaway concurrency costs.

    - Serverless eliminates idle VM costs but has a higher per-DBU rate — compare total cost for your query volume.

    Security and Governance

    - SQL warehouses enforce Unity Catalog permissions; every query runs with the caller's identity.

    - Classic warehouses support Private Link and VPC peering for private connectivity.

    - Serverless warehouses use encryption in transit and at rest with optional customer-managed keys.

    - Use warehouse permissions to control who can use or manage each warehouse.

    Common Pitfalls and Recommended Patterns

    • Over-sizing the warehouse: a Small warehouse handles many analytical queries; measure before upsizing.
    • Setting max clusters too high: each additional cluster costs DBUs; cap it at expected peak concurrency.
    • Not setting auto-stop: warehouses left running overnight are a common cost overrun.
    • Using a general-purpose cluster for SQL workloads: SQL warehouses offer better concurrency, caching, and cost for SQL.
    • Ignoring query history: use it to find expensive queries and optimize them before scaling compute.
    • Mixing workloads on one warehouse: separate exploratory and production dashboards to avoid contention.

    Frequently Asked Questions

    Can I use a SQL warehouse for Python or Scala?

    No. SQL warehouses support only SQL. For Python, Scala, or R workloads, use an all-purpose cluster, job cluster, or serverless notebook.

    How does the SQL warehouse handle concurrent queries?

    A single cluster in the warehouse can handle multiple concurrent queries. When concurrency exceeds the capacity of one cluster, the warehouse scales out by adding more clusters (up to max_num_clusters).

    Should I use Pro or Classic warehouse type?

    Pro warehouses support additional features like query profiling, intelligent workload management, and predictive optimisation. Use Pro unless you have a strong reason to use Classic.

    What is the smallest SQL warehouse I can create?

    The smallest size is 2X-Small, which is suitable for light ad-hoc queries. For dashboards with concurrent users, start with Small or Medium.