Databricks Notebooks

Databricks Notebooks are the primary interactive development environment on the platform, supporting Python, SQL, Scala, and R in a single document with real-time collaboration, built-in visualisation, and direct access to Spark and Unity Catalog. Notebooks serve as the entry point for data exploration, pipeline prototyping, and ad-hoc analysis, and can be scheduled as production jobs. This page is the pillar overview for the entire Notebooks section.

  • Understand the notebook interface and its role in the Databricks workflow
  • Learn the capabilities that set Databricks notebooks apart from Jupyter and other tools
  • Navigate to focused tutorials on each notebook feature and best practice

Who this is for: Data engineers, analysts, and data scientists who use notebooks for development, exploration, or production workloads on Databricks.

Architecture / Concept Overview: Databricks Notebooks

Databricks Notebooks run on the Databricks control plane and execute code on attached compute resources (clusters, serverless, or SQL warehouses). Each notebook consists of ordered cells that can contain code, markdown, or SQL. The notebook server manages sessions, variables, and state, while the compute layer handles distributed execution.

%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% flowchart LR classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED User[Developer]:::source --> NB[Notebook]:::serving NB --> Cluster[Compute Cluster]:::processing NB --> Serverless[Serverless Compute]:::serving Cluster --> DL[(Delta Lake)]:::storage Serverless --> DL NB --> UC[Unity Catalog]:::governance

*Notebooks connect to clusters or serverless compute, accessing data through Unity Catalog.*

Notebooks support multiple languages within a single document using magic commands, making them versatile for mixed workloads.

%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% graph TD classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED NB[Notebook Document]:::serving --> Python[Python Cells]:::processing NB --> SQL[SQL Cells]:::processing NB --> Scala[Scala Cells]:::processing NB --> R[R Cells]:::processing NB --> MD[Markdown Cells]:::source NB --> Widgets[Interactive Widgets]:::ingestion

*A single notebook can contain Python, SQL, Scala, R, and markdown cells with interactive widgets.*

%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% flowchart LR classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED Explore[Explore Data]:::source --> Prototype[Prototype Pipeline]:::processing Prototype --> Schedule[Schedule as Job]:::serving Schedule --> Monitor[Monitor Results]:::governance

*Notebooks support the full development lifecycle: explore, prototype, schedule, and monitor.*

Key Terms

Notebook
An interactive document composed of ordered cells that execute code, display markdown, or render visualisations.
Cell
A single executable unit within a notebook that contains code, SQL, or markdown.
Magic Command
A cell prefix (e.g., %python, %sql, %scala, %r, %md) that overrides the notebook's default language.
Widget
An interactive input control (text, dropdown, combobox, multiselect) that parameterises notebook execution.
Notebook Workflow
Running one notebook from another using dbutils.notebook.run(), passing parameters and receiving results.
Co-Authoring
Real-time collaborative editing where multiple users work on the same notebook simultaneously.

Prerequisites and Setup

  • A Databricks workspace with notebook access
  • Compute resources (cluster, serverless, or SQL warehouse) to attach to
  • Basic familiarity with Python, SQL, or another supported language
  • Unity Catalog enabled for governed data access

Step-by-Step Implementation

    Configuration Reference

    Databricks Notebooks configuration options
    FeatureDescriptionWhere to Learn More
    Multi-language supportPython, SQL, Scala, R in one notebookSupported Languages tutorial
    Magic commands%python, %sql, %run, %pipMagic Commands tutorial
    WidgetsInteractive parametersWidgets tutorial
    Co-authoringReal-time multi-user editingCollaboration tutorial
    SchedulingRun notebooks as jobsCreating and Scheduling tutorial
    DebuggingInteractive breakpoint debuggerDebugging tutorial
    Unit testingTest notebook code with frameworksUnit Testing tutorial
    AI assistantData Science Agent integrationAI Assistant tutorial
    Import/ExportDBC, Jupyter, Python formatsImport/Export tutorial

    Monitoring, Cost, and Security Considerations

    Monitoring

    Notebook execution logs are available in the job run history and through dbutils.notebook.entry_point. Track which notebooks run most frequently and their resource consumption through system tables.

    Cost Optimisation

    - Notebooks consume compute only while attached and executing. Use auto-termination to stop idle clusters.

    - For ad-hoc exploration, use serverless notebooks to avoid paying for cluster idle time.

    - Schedule notebooks as jobs on job clusters for lower DBU rates in production.

    Security and Governance

    - Notebook access is controlled by workspace permissions (can read, can run, can edit, can manage).

    - Unity Catalog enforces data access policies regardless of notebook language.

    - Use Repos (Git integration) for version control and code review before promotion to production.

    Common Pitfalls and Recommended Patterns

    • Using notebooks for everything: extract reusable logic into Python modules and test them outside notebooks.
    • Not using version control: connect notebooks to Git repos for history, review, and rollback.
    • Running all exploration on expensive clusters: use serverless for ad-hoc work.
    • Ignoring cell execution order: notebooks maintain state in execution order, not cell position, which can cause confusion.
    • Hardcoding parameters: use widgets for runtime parameters instead of editing code.
    • Skipping the AI assistant: the Data Science Agent can generate code, explain errors, and suggest optimisations.

    Frequently Asked Questions

    How are Databricks notebooks different from Jupyter notebooks?

    Databricks notebooks offer built-in collaboration, native Spark integration, multi-language support in one notebook, scheduling, and Unity Catalog governance. Jupyter notebooks can be imported into Databricks.

    Can I use notebooks for production workloads?

    Yes. Notebooks can be scheduled as jobs and monitored through the jobs UI. For complex production pipelines, consider extracting logic into modules and using notebooks as thin orchestration wrappers.

    Which language should I use?

    Python is the most popular and has the broadest library ecosystem. Use SQL for data analysis and transformations. Scala is useful for performance-critical code. R is best for statistical analysis.

    Can I run notebooks from the CLI?

    Yes. Use databricks jobs create with a notebook task, or use dbutils.notebook.run() within another notebook.