Check the Spark session

Databricks Notebooks provide an interactive, cell-based development environment where you write and execute code directly against Spark clusters or serverless compute, with built-in visualisation, collaboration, and scheduling. They combine the familiarity of Jupyter-style notebooks with Databricks-specific features like multi-language support, Unity Catalog integration, and real-time co-authoring. Start here to understand the notebook interface and workflow.

  • Understand the notebook interface, cell types, and execution model
  • Create your first notebook and connect it to compute
  • Learn the core actions: run cells, view results, and save work

Who this is for: Anyone new to Databricks who wants to start working with notebooks for data exploration, analysis, or pipeline development.

Part of the Databricks Notebooks section of the Databricks tutorial series.

Architecture / Concept Overview: Check the Spark session

A notebook is a sequence of cells stored in the Databricks workspace. When you attach a notebook to compute and run a cell, the code is sent to the cluster driver, executed (potentially distributed across workers), and the results are returned to the notebook UI. The notebook server maintains session state (variables, imports, temporary views) across cell executions until the session is detached or the cluster restarts.

%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% flowchart LR classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED Cell[Notebook Cell]:::source --> Driver[Spark Driver]:::processing Driver --> Workers[Spark Workers]:::processing Workers --> Data[(Delta Lake)]:::storage Workers --> Driver Driver --> Results[Cell Results]:::serving

*Code flows from the notebook cell to the Spark driver, distributes across workers, reads from storage, and returns results.*

The notebook workspace organises notebooks into folders with version history and permissions.

%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% graph TD classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED WS[Workspace]:::governance --> Home[/Users/you/]:::source WS --> Shared[/Shared/]:::source WS --> Repos[/Repos/]:::source Home --> NB1[Exploration Notebook]:::serving Shared --> NB2[Team Notebook]:::serving Repos --> NB3[Git-Synced Notebook]:::serving

*Notebooks live in workspace folders: personal (/Users), shared (/Shared), or Git-synced (/Repos).*

%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% flowchart LR classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED Edit[Edit Cell]:::source --> Run[Run Cell]:::processing Run --> View[View Results]:::serving View --> Iterate[Iterate]:::source Iterate --> Edit

*The notebook workflow is an iterative loop: edit, run, view results, and iterate.*

Key Terms

Notebook
An interactive document of ordered cells for code execution, visualisation, and documentation.
Cell
A single executable block containing code, SQL, or markdown.
Attach
Connecting a notebook to a compute resource (cluster or serverless) for execution.
Session
The runtime state (variables, imports, Spark context) maintained while a notebook is attached to compute.
Workspace
The file system within Databricks that stores notebooks, folders, and other assets.
Revision History
Automatic version snapshots of notebook content for auditing and rollback.

Prerequisites and Setup

  • A Databricks workspace with notebook access
  • A running cluster or serverless compute to attach to
  • Basic familiarity with Python or SQL
  • A web browser (notebooks run in the Databricks web UI)

Step-by-Step Implementation

    Configuration Reference

    Check the Spark session configuration options
    FeatureShortcut / LocationDescription
    Run cellShift + EnterExecute current cell and move to next
    Run allCtrl + Shift + EnterExecute all cells in order
    Add cellClick + between cellsInsert a new cell
    Change languageCell menu → LanguageSwitch cell to Python, SQL, Scala, R
    Toggle markdown%md at cell topRender cell as markdown
    Clear stateDetach/Reattach or Clear StateReset all variables and imports
    Revision historyFile → Revision HistoryView and restore previous versions
    CommentsHighlight text → CommentAdd inline comments for collaboration

    Monitoring, Cost, and Security Considerations

    Monitoring

    Each cell shows execution time and the compute resource used. The notebook activity log records who ran what and when. For scheduled notebooks, job run history provides detailed execution logs.

    Cost Optimisation

    - Detach notebooks from clusters when not actively working to avoid idle cluster costs.

    - Use serverless compute for ad-hoc exploration to eliminate cluster startup and idle time.

    - Clear large DataFrames from memory when no longer needed to free executor resources.

    Security and Governance

    - Notebook permissions control who can view, run, edit, or manage each notebook.

    - Unity Catalog enforces data access policies regardless of which notebook runs the query.

    - Revision history provides an audit trail of all changes to notebook content.

    - Use Repos for Git-based version control and code review workflows.

    Common Pitfalls and Recommended Patterns

    • Running cells out of order: this creates hidden state dependencies; use "Run All" to verify top-to-bottom execution.
    • Not detaching from clusters: leaving a notebook attached keeps the cluster alive and consuming DBUs.
    • Hardcoding values: use widgets or parameters instead of editing cell code for different runs.
    • Storing sensitive data in notebook cells: use secrets (dbutils.secrets.get()) instead.
    • Skipping markdown cells: document your analysis inline for future you and your collaborators.
    • Ignoring revision history: use it to roll back unintended changes rather than trying to undo manually.

    Frequently Asked Questions

    Can I use Databricks notebooks offline?

    No. Notebooks require a connection to the Databricks workspace and an attached compute resource to execute code. You can export notebooks for offline viewing.

    How do Databricks notebooks compare to Jupyter?

    Databricks notebooks offer native Spark integration, multi-language support in one document, real-time collaboration, built-in scheduling, Unity Catalog governance, and the Data Science Agent. Jupyter notebooks can be imported into Databricks.

    Can multiple people edit the same notebook at the same time?

    Yes. Databricks supports real-time co-authoring where multiple users can edit and run cells simultaneously, similar to Google Docs.

    Where are notebooks stored?

    Notebooks are stored in the Databricks workspace, which is backed by the control plane. For Git integration, use Repos to sync notebooks with an external Git repository.