Importing and Exporting Notebooks

Databricks supports importing and exporting notebooks in multiple formats — DBC archives, Jupyter (.ipynb), Python (.py), SQL, Scala, and R source files — so you can migrate work between workspaces, share with colleagues outside Databricks, or integrate with CI/CD pipelines. Use the workspace UI, CLI, or REST API for bulk operations, and Repos for ongoing Git-based synchronisation.

Import notebooks from Jupyter, source files, or DBC archives into Databricks
Export notebooks in the format best suited for your use case
Automate import/export with the CLI and REST API

Who this is for: Engineers and analysts who need to move notebooks between environments, share outside Databricks, or integrate with version control systems.

Part of the Databricks Notebooks section of the Databricks tutorial series.

Architecture / Concept Overview: Importing and Exporting Notebooks

Notebooks in Databricks are stored in the workspace file system managed by the control plane. Import converts external files into workspace notebook objects; export converts workspace notebooks back into portable file formats. The DBC format is Databricks-proprietary and preserves all notebook metadata, while source formats (.py, .sql) contain only code.

%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% flowchart LR classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED External[External Files]:::source --> Import[Import]:::ingestion Import --> WS[Workspace Notebook]:::serving WS --> Export[Export]:::ingestion Export --> Files[Portable Files]:::source

*Import converts external files into workspace notebooks; export converts them back into portable formats.*

%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% graph TD classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED DBC[DBC Archive]:::governance --> Full[Full Metadata]:::governance IPYNB[Jupyter .ipynb]:::processing --> CellOut[Cells + Outputs]:::processing PY[Python .py]:::source --> CodeOnly[Code Only]:::source SQL[SQL .sql]:::source --> CodeOnly HTML[HTML Export]:::serving --> ReadOnly[Read-Only View]:::serving

*Each format preserves different levels of notebook content, from full metadata (DBC) to code only (source).*

%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% flowchart LR classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED UI[Workspace UI]:::source --> Action[Import / Export]:::processing CLI[Databricks CLI]:::source --> Action API[REST API]:::source --> Action Repos[Git Repos]:::serving --> Sync[Continuous Sync]:::serving

*Four methods for import/export: UI for one-off, CLI for scripting, API for automation, Repos for continuous sync.*

Key Terms

DBC Archive: A Databricks-proprietary archive format that bundles notebooks with metadata, dashboards, and folder structure.
Jupyter Notebook (.ipynb): The open JSON format used by Jupyter, containing cells with code and outputs.
Source Format: Plain text files (.py, .sql, .scala, .r) containing notebook code with cell separators.
Workspace Import: The process of uploading external files into the Databricks workspace as notebook objects.
Repos: Git integration for continuous synchronisation between workspace notebooks and external repositories.

Prerequisites and Setup

A Databricks workspace with import/export permissions
The Databricks CLI installed and configured for scripted operations
Source files or archives to import
A Git repository for Repos-based synchronisation

Step-by-Step Implementation

Configuration Reference

Importing and Exporting Notebooks configuration options
Format	Extension	Preserves Outputs	Preserves Metadata	Use Case
DBC	`.dbc`	Yes	Yes (full)	Backup, migration between workspaces
Jupyter	`.ipynb`	Yes	Partial	Sharing with Jupyter users
Source	`.py`, `.sql`, `.scala`, `.r`	No	No	Git, CI/CD, code review
HTML	`.html`	Yes	No	Read-only sharing
R Markdown	`.Rmd`	No	No	R workflows

Monitoring, Cost, and Security Considerations

Monitoring

Track workspace import/export operations through the audit log. Large DBC archive imports can take time; monitor for completion and errors. Repos sync status is visible in the workspace UI.

Cost Optimisation

- Import/export operations do not consume compute resources or DBUs.

- Use Repos for ongoing synchronisation instead of repeated manual imports to save time and reduce errors.

- Clean up unused imported notebooks to reduce workspace clutter.

Security and Governance

- Exported notebooks may contain sensitive outputs, credentials, or data; treat exports as confidential.

- DBC archives include all cell outputs, which could contain PII or query results.

- Source format exports include only code, which is safer for version control.

- Use workspace permissions to control who can import or export notebooks.

Common Pitfalls and Recommended Patterns

Exporting DBC archives with sensitive outputs: use source format for version control to avoid leaking data.
Importing Jupyter notebooks with incompatible libraries: verify dependencies are available on the cluster.
Not using --overwrite when re-importing: without it, the CLI fails if the notebook already exists.
Relying on manual import/export instead of Repos: use Git integration for ongoing development.
Importing large DBC archives into the wrong folder: always verify the target path before importing.
Forgetting that source format loses outputs and metadata: re-run cells after importing source files.

Frequently Asked Questions

Can I import a folder of notebooks at once?

Yes. Use databricks workspace import_dir to import an entire directory of source files, or import a DBC archive that contains multiple notebooks.

Does importing overwrite existing notebooks?

Only if you use the --overwrite flag. Without it, the import fails if a notebook already exists at the target path.

Can I export notebooks with their outputs?

Yes. Use DBC, Jupyter, or HTML formats to preserve cell outputs. Source format exports only the code.

Should I use DBC or source format for backups?

Use DBC for full backups (preserves outputs and metadata). Use source format for Git-based version control (cleaner diffs, no binary data).