User B (different session on same cluster) cannot see it
Lakeguard is the security technology built into Databricks that enforces strong user isolation on shared (Standard) clusters, preventing users from accessing each other's data, credentials, or intermediate results. It combines process isolation, filesystem restrictions, and network controls to make shared clusters as secure as dedicated single-user clusters. Lakeguard enables cost-efficient multi-tenant compute without sacrificing data protection.
- Understand how Lakeguard enforces user isolation on shared clusters
- Learn the security boundaries Lakeguard creates between users
- Configure clusters and policies to take advantage of Lakeguard
Who this is for: Platform administrators and security engineers responsible for compute security and user isolation on Databricks.
Part of the Databricks Compute section of the Databricks tutorial series.
Architecture / Concept Overview: User B (different session on same cluster) cannot see it
Lakeguard creates a security boundary around each user session on a shared cluster. Each user's code runs in an isolated process with its own Python/R environment, restricted filesystem access, and network policies. Even though multiple users share the same cluster JVM, Lakeguard prevents lateral movement between sessions by blocking access to environment variables, local files, and driver ports that could leak another user's data.
*Lakeguard wraps each user session in an isolated sandbox, with Unity Catalog enforcing per-user data access.*
*Lakeguard restricts each user's process to approved filesystem paths, controlled network access, and blocked environment variables.*
*Without Lakeguard, shared cluster users can access the full filesystem and environment. Lakeguard restricts both.*
Key Terms
- Lakeguard
- Databricks security technology that enforces user isolation on shared (Standard access mode) clusters.
- Process Isolation
- Running each user's code in a separate OS process to prevent data leakage between sessions.
- Standard Access Mode
- The cluster mode (formerly Shared) that enables multi-user access with Lakeguard isolation.
- Sandbox
- The restricted execution environment Lakeguard creates for each user session.
- Unity Catalog
- The governance layer that enforces per-user data access permissions, complementing Lakeguard's compute isolation.
Prerequisites and Setup
- A Databricks workspace with Unity Catalog enabled
- Clusters configured with Standard (USER_ISOLATION) access mode
- Understanding of multi-tenant security requirements
- Workspace admin or security admin role for configuration
Step-by-Step Implementation
Configuration Reference
| Security Feature | Standard (Lakeguard) | Dedicated | No Isolation |
|---|---|---|---|
| Multi-user support | Yes | No | Yes (no security) |
| Filesystem restriction | Yes | No | No |
| Environment variable access | Blocked | Full | Full |
| Custom JVM libraries | Blocked | Allowed | Allowed |
| FUSE mounts | Blocked | Allowed | Allowed |
| Network egress control | Restricted | Open | Open |
| Unity Catalog enforcement | Per-user | Single identity | None |
| Temporary view isolation | Yes | N/A | No |
Monitoring, Cost, and Security Considerations
Monitoring
Audit cluster creation events to ensure all shared clusters use USER_ISOLATION mode. Monitor for clusters created with NONE or NO_ISOLATION mode, which bypass Lakeguard protections.
Cost Optimisation
- Lakeguard enables safe multi-user clusters, which are more cost-efficient than giving each user a dedicated cluster.
- One Standard cluster serving five users costs less than five Dedicated clusters.
- The isolation overhead is minimal and does not significantly affect query performance.
Security and Governance
- Lakeguard prevents users from reading local files, environment variables, or network ports that could leak credentials.
- Unity Catalog handles data access control (which tables and columns a user can read); Lakeguard handles compute isolation (preventing lateral movement on the cluster).
- Both layers must be active for full security: Lakeguard without Unity Catalog leaves data ungoverned, and Unity Catalog without Lakeguard leaves compute boundaries open.
Common Pitfalls and Recommended Patterns
- Assuming Unity Catalog alone provides compute isolation: Unity Catalog governs data access, not process-level security on the cluster.
- Using NO_ISOLATION mode for convenience: this disables all boundaries between users on the same cluster.
- Installing libraries that require filesystem access on Standard clusters: those libraries may be blocked by Lakeguard.
- Not auditing cluster access modes: a single unrestricted cluster can undermine your security posture.
- Using Dedicated clusters for users who only need Python and SQL: Standard with Lakeguard is cheaper and equally secure.
- Forgetting to test isolation after policy changes: verify with a simple cross-session access test.
Frequently Asked Questions
Does Lakeguard affect performance?
The performance impact is minimal. Process isolation adds a small overhead, but for most workloads the difference is negligible compared to the security benefit.
Can I use Scala on a Lakeguard-protected cluster?
Scala is supported on Standard clusters but with restrictions: custom JARs and arbitrary JVM code are blocked because they could bypass process isolation.
Is Lakeguard the same as Unity Catalog?
No. They complement each other. Unity Catalog controls what data a user can access (tables, columns, rows). Lakeguard controls compute isolation — preventing users from accessing each other's processes, files, and environment on a shared cluster.
Do serverless notebooks use Lakeguard?
Serverless compute provides equivalent isolation through containerised execution environments. The result is similar — users cannot access each other's data — but the implementation differs from classic cluster Lakeguard.
When should I use Dedicated instead of Lakeguard?
Use Dedicated clusters when workloads require features blocked by Lakeguard: custom JVM libraries, GPU access, FUSE mounts, or full filesystem access.