Data Management and Unity Catalog

    Who this is for:

    Architecture / Concept Overview: Data Management and Unity Catalog

    Unity Catalog provides a unified namespace and permission model across all Databricks workspaces, engines, and clouds.

    %%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% flowchart LR classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED USERS[Users & Apps] --> WS1[Workspace A] USERS --> WS2[Workspace B] WS1 --> UC[Unity Catalog<br/>Centralised Governance] WS2 --> UC UC --> META[Metastore<br/>Metadata & Permissions] META --> MANAGED[Managed Storage<br/>Delta Tables] META --> EXTERNAL[External Locations<br/>S3 · ADLS · GCS] USERS:::source WS1:::processing WS2:::processing UC:::governance META:::governance MANAGED:::storage EXTERNAL:::storage

    *Figure 1 — Unity Catalog provides centralised governance across multiple workspaces with managed and external storage.*

    The governance capabilities of Unity Catalog span six pillars — from access control through monitoring.

    %%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% graph TD classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED UC[Unity Catalog Governance] --> AC[Access Control<br/>Privileges · Row Filters · Column Masks] UC --> LIN[Lineage<br/>Table & Column-level tracking] UC --> DISC[Discovery<br/>Catalog Explorer · Search · Tags] UC --> AI[AI Insights<br/>Auto comments · Profiling] UC --> QUAL[Data Quality<br/>Monitors · Anomaly detection] UC --> AUDIT[Audit<br/>System tables · Usage logs] UC:::governance AC:::processing LIN:::serving DISC:::serving AI:::processing QUAL:::ingestion AUDIT:::source

    *Figure 2 — Six pillars of Unity Catalog governance: access control, lineage, discovery, AI insights, data quality, and audit.*

    Key Terms

    Prerequisites and Setup

    • Databricks account with Unity Catalog enabled
    • Account admin role for initial metastore and workspace setup
    • Cloud storage for managed and external locations
    • Identity provider (Azure AD, Okta, or similar) for user and group synchronisation

    Step-by-Step Implementation

      Configuration Reference

      Data Management and Unity Catalog configuration options
      SettingScopeRecommended Value
      Metastore per regionAccountOne metastore per cloud region
      Default catalogWorkspaceSet to team or environment catalog
      Identity federationAccountEnable SCIM sync from your IdP
      Managed storageMetastoreDedicated bucket/container per metastore
      External locationsMetastoreRegister all production storage paths
      Audit loggingAccountEnable for compliance and troubleshooting
      Predictive optimisationCatalog/SchemaEnable for frequently queried assets

      Monitoring, Cost, and Security Considerations

      Common Pitfalls and Recommended Patterns

        Frequently Asked Questions