Unity Catalog started as a data governance solution. It governed tables, views, and volumes with fine-grained access control, data lineage, and audit logging. With the rise of GenAI, Unity Catalog has expanded to govern the entire AI lifecycle — models, vector indexes, feature tables, and AI agents.

This evolution matters because AI governance is becoming a regulatory requirement, and most organizations are struggling to retrofit governance onto AI systems that were built without it.

What Unity Catalog Now Governs

Traditional Data Assets

  • Tables and views: Row-level and column-level security, data masking, access audit trails
  • Volumes: Unstructured data (PDFs, images, documents) with the same access controls as structured data
  • Functions: User-defined functions with permission management

ML and AI Assets

  • Registered models: ML models with versioning, lineage tracking, and permission-based access. Know which data trained which model version.
  • Feature tables: Shared feature definitions with access control, preventing unauthorized teams from accessing sensitive features.
  • Model serving endpoints: Govern who can deploy and query model endpoints.

GenAI Assets

  • Vector search indexes: Govern access to vector stores used in RAG systems. Users can only search vectors derived from documents they're authorized to access.
  • AI agents: Register, version, and govern AI agents with full audit logging of agent actions and decisions.
  • Foundation model access: Control which teams can access which foundation models via the Model Serving endpoint.

Why Unified Governance Matters for GenAI

The Access Control Challenge

In a RAG system, the LLM generates responses based on retrieved documents. If a junior employee asks a question and the system retrieves board-level financial documents, that's a data breach — even though the employee never directly accessed the document.

Unity Catalog solves this by enforcing access controls at the retrieval layer. Vector search queries are filtered by the user's permissions. If you can't access the source document, you can't retrieve its vectors.

The Lineage Challenge

When an AI agent makes a decision, regulators want to know: what data informed that decision? What model was used? What version? What was the model's training data?

Unity Catalog tracks this lineage end-to-end: from source data through feature engineering, model training, and inference. Every prediction can be traced back to its origins.

The Compliance Challenge

The EU AI Act requires documentation of AI systems including training data, performance metrics, known limitations, and intended use. Unity Catalog's metadata management and audit logging provide the infrastructure for this documentation.

Practical Implementation

Setting Up Model Governance

  1. Register all ML models in Unity Catalog's model registry
  2. Define access policies: who can read, deploy, and modify each model
  3. Enable lineage tracking to connect models to their training data and features
  4. Set up alerts for model deployments and permission changes

Setting Up Vector Search Governance

  1. Create vector search indexes linked to source Delta tables
  2. Inherit access controls from the source table — users who can't query the table can't search its vectors
  3. Enable audit logging for all vector search queries
  4. Monitor which users access which document collections

Setting Up Agent Governance

  1. Register agents in Unity Catalog with their tool definitions and permissions
  2. Define which tools each agent can access (principle of least privilege)
  3. Log all agent actions for audit and debugging
  4. Implement approval workflows for agents that take high-impact actions

The Bottom Line

Unity Catalog's expansion into AI governance isn't just a product feature — it reflects a fundamental truth: you can't govern AI without governing the data it uses, the models it runs, and the actions it takes. A unified governance layer across all these assets is the only practical approach at enterprise scale. If you're building GenAI on Databricks, Unity Catalog should be your governance foundation from day one.