AI Gateway: Governing and Monitoring Access to AI Models

Who this is for:

Architecture / Concept Overview: AI Gateway: Governing and Monitoring Access to AI Models

AI Gateway sits between applications and model providers, intercepting every request for governance and monitoring.

%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% flowchart LR classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED APP1[App A] -->|Request| GW[AI Gateway] APP2[App B] -->|Request| GW APP3[Agent C] -->|Request| GW GW -->|Rate Limit| RL[Rate Limiter] GW -->|Log| LOG[Request Logger] GW -->|Route| HOSTED[Databricks Models] GW -->|Route| OPENAI[OpenAI] GW -->|Route| ANTHRO[Anthropic] GW -->|Failover| FALLBACK[Fallback Provider] APP1:::source APP2:::source APP3:::source GW:::governance RL:::processing LOG:::storage HOSTED:::serving OPENAI:::ingestion ANTHRO:::ingestion FALLBACK:::ingestion

*AI Gateway architecture: applications route through a single proxy with rate limiting, logging, and multi-provider failover.*

%%{init: {"theme":"base","themeVariables":{"background":"#0B0E14","primaryTextColor":"#E0E6ED","lineColor":"#5D6470","darkMode":true,"primaryColor":"#2E4A4A","secondaryColor":"#374151","secondaryTextColor":"#E0E6ED","tertiaryColor":"#111827","tertiaryTextColor":"#E0E6ED","edgeLabelBackground":"#1f2937"}}}%% graph TD classDef source fill:#3F4B59,stroke:#9CA3AF,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef ingestion fill:#5A4B36,stroke:#C9A86B,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef processing fill:#535072,stroke:#8E82B4,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef storage fill:#2E4A4A,stroke:#5FAFA8,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef serving fill:#3D5550,stroke:#6BB7AA,stroke-width:2px,rx:8,ry:8,color:#E0E6ED classDef governance fill:#5A3F52,stroke:#C28BB0,stroke-width:2px,rx:8,ry:8,color:#E0E6ED FEATURES[AI Gateway Features] --> RATE[Rate Limiting] FEATURES --> LOGGING[Usage Logging] FEATURES --> ROUTING[Intelligent Routing] FEATURES --> GUARDRAILS_F[Guardrails] RATE --> PER_USER[Per-User Limits] RATE --> PER_EP[Per-Endpoint Limits] RATE --> TOKEN_LIM[Token Budget Limits] LOGGING --> TOKENS[Token Usage] LOGGING --> LATENCY_L[Latency Metrics] LOGGING --> PAYLOAD[Request/Response Payloads] ROUTING --> FALLBACK_R[Provider Failover] ROUTING --> LOAD_BAL[Load Balancing] FEATURES:::governance RATE:::processing LOGGING:::storage ROUTING:::serving GUARDRAILS_F:::ingestion PER_USER:::source PER_EP:::source TOKEN_LIM:::source TOKENS:::source LATENCY_L:::source PAYLOAD:::source FALLBACK_R:::source LOAD_BAL:::source

*AI Gateway feature set: rate limiting, logging, routing, and guardrails.*

Key Terms

Prerequisites and Setup

Databricks workspace (Premium or Enterprise).
Admin access to configure AI Gateway routes and rate limits.
Provider API keys stored in Databricks Secrets for external models.

Step-by-Step Implementation

Configuration Reference

AI Gateway: Governing and Monitoring Access to AI Models configuration options
Parameter	Default	Description
`rate_limits[].key`	—	Scope: `user` or `endpoint`
`rate_limits[].renewal_period`	—	Window: `minute`, `hour`, or `day`
`rate_limits[].calls`	—	Maximum calls per window
`usage_tracking_config.enabled`	`false`	Enable token usage tracking
`inference_table_config.enabled`	`false`	Log request/response payloads
`guardrails`	`{}`	Input/output safety filters

AI Gateway: Governing and Monitoring Access to AI Models

Architecture / Concept Overview: AI Gateway: Governing and Monitoring Access to AI Models

Key Terms

Prerequisites and Setup

Step-by-Step Implementation

Configuration Reference

Monitoring, Cost, and Security Considerations

Common Pitfalls and Recommended Patterns

Frequently Asked Questions

AI Gateway: Governing and Monitoring Access to AI Models

Architecture / Concept Overview: AI Gateway: Governing and Monitoring Access to AI Models

Key Terms

Prerequisites and Setup

Step-by-Step Implementation

Configuration Reference

Monitoring, Cost, and Security Considerations

Common Pitfalls and Recommended Patterns

Frequently Asked Questions

Related Topics