Every major model, one routing layer.

Switch a provider with a config change. Mix providers per endpoint. Use the same OpenAI SDK across all of them.

Coming soonComing soon Provider docs ↗

CAPABILITY MATRIX

What's wired today

Every provider is in the gateway as a first-class adapter. Your code never knows the difference.

Provider

Chat

Streaming

Tool calling

Embeddings

OpenAI

✓

Anthropic

✓

—

Google Gemini

✓

—

Cohere

✓

—

✓

Mistral

✓

Groq

✓

—

Together AI

✓

Ollama (local)

✓

CUSTOM PROVIDERS

Bring your own OpenAI-compatible endpoint.

Anything that speaks the OpenAI Chat Completions wire format — vLLM, LM Studio, llama.cpp's server, an internal vLLM cluster, an experimental research endpoint — drops in as a custom provider via an Ollama-style configuration.

Per-endpoint base URL + auth header
Tool calling + streaming work as long as upstream conforms
Mix custom and first-party providers in the same project
Cost dashboard, audit, guardrails apply uniformly

Roadmap

Coming next

AWS Bedrock adapter (Claude / Titan / Llama)
Google Vertex AI native auth
Azure OpenAI dedicated endpoint mapping
DeepSeek / Replicate adapters

Vote on what we ship next: Coming soonComing soon.

One token, every provider.

Issue a single PromptGate token; route to OpenAI, Anthropic, or any of the 8 with the same SDK and base URL. Provider keys stay encrypted on your side.

Install Community Edition Agent Proxy story →