8 First-class providers · plus any OpenAI-compatible custom endpoint

Every major model, one routing layer.

Switch a provider with a config change. Mix providers per endpoint. Use the same OpenAI SDK across all of them.

Coming soonComing soon Provider docs ↗
CAPABILITY MATRIX

What's wired today

Every provider is in the gateway as a first-class adapter. Your code never knows the difference.

Provider
Chat
Streaming
Tool calling
Embeddings
OpenAI
Anthropic
Google Gemini
Cohere
Mistral
Groq
Together AI
Ollama (local)
CUSTOM PROVIDERS

Bring your own OpenAI-compatible endpoint.

Anything that speaks the OpenAI Chat Completions wire format — vLLM, LM Studio, llama.cpp's server, an internal vLLM cluster, an experimental research endpoint — drops in as a custom provider via an Ollama-style configuration.

  • Per-endpoint base URL + auth header
  • Tool calling + streaming work as long as upstream conforms
  • Mix custom and first-party providers in the same project
  • Cost dashboard, audit, guardrails apply uniformly
Roadmap

Coming next

  • AWS Bedrock adapter (Claude / Titan / Llama)
  • Google Vertex AI native auth
  • Azure OpenAI dedicated endpoint mapping
  • DeepSeek / Replicate adapters

Vote on what we ship next: Coming soonComing soon.

One token, every provider.

Issue a single PromptGate token; route to OpenAI, Anthropic, or any of the 8 with the same SDK and base URL. Provider keys stay encrypted on your side.

Install Community Edition Agent Proxy story →