8 First-class providers · plus any OpenAI-compatible custom endpoint
Every major model, one routing layer.
Switch a provider with a config change. Mix providers per endpoint. Use the same OpenAI SDK across all of them.
Coming soonComing soon Provider docs ↗
CAPABILITY MATRIX
What's wired today
Every provider is in the gateway as a first-class adapter. Your code never knows the difference.
Provider
Chat
Streaming
Tool calling
Embeddings
OpenAI
✓
✓
✓
✓
Anthropic
✓
✓
✓
—
Google Gemini
✓
✓
—
—
Cohere
✓
✓
—
✓
Mistral
✓
✓
✓
✓
Groq
✓
✓
✓
—
Together AI
✓
✓
✓
✓
Ollama (local)
✓
✓
✓
✓
CUSTOM PROVIDERS
Bring your own OpenAI-compatible endpoint.
Anything that speaks the OpenAI Chat Completions wire format — vLLM, LM Studio, llama.cpp's server, an internal vLLM cluster, an experimental research endpoint — drops in as a custom provider via an Ollama-style configuration.
- Per-endpoint base URL + auth header
- Tool calling + streaming work as long as upstream conforms
- Mix custom and first-party providers in the same project
- Cost dashboard, audit, guardrails apply uniformly
Roadmap
Coming next
- AWS Bedrock adapter (Claude / Titan / Llama)
- Google Vertex AI native auth
- Azure OpenAI dedicated endpoint mapping
- DeepSeek / Replicate adapters
Vote on what we ship next: Coming soonComing soon.
One token, every provider.
Issue a single PromptGate token; route to OpenAI, Anthropic, or any of the 8 with the same SDK and base URL. Provider keys stay encrypted on your side.