ai_gateway Product type

AI Gateway

The most-opinionated of the five types. Each endpoint pins a provider + model + prompt and validates input/output. Sessions, streaming, schemas, failover, budgets — all per-endpoint.

WHAT'S IN THE BOX

Endpoints as first-class artefacts

Each AI endpoint carries every knob you'd want for a production-quality LLM call.

Pinned provider + model

One endpoint = one upstream model. Or pick a Provider Template if you want to share configuration across endpoints.

System prompt + template

A system prompt and an optional {{input}}-templated user prompt. Re-render with new context, not a new endpoint.

JSON Schema validation

Optional input + output JSON Schemas enforced via justinrainbow/json-schema. 422 on invalid output.

Streaming

Per-endpoint streaming_enabled flag. SSE pass-through with full provider semantics intact.

Sessions

Server-side conversation state with TTL, max-messages, max-tokens. Session UUID flows transparently through the gateway.

Failover chain

Try secondary credentials when the primary fails. Transparent to the client; logged with the routing decision.

Rate limits

Per-minute and per-hour caps. 429 + Retry-After header on breach. Cache-driver agnostic.

Budgets

Per-request token cap + monthly USD budget. 422 before the provider call when configured; null = unlimited.

Routing rules (YAML)

Pick provider/model based on input size, schema presence, monthly_spend_pct, time_of_day. First-match-wins; no match → endpoint default.

EXAMPLE

Call an AI Gateway endpoint

One slug per endpoint. The slug is the public path; the endpoint's pinned model + prompt + schema take care of everything else.

  • System prompt baked in — clients only send a user message
  • Output schema validated server-side
  • Tokens, latency, status logged in gateway_logs
  • Cost rolls into the project dashboard automatically
curl shell
# POST to the endpoint slug
curl -X POST https://promptgate.your.co/api/<uuid>/summarize \
  -H "Authorization: Bearer pg_live_..." \
  -H "Content-Type: application/json" \
  -d '{
        "messages": [
          {"role": "user", "content": "Long article goes here..."}
        ]
      }'

# → returns the validated, schema-constrained response
DECISION HELPER

Pick this type when…

Use AI Gateway when

  • You ship a real product feature with one or more LLM-backed endpoints
  • You want to enforce a system prompt + schema on every call
  • You need sessions, failover, budgets, or rate limits per endpoint
  • You expose specific endpoints to specific clients (different scopes per token)

Pick something else when

  • Your app uses an OpenAI SDK and you just want to swap the base URL → Agent Proxy or ai_wrapper
  • You want to proxy a non-LLM HTTP API → API Gateway
  • You want to aggregate multiple MCP servers behind one endpoint → MCP Gateway

Ready to ship?

Pull the image, create your first AI Gateway project, and define a single endpoint. The wizard takes you through every knob in seven tabs.

Install Community Edition AI Endpoints docs ↗