Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 33 additions & 15 deletions assets/agw-docs/snippets/llm-comparison.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,36 @@
Review the following table to compare agentgateway's support of different LLM provider APIs.

| API | OpenAI | Anthropic | Amazon Bedrock | Azure | Google Gemini | Google Vertex AI | GitHub Copilot |
|-----|:------:|:---------:|:--------------:|:------------:|:-------------:|:----------------:|:---------------:|
| Completions<br>`/v1/chat/completions` | ✅ Native | ✅ Translation | ✅ Translation| ✅ Native | ✅ Native`*`| ✅ Native`†` | ✅ Native |
| Responses<br>`/v1/responses` | ✅ Native | ❌ No | ✅ Translation| ✅ Native| ❌ No | ❌ No | ❌ No |
| Messages<br>`/v1/messages` | ✅ Translation | ✅ Native | ✅ Translation | ✅ Translation | ✅ Translation | ✅ Native`†` | ✅ Translation |
| Embeddings<br>`/v1/embeddings` | ✅ Native | ❌ No | ✅ Translation | ✅ Native | ❌ No | ✅ Translation | ❌ No |
| Realtime<br>`/v1/realtime` | ✅ Native | ❌ No | ❌ No | ❌ No | ❌ No | ❌ No | ❌ No |
| Token Count<br>`/v1/messages/count_tokens` | ❌ No | ✅ Native| ✅ Translation | ❌ No| ❌ No | ✅ Translation | ❌ No |
| Provider | Chat Completions | Responses | Messages | Embeddings | Realtime | Count Tokens | Rerank |
|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
| <img src="/integrations/providers/openai.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> OpenAI | ✅ | ✅ | ✅¹ | ✅ | ✅ | ✅² | - |
| <img src="/integrations/providers/anthropic.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> Anthropic | ✅¹ | ◇ | ✅ | - | - | ✅ | - |
| <img src="/integrations/providers/bedrock.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> Bedrock | ✅¹ | ✅¹ | ✅¹ | ✅¹ | - | ✅⁴ | ✅¹ |
| <img src="/integrations/providers/azure.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> Azure | ✅ | ✅ | ✅¹ | ✅ | - | ✅² | ⚠️³ |
| <img src="/integrations/providers/gemini.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> Gemini | ✅ | ✅¹ | ✅¹ | ✅ | - | ✅² | - |
| <img src="/integrations/providers/vertex.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> Vertex AI | ✅⁴ | ◇ | ✅⁴ | ✅¹ | - | ✅⁴ | ✅¹ |
| <img src="/integrations/providers/copilot.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> Copilot | ✅ | ✅ | ✅¹ | ◇ | - | ✅² | ⚠️³ |
| <img src="/integrations/providers/cohere.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> Cohere | ✅ | ✅¹ | ✅¹ | ✅ | - | ✅² | ✅ |
| <img src="/integrations/providers/ollama.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> Ollama | ✅ | ✅ | ✅¹ | ✅ | - | ✅² | - |
| <img src="/integrations/providers/baseten.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> Baseten | ✅ | ✅¹ | ✅ | - | - | ✅² | - |
| <img src="/integrations/providers/cerebras.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> Cerebras | ✅ | ✅¹ | ✅¹ | - | - | ✅² | - |
| <img src="/integrations/providers/deepinfra.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> Deepinfra | ✅ | ✅¹ | ✅ | ✅ | - | ✅² | - |
| <img src="/integrations/providers/deepseek.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> Deepseek | ✅ | ✅¹ | ✅ | - | - | ✅² | - |
| <img src="/integrations/providers/groq.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> Groq | ✅ | ✅ | ✅¹ | - | - | ✅² | - |
| <img src="/integrations/providers/huggingface.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> Hugging Face | ✅ | ✅ | ✅¹ | - | - | ✅² | - |
| <img src="/integrations/providers/mistral.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> Mistral | ✅ | ✅¹ | ✅¹ | ✅ | - | ✅² | - |
| <img src="/integrations/providers/openrouter.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> OpenRouter | ✅ | ✅ | ✅ | ✅ | - | ✅² | ✅ |
| <img src="/integrations/providers/togetherai.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> Together AI | ✅ | ✅¹ | ✅¹ | ✅ | - | ✅² | ✅ |
| <img src="/integrations/providers/xai.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> xAI | ✅ | ✅ | ✅¹ | - | ✅ | ✅² | - |
| <img src="/integrations/providers/fireworks.svg" alt="" width="20" height="20" style="vertical-align:middle;margin-right:0.4rem;"> Fireworks | ✅ | ✅ | ✅ | ✅ | - | ✅² | ✅ |

**Notes**:
- **✅ Native**: Agentgateway has complete support for the API, and the provider supports the API natively. This allows Agentgateway to passthrough unknown fields without change. As such, even if you use extra fields or new models, the proxying likely works.
- **✅ Translation**: Agentgateway translates from one API to another. As such, agentgateway only supports fields that it is aware of. New models or LLM APIs might require code changes before they are fully supported.
- **❌ No**: Agentgateway does not currently support the API for this provider.
- `*`: Agentgateway supports the API natively via a compatibility endpoint. Note that Google Gemini does a translation for their Completions API support.
- `†`: Agentgateway supports the API natively via translation to Anthropic. Support in Vertex AI differs depending on the model type.
- Both streaming and non-streaming options for the Completions, Responses, and Messages APIs are supported.
Legend:

| Symbol | Meaning |
|--------|--------------------------------------------------------------------------------|
| ✅ | Supported natively |
| ✅¹ | Supported via Agentgateway translation |
| ✅² | Supported by a local estimate by Agentgateway |
| ⚠️³ | Passthrough/provider-dependent; works only with a compatible upstream endpoint |
| ✅⁴ | Supported, but behavior depends on model family or provider route |
| ◇ | Not currently implemented in Agentgateway |
| - | Provider does not offer this capability |
22 changes: 9 additions & 13 deletions assets/agw-docs/standalone/deployment/binary.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ To run agentgateway as a standalone binary, follow the steps to download, instal

{{% steps %}}

### Step 1: Download and install
### Download and install

Download and install the agentgateway binary. Alternatively, you can manually download the binary from the [agentgateway releases page](https://github.com/agentgateway/agentgateway/releases/latest).

Expand Down Expand Up @@ -79,7 +79,7 @@ Password:
agentgateway installed into /usr/local/bin/agentgateway
```

### Step 2: Verify the installation
### Verify the installation

Verify that the `agentgateway` binary is installed.

Expand All @@ -99,26 +99,22 @@ Example output with the latest version, {{< reuse "agw-docs/versions/n-patch.md"
}
```

### Step 3: Create a configuration file
### Run agentgateway

Create a [configuration file]({{< link-hextra path="/configuration/" >}}) for agentgateway. In this example, `config.yaml` is used. You might start with [this simple example configuration file](https://agentgateway.dev/examples/basic/config.yaml).
To run agentgateway, the binary can simply be executed. Configuration will be stored in `~/.config/agentgateway`

```yaml
{{< github url="https://agentgateway.dev/examples/basic/config.yaml" >}}
```sh
agentgateway
```

### Step 4: Run agentgateway
To specify an explicit configuration file, use `-f`:

```sh
agentgateway -f config.yaml
```

Example output:
You might start with [this simple example configuration file](https://agentgateway.dev/examples/basic/config.yaml).

```
info state_manager loaded config from File("config.yaml")
info app serving UI at http://localhost:15000/ui
info proxy::gateway started bind bind="bind/3000"
```
Open <http://localhost:15000/ui> to get started!

{{% /steps %}}
59 changes: 4 additions & 55 deletions assets/agw-docs/standalone/virtual-keys.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,10 @@ EOF

LLMs typically charge per input and output token. Without spending control, users can quickly generate large bills by submitting long prompts, streaming or retrying requests, or running recursive agent loops. To protect against unexpected bills, scaling surprises, and abuse, use token-based rate limits to cap the number of tokens that can be used.

{{< callout type="warning" >}}
`localRateLimit` is a **gateway-wide** limit, not a per-key limit. It enforces a single shared token budget across **all** requests and API keys.
{{< /callout >}}

### How rate limiting works

Agentgateway checks token-based rate limits in two phases:
Expand Down Expand Up @@ -352,61 +356,6 @@ EOF

With this setting, requests are denied immediately if the estimated prompt token count exceeds the available budget.

## Add a global token budget

{{< callout type="warning" >}}
`localRateLimit` is a **gateway-wide** limit, not a per-key limit. It enforces a single shared token budget across **all** requests and API keys.
{{< /callout >}}

To add a token budget that limits total token usage across all requests using more advanced routing options, use the routing-based configuration format with `localRateLimit`.

{{< callout type="info" >}}
Rate limiting requires the `binds/listeners/routes` configuration format because `localRateLimit` is an HTTP-level policy. For more information, see the [Routing-based configuration guide]({{< link-hextra path="/llm/configuration-modes/" >}}).
{{< /callout >}}

```yaml
cat <<'EOF' > config.yaml
# yaml-language-server: $schema=https://agentgateway.dev/schema/config

binds:
- port: 4000
listeners:
- routes:
- backends:
- ai:
name: openai
provider:
openAI:
model: gpt-3.5-turbo
policies:
apiKey:
mode: strict
keys:
- key: sk-alice-abc123def456
metadata:
user: alice
- key: sk-bob-xyz789uvw012
metadata:
user: bob
backendAuth:
key: "$OPENAI_API_KEY"
localRateLimit:
- maxTokens: 100000
tokensPerFill: 100000
fillInterval: 86400s
type: tokens
EOF
```

| Setting | Description |
| -- | -- |
| `backendAuth` | The API key used to authenticate with the LLM provider backend. For configuration options, see [Manage API keys]({{< link-hextra path="/llm/api-keys/" >}}). |
| `localRateLimit` | Token-based rate limiting applied globally to **all** requests through this route, regardless of which API key is used. |
| `maxTokens` | The maximum number of tokens available in the shared budget. |
| `tokensPerFill` | The number of tokens added during each refill. |
| `fillInterval` | The interval between refills. Use `86400s` for a daily budget. |
| `type` | Set to `tokens` for token-based limits. Use `requests` for request-based limits. |

For more information about rate limiting configuration options, see [Rate limits]({{< link-hextra path="/configuration/resiliency/rate-limits/" >}}).

## Monitor per-key spending
Expand Down
47 changes: 32 additions & 15 deletions content/docs/standalone/main/deployment/docker/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,45 +6,62 @@ description: Overview of how to deploy agentgateway with Docker.

To run agentgateway as a Docker container, agentgateway publishes official Docker images at `cr.agentgateway.dev/agentgateway`.

Before you begin, create a [configuration file]({{< link-hextra path="/configuration/" >}}) for agentgateway. In this example, `config.yaml` is used.
You might start with [this simple example configuration file](https://agentgateway.dev/examples/basic/config.yaml).

## Docker

To run agentgateway with Docker, mount your configuration file into the container and expose any necessary ports.
To run agentgateway with Docker, you may either mount your [configuration file]({{< link-hextra path="/configuration/" >}}) directly, or mount a directory
and create the configuration in the UI:

```sh
docker run -v ./config.yaml:/config.yaml -p 3000:3000 \
cr.agentgateway.dev/agentgateway:v{{< reuse "agw-docs/versions/n-patch.md" >}} \
-f /config.yaml
mkdir agentgateway-config
docker run \
--user "$(id -u):$(id -g)" \
-v ./agentgateway-config:/config \
-p 3000:3000 -p 4000:4000 -p 127.0.0.1:15000:15000 \
cr.agentgateway.dev/agentgateway:v{{< reuse "agw-docs/versions/n-patch.md" >}}
```

By default, the agentgateway admin UI listens on localhost, which is not exposed outside of the container.
To access the UI, you can change the bind address and expose the port.
When run in this mode, a configuration file will automatically be created, setting up logging and exposing the admin UI.
The `user` is customized to run as the current user to ensure the container can read and write the configuration.

If you want to provide an explicit file, you can also do so. By default, the agentgateway admin UI listens on localhost, which is not exposed outside of the container;
the `ADMIN_ADDR` is set below to expose it and is optional.

```sh
docker run -v ./config.yaml:/config.yaml -p 3000:3000 \
-p 127.0.0.1:15000:15000 -e ADMIN_ADDR=0.0.0.0:15000 \
docker run \
--user "$(id -u):$(id -g)" \
-v ./config.yaml:/config.yaml \
-p 3000:3000 -p 4000:4000 -p 127.0.0.1:15000:15000 \
-e ADMIN_ADDR=0.0.0.0:15000 \
cr.agentgateway.dev/agentgateway:v{{< reuse "agw-docs/versions/n-patch.md" >}} \
-f /config.yaml
```

Open <http://localhost:15000/ui> to get started!

## Docker Compose

To run agentgateway in Docker Compose, follow a similar approach to mount the configuration file and expose the ports.
To run agentgateway in Docker Compose, follow the same approach as above. Create a directory for the configuration and start the service.

```sh
mkdir agentgateway-config
docker compose up
```

```yaml
services:
agentgateway:
container_name: agentgateway
restart: unless-stopped
image: cr.agentgateway.dev/agentgateway:v{{< reuse "agw-docs/versions/n-patch.md" >}}
# Replace with your user and group IDs, such as the output of: id -u && id -g
user: "1000:1000"
ports:
- "3000:3000"
- "4000:4000"
- "127.0.0.1:15000:15000"
volumes:
- ./config.yaml:/config.yaml
environment:
- ADMIN_ADDR=0.0.0.0:15000
command: ["-f", "/config.yaml"]
- ./agentgateway-config:/config
```

Open <http://localhost:15000/ui> to get started!
3 changes: 2 additions & 1 deletion content/docs/standalone/main/llm/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,5 @@ next: /reference/observability
test: skip
---

Consume LLM services by setting up AI backends for your LLM providers.
Agentgateway can act as a feature rich AI/LLM gateway, acting as a proxy between your applications and LLM providers.
This enables connecting to thousands of LLM model through a unified interface providing governance, observability, and reliability controls.
Loading
Loading