Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
b83e4ed
Build custom vllm-runtime image
treydock Jun 9, 2026
1a82ad3
Fix new job
treydock Jun 9, 2026
cf663c8
Add dynamo image-puller to help ensure dynamo images are always on wo…
treydock Jun 9, 2026
c323f02
Add initial docs and fix daemonset command
treydock Jun 9, 2026
dd67958
Add HF token secret and cache PVC
treydock Jun 9, 2026
c83c16b
Change how HF token is set and add namespace to everything
treydock Jun 9, 2026
eb0dd71
Also set service account on PVC
treydock Jun 9, 2026
7b90bc1
Various fixes
treydock Jun 9, 2026
2480b19
Add frontend and open-webui
treydock Jun 9, 2026
f39117d
Avoid space issues hopefully
treydock Jun 9, 2026
74f4e5f
Add worker logic
treydock Jun 9, 2026
f98ac16
Each DGD needs a frontend and is self-contained so point Open WebUI a…
treydock Jun 9, 2026
5e2c0ad
Fix frontend resource requests
treydock Jun 9, 2026
8659216
Make models public by default
treydock Jun 9, 2026
c271b48
Update docs
treydock Jun 9, 2026
d104cce
Set OPENAI_API_BASE_URLS so multiple can be set
treydock Jun 9, 2026
5501608
Use served-model-name to set model name
treydock Jun 9, 2026
d5b87db
Support frontend args and env
treydock Jun 9, 2026
bebf7dc
Base model alias off a new property to allow different name
treydock Jun 10, 2026
dcced6f
Support disaggregated and model shared args
treydock Jun 10, 2026
4515bd5
More central way to determine GPUs
treydock Jun 10, 2026
d325c92
Ensure RDMA works for disaggregated workflows
treydock Jun 10, 2026
1d41945
Need NIXL telemetry enabled for podmonitor
treydock Jun 10, 2026
9d1a89f
Support multinode and kv routing and ensure proper kv-transfer-config…
treydock Jun 10, 2026
dd7da9e
Do not force CUPY cache path
treydock Jun 10, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions .github/workflows/release_docker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,22 @@ jobs:
- nominatim
- postgresql-nominatim
- cryosparc
- vllm-runtime
name: Release ${{ matrix.image }} Docker image
steps:
- name: Checkout
uses: actions/checkout@v6
- name: Free Disk Space (Ubuntu)
if: matrix.image == 'vllm-runtime'
uses: jlumbroso/free-disk-space@v1.3.1
with:
tool-cache: false
android: true
dotnet: true
haskell: true
large-packages: true
docker-images: true
swap-storage: true
- name: Login to DockerHub
if: matrix.image != 'dcgm-exporter'
uses: docker/login-action@v4
Expand Down
34 changes: 34 additions & 0 deletions .github/workflows/test_docker.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: Test Docker

on:
pull_request:
paths:
- docker-images/**
- .github/workflows/test_docker.yaml

jobs:
test-docker:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
image:
- vllm-runtime
name: Test ${{ matrix.image }} Docker image
steps:
- name: Checkout
uses: actions/checkout@v6
- name: Free Disk Space (Ubuntu)
if: matrix.image == 'vllm-runtime'
uses: jlumbroso/free-disk-space@v1.3.1
with:
tool-cache: false
android: true
dotnet: true
haskell: true
large-packages: true
docker-images: true
swap-storage: true
- name: Build ${{ matrix.image }} image
run: |
make -f docker-images/${{ matrix.image }}/Makefile build
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ Charts for deploying OSC specific Kubernetes services using Helm
- [webservice](#webservice)
- [osc-open-webui](#osc-open-webui)
- [database](#database)
- [dynamo](#dynamo)
- [Helm Chart Values Patterns](#helm-chart-values-patterns)
- [Environment-Specific Values](#environment-specific-values)
- [Infrastructure charts](#infrastructure-charts)
Expand Down Expand Up @@ -62,6 +63,10 @@ The [database](charts/database/README.md) chart provides database service deploy

It includes integration with osc-common for consistent security and configuration practices.

### dynamo

The [dynamo](charts/dynamo/README.md) chart handles deploying Dynamo and resources related to Dynamo at OSC.

## Helm Chart Values Patterns

When using the OSC Puppet infrastructure, some values are automatically added to Helm charts, particularly for webservices.
Expand Down
23 changes: 23 additions & 0 deletions charts/dynamo/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
35 changes: 35 additions & 0 deletions charts/dynamo/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
apiVersion: v2
name: dynamo
description: A Helm chart to deploy Dynamo resources at OSC

# A chart can be either an 'application' or a 'library' chart.
#
# Application charts are a collection of templates that can be packaged into versioned archives
# to be deployed.
#
# Library charts provide useful utilities or functions for the chart developer. They're included as
# a dependency of application charts to inject those utilities and functions into the rendering
# pipeline. Library charts do not define any templates and therefore cannot be deployed.
type: application

# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.1.0

# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
# It is recommended to use it with quotes.
appVersion: "1.2.0"
maintainers:
- name: treydock
dependencies:
- name: osc-common
version: 0.14.2
repository: https://osc.github.io/osc-helm-charts/
# repository: file://../osc-common
- name: osc-open-webui
version: 0.7.3
repository: https://osc.github.io/osc-helm-charts/
# repository: file://../osc-open-webui
98 changes: 98 additions & 0 deletions charts/dynamo/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# dynamo

![Version: 0.1.0](https://img.shields.io/badge/Version-0.1.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.2.0](https://img.shields.io/badge/AppVersion-1.2.0-informational?style=flat-square)

A Helm chart to deploy Dynamo resources at OSC

## dynamo Chart

This chart handles the deployment of Dynamo resources. It depends on the Dynamo platform and operator being deployed.

Components:

* image-puller
* A daemonset that pulls the Dynamo images and sleeps to ensure those images are on all nodes
* Items to support Dynamo
* PVC for cache storage
* Huggingface secret
* Dynamo frontend
* Open WebUI

## Usage

The following is the minimal values to deploy this chart:

```yaml
---
global:
imagePullSecret:
password: <webservices-read password>
ingress:
host: <Ingress host>
hostAlias: <Ingress host alias>
webui_secret_key: <Open WebUI webui secret>
models:
qwen3:
model: Qwen/Qwen3-0.6B
disagg: true
kvRouting: true
args:
- --max-model-len
- "20480"
decode:
args: []
cpu: 2
memory: 6Gi
gpu: 1
gpuType: NVIDIA-A100-PCIE-40GB-MIG-1g.5gb
nodes: 1
prefill:
args: []
cpu: 2
memory: 6Gi
gpu: 1
gpuType: NVIDIA-A100-PCIE-40GB-MIG-1g.5gb
nodes: 1
hfToken:
value: <Huggingface token>
```

## Requirements

| Repository | Name | Version |
|------------|------|---------|
| https://osc.github.io/osc-helm-charts/ | osc-common | 0.14.2 |
| https://osc.github.io/osc-helm-charts/ | osc-open-webui | 0.7.3 |

## Values

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| global.oscServiceAccount | string | `"dynamo"` | The service account used by OSC deployments. |
| global.environment | string | `"production"` | The deployment's OSC environment |
| global.nodeSelectorRole | string | `"webservices"` | The OSC node role to use with nodeSelector |
| global.imagePullSecret.name | string | `"osc-registry"` | imagePullSecret name |
| global.imagePullSecret.registry | string | `"docker-registry.osc.edu"` | imagePullSecret registry |
| global.imagePullSecret.username | string | `"robot$webservices-read"` | imagePullSecret username |
| global.imagePullSecret.password | string | **required** | imagePullSecret password. This value will be set by OSC's Puppet. |
| global.fileset | string | `"PZS0645"` | The fileset for storage |
| global.storageClass | string | `"local-ess"` | The storage class to use for PV claims |
| global.ingress.host | string | `""` | Ingress host name |
| global.ingress.hostAlias | string | `""` | Ingress host alias |
| global.auth.allowGroups | list | `["oscall","PZS0645"]` | Restrict access to these groups |
| global.webui_secret_key | string | **required** | The Open WebUI secret key |
| global.models | object | `{}` | Define models |
| image.repository | string | `"kubernetes/ai-dynamo/vllm-runtime"` | The repository path to main vllm runtime image |
| image.tag | string | The chart's appVersion | The vllm runtime image tag |
| image.release | string | `"0"` | The release of the custom OSC vllm runtime image |
| hfToken.value | string | **required** | The HF token for Hugging Face |
| cache | object | `{}` | |
| defaultGpuType | string | `"NVIDIA-A100-PCIE-40GB-MIG-7g.40gb"` | The default GPU type |
| rdmaResource | string | `"rdma/shared_mlx5"` | The RDMA resource name in Kubernetes |
| kvTransferConfig | object | `{"kv_connector":"NixlConnector","kv_role":"kv_both"}` | The configuration of kv-transfer-config |
| kvEventsConfig | object | `{"enable_kv_cache_events":true,"endpoint":"tcp://*:20080","publisher":"zmq","topic":"kv-events"}` | The configuration for kv-events-config |
| osc-open-webui.open-webui.image.tag | string | `"0.9.6"` | The version of Open WebUI |
| osc-open-webui.open-webui.sso.enableRoleManagement | bool | `true` | Enables role access controls in Open WebUI |

----------------------------------------------
Autogenerated from chart metadata using [helm-docs v1.14.2](https://github.com/norwoodj/helm-docs/releases/v1.14.2)
65 changes: 65 additions & 0 deletions charts/dynamo/README.md.gotmpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
{{ template "chart.header" . }}
{{ template "chart.deprecationWarning" . }}

{{ template "chart.badgesSection" . }}

{{ template "chart.description" . }}

## dynamo Chart

This chart handles the deployment of Dynamo resources. It depends on the Dynamo platform and operator being deployed.

Components:

* image-puller
* A daemonset that pulls the Dynamo images and sleeps to ensure those images are on all nodes
* Items to support Dynamo
* PVC for cache storage
* Huggingface secret
* Dynamo frontend
* Open WebUI

## Usage

The following is the minimal values to deploy this chart:

```yaml
---
global:
imagePullSecret:
password: <webservices-read password>
ingress:
host: <Ingress host>
hostAlias: <Ingress host alias>
webui_secret_key: <Open WebUI webui secret>
models:
qwen3:
model: Qwen/Qwen3-0.6B
disagg: true
kvRouting: true
args:
- --max-model-len
- "20480"
decode:
args: []
cpu: 2
memory: 6Gi
gpu: 1
gpuType: NVIDIA-A100-PCIE-40GB-MIG-1g.5gb
nodes: 1
prefill:
args: []
cpu: 2
memory: 6Gi
gpu: 1
gpuType: NVIDIA-A100-PCIE-40GB-MIG-1g.5gb
nodes: 1
hfToken:
value: <Huggingface token>
```

{{ template "chart.requirementsSection" . }}

{{ template "chart.valuesSection" . }}

{{ template "helm-docs.versionFooter" . }}
96 changes: 96 additions & 0 deletions charts/dynamo/templates/_helpers.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "dynamo.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "dynamo.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}

{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "dynamo.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Common labels
*/}}
{{- define "dynamo.labels" -}}
helm.sh/chart: {{ include "dynamo.chart" . }}
{{ include "dynamo.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}

{{/*
Selector labels
*/}}
{{- define "dynamo.selectorLabels" -}}
app.kubernetes.io/name: {{ include "dynamo.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}

{{/*
Create the name of the service account to use
*/}}
{{- define "dynamo.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "dynamo.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}

{{/*
image-puller labels
*/}}
{{- define "dynamo.image-puller.labels" -}}
helm.sh/chart: {{ include "dynamo.chart" . }}
{{ include "dynamo.image-puller.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}

{{/*
image-puller selector labels
*/}}
{{- define "dynamo.image-puller.selectorLabels" -}}
app.kubernetes.io/name: image-puller
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}

{{/* Image tag helper */}}
{{- define "dynamo.image" -}}
{{- $tag := printf "%s-osc-r%s" (.Values.image.tag | default .Chart.AppVersion) .Values.image.release }}
{{- printf "%s/%s:%s" .Values.global.imagePullSecret.registry .Values.image.repository $tag }}
{{- end }}

{{- define "dynamo.openai.urls" -}}
{{- $urls := list }}
{{- range $name, $model := .Values.global.models }}
{{- $urls = append $urls (printf "http://%s-frontend.%s.svc.cluster.local:8000/v1" $name $.Release.Namespace) }}
{{- end }}
{{- $urls | join ";" }}
{{- end }}
Loading
Loading