llm-d incubation
Incubating components of llm-d, a Kubernetes-native high-performance distributed LLM inference framework
Popular repositories Loading
-
-
llm-d-modelservice
llm-d-modelservice Publichelm charts for deploying models with llm-d
-
-
llm-d-fast-model-actuation
llm-d-fast-model-actuation PublicKubernetes controllers for fast model actuation using vLLM sleep/wake and launcher-based model swapping
-
batch-gateway
batch-gateway PublicThe batch gateway is an llm-d implementation of the OpenAI batch inference API
-
py-inference-scheduler
py-inference-scheduler PublicPython based inference-scheduler for Reinforcement Learning
Repositories
Showing 10 of 14 repositories
- llm-d-fast-model-actuation Public
Kubernetes controllers for fast model actuation using vLLM sleep/wake and launcher-based model swapping
llm-d-incubation/llm-d-fast-model-actuation’s past year of commit activity - llm-d-planner Public
llm-d-incubation/llm-d-planner’s past year of commit activity - weight-propagation-interface Public
llm-d-incubation/weight-propagation-interface’s past year of commit activity - llm-d-skills Public
llm-d-incubation/llm-d-skills’s past year of commit activity - hermes Public
Hermes is a cluster configuration scanning and self-test generation tool for llm-d inference workloads
llm-d-incubation/hermes’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…