Skip to content

Commit f1c0755

Browse files
douhashiclaude
andcommitted
Remove unsupported --chat-template-kwargs flag
Current vllm/vllm-openai:gemma4 image does not support this flag. Thinking disable will be possible after image update with --default-chat-template-kwargs from vllm-project/vllm#39027. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 2a4bd0d commit f1c0755

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

scripts/deploy-runpod.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ done
8686
# --default-chat-template-kwargs '{"enable_thinking": true}'
8787
# --chat-template examples/tool_chat_template_gemma4.jinja
8888
# 関連: vllm-project/vllm#38855, block/goose#6192
89-
VLLM_CMD="${MODEL_NAME},--served-model-name,${MODEL_NAME},gpt-4o-mini,--max-model-len,${MAX_MODEL_LENGTH},--gpu-memory-utilization,${GPU_MEMORY_UTILIZATION},--dtype,${DTYPE},--api-key,${VLLM_API_KEY},--enable-auto-tool-choice,--tool-call-parser,gemma4,--reasoning-parser,gemma4,--chat-template-kwargs,enable_thinking=false,--host,0.0.0.0,--port,8000"
89+
VLLM_CMD="${MODEL_NAME},--served-model-name,${MODEL_NAME},gpt-4o-mini,--max-model-len,${MAX_MODEL_LENGTH},--gpu-memory-utilization,${GPU_MEMORY_UTILIZATION},--dtype,${DTYPE},--api-key,${VLLM_API_KEY},--enable-auto-tool-choice,--tool-call-parser,gemma4,--reasoning-parser,gemma4,--host,0.0.0.0,--port,8000"
9090

9191
# ===== Create Template =====
9292
echo "==> Creating template: ${TEMPLATE_NAME}"

0 commit comments

Comments
 (0)