-
Notifications
You must be signed in to change notification settings - Fork 681
Pull requests: InternLM/lmdeploy
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[ascend] remove t().contiguous() in updating moe weights
#4516
opened Apr 10, 2026 by
wanfengcxz
Contributor
•
Draft
[ci] change test whl into python 312 and use test images
#4513
opened Apr 9, 2026 by
zhulinJulia24
Collaborator
Loading…
add explicit trust_remote_code controls to resolve the security issue
#4511
opened Apr 8, 2026 by
lvhan028
Collaborator
Loading…
feat: Add TurboQuant (quant_policy=42) support for KV Cache Quantization
#4510
opened Apr 8, 2026 by
windreamer
Collaborator
Loading…
make fp8 model quantized by llm-compressor can be inferenced in turbomind
enhancement
New feature or request
#4509
opened Apr 8, 2026 by
43758726
Collaborator
Loading…
[Fix]: Handle None scales in generate_zero_point for mixed-format layers
improvement
#4505
opened Apr 7, 2026 by
lingyezhixing
Loading…
fix: handle missing KV cache without crashing engine
Bug:P0
#4497
opened Apr 4, 2026 by
lvhan028
Collaborator
Loading…
feat(turbomind): integrate cublasGemmGroupedBatchedEx for Qwen3.5 MoE inference on Blackwell GPUs with memory copy optimizations
enhancement
New feature or request
#4490
opened Apr 3, 2026 by
hd9568
Loading…
fix lite module for transformers>=5.0
improvement
#4488
opened Apr 2, 2026 by
43758726
Collaborator
Loading…
Integrate deep-ep nccl backend
enhancement
New feature or request
#4477
opened Mar 27, 2026 by
irexyc
Collaborator
Loading…
[refactor] [api_server] [1/N] Improve reasoning and tool-call parsers
improvement
#4468
opened Mar 26, 2026 by
lvhan028
Collaborator
Loading…
feat: Turbomind linear gdn prefix caching
enhancement
New feature or request
#4465
opened Mar 25, 2026 by
lapy
Contributor
Loading…
feat: implement Turbomind vision encoder support for Qwen3VL/3.5 families
enhancement
New feature or request
#4460
opened Mar 24, 2026 by
lapy
Contributor
Loading…
[Feature] Support n parameter in /v1/chat/completions and /v1/completions
improvement
#4419
opened Mar 17, 2026 by
ziyangliu-666
Loading…
Fix Structured Output for GPT-OSS Models
#4386
opened Mar 2, 2026 by
windreamer
Collaborator
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2026-04-08.