Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion lmdeploy/turbomind/deploy/parameter.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,10 @@ def __call__(self, f, g, i):
scales = self._get(g, 'scales')
f(i, scales, 'scales', to_half, apply_gs=['w2'])
if self.compressed_tensors and not self.has_zero_point:
zeros = generate_zero_point(scales)
if scales is not None and all(s is not None for s in scales):
zeros = generate_zero_point(scales)
else:
zeros = scales
Comment on lines +100 to +103
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new branch that skips generate_zero_point when scales (or any element within it) is None isn’t covered by the existing compressed-tensors tests. Please add a unit test that exercises QuantWeightOnly with compressed-tensors keys where weight_scale is a tuple containing None entries (e.g., all None for a missing self_attn layer) and asserts the call does not crash and that zeros is passed through consistently.

Copilot uses AI. Check for mistakes.
else:
zeros = self._get(g, 'qzeros')
f(i, zeros, 'zeros', to_half, apply_gs=['w2'])
Expand Down
Loading