Commit 2d03817
committed
refactor: introduce QuantPolicy enum and consolidate TurboQuant utilities
- Add QuantPolicy IntEnum to replace magic numbers (0, 4, 8, 42) for KV cache
quantization policies: NONE, INT4, INT8, TURBO_QUANT
- Update TurbomindEngineConfig and PytorchEngineConfig to use QuantPolicy type
- Extract TurboQuant utilities (Hadamard rotation, Lloyd-Max codebook) from
fill_kv_cache.py into new dedicated module turbo_quant.py
- Rename butterfly_rotate/butterfly_rotate_inv to hadamard_rotate/
hadamard_rotate_inv for naming accuracy (the transform uses Hadamard matrix)
- Update all call sites across attention kernels, cache engine, and tests
- Update test fixtures and assertions to use QuantPolicy constants
This improves type safety, code readability, and maintains backward
compatibility through enum integer values matching previous magic numbers.1 parent bbde920 commit 2d03817
File tree
17 files changed
+362
-301
lines changed- lmdeploy
- pytorch
- backends
- cuda/attention
- engine
- kernels/cuda
- tests
- pytorch/kernel
- test_lmdeploy
17 files changed
+362
-301
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
19 | 27 | | |
20 | 28 | | |
21 | 29 | | |
| |||
298 | 306 | | |
299 | 307 | | |
300 | 308 | | |
301 | | - | |
| 309 | + | |
| 310 | + | |
302 | 311 | | |
303 | 312 | | |
304 | 313 | | |
| |||
403 | 412 | | |
404 | 413 | | |
405 | 414 | | |
406 | | - | |
| 415 | + | |
407 | 416 | | |
408 | 417 | | |
409 | 418 | | |
| |||
440 | 449 | | |
441 | 450 | | |
442 | 451 | | |
443 | | - | |
| 452 | + | |
| 453 | + | |
444 | 454 | | |
445 | 455 | | |
446 | 456 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
| 5 | + | |
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| 9 | + | |
| 10 | + | |
9 | 11 | | |
10 | 12 | | |
11 | 13 | | |
| |||
18 | 20 | | |
19 | 21 | | |
20 | 22 | | |
21 | | - | |
| 23 | + | |
22 | 24 | | |
23 | 25 | | |
24 | 26 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
4 | 3 | | |
5 | 4 | | |
6 | 5 | | |
| 6 | + | |
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| |||
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
43 | | - | |
| 43 | + | |
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
| |||
279 | 279 | | |
280 | 280 | | |
281 | 281 | | |
282 | | - | |
| 282 | + | |
283 | 283 | | |
284 | | - | |
285 | | - | |
286 | | - | |
287 | | - | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
288 | 288 | | |
289 | 289 | | |
290 | | - | |
| 290 | + | |
291 | 291 | | |
292 | 292 | | |
293 | 293 | | |
| |||
309 | 309 | | |
310 | 310 | | |
311 | 311 | | |
312 | | - | |
313 | | - | |
| 312 | + | |
| 313 | + | |
314 | 314 | | |
315 | 315 | | |
316 | 316 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
7 | 8 | | |
8 | 9 | | |
9 | 10 | | |
| |||
405 | 406 | | |
406 | 407 | | |
407 | 408 | | |
408 | | - | |
| 409 | + | |
409 | 410 | | |
410 | 411 | | |
411 | 412 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
| 5 | + | |
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
| 9 | + | |
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| |||
98 | 98 | | |
99 | 99 | | |
100 | 100 | | |
101 | | - | |
| 101 | + | |
102 | 102 | | |
103 | 103 | | |
104 | 104 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
8 | 7 | | |
9 | 8 | | |
10 | 9 | | |
| |||
20 | 19 | | |
21 | 20 | | |
22 | 21 | | |
| 22 | + | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| |||
140 | 140 | | |
141 | 141 | | |
142 | 142 | | |
143 | | - | |
| 143 | + | |
144 | 144 | | |
145 | 145 | | |
146 | 146 | | |
| |||
155 | 155 | | |
156 | 156 | | |
157 | 157 | | |
158 | | - | |
| 158 | + | |
| 159 | + | |
159 | 160 | | |
160 | 161 | | |
161 | 162 | | |
| |||
167 | 168 | | |
168 | 169 | | |
169 | 170 | | |
170 | | - | |
| 171 | + | |
171 | 172 | | |
172 | 173 | | |
173 | 174 | | |
| |||
183 | 184 | | |
184 | 185 | | |
185 | 186 | | |
186 | | - | |
| 187 | + | |
187 | 188 | | |
188 | 189 | | |
189 | 190 | | |
190 | | - | |
| 191 | + | |
191 | 192 | | |
192 | 193 | | |
193 | 194 | | |
| |||
209 | 210 | | |
210 | 211 | | |
211 | 212 | | |
212 | | - | |
| 213 | + | |
213 | 214 | | |
214 | 215 | | |
215 | 216 | | |
| |||
228 | 229 | | |
229 | 230 | | |
230 | 231 | | |
231 | | - | |
| 232 | + | |
232 | 233 | | |
233 | 234 | | |
234 | 235 | | |
235 | 236 | | |
236 | 237 | | |
237 | 238 | | |
238 | 239 | | |
239 | | - | |
| 240 | + | |
240 | 241 | | |
241 | 242 | | |
242 | 243 | | |
243 | | - | |
| 244 | + | |
244 | 245 | | |
245 | | - | |
| 246 | + | |
246 | 247 | | |
247 | 248 | | |
248 | 249 | | |
| |||
0 commit comments