Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR refactors the decoding-loop “step state” management by moving prefill/decode update logic into paradigm-specific StepInputs subclasses, removing the previous cross-strategy delegation methods.
Changes:
- Introduces a new
StepInputsABC and adds paradigm implementations for AR, DLLM, and AR-Spec. - Simplifies strategy ABCs by removing now-unneeded “delta/merge/step” update hooks, and updates the model agent loop to call
StepInputsdirectly. - Moves AR helper logic into
ar/model_inputs.pyand updates DLLM/AR-Spec code to align with the new lifecycle (merge_prefill,reindex,step_decode).
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| lmdeploy/pytorch/strategies/base/step_inputs.py | Adds the new StepInputs base class and defines the decoding-loop lifecycle contract. |
| lmdeploy/pytorch/strategies/ar/step_inputs.py | Implements AR-specific step-state transitions and sampling-delta handling. |
| lmdeploy/pytorch/strategies/dllm/step_inputs.py | Implements DLLM-specific step-state transitions including mask-aware sampling delta updates. |
| lmdeploy/pytorch/strategies/ar_spec/step_inputs.py | Implements AR-Spec-specific step-state transitions including draft-token and mRoPE handling. |
| lmdeploy/pytorch/engine/model_agent/agent.py | Replaces the old inline StepInputs with strategy_factory.build_step_inputs() and calls into the new API. |
| lmdeploy/pytorch/strategies/base/init.py | Extends StrategyFactoryBase with build_step_inputs(). |
| lmdeploy/pytorch/strategies/base/model_agent.py | Removes obsolete ABC hooks (update_*_for_next_step, update_extra_inputs) now owned by StepInputs. |
| lmdeploy/pytorch/strategies/base/model_inputs.py | Removes obsolete ABC hooks (merge, update_inputs) now owned by StepInputs. |
| lmdeploy/pytorch/strategies/base/sampling.py | Removes obsolete ABC hooks (merge/step/update_sampling_delta) now owned by StepInputs. |
| lmdeploy/pytorch/strategies/ar/model_inputs.py | Promotes AR decoding helpers (get_model_inputs_next_decoding, index_select_model_inputs) to module-level functions. |
| lmdeploy/pytorch/strategies/ar/model_agent.py | Removes AR “next step” update methods now implemented in ARStepInputs. |
| lmdeploy/pytorch/strategies/ar/sampling.py | Removes delta/merge/step methods now implemented in ARStepInputs. |
| lmdeploy/pytorch/strategies/ar/init.py | Adds build_step_inputs() factory method for AR. |
| lmdeploy/pytorch/strategies/dllm/model_agent.py | Removes DLLM “next step” update methods now implemented in DLLMStepInputs. |
| lmdeploy/pytorch/strategies/dllm/model_inputs.py | Removes DLLM merge/update methods now implemented in DLLMStepInputs. |
| lmdeploy/pytorch/strategies/dllm/sampling.py | Removes delta/merge/step methods now implemented in DLLMStepInputs; aligns repeated attrs with new ngram field names. |
| lmdeploy/pytorch/strategies/dllm/init.py | Adds build_step_inputs() factory method for DLLM. |
| lmdeploy/pytorch/strategies/ar_spec/model_agent.py | Removes AR-Spec “next step” update methods now implemented in ARSpecStepInputs. |
| lmdeploy/pytorch/strategies/ar_spec/model_inputs.py | Removes AR-Spec merge/update methods now implemented in ARSpecStepInputs. |
| lmdeploy/pytorch/strategies/ar_spec/init.py | Adds build_step_inputs() factory method for AR-Spec. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
StepInputssubclasses, replacing the oldStepInputsinagent.pythat delegated to 8 scattered methods across 3 strategy classesupdate_prefill_for_next_step,update_decoding_for_next_step,update_extra_inputsfromModelAgentStrategy;update_inputs,mergefromModelInputsStrategy;merge_sampling_delta,step_sampling_delta,update_sampling_deltafromSamplingStrategy— all were exclusively called fromStepInputsmerge→merge_prefill,update_delta→reindex,step→step_decodeMotivation
Before this change, understanding the update lifecycle for a single paradigm (e.g., AR) required reading 3 files (
ar/model_agent.py,ar/model_inputs.py,ar/sampling.py). Adding a new field toModelInputsmeant updating methods in all 3 files. Now each paradigm's update logic lives in a singlestep_inputs.pyfile.What changed
New files (4):
strategies/base/step_inputs.py— abstractStepInputsbase class with lifecycle docsstrategies/ar/step_inputs.py—ARStepInputswith all AR update logicstrategies/dllm/step_inputs.py—DLLMStepInputswith all DLLM update logicstrategies/ar_spec/step_inputs.py—ARSpecStepInputswith all AR Spec update logicKey modifications:
engine/model_agent/agent.py— removed oldStepInputsclass (~85 lines), construction now usesstrategy_factory.build_step_inputs()strategies/base/— removed 8 abstract methods from the 3 strategy ABCsstrategies/{ar,dllm,ar_spec}/— removed inlined methods from strategy implementations, addedbuild_step_inputs()factory methodsar/model_inputs.py—index_selectpromoted to standaloneindex_select_model_inputs();get_model_inputs_next_decodingmoved here fromar/model_agent.pyTest
qwen3.5-35b-a3b gpqa
qwen3.5-35b-a3b MTP aime2025
qwen3.5-35b-a3b ruler
Select 64 data, same result as main branch