Remove truncation_mode from DPO#5372
Remove truncation_mode from DPO#5372albertvillanova wants to merge 4 commits intohuggingface:mainfrom
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| rejected_mask = [[0] * len(example["prompt_ids"]) + [1] * len(example["rejected_ids"]) for example in examples] | ||
|
|
||
| if self.max_length is not None: | ||
| if self.truncation_mode == "keep_start": |
There was a problem hiding this comment.
Truncation mode removal not propagated to SFT trainer
Low Severity
The truncation_mode removal from DataCollatorForPreference and DPOConfig was not propagated to the SFT trainer, which has the identical pattern in DataCollatorForLanguageModeling (including the same keep_start/keep_end if/elif/else block) and SFTConfig. The VLM keep_end guard is also duplicated verbatim in SFTTrainer.__init__. This creates an inconsistency between the two main-code trainers.
Additional Locations (1)
Triggered by project rule: BUGBOT.md
There was a problem hiding this comment.
As commented in the description, if there is agreement, I'd be happy to follow up and apply the same change across other trainers for consistency.
|
I think removing the truncation-side option would be reasonable. I checked a few other repos that implement DPO/SFT, and
So the implementation landscape seems fairly consistent: For SFT the case is a bit less clear, but for DPO in particular, left truncation seems unusual. If it is supported, it is generally an opt-in behavior rather than the standard path. We could also check papers, but that would likely take quite a bit more time for what is probably the same conclusion. cc @kashif if you have some insights |
|
not for this PR, but we might want to re-add this option in the future to enable dropping overlong examples, which I think make more sense |
|
I don't want to rush this before v1, I recommend keeping |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 4bf9351. Configure here.
| Truncation mode to use when the sequence exceeds `max_length`. Possible values are `"keep_end"` and | ||
| `"keep_start"`. | ||
| Maximum length of the tokenized sequence. Sequences longer than `max_length` are truncated from the end. | ||
| If `None`, no truncation is applied. |
There was a problem hiding this comment.
Inconsistent truncation_mode removal across trainers violates consistency rule
Medium Severity
The truncation_mode parameter and its associated logic (config field, collator field, truncation branching, VLM guard) are duplicated across the DPO and SFT trainers. This PR removes truncation_mode from DPOConfig, DataCollatorForPreference, and DPOTrainer, but the identical pattern remains in SFTConfig, DataCollatorForLanguageModeling, and SFTTrainer. Per the project's AGENTS.md consistency rules, changes to duplicated logic across trainers must be propagated to all copies.
Additional Locations (1)
Triggered by project rule: ../.ai/AGENTS.md
Reviewed by Cursor Bugbot for commit 4bf9351. Configure here.
|
We may want to have a "drop" truncation mode in the future, ie drop overlong samples instead of truncating them. Maybe keep this arg (and avoid breaking change) and just deprecate |
|
OK, I agree that "drop" truncation mode will be useful. So let's keep |


Remove
truncation_modefrom DPO.This PR simplifies and standardizes the truncation logic for tokenized sequences in the DPO Trainer by removing the
truncation_modeoption. All truncation now consistently keeps the start of the sequence and truncates from the end, eliminating ambiguity and reducing configuration complexity. Documentation and code are updated accordingly.Motivation
I'd like to propose a potential simplification: removing the
truncation_modeoption altogether.The motivation is to reduce complexity and avoid supporting behaviors that are either rarely used or potentially undesirable in practice. Before going further, I wanted to get feedback on whether this kind of simplification would be acceptable, and whether dropping this functionality would be considered reasonable.
This is just an initial proposal for discussion. If there is agreement, I'd be happy to follow up and apply the same change across other trainers for consistency. Otherwise, no worries at all: feel free to disregard this suggestion and close the PR.
CC: @qgallouedec
Changes
Truncation logic simplification:
truncation_modeparameter and all related logic fromDPOConfig,DataCollatorForPreference, andDPOTrainer, so sequences longer thanmax_lengthare always truncated from the end.Documentation updates:
truncation_modeand clarifying that truncation always happens from the end.Note
Medium Risk
Removes a public configuration option and changes truncation behavior for DPO runs that previously relied on
keep_end, which could alter training inputs and results. Risk is contained to DPO preprocessing/collation and is covered by updating/removing the associated regression test.Overview
DPO no longer supports configurable truncation behavior.
truncation_modeis removed fromDPOConfig,DataCollatorForPreference, and theDPOTrainerwiring; whenmax_lengthis set, sequences are now always truncated by keeping the start and cutting off the end.Documentation/help text is updated to reflect the fixed truncation behavior, the VLM-specific
keep_endvalidation inDPOTraineris deleted, and the corresponding test that asserted this error is removed.Reviewed by Cursor Bugbot for commit 4bf9351. Bugbot is set up for automated code reviews on this repo. Configure here.