Skip to content

Remove truncation_mode from DPO#5372

Closed
albertvillanova wants to merge 4 commits intohuggingface:mainfrom
albertvillanova:rm-truncation-mode-dpo
Closed

Remove truncation_mode from DPO#5372
albertvillanova wants to merge 4 commits intohuggingface:mainfrom
albertvillanova:rm-truncation-mode-dpo

Conversation

@albertvillanova
Copy link
Copy Markdown
Member

@albertvillanova albertvillanova commented Mar 25, 2026

Remove truncation_mode from DPO.

This PR simplifies and standardizes the truncation logic for tokenized sequences in the DPO Trainer by removing the truncation_mode option. All truncation now consistently keeps the start of the sequence and truncates from the end, eliminating ambiguity and reducing configuration complexity. Documentation and code are updated accordingly.

Motivation

I'd like to propose a potential simplification: removing the truncation_mode option altogether.

The motivation is to reduce complexity and avoid supporting behaviors that are either rarely used or potentially undesirable in practice. Before going further, I wanted to get feedback on whether this kind of simplification would be acceptable, and whether dropping this functionality would be considered reasonable.

This is just an initial proposal for discussion. If there is agreement, I'd be happy to follow up and apply the same change across other trainers for consistency. Otherwise, no worries at all: feel free to disregard this suggestion and close the PR.

CC: @qgallouedec

Changes

Truncation logic simplification:

  • Removed the truncation_mode parameter and all related logic from DPOConfig, DataCollatorForPreference, and DPOTrainer, so sequences longer than max_length are always truncated from the end.

Documentation updates:

  • Updated docstrings and help texts to reflect the new, simplified truncation behavior, removing references to truncation_mode and clarifying that truncation always happens from the end.

Note

Medium Risk
Removes a public configuration option and changes truncation behavior for DPO runs that previously relied on keep_end, which could alter training inputs and results. Risk is contained to DPO preprocessing/collation and is covered by updating/removing the associated regression test.

Overview
DPO no longer supports configurable truncation behavior. truncation_mode is removed from DPOConfig, DataCollatorForPreference, and the DPOTrainer wiring; when max_length is set, sequences are now always truncated by keeping the start and cutting off the end.

Documentation/help text is updated to reflect the fixed truncation behavior, the VLM-specific keep_end validation in DPOTrainer is deleted, and the corresponding test that asserted this error is removed.

Reviewed by Cursor Bugbot for commit 4bf9351. Bugbot is set up for automated code reviews on this repo. Configure here.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

rejected_mask = [[0] * len(example["prompt_ids"]) + [1] * len(example["rejected_ids"]) for example in examples]

if self.max_length is not None:
if self.truncation_mode == "keep_start":
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Truncation mode removal not propagated to SFT trainer

Low Severity

The truncation_mode removal from DataCollatorForPreference and DPOConfig was not propagated to the SFT trainer, which has the identical pattern in DataCollatorForLanguageModeling (including the same keep_start/keep_end if/elif/else block) and SFTConfig. The VLM keep_end guard is also duplicated verbatim in SFTTrainer.__init__. This creates an inconsistency between the two main-code trainers.

Additional Locations (1)
Fix in Cursor Fix in Web

Triggered by project rule: BUGBOT.md

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As commented in the description, if there is agreement, I'd be happy to follow up and apply the same change across other trainers for consistency.

@qgallouedec qgallouedec requested review from kashif March 25, 2026 15:33
@qgallouedec
Copy link
Copy Markdown
Member

qgallouedec commented Mar 25, 2026

I think removing the truncation-side option would be reasonable.

I checked a few other repos that implement DPO/SFT, and keep_end-style behavior looks more like an exception than the norm:

  • torchtune: it supports the equivalent of keep_end via truncation_type="left", but the default is still "right": link
  • OpenRLHF: I did not find any keep_end-style option for DPO/SFT. It relies on the tokenizer truncation path. Interestingly, after truncation it forces the last token to EOS: link
  • LLaMA-Factory: its preprocessing mostly preserves the beginning of the sequence, not the end: link
  • Unsloth: it mostly delegates this behavior to trl, so it is not really an independent reference point
  • Axolotl: I did not find an explicit keep_end mode either; for DPO it mostly inherits TRL behavior and drops overlong pairs

So the implementation landscape seems fairly consistent: keep_end can exist as an explicit option, but it is rarely the default, and in several repos it is not exposed at all.

For SFT the case is a bit less clear, but for DPO in particular, left truncation seems unusual. If it is supported, it is generally an opt-in behavior rather than the standard path.

We could also check papers, but that would likely take quite a bit more time for what is probably the same conclusion.

cc @kashif if you have some insights

@qgallouedec
Copy link
Copy Markdown
Member

qgallouedec commented Mar 25, 2026

not for this PR, but we might want to re-add this option in the future to enable dropping overlong examples, which I think make more sense

@qgallouedec
Copy link
Copy Markdown
Member

I don't want to rush this before v1, I recommend keeping truncation_mode, and decide whether we should stop supporting "keep_end"

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 4bf9351. Configure here.

Truncation mode to use when the sequence exceeds `max_length`. Possible values are `"keep_end"` and
`"keep_start"`.
Maximum length of the tokenized sequence. Sequences longer than `max_length` are truncated from the end.
If `None`, no truncation is applied.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent truncation_mode removal across trainers violates consistency rule

Medium Severity

The truncation_mode parameter and its associated logic (config field, collator field, truncation branching, VLM guard) are duplicated across the DPO and SFT trainers. This PR removes truncation_mode from DPOConfig, DataCollatorForPreference, and DPOTrainer, but the identical pattern remains in SFTConfig, DataCollatorForLanguageModeling, and SFTTrainer. Per the project's AGENTS.md consistency rules, changes to duplicated logic across trainers must be propagated to all copies.

Additional Locations (1)
Fix in Cursor Fix in Web

Triggered by project rule: ../.ai/AGENTS.md

Reviewed by Cursor Bugbot for commit 4bf9351. Configure here.

@qgallouedec
Copy link
Copy Markdown
Member

qgallouedec commented Apr 6, 2026

We may want to have a "drop" truncation mode in the future, ie drop overlong samples instead of truncating them. Maybe keep this arg (and avoid breaking change) and just deprecate "keep_start"?

@albertvillanova
Copy link
Copy Markdown
Member Author

OK, I agree that "drop" truncation mode will be useful. So let's keep truncation_mode and address the rest of changes in separate PRs. I'm working on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants