huggingface / trl Public

Notifications You must be signed in to change notification settings
Fork 2.6k
Star 17.9k

Code
Issues 543
Pull requests 122
Discussions
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security and quality
Insights

Pull requests: huggingface/trl

Labels 37 Milestones 0

New pull request New

122 Open 2,895 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Add GPT-OSS tool calling support

#5464 opened Apr 6, 2026 by qgallouedec

Loading…

Add GLM-4-MoE tool calling support

#5463 opened Apr 6, 2026 by qgallouedec

Loading…

Add supports_tool_calling utility and validate tool support at init

#5462 opened Apr 6, 2026 by qgallouedec

Loading…

GOLDTrainer VLM support

#5461 opened Apr 6, 2026 by Strongich

Loading…

4 of 8 tasks

Move chat templates from inline strings to .jinja files

#5459 opened Apr 5, 2026 by qgallouedec

Loading…

Narrow prefix-preserving check to the actual requirement

#5458 opened Apr 5, 2026 by qgallouedec

Loading…

[docs] Clarify dtype defaults between trf v5 and TRL

#5457 opened Apr 4, 2026 by casinca

Loading…

2 of 4 tasks

fix _get_per_token_logps_and_entropies return type

#5456 opened Apr 4, 2026 by kashif

Loading…

2 of 8 tasks

Gemma 4 support

#5453 opened Apr 4, 2026 by qgallouedec

Loading…

[AsyncGRPO] Support async tool calls in AsyncRolloutWorker

#5446 opened Apr 3, 2026 by PoilZero

Loading…

5 of 8 tasks

Simplify _get_tool_suffix_ids

#5440 opened Apr 2, 2026 by qgallouedec

Loading…

FIPO loss

#5434 opened Apr 2, 2026 by kdubovikov

Loading…

4 of 8 tasks

feat(async-grpo): add sampling parameter parity

#5418 opened Mar 31, 2026 by kdubovikov

Loading…

4 of 8 tasks

Delta weight sync using Xet buckets

#5417 opened Mar 31, 2026 by AmineDiro • Draft

8 tasks

fix(async-grpo): honor model init dtype

#5416 opened Mar 31, 2026 by kdubovikov

Loading…

3 of 8 tasks

Skip redundant forward pass for on-policy vLLM importance sampling

#5413 opened Mar 31, 2026 by GJ98

Loading…

3 of 8 tasks

add JEPO trainer

#5411 opened Mar 31, 2026 by zbills

Loading…

3 of 7 tasks

Add log_multimodal param to GRPOConfig and RLOOConfig to control image logging

#5408 opened Mar 30, 2026 by apardyl

Loading…

3 of 8 tasks

Add DistillationTrainer for efficient on-policy distillation

#5407 opened Mar 30, 2026 by cmpatino

Loading…

3 of 5 tasks

Add length-normalized sigmoid loss type to DPO trainer

#5406 opened Mar 30, 2026 by BrownianNotion

Loading…

5 of 8 tasks

Add per-sample tool filtering to GRPOTrainer via tools column

#5398 opened Mar 27, 2026 by lailanelkoussy

Loading…

3 tasks done

Add tool calling support to RLOOTrainer

#5395 opened Mar 27, 2026 by qgallouedec

Loading…

feat(grpo): add stop_tool_names for immediate agent loop termination

#5390 opened Mar 27, 2026 by lailanelkoussy

Loading…

Fix DAPO token-level loss to use prompt-level aggregation

#5381 opened Mar 26, 2026 by matdou

Loading…

2 of 5 tasks

Remove truncation_mode from DPO

#5372 opened Mar 25, 2026 by albertvillanova

Loading…

Previous 1 2 3 4 5 Next

Previous Next

ProTip! Filter pull requests by the default branch with base:main.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!