-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add
supports_tool_calling utility and validate tool support at init
#5462
opened Apr 6, 2026 by
qgallouedec
Loading…
Move chat templates from inline strings to
.jinja files
#5459
opened Apr 5, 2026 by
qgallouedec
Loading…
Narrow prefix-preserving check to the actual requirement
#5458
opened Apr 5, 2026 by
qgallouedec
Loading…
[docs] Clarify dtype defaults between trf v5 and TRL
#5457
opened Apr 4, 2026 by
casinca
Loading…
2 of 4 tasks
fix _get_per_token_logps_and_entropies return type
#5456
opened Apr 4, 2026 by
kashif
Loading…
2 of 8 tasks
[AsyncGRPO] Support async tool calls in AsyncRolloutWorker
#5446
opened Apr 3, 2026 by
PoilZero
Loading…
5 of 8 tasks
feat(async-grpo): add sampling parameter parity
#5418
opened Mar 31, 2026 by
kdubovikov
Loading…
4 of 8 tasks
fix(async-grpo): honor model init dtype
#5416
opened Mar 31, 2026 by
kdubovikov
Loading…
3 of 8 tasks
Skip redundant forward pass for on-policy vLLM importance sampling
#5413
opened Mar 31, 2026 by
GJ98
Loading…
3 of 8 tasks
Add
log_multimodal param to GRPOConfig and RLOOConfig to control image logging
#5408
opened Mar 30, 2026 by
apardyl
Loading…
3 of 8 tasks
Add
DistillationTrainer for efficient on-policy distillation
#5407
opened Mar 30, 2026 by
cmpatino
Loading…
3 of 5 tasks
Add length-normalized sigmoid loss type to DPO trainer
#5406
opened Mar 30, 2026 by
BrownianNotion
Loading…
5 of 8 tasks
Add per-sample tool filtering to GRPOTrainer via
tools column
#5398
opened Mar 27, 2026 by
lailanelkoussy
Loading…
3 tasks done
feat(grpo): add stop_tool_names for immediate agent loop termination
#5390
opened Mar 27, 2026 by
lailanelkoussy
Loading…
Fix DAPO token-level loss to use prompt-level aggregation
#5381
opened Mar 26, 2026 by
matdou
Loading…
2 of 5 tasks
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.