Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions docs/source/customization.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,13 @@ TRL is designed with modularity in mind so that users are able to efficiently cu
> [!NOTE]
> Although these examples use the [`DPOTrainer`], these customization methods apply to most (if not all) trainers in TRL.

> [!NOTE]
> [Since Transformers v5](https://github.com/huggingface/transformers/pull/42805), `from_pretrained` infers the dtype from the model's config (e.g., `bfloat16`) instead of
> defaulting to `float32`. When you load a model, pass `dtype` explicitly if you need a specific
> precision.
> When TRL handles model loading (i.e., you pass a model name string to the trainer), it defaults to
> `float32`.

## Use different optimizers and schedulers

By default, the [`DPOTrainer`] creates a `torch.optim.AdamW` optimizer. You can create and define a different optimizer and pass it to [`DPOTrainer`] as follows:
Expand Down