Skip to content

Commit eb07373

Browse files
authored
Merge pull request huggingface#165 from huggingface/smangrul/add-trl-example-in-readme
minor changes
2 parents 8777b56 + f1980e9 commit eb07373

File tree

2 files changed

+14
-9
lines changed

2 files changed

+14
-9
lines changed

README.md

+5-4
Original file line numberDiff line numberDiff line change
@@ -125,14 +125,15 @@ Try out the 🤗 Gradio Space which should run seamlessly on a T4 instance:
125125

126126
![peft lora dreambooth gradio space](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/peft/peft_lora_dreambooth_gradio_space.png)
127127

128-
### Parameter Efficient Tuning of LLMs for RLHF components such as Ranker and Policy [ToDo]
129-
Here is an exmaple in trl library on using PEFT+INT8 for tuning policy model: [gpt2-sentiment_peft.py](https://github.com/lvwerra/trl/blob/main/examples/sentiment/scripts/gpt2-sentiment_peft.py)
128+
### Parameter Efficient Tuning of LLMs for RLHF components such as Ranker and Policy
129+
- Here is an exmaple in [trl](https://github.com/lvwerra/trl) library using PEFT+INT8 for tuning policy model: [gpt2-sentiment_peft.py](https://github.com/lvwerra/trl/blob/main/examples/sentiment/scripts/gpt2-sentiment_peft.py)
130+
- Example using PEFT for both reward model and policy [ToDo]
130131

131132
### INT8 training of large models in Colab using PEFT LoRA and bits_and_bytes
132133

133-
Here is now a demo on how to fine tune [OPT-6.7b](https://huggingface.co/facebook/opt-6.7b) (14GB in fp16) in a Google colab: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1jCkpikz0J2o20FBQmYmAGdiKmJGOMo-o?usp=sharing)
134+
- Here is now a demo on how to fine tune [OPT-6.7b](https://huggingface.co/facebook/opt-6.7b) (14GB in fp16) in a Google colab: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1jCkpikz0J2o20FBQmYmAGdiKmJGOMo-o?usp=sharing)
134135

135-
Here is now a demo on how to fine tune [whishper-large](openai/whisper-large-v2) (1.5B params) (14GB in fp16) in a Google colab: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1DOkD_5OUjFa0r5Ik3SgywJLJtEo2qLxO?usp=sharing) and [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1vhF8yueFqha3Y3CpTHN6q9EVcII9EYzs?usp=sharing)
136+
- Here is now a demo on how to fine tune [whishper-large](openai/whisper-large-v2) (1.5B params) (14GB in fp16) in a Google colab: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1DOkD_5OUjFa0r5Ik3SgywJLJtEo2qLxO?usp=sharing) and [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1vhF8yueFqha3Y3CpTHN6q9EVcII9EYzs?usp=sharing)
136137

137138
### Save compute and storage even for medium and small models
138139

examples/int8_training/peft_bnb_whisper_large_v2_training.ipynb

+9-5
Original file line numberDiff line numberDiff line change
@@ -1295,13 +1295,16 @@
12951295
"from transformers import Seq2SeqTrainer, TrainerCallback, TrainingArguments, TrainerState, TrainerControl\n",
12961296
"from transformers.trainer_utils import PREFIX_CHECKPOINT_DIR\n",
12971297
"\n",
1298+
"\n",
12981299
"class SavePeftModelCallback(TrainerCallback):\n",
12991300
" def on_save(\n",
1300-
" self, args: TrainingArguments, state: TrainerState, control: TrainerControl, **kwargs,\n",
1301+
" self,\n",
1302+
" args: TrainingArguments,\n",
1303+
" state: TrainerState,\n",
1304+
" control: TrainerControl,\n",
1305+
" **kwargs,\n",
13011306
" ):\n",
1302-
" checkpoint_folder = os.path.join(\n",
1303-
" args.output_dir, f\"{PREFIX_CHECKPOINT_DIR}-{state.global_step}\"\n",
1304-
" ) \n",
1307+
" checkpoint_folder = os.path.join(args.output_dir, f\"{PREFIX_CHECKPOINT_DIR}-{state.global_step}\")\n",
13051308
"\n",
13061309
" peft_model_path = os.path.join(checkpoint_folder, \"adapter_model\")\n",
13071310
" kwargs[\"model\"].save_pretrained(peft_model_path)\n",
@@ -1311,6 +1314,7 @@
13111314
" os.remove(pytorch_model_path)\n",
13121315
" return control\n",
13131316
"\n",
1317+
"\n",
13141318
"trainer = Seq2SeqTrainer(\n",
13151319
" args=training_args,\n",
13161320
" model=model,\n",
@@ -1319,7 +1323,7 @@
13191323
" data_collator=data_collator,\n",
13201324
" # compute_metrics=compute_metrics,\n",
13211325
" tokenizer=processor.feature_extractor,\n",
1322-
" callbacks=[SavePeftModelCallback]\n",
1326+
" callbacks=[SavePeftModelCallback],\n",
13231327
")\n",
13241328
"model.config.use_cache = False # silence the warnings. Please re-enable for inference!"
13251329
]

0 commit comments

Comments
 (0)