Releases: a-r-r-o-w/finetrainers
v0.0.1
FineTrainers v0.0.1 🧪
FineTrainers is a work-in-progress library to support (accessible) training of diffusion models. The following models are currently supported (based on Diffusers):
- CogVideoX T2V (versions 1.0 and 1.5)
- LTX Video
- Hunyuan Video
The legacy/deprecated scripts also support CogVideoX I2V and Mochi.
Currently, LoRA and Full-rank finetuning is supported. With time, more models and training techniques will be supported. We thank our many contributors for their amazing work at improve finetrainers
. They are mentioned below in the "New Contributors" section.
In a short timespan, finetrainers has found its way into multiple research works, which has been a very motivating factor for us. They are mentioned in the "Featured Projects" section of the README. We hope you find them interesting and continue to build & work on interesting ideas, while sharing your research artifacts openly!
Some artifacts that we've released ourselves is available here: https://huggingface.co/finetrainers
We plan to focus on core algorithms/models that users prefer to have support for quickly, primarily based on the feedback we've received (thanks to everyone who's spoken with me regarding this. Your time is invaluable!) The majors asks are:
- more models and faster support for newer models (we will open this up for contributions after a major open but pending PR, and add many ourselves!)
- compatibility with UIs that do not support standardized implementations from diffusers (we will write two-way conversion scripts for new models that are added to diffusers, so that it is easy to obtain original-format weights from diffusers-format weights)
- more algorithms (Control LoRA, ControlNets for video models and VideoJAM are some of the highly asked techniques -- we will prioritize this!)
- Dataset QoL changes (this is a WIP in an open but pending PR)
Let us know what you'd like to see next & stay tuned for interesting updates!
output.mp4 |
What's Changed
- CogVideoX LoRA and full finetuning by @a-r-r-o-w in #1
- Low-bit memory optimizers, CpuOffloadOptimizer, Memory Reports by @a-r-r-o-w in #3
- Pin memory support in dataloader by @a-r-r-o-w in #5
- DeepSpeed fixes by @a-r-r-o-w in #7
- refactor readme i. by @sayakpaul in #8
- DeepSpeed and DDP Configs by @a-r-r-o-w in #10
- Full finetuning memory requirements by @a-r-r-o-w in #9
- Multi-GPU parallel encoding support for training videos. by @zRzRzRzRzRzRzR in #6
- CogVideoX I2V; CPU offloading; Model README descriptions by @a-r-r-o-w in #11
- add VideoDatasetWithResizeAndRectangleCrop dataset resize crop by @glide-the in #13
- add "max_sequence_length": model_config.max_text_seq_length, by @glide-the in #15
- readme updates + refactor by @a-r-r-o-w in #14
- Update README.md by @a-r-r-o-w in #17
- merge by @zRzRzRzRzRzRzR in #18
- Darft of Chinese README by @zRzRzRzRzRzRzR in #19
- docs: update README.md by @eltociear in #21
- Update requirements.txt (fixed typo) by @Nojahhh in #24
- Update README and Contribution guide by @zRzRzRzRzRzRzR in #20
- Lower requirements versions by @a-r-r-o-w in #27
- Update for windows compability by @Nojahhh in #32
- [Docs] : Update README.md by @FarukhS52 in #35
- Improve dataset preparation support + multiresolution prep by @a-r-r-o-w in #39
- Update prepare_dataset.sh by @a-r-r-o-w in #42
- improve dataset preparation by @a-r-r-o-w in #43
- more dataset fixes by @a-r-r-o-w in #49
- fix: correct type in .py files by @DhanushNehru in #52
- fix: resuming from a checkpoint when using deepspeed. by @sayakpaul in #38
- Windows support for T2V scripts by @a-r-r-o-w in #48
- Fixed optimizers parsing error in bash scripts by @Nojahhh in #61
- Update readme to install diffusers from source by @Yuancheng-Xu in #59
- Update README.md by @a-r-r-o-w in #73
- add some script of lora test by @zRzRzRzRzRzRzR in #66
- I2V multiresolution finetuning by removing learned PEs by @a-r-r-o-w in #31
- adaption for CogVideoX1.5 by @jiashenggu in #92
- docs: fix help message in args.py by @Leojc in #98
- sft with multigpu by @zhipuch in #84
- [feat] add Mochi-1 trainer by @sayakpaul in #90
- wandb tracker in scheduling problems during the training initiation and training stages by @glide-the in #100
- fix format specifier. by @sayakpaul in #104
- Unbound fix by @glide-the in #105
- feat: support checkpointing saving and loading by @sayakpaul in #106
- RoPE fixes for 1.5, bfloat16 support in prepare_dataset, gradient_accumulation grad norm undefined fix by @a-r-r-o-w in #107
- Update README.md to include mochi-1 trainer by @sayakpaul in #112
- add I2V sft and fix an error by @jiashenggu in #97
- LTX Video by @a-r-r-o-w in #123
- Hunyuan Video LoRA by @a-r-r-o-w in #126
- Precomputation of conditions and latents by @a-r-r-o-w in #129
- Grad Norm tracking in DeepSpeed by @a-r-r-o-w in #148
- fix validation bug by @a-r-r-o-w in #149
- [feat] support DeepSpeed. by @sayakpaul in #139
- [optimization] support 8bit optims from bistandbytes by @sayakpaul in #163
- [Chore] bulk update styling and formatting by @sayakpaul in #170
- Update README.md to fix graph paths by @sayakpaul in #171
- Support CogVideoX T2V by @sayakpaul in #165
- Fix scheduler bugs by @sayakpaul in #177
- scheduler fixes part ii by @sayakpaul in #178
- [CI] add a workflow to do quality checks. by @sayakpaul in #180
- support model cards by @sayakpaul in #176
- [docs] refactor docs for easier info parsing by @sayakpaul in #175
- Allow images; Remove LLM generated prefixes; Allow JSON/JSONL; Fix bugs by @a-r-r-o-w in #158
- simplify docs part ii by @sayakpaul in #190
- Update requirements by @a-r-r-o-w in #189
- Fix minor bug with function call that doesn't exist. by @ArEnSc in #195
- Precomputation folder name based on model name by @a-r-r-o-w in #196
- Better defaults for LTXV by @a-r-r-o-w in #198
- [core] Fix loading of precomputed conditions and latents by @sayakpaul in #199
- Epoch loss by @a-r-r-o-w in #201
- Shell script to minimally test supported models on a real dataset by @sayakpaul in #204
- Update pr_tests.yml to update ruff version by @sayakpaul in #205
- Fix the checkpoint dir bug in
get_intermediate_ckpt_path
by @Awcrr in #207 - Argument descriptions by @a-r-r-o-w in #208
- Improve argument handling by @a-r-r-o-w in #209
- Helpful messages by @a-r-r-o-w in #210
- Full Finetuning for LTX pos...