Skip to content

v2.7

Compare
Choose a tag to compare
@oobabooga oobabooga released this 09 Apr 17:49
· 343 commits to main since this release
14e6bae

Changes

  • Add ExLlamaV3 support (#6832). This is done through a new ExLlamav3_HF loader that uses the same samplers as Transformers and ExLlamav2_HF. Wheels compiled with GitHub Actions are included for both Linux and Windows, eliminating manual installation steps. Note: these wheels require compute capacity of 8 or greater, at least for now.
  • Add a new chat style: Dark (#6817).
  • Set context lengths to at most 8192 by default to prevent OOM errors, and show the model's maximum length in the UI (#6835).

🔧 Bug fixes

  • Fix a matplotlib bug in the Google Colab notebook.
  • Fix links in the ngrok extension README (#6826). Thanks @KPCOFGS.

🔄 Backend updates

  • Transformers: Bump to 4.50.
  • CUDA: Bump to 12.4.
  • PyTorch: Bump to 2.6.0.
  • FlashAttention: Bump to v2.7.4.post1.
  • PEFT: Bump to 0.15. This should make axolotl loras compatible with the project.