Releases · Lightning-AI/litgpt

06 May 10:22

t-vi

v0.5.8

c50b574

v0.5.8 Latest

Latest

Many great updates!

What's Changed

add missing r1 prompt style by @ali-alshaar7 in #1929
fix incremental save for PyTorch 2.6 by @t-vi in #1928
Update pyproject.toml by @syntheticgio in #1939
fix: resolve failing CI by @Borda in #1944
handle wrapped thundermodules in generate by @t-vi in #1955
fix skip condition by @t-vi in #1956
ci: use HF cache by @Borda in #1958
nits for CI by @Andrei-Aksionov in #1940
ci: split HF caching by @Borda in #1960
bump: PT 2.6 + bitsandbytes & standalone tests by @Borda in #1959
prune whitespaces for code readability by @Borda in #1962
fixing various typos in examples & tutorials by @Borda in #1963
fix n_query_groups for llama-3.1-405b by @ysjprojects in #1946
tests: make flaky test due to connection issues by @Borda in #1964
Fix: incorrect gradient accumulation steps bug by @ysjprojects in #1947
fix: use default num_nodes=1 for back-compatibility by @Borda in #1967
Do not wrap LoRA layers with FSDP by @janEbert in #1538
Speculative decoding: Base implementation by @Andrei-Aksionov in #1938
Better clarity on SFT dataset attributes by @ysjprojects in #1970
Enforce Consistent Formatting and Validation for YAML Files by @Borda in #1977
Apply Standard Formatting and Fix Import & Test Name Issues by @Borda in #1981
Setting config.sliding_window_layer_stride explicity by @ysjprojects in #1972
feat: add linear rope type by @k223kim in #1982
feat: update tests for transformers 4.50.2 by @k223kim in #1983
fix: test_tokenizer_against_hf by @Borda in #1984
feat: replace sliding window type with offset by @k223kim in #1989
ci: with pull_request_target by @Borda in #1992
Phi4 mini by @ysjprojects in #1949
aggregate val_loss by @ysjprojects in #1971
feat: add local base freq for rope by @k223kim in #1993
test: flexible wait for serve start by @Borda in #1996
fix: replace sliding window configuration parameters to sliding windows indices by @k223kim in #1995
QwQ-32B by @ysjprojects in #1952
feat: run thunder tests as part of LitGPT CI by @deependujha in #1975
try pyupgrade-up py38 by @Borda in #1999
[1/4] feat: add gemma 3 27b by @k223kim in #1998
[2/4] add gemma 3 1b by @k223kim in #2000
feat: add gemma-3-12b by @k223kim in #2002
[3/4] feat: add gemma 3 4b by @k223kim in #2001
Add resume for adapter_v2, enable continued finetuning for adapter by @altria-zewei-wang in #1354
Fix/loading gemma 3 1b by @pquadri in #2004
feat: add gemma 3 in readme and tutorials by @k223kim in #2005
add borda as codeowner by @t-vi in #2007
example for full finetuning with python code by @astrobdr in #1331
feat: add tests for gemma3 by @k223kim in #2006
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #2009
building tutorials as mkdocs pages by @Borda in #2011
Add mlflow logger support by @topikachu in #1985
fix support for litserve>0.2.4 by @ali-alshaar7 in #1994
Cast tensors in KVCache only when needed by @Andrei-Aksionov in #2017
feat: load only text weights from multimodal gemma by @pquadri in #2008
Feature: Adds support for OpenAISpec in litgpt serve by @bhimrazy in #1943
fix typo by @Lynsoo in #2018
drop upper bounds in dependencies by @t-vi in #2022
prepare 0.5.8 by @t-vi in #2023

New Contributors

@syntheticgio made their first contribution in #1939
@deependujha made their first contribution in #1975
@altria-zewei-wang made their first contribution in #1354
@pquadri made their first contribution in #2004
@astrobdr made their first contribution in #1331
@pre-commit-ci made their first contribution in #2009
@topikachu made their first contribution in #1985
@bhimrazy made their first contribution in #1943
@Lynsoo made their first contribution in #2018

Full Changelog: v0.5.7...v0.5.8

Contributors

topikachu, Borda, and 14 other contributors

Assets 2

27 Jan 20:59

t-vi

v0.5.7

250599b

v0.5.7

What's Changed

Add Deepseek r1 distill llama models by @ali-alshaar7 in #1922

New Contributors

@ali-alshaar7 made their first contribution in #1922

Full Changelog: v0.5.6...v0.5.7

Contributors

ali-alshaar7

Assets 2

24 Jan 16:53

t-vi

v0.5.6

9479c91

v0.5.6

What has changed

Phi-4 by @ysjprojects
Thunder compatibility by @t-vi

Contributors

t-vi and ysjprojects

Assets 2

08 Jan 20:33

rasbt

v0.5.5

91f3752

v0.5.5

What's Changed

Post-release setup for 0.5.5.dev1 by @Andrei-Aksionov in #1885
Falcon3 by @ysjprojects in #1881
ChatML prompt template by @ysjprojects in #1882
Small fixes and refactoring by @mseeger in #1861
Drop interleave placement in QKV matrix by @Andrei-Aksionov in #1013
Bump PyTorch, PyTorch-Lightning and BnB versions by @Andrei-Aksionov in #1893
Pin version of mistune in check links workflow by @Andrei-Aksionov in #1895
Skip converting .safetensors to .bin by @ysjprojects in #1853
Some improvements for KV caching by @mseeger in #1891
added query-key norm to accomodate OLMo2 by @ysjprojects in #1894
Improve HF download speed by @rasbt in #1899
Bump version for 0.5.5 release by @rasbt in #1901

New Contributors

@mseeger made their first contribution in #1861

Full Changelog: v0.5.4...v0.5.5

Contributors

rasbt, mseeger, and 2 other contributors

Assets 2

23 Dec 13:00

Andrei-Aksionov

v0.5.4

e63099c

v0.5.4

What's Changed

0.5.3 post release setup by @rasbt in #1817
Add cff file by @rasbt in #1818
Deprecate Support for Dolly, Nous-Hermes, Redpajama-Incite, Vicuna and H2O Danube Models. by @ParagEkbote in #1821
Adding OLMo by @aflah02 in #1827
Adding Qwen2.5 by @ysjprojects in #1834
Restore SlimPajama preprocessing code by @aflah02 in #1840
Add QwQ-32B-Preview by @ysjprojects in #1844
Add Mixtral-8x22B by @ysjprojects in #1845
add Llama-3.3-70B-Instruct by @ysjprojects in #1859
add Salamandra by @ysjprojects in #1857
Qwen2.5: fix block size for Coder series by @ysjprojects in #1856
fix: add missing"," by @vra in #1855
fix llama3.3 readme url by @ysjprojects in #1862
Set torch.load(..., weights_only=False) in litgpt/api.py by @Andrei-Aksionov in #1874
Add Qwen2.5 math by @ysjprojects in #1863
Add SmolLM2 by @ysjprojects in #1848
Add Mistral-Large-Instruct-2411 by @ysjprojects in #1876
Bump version for 0.5.4 release by @Andrei-Aksionov in #1883
Temporary remove Thunder to make a release by @Andrei-Aksionov in #1884

New Contributors

@ParagEkbote made their first contribution in #1821
@ysjprojects made their first contribution in #1834
@vra made their first contribution in #1855

Full Changelog: v0.5.3...v0.5.4

Contributors

vra, rasbt, and 4 other contributors

Assets 2

29 Oct 20:10

rasbt

v0.5.3

ba4e92a

v0.5.3

What's Changed

Post-release setup for 0.5.3.dev1 by @rasbt in #1799
Add Phi 3 128k model by @deveworld in #1800
Add token counts to compute performance by @rasbt in #1801
Fixed the issue that precision is always "32-true". by @jianpingw in #1802
Add Nvidia Llama 3.1 70B Nemotron weights by @rasbt in #1803
Choose evaluation example from test set by @rasbt in #1804
Pretrain tok sec by @rasbt in #1805
typo in convert_to_litgpt command by @wasifferoze in #1807
Move distributed all_reduce import into a function by @IvanYashchuk in #1810
Remove hardcoded 32-precision conversion by @rasbt in #1814

New Contributors

@deveworld made their first contribution in #1800
@jianpingw made their first contribution in #1802
@wasifferoze made their first contribution in #1807
@IvanYashchuk made their first contribution in #1810

Full Changelog: v0.5.2...v0.5.3

Contributors

jianpingw, rasbt, and 3 other contributors

Assets 2

22 Oct 16:20

rasbt

v0.5.2

721b6e3

v0.5.2

What's Changed

Fix prompt style usage in evaluation example by @rasbt in #1790
Pin to PyTorch 2.4.1 by @rasbt in #1796
Fix step iteration bug in finetuning scripts by @rasbt in #1794
Housekeeping: remove unused code and fix tests by @rasbt in #1795

Full Changelog: v0.5.1...v0.5.2

Contributors

rasbt

Assets 2

10 Oct 15:20

rasbt

v0.5.1

fd0a3fa

v0.5.1

What's Changed

v0.5.0 post release setup by @rasbt in #1774
Be more specific about missign RoPE parameters by @rasbt in #1781
Use correct Llama 3.1 and 3.2 context lengths by @rasbt in #1779
Fixing Llama 3.1 and 3.2 Maximum Context Length by @rasbt in #1782
Use more realistic RoPE tests by @rasbt in #1785
AMD (MI250X) support by @TensorTemplar in #1775
Tidy up RoPE by @rasbt in #1786
Bump version for 0.5.1 bugfix release by @rasbt in #1787

New Contributors

@TensorTemplar made their first contribution in #1775

Full Changelog: v0.5.0...v0.5.1

Contributors

rasbt and TensorTemplar

Assets 2

04 Oct 18:59

rasbt

v0.5.0

4e5b81b

v0.5.0

What's Changed

Post 0.4.13 release set up by @rasbt in #1755
Add missing explanation on how to use a finetuned model by @rasbt in #1756
Bump lightning version to latest stable release (2.4.0) by @rasbt in #1765
Improve rope by @rasbt in #1745
Add bnb.nn.StableEmbedding for quantized training by @rasbt in #1770
[fix][1760] Added fix for the missing context key issue in dolly! by @pytholic in #1766
Fix Llama 3.2 tokenizer by @rasbt in #1772

New Contributors

@pytholic made their first contribution in #1766

Full Changelog: v0.4.13...v0.5.0

Contributors

rasbt and pytholic

Assets 2

27 Sep 14:13

rasbt

v0.4.13

0fb5847

v0.4.13

What's Changed

Make 0.4.13.dev1 version by @rasbt in #1722
Enable MPS support for LitGPT by @rasbt in #1724
Simplify MPS support by @rasbt in #1726
Add Chainlit Studio by @rasbt in #1728
Fixing the tokenizer for slimpajama data preparation by @tomaslaz in #1734
Add pretrain conversion by @rasbt in #1735
Typo fix and formatting improvements in API Trainer docs by @rasbt in #1736
bump macos to m1 by @t-vi in #1725
Improve filepath handling in unit tests by @rasbt in #1737
Add a more informative message in case text exceeds context size by @rasbt in #1738
Update Thunder README.md by @rasbt in #1740
Add sliding window attention to Mistral and Phi 3 by @rasbt in #1741
Extend context length for sliding window tests by @rasbt in #1742
Fix jsonarparse version by @rasbt in #1748
Update RoPE tests by @rasbt in #1746
Make json parsing more robust by @rasbt in #1749
Support for optimizers which don't have "fused" parameter such as grokadamw and 8bit bnb by @mtasic85 in #1744
Increase rtol and atol in Gemma 2 for macOS by @rasbt in #1751
Repair json files by @rasbt in #1752
Llama 3.2 weights by @rasbt in #1750
Bump version to 0.4.13 for new release by @rasbt in #1753
Temporarily take out thunder dependency for deployment by @rasbt in #1754

New Contributors

@tomaslaz made their first contribution in #1734
@mtasic85 made their first contribution in #1744

Full Changelog: v0.4.12...v0.4.13

Contributors

mtasic85, rasbt, and 2 other contributors

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What has changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

Releases: Lightning-AI/litgpt

v0.5.8

What's Changed

New Contributors

Contributors

v0.5.7

What's Changed

New Contributors

Contributors

v0.5.6

What has changed

Contributors

v0.5.5

What's Changed

New Contributors

Contributors

v0.5.4

What's Changed

New Contributors

Contributors

v0.5.3

What's Changed

New Contributors

Contributors

v0.5.2

What's Changed

Contributors

v0.5.1

What's Changed

New Contributors

Contributors

v0.5.0

What's Changed

New Contributors

Contributors

v0.4.13

What's Changed

New Contributors

Contributors