What's Changed
- release 0.5.2 by @winglian in #2086
- use pep440 instead of semver by @winglian in #2088
- Fix duplication of plugin callbacks by @chiragjn in #2090
- fix inference when no chat_template is set, fix unsloth dora check by @winglian in #2092
- Enable Ascend NPU support by @MengqingCao in #1758
- Bump liger-kernel requirements to 0.4.2 by @bursteratom in #2096
- fix None-type not iterable error when deepspeed is left blank w/ use_… by @bursteratom in #2087
- actions/create-release is unmaintained, and doesn't create proper release notes by @winglian in #2098
- updated colab notebook by @bursteratom in #2074
.gitignore
additions by @tmm1 in #349- move shared pytest conftest to top level tests by @winglian in #2099
- add finetome dataset to fixtures, check eval_loss in test by @winglian in #2106
- fix: ds3 and fsdp lmbench eval by @NanoCode012 in #2102
- support seperate lr for embeddings, similar to loraplus by @winglian in #1910
- add e2e tests for Unsloth qlora and test the builds by @winglian in #2093
- Add Exact Deduplication Feature to Preprocessing Pipeline by @olivermolenschot in #2072
- various tests fixes for flakey tests by @winglian in #2110
- build causal_conv1d and mamba-ssm into the base image by @winglian in #2113
- make the eval size smaller for the resume test by @winglian in #2111
- use pytest sugar and verbose for more info during ci by @winglian in #2112
- Check torch version for ADOPT optimizer + integrating new ADOPT updates by @bursteratom in #2104
- fix(vlm): handle legacy conversation data format and check image in data by @NanoCode012 in #2018
- Add ds model card, rebased by @bursteratom in #2101
- fix so inference can be run against quantized models without adapters by @winglian in #1834
- fix merge conflict of duplicate max_steps in config for relora by @winglian in #2116
- feat: add cut_cross_entropy by @NanoCode012 in #2091
- fix(readme): update cuda instructions during preprocess by @NanoCode012 in #2114
- fix optimizer reset for relora sft by @winglian in #1414
- prepare plugins needs to happen so registration can occur to build the plugin args by @winglian in #2119
- add missing fixture decorator for predownload dataset by @winglian in #2117
- replace tensorboard checks with helper function by @winglian in #2120
- drop unnecessary BNB_CUDA_VERSION env var from docker as it just results in warnings by @winglian in #2121
- update fix_untrained_tokens from unsloth with additional fixes by @winglian in #2122
- cleanup the readme, add Modal as sponsor by @winglian in #2130
- fix license header for fix_untrained_tokens from unsloth-zoo by @winglian in #2129
- CLI Implementation with Click by @djsaunde in #2107
- auto-versioning and adding axolotl.version by @djsaunde in #2127
- remove accidentally included symlink by @winglian in #2131
- upgrade bnb 0.45.0 and peft by @winglian in #2126
- Fix broken CLI; remove duplicate metadata from setup.py by @djsaunde in #2136
- reduce test concurrency to avoid HF rate limiting, test suite parity by @winglian in #2128
- Fix llama type model check by @chiragjn in #2142
- Transformers 4.47.0 by @winglian in #2138
- [tests] reset known modules that are patched on each test function end by @winglian in #2147
- add --version support to axolotl cli by @winglian in #2152
- fix for auto_map check when using remote code and multipack for models like deepseek by @winglian in #2151
- bump autoawq to 0.2.7.post3 by @winglian in #2150
- Transformers version flexibility and FSDP optimizer patch by @winglian in #2155
- add additional fft deepspeed variants by @winglian in #2153
- Fixing issue#2134 Axolotl Crashes At The End Of Training If Base Model Is Local by @bursteratom in #2140
- use manual version for now by @winglian in #2156
- fix: duplicate mlflow logging by @NanoCode012 in #2109
- upgrade deepspeed to 0.16.1 by @winglian in #2157
- add missing init to optimizers path by @winglian in #2160
- feat: add kto example by @NanoCode012 in #2158
- don't add dataset tags if empty due to all local data paths by @winglian in #2162
- fix: chat_template masking due to truncation, consolidate turn build and keys within field by @NanoCode012 in #2123
- need to update deepspeed version in extras too by @winglian in #2161
- [docs] Update README Quickstart to use CLI by @winglian in #2137
- fix release command by @winglian in #2163
- make sure to checkout tag before creating release by @winglian in #2164
New Contributors
Full Changelog: v0.5.2...v0.6.0