v0.11.0
🚀 LLM Foundry v0.11.0
New Features
LLM Foundry CLI Commands (#1337, #1345, #1348, #1354)
We've added CLI commands for our commonly used scripts.
For example, instead of calling composer llm-foundry/scripts/train.py parameters.yaml
, you can now do composer -c llm-foundry train parameters.yaml
.
Docker Images Contain All Optional Dependencies (#1431)
LLM Foundry Docker images now have all optional dependencies.
Support for Llama3 Rope Scaling (#1391)
To use it, you can add the following to your parameters:
model:
name: mpt_causal_lm
attn_config:
rope: true
...
rope_impl: hf
rope_theta: 500000
rope_hf_config:
type: llama3
...
Tokenizer Registry (#1386)
We now have a tokenizer registry so you can easily add custom tokenizers.
LoadPlanner and SavePlanner Registries (#1358)
We now have LoadPlanner and SavePlanner registries so you can easily add custom checkpoint loading and saving logic.
Faster Auto-packing (#1435)
The auto packing startup is now much faster. To use auto packing with finetuning datasets, you can add packing_ratio: auto
to your config like so:
train_loader:
name: finetuning
dataset:
...
packing_ratio: auto
What's Changed
- Extra serverless by @XiaohanZhangCMU in #1320
- Fixing sequence_id =-1 bug, adding tests by @ShashankMosaicML in #1324
- Registry docs update by @dakinggg in #1323
- Add dependabot by @dakinggg in #1322
HUGGING_FACE_HUB_TOKEN
->HF_TOKEN
by @dakinggg in #1321- Bump version by @b-chu in #1326
- Relax hf hub pin by @dakinggg in #1314
- Error if metadata matches existing keys by @dakinggg in #1313
- Update transformers requirement from <4.41,>=4.40 to >=4.42.3,<4.43 by @dependabot in #1327
- Bump einops from 0.7.0 to 0.8.0 by @dependabot in #1328
- Bump onnxruntime from 1.15.1 to 1.18.1 by @dependabot in #1329
- Bump onnx from 1.14.0 to 1.16.1 by @dependabot in #1331
- Currently multi-gpu generate does not work with hf.generate for hf checkpoints. This PR fixes that. by @ShashankMosaicML in #1332
- Fix registry for callbacks with configs by @mvpatel2000 in #1333
- Adding a child class of hf's rotary embedding to make hf generate work on multiple gpus. by @ShashankMosaicML in #1334
- Add a config arg to just save an hf checkpoint by @dakinggg in #1335
- Deepcopy config in callbacks_with_config by @mvpatel2000 in #1336
- Avoid HF race condition by @dakinggg in #1338
- Nicer error message for undefined symbol by @dakinggg in #1339
- Bump sentencepiece from 0.1.97 to 0.2.0 by @dependabot in #1342
- Removing logging exception through update run metadata by @jjanezhang in #1292
- [MCLOUD-4910] Escape UC names during data prep by @naren-loganathan in #1343
- Add CLI for train.py by @KuuCi in #1337
- Add fp32 to the set of valid inputs to attention layer by @j316chuck in #1347
- Log all extraneous_keys in one go for ease of development by @josejg in #1344
- Fix MLFlow Save Model for TE by @j316chuck in #1353
- Add flag for saving only composer checkpoint by @irenedea in #1356
- Expose flag for should_save_peft_only by @irenedea in #1357
- Command utils + train by @KuuCi in #1361
- Readd Clear Resolver by @KuuCi in #1365
- Add Eval to Foundry CLI by @KuuCi in #1345
- Enhanced Logging for convert_delta_to_json and convert_text_to_mds by @vanshcsingh in #1366
- Add convert_dataset_hf to CLI by @KuuCi in #1348
- Add missing init by @KuuCi in #1368
- Make ICL dataloaders build lazily by @josejg in #1359
- Add option to unfuse Wqkv by @snarayan21 in #1367
- Add convert_dataset_json to CLI by @KuuCi in #1349
- Add convert_text_to_mds to CLI by @KuuCi in #1352
- Fix hf dataset hang on small dataset by @dakinggg in #1370
- Add LoadPlanner and SavePlanner registries by @irenedea in #1358
- Load config on rank 0 first by @dakinggg in #1371
- Add convert_finetuning_dataset to CLI by @KuuCi in #1354
- Allow for transforms on the model before MLFlow registration by @snarayan21 in #1372
- Allow flash attention up to 3 by @dakinggg in #1377
- Update accelerate requirement from <0.26,>=0.25 to >=0.32.1,<0.33 by @dependabot in #1341
- update runners by @KevDevSha in #1360
- Allow for multiple workers when autopacking by @b-chu in #1375
- Allow train.py-like config for eval.py by @josejg in #1351
- Fix load and save planner config logic by @irenedea in #1385
- Do dtype conversion in torch hook to save memory by @irenedea in #1384
- Get a shared file system safe signal file name by @dakinggg in #1381
- Add transformation method to hf_causal_lm by @irenedea in #1383
- [kushalkodnad/tokenizer-registry] Introduce new registry for tokenizers by @kushalkodn-db in #1386
- Bump transformers version to 4.43.1 by @dakinggg in #1388
- Add convert_delta_to_json to CLI by @KuuCi in #1355
- Revert "Use utils to get shared fs safe signal file name (#1381)" by @dakinggg in #1389
- Avoid race condition in convert text to mds script by @dakinggg in #1390
- Refactor loss function for ComposerMPTCausalLM by @irenedea in #1387
- Revert "Allow for multiple workers when autopacking (#1375)" by @dakinggg in #1392
- Bump transformers to 4.43.2 by @dakinggg in #1393
- Support rope scaling by @milocress in #1391
- Removing the extra LlamaRotaryEmbedding import by @ShashankMosaicML in #1394
- Dtensor oom by @dakinggg in #1395
- Condition the meta initialization for hf_causal_lm on pretrain by @irenedea in #1397
- Fix license link in readme by @dakinggg in #1398
- Enable passing epsilon when building norm layers by @gupta-abhay in #1399
- Add pre register method for mlflow by @dakinggg in #1396
- add it by @dakinggg in #1400
- Remove orig params default by @dakinggg in #1401
- Add spin_dataloaders flag by @dakinggg in #1405
- Remove curriculum learning error when duration less than saved timestamp by @b-chu in #1406
- Set pretrained model name correctly, if provided, in HF Checkpointer by @snarayan21 in #1407
- Enable QuickGelu Function for CLIP models by @gupta-abhay in #1408
- Bump streaming version to v0.8.0 by @mvpatel2000 in #1411
- Kevin/ghcr build by @KevDevSha in #1413
- Update accelerate requirement from <0.33,>=0.25 to >=0.25,<0.34 by @dependabot in #1403
- Update huggingface-hub requirement from <0.24,>=0.19.0 to >=0.19.0,<0.25 by @dependabot in #1379
- Make Pytest log in color in Github Action by @eitanturok in #1412
- Read Package Version Better by @eitanturok in #1415
- Log original config by @josejg in #1410
- Replace pydocstyle with Ruff by @eitanturok in #1417
- test cpu by @KevDevSha in #1416
- Update pr-gpu.yaml by @KevDevSha in #1420
- Additional registry entrypoint documentation by @dakinggg in #1414
- Remove type ignore by @dakinggg in #1421
- Update pytest-cov requirement from <5,>=4 to >=4,<6 by @dependabot in #1423
- Bump onnx from 1.16.1 to 1.16.2 by @dependabot in #1425
- Add transforms to logged config by @b-chu in #1428
- Install all optional dependencies in the docker images by @dakinggg in #1431
- Raise error when not enough data when converting text to MDS by @KuuCi in #1430
- Bump yaml versions by @dakinggg in #1433
- Automatically get the portion of the dataset config that is constructor args by @dakinggg in #1434
- Remove flash patching for HF by @dakinggg in #1436
- Fix the context size in long context gauntlet for wikiqa by @bfontain in #1439
- Update mlflow requirement from <2.15,>=2.14.1 to >=2.14.1,<2.16 by @dependabot in #1424
- Add special errors for bad chat/ift types by @milocress in #1437
- Make autopacking faster by @b-chu in #1435
- Use the pretrained generation config if it exists for HF models by @irenedea in #1440
New Contributors
- @dependabot made their first contribution in #1327
- @naren-loganathan made their first contribution in #1343
- @vanshcsingh made their first contribution in #1366
- @KevDevSha made their first contribution in #1360
- @kushalkodn-db made their first contribution in #1386
- @gupta-abhay made their first contribution in #1399
- @bfontain made their first contribution in #1439
Full Changelog: v0.10.0...v0.11.0