Name		Name	Last commit message	Last commit date
parent directory ..
01_load_inference.py		01_load_inference.py
02_mlflow_logging_inference.py		02_mlflow_logging_inference.py
03_langchain_inference.py		03_langchain_inference.py
04_fine_tune_qlora.py		04_fine_tune_qlora.py
README.md		README.md

README.md

MPT-7b-8k models

MPT-7B-8K are 7B parameter open-source LLM models with 8k context length trained with the MosaicML platform. It contains 2 models which are commercializable:

MPT-7B-8k: A decoder-style transformer pretrained starting from MPT-7B, but updating the sequence length to 8k and training for an additional 500B tokens, resulting in a total of 1.5T tokens of text and code. License: CC-BY-SA-3.0
MPT-7B-8k-Instruct: a model for long-form instruction following (especially summarization and question-answering). Built by finetuning MPT-7B-8k on several carefully curated datasets. License: CC-BY-SA-3.0

MPT-7B-8k FAQ

When would I choose…

MPT-7B-8k over MPT-7B? Use 8k in most cases, except when coding or reasoning ability are the only criteria, in which case you should evaluate both models
MPT-7B-8k-Instruct over MPT-7B-Instruct? 8k-Instruct excels at longform instruction following; use it when you have inputs longer than 2048 tokens or for summarization and question answering.