GitHub - neuml/txtinstruct: 📚 Datasets and models for instruction-tuning

Datasets and models for instruction-tuning

txtinstruct is a framework for training instruction-tuned models.

The objective of this project is to support open data, open models and integration with your own data. One of the biggest problems today is the lack of licensing clarity with instruction-following datasets and large language models. txtinstruct makes it easy to build your own instruction-following datasets and use those datasets to train instructed-tuned models.

txtinstruct is built with Python 3.8+ and txtai.

Installation

The easiest way to install is via pip and PyPI

pip install txtinstruct

You can also install txtinstruct directly from GitHub. Using a Python Virtual Environment is recommended.

pip install git+https://github.com/neuml/txtinstruct

Python 3.8+ is supported

See this link to help resolve environment-specific install issues.

Examples

The following example notebooks show how to build models with txtinstruct.

Notebook	Description
Introducing txtinstruct	Build instruction-tuned datasets and models

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
examples		examples
images		images
src/python/txtinstruct		src/python/txtinstruct
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
LICENSE		LICENSE
README.md		README.md
logo.png		logo.png
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Examples

Further Reading

About

Releases 1

Packages

Languages

License

neuml/txtinstruct

Folders and files

Latest commit

History

Repository files navigation

Installation

Examples

Further Reading

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages