Skip to content
/ kurtis Public

Kurtis is a fine-tuning, inference and evaluation tool built for SLMs (Small Language Models), such as Huggingface's SmolLM2.

License

Notifications You must be signed in to change notification settings

mrs83/kurtis

Repository files navigation

Kurtis

Kurtis is an experimental training, evaluation and inference tool for small-language models (SLMs).

Kurtis in action - Evaluation Kurtis in action - Chat

Kurtis final model allows users to ask questions related to mental health topics. However, please note the following disclaimer:

Disclaimer

  • Kurtis is not a substitute for professional mental health therapy or advice.
  • The responses generated by this model may be biased or inaccurate.
  • For any serious or urgent mental health concerns, please consult with a licensed professional.
  • Kurtis is intended as a supportive tool for casual conversations, not for diagnosing or treating any mental health conditions.

Getting Started

Prerequisites

  • uv for dependency management.
  • Python 3.10 or higher.
  • Docker (optional, if you plan to use Docker).

Installation

Install the necessary dependencies using uv:

uv install

Usage

You can interact with Kurtis by either training the model or starting a chat session.

Train the Model

To train the model using the provided configuration:

uv run kurtis --train --config-module kurtis.config.default

Start a Chat Session

To start a conversation with the Kurtis model:

uv run kurtis --chat --config-module kurtis.config.default

Command-Line Options

You can view all available command-line options using the --help flag:

uv run kurtis --help

The output will display:

Usage: kurtis [OPTIONS]

  Main function to handle training and interaction with the Kurtis model.

Options:
  --preprocessing           Pre-process the QA datasets.
  --train                   Train the model using QA datasets
  --chat                    Interact with the trained model.
  --eval-model              Evaluate model.
  --push-model              Push model to huggingface.
  --push-datasets           Push datasets to huggingface.
  -o, --output-dir TEXT     Directory to save or load the model and
                            checkpoints
  -c, --config-module TEXT  Kurtis python config module.
  --debug                   Enable debug mode for verbose output
  --help                    Show this message and exit.

Makefile

A Makefile is included to help automate common development and testing tasks.

make preprocessing  - Preprocess the data using a pre-trained LLM.
make train          - Train the model.
make chat           - Start a prompt session with the model.
make install        - Install project dependencies using uv.
make cleanup        - Remove all files in the output directory with confirmation.
make eval_model     - Evaluate model.
make push           - Push datasets to Huggingface.
make docker_build   - Build the Docker image for the project.
make docker_push    - Push the Docker image to the registry.
make docker_run     - Run the Docker container with output mounted.
make docker_train   - Run the training script inside the Docker container.
make docker_chat    - Start a prompt session inside the Docker container.

Evaluation Results

Evaluation Dataset

Metric SmolLM2-1.7B SmolLM2-360M Qwen2.5-0.5B
ROUGE-2 0.0377 0.0298 0.0274
Accuracy 81.81% 80.46% 80.80%
F1 Score 86.28% 85.35% 85.45%
Precision 91.30% 90.91% 90.70%
Recall 81.81% 80.46% 80.80%

License

This project is licensed under the MIT License - see the LICENSE file for details.