Name		Name	Last commit message	Last commit date
parent directory ..
dataset		dataset
peft		peft
scripts_for_CR		scripts_for_CR
scripts_for_MR		scripts_for_MR
.DS_Store		.DS_Store
README.md		README.md
commonsense_170k.json		commonsense_170k.json
commonsense_evaluate.py		commonsense_evaluate.py
export_hf_checkpoint.py		export_hf_checkpoint.py
export_state_dict_checkpoint.py		export_state_dict_checkpoint.py
finetune.py		finetune.py
generate.py		generate.py
lengths.ipynb		lengths.ipynb
math_10k.json		math_10k.json
math_evaluate.py		math_evaluate.py
mathqa.py		mathqa.py
multi_dataset_eval.py		multi_dataset_eval.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

README.md

Adapting Pre-trained Models for Commonsense Reasoning and Math Reasoning Tasks

This folder contains the implementations for commonsense reasoning and math reasoning tasks.

Setup Environment

Create and Activate the Conda Environment

conda create -n Reasoning python=3.10
conda activate Reasoning

Install the Pre-requisites

pip install -r requirements.txt

Code Structure

tuners/ contains the PEFT methods used for this task. For more PEFT methods, you can use the implementations in loralib/ to modify the corresponding codes.
dataset/ contains the datasets for testing.
peft/ contains all the codes related to PEFT.
scripts_for_../ contains the training and evaluation scripts for different tasks.
finetune is the training code.
commonsense_evaluate is the evaluation code for commonsense reasoning tasks.
math_evaluate is the evaluation code for math reasoning tasks.

Adapting LLAMA / LLAMA2 / LLAMA3 for CR Tasks

Example: Fine-tuning and Evaluating LLAMA with LoRA

# Fine-tuning
# llama.sh

CUDA_VISIBLE_DEVICES=$4 python finetune.py \
    --base_model 'yahma/llama-7b-hf' \
    --data_path 'commonsense_170k.json' \
    --output_dir $3 \
    --batch_size 16  --micro_batch_size 16 --num_epochs 3 \
    --learning_rate 2e-4 --cutoff_len 256 --val_set_size 120 \
    --eval_step 80 --save_step 80  --adapter_name lora \
    --target_modules '["q_proj", "k_proj", "v_proj", "up_proj", "down_proj"]' \
    --lora_r $1 --lora_alpha $2 --use_gradient_checkpointing

sh llama.sh 32 64 ./finetune/lora_r=32/ 0

Hyper-parameter Setup

$1: the rank of LoRA.
$2: the corresponding alpha of LoRA.
$3: where to save the fine-tuned model.
$4: GPU number.
--adapter_name: the method used for fine-tuning.
--target_modules: which modules for fine-tuning.

# Evaluating
# part of llama_eval.sh

CUDA_VISIBLE_DEVICES=$2 python commonsense_evaluate.py \
    --model LLaMA-7B \
    --adapter LoRA \
    --dataset boolq \
    --base_model 'yahma/llama-7b-hf' \
    --batch_size 1 \
    --lora_weights $1|tee -a $1/boolq.txt

    ...
    ...

sh llama_eval.sh ./finetune/lora_r=32/ 0

Hyper-parameter Setup

$1: the location of fine-tuned weights
$2: GPU number

Example: Fine-tuning and Evaluating LLAMA2 with DoRA

sh llama2.sh 32 64 ./finetune/dora_r=32/ 0

sh llama2_eval.sh ./finetune/dora_r=32/ 0

Hyper-parameter Setup

See first example for details on hyper-parameters.

Example: Fine-tuning and Evaluating LLAMA3 with LoRA-Dash

First, please replace the codes in lora.py with those in lora-dash.py. Then, run the scripts as follows:

sh llama3.sh 32 64 ./finetune/lora-dash_r=32/ 0

sh llama3_eval.sh ./finetune/lora-dash_r=32/ 0

For other methods, please follow a similar pipeline.

Hyper-parameter Setup

See first example for details on hyper-parameters.

Adapting LLAMA / BLOOMz / GPT for MR Tasks

The setup for the MR task is basically the same as for the CR task. You can directly run the corresponding script.

GPU Memory

Considering that you may have different computing resources, we provide a simple table of the GPU resources required for the different models.

Model Name	GPU Requirement
LLAMA	A100 (80G)
LLAMA2	A100 (80G)
LLAMA3	A100 (80G)
BLOOMz	A100 (80G)
GPT	A100 (80G)

Q&A

If you encounter any issues, please refer to these two links: LLM-Adapter and DoRA. These resources cover the majority of problems you may encounter. Additionally, you are also welcome to contact us by submitting an issue or via email.

Acknowledgement

This directory is modified based on LLM-Adapter and DoRA. We greatly appreciate their remarkable works.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CR_MR

CR_MR

README.md

Adapting Pre-trained Models for Commonsense Reasoning and Math Reasoning Tasks

Setup Environment

Create and Activate the Conda Environment

Install the Pre-requisites

Code Structure

Adapting LLAMA / LLAMA2 / LLAMA3 for CR Tasks

Adapting LLAMA / BLOOMz / GPT for MR Tasks

GPU Memory

Q&A

Acknowledgement

Files

CR_MR

Directory actions

More options

Directory actions

More options

Latest commit

History

CR_MR

Folders and files

parent directory

README.md

Adapting Pre-trained Models for Commonsense Reasoning and Math Reasoning Tasks

Setup Environment

Create and Activate the Conda Environment

Install the Pre-requisites

Code Structure

Adapting LLAMA / LLAMA2 / LLAMA3 for CR Tasks

Adapting LLAMA / BLOOMz / GPT for MR Tasks

GPU Memory

Q&A

Acknowledgement