Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update sampler and use LoRA finetune LLM #444

Open
chengxin0913 opened this issue Sep 25, 2024 · 0 comments
Open

update sampler and use LoRA finetune LLM #444

chengxin0913 opened this issue Sep 25, 2024 · 0 comments
Assignees

Comments

@chengxin0913
Copy link

chengxin0913 commented Sep 25, 2024

I'm trying to use finetune_lora.sh to finetune InternLM-XComposer2. Except for using LoRA to finetune the part of LLM, I want to know whether can I also update the parameter of sampler. Changing the --fix_sampler False is the only thing I have to do? Or I should change somwthing else?
For the state of saving updated model, I notice that they only save the part of LoRA. Maybe I need to do some change for safe_save_model_for_hf_trainer or get_peft_state_maybe_zero_3?

note: LoRA finetune LLM and update sampler(projection layer)

If anyone can give some advice I would be very grateful.

#!/bin/bash
export CUDA_DEVICE_MAX_CONNECTIONS=1
DIR=pwd

export MODEL="internlm/internlm-xcomposer2d5-7b"
export DATA="data.txt"

GPUS_PER_NODE=8
NNODES=1
NODE_RANK=0
MASTER_ADDR=localhost
MASTER_PORT=6001

DISTRIBUTED_ARGS="
--nproc_per_node $GPUS_PER_NODE
--nnodes $NNODES
--node_rank $NODE_RANK
--master_addr $MASTER_ADDR
--master_port $MASTER_PORT
"

torchrun $DISTRIBUTED_ARGS finetune.py
--model_name_or_path $MODEL
--data_path $DATA
--given_num True
--bf16 True
--fix_vit True
*--fix_sampler False *
--use_lora True
--hd_num 18
--output_dir output/finetune_lora
--num_train_epochs 1
--batch_size 2
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--gradient_accumulation_steps 8
--evaluation_strategy "no"
--save_strategy "epoch"
--save_total_limit 1
--learning_rate 5e-5
--weight_decay 0.1
--adam_beta2 0.95
--warmup_ratio 0.01
--lr_scheduler_type "cosine"
--logging_steps 1
--report_to "none"
--max_length 16384
--deepspeed ds_config_zero2.json
--gradient_checkpointing True

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants