You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to use finetune_lora.sh to finetune InternLM-XComposer2. Except for using LoRA to finetune the part of LLM, I want to know whether can I also update the parameter of sampler. Changing the --fix_sampler False is the only thing I have to do? Or I should change somwthing else?
For the state of saving updated model, I notice that they only save the part of LoRA. Maybe I need to do some change for safe_save_model_for_hf_trainer or get_peft_state_maybe_zero_3?
note: LoRA finetune LLM and update sampler(projection layer)
If anyone can give some advice I would be very grateful.
I'm trying to use finetune_lora.sh to finetune InternLM-XComposer2. Except for using LoRA to finetune the part of LLM, I want to know whether can I also update the parameter of sampler. Changing the
--fix_sampler False
is the only thing I have to do? Or I should change somwthing else?For the state of saving updated model, I notice that they only save the part of LoRA. Maybe I need to do some change for
safe_save_model_for_hf_trainer
orget_peft_state_maybe_zero_3
?note: LoRA finetune LLM and update sampler(projection layer)
If anyone can give some advice I would be very grateful.
#!/bin/bash
export CUDA_DEVICE_MAX_CONNECTIONS=1
DIR=
pwd
export MODEL="internlm/internlm-xcomposer2d5-7b"
export DATA="data.txt"
GPUS_PER_NODE=8
NNODES=1
NODE_RANK=0
MASTER_ADDR=localhost
MASTER_PORT=6001
DISTRIBUTED_ARGS="
--nproc_per_node $GPUS_PER_NODE
--nnodes $NNODES
--node_rank $NODE_RANK
--master_addr $MASTER_ADDR
--master_port $MASTER_PORT
"
torchrun $DISTRIBUTED_ARGS finetune.py
--model_name_or_path $MODEL
--data_path $DATA
--given_num True
--bf16 True
--fix_vit True
*--fix_sampler False *
--use_lora True
--hd_num 18
--output_dir output/finetune_lora
--num_train_epochs 1
--batch_size 2
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--gradient_accumulation_steps 8
--evaluation_strategy "no"
--save_strategy "epoch"
--save_total_limit 1
--learning_rate 5e-5
--weight_decay 0.1
--adam_beta2 0.95
--warmup_ratio 0.01
--lr_scheduler_type "cosine"
--logging_steps 1
--report_to "none"
--max_length 16384
--deepspeed ds_config_zero2.json
--gradient_checkpointing True
The text was updated successfully, but these errors were encountered: