Fine-tuning (sft) a model after Pretraining (pt) #4118
-
Is it possible to first pretrain a model with text (similar to c4_demo.json) and then fine-tune with some question and answer pairs (similar to alpaca_en_demo.json)? When I use the same output directory for fine-tuning and disable the overwriting of the output directory, the execution fails due to the output directory not being empty, as opposed to reading the pre-trained model and continuing with the fine-tuning of the model. What should be my approach considering I have a small SFT dataset and a large unlabelled dataset that is ideal for pretraining? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
do not use same output dir to continue training, use |
Beta Was this translation helpful? Give feedback.
do not use same output dir to continue training, use
adapter_name_or_path
instead