Fine-tuning (sft) a model after Pretraining (pt) #4118

canberkgurel · 2024-06-06T17:47:07Z

canberkgurel
Jun 6, 2024

Is it possible to first pretrain a model with text (similar to c4_demo.json) and then fine-tune with some question and answer pairs (similar to alpaca_en_demo.json)?

When I use the same output directory for fine-tuning and disable the overwriting of the output directory, the execution fails due to the output directory not being empty, as opposed to reading the pre-trained model and continuing with the fine-tuning of the model.

What should be my approach considering I have a small SFT dataset and a large unlabelled dataset that is ideal for pretraining?

Answered by hiyouga

Jun 6, 2024

do not use same output dir to continue training, use adapter_name_or_path instead

View full answer

hiyouga · 2024-06-06T17:54:52Z

hiyouga
Jun 6, 2024
Maintainer

do not use same output dir to continue training, use adapter_name_or_path instead

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine-tuning (sft) a model after Pretraining (pt) #4118

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Fine-tuning (sft) a model after Pretraining (pt) #4118

canberkgurel Jun 6, 2024

Replies: 1 comment

hiyouga Jun 6, 2024 Maintainer

canberkgurel
Jun 6, 2024

hiyouga
Jun 6, 2024
Maintainer