Train a model for a new language #200

boyu9 · 2024-12-04T07:55:50Z

I want to train a new programming language with a model. Without fine-tuning, it is completely impossible to output because it is an internal front-end framework and the open-source model does not have corresponding corpus. Now, I want to fine tune based on Qwen-2.5-Coder-32B and hope that the output component code can comply with the specifications in the internal framework documentation. And implement code writing. May I ask how to use Qwen-2.5-Coder-32B for training， Do we need to pretrain, or just fine tune based on Qwen-2.5-Coder-32B

cyente · 2024-12-05T08:11:32Z

https://github.com/QwenLM/Qwen2.5-Coder/tree/main/finetuning

here are our finetuning scripts, you can try.

pretraining or not depends on your demands and resources. We advise you to try first. Hoping to hear your successful implementation on Qwen-Coder :)

boyu9 · 2024-12-05T12:16:50Z

Thank you for your reply. If I want to try fine tune Qwen-Coder, can I do it in two steps? The first step is to learn the basic grammar knowledge of the new language, first do grammar knowledge fine-tuning , and then second do instruction fine-tuning,Could you give me some training suggestions about this .thanks

CSJianYang · 2024-12-10T10:05:52Z

You can try the low-quality data in the first stage and high-quality data in the second sft stage. Maybe, it brings more improvement (https://arxiv.org/abs/2412.05210).

boyu9 · 2024-12-11T07:20:24Z

Okay, thank you for your suggestion. I also have a question to ask. Should we use full parameter fine-tuning or based on Lora fine-tuning? Currently, GPU resources are not very sufficient, and I plan to use Lora fine-tuning. I'm not sure about the performance.

cyente · 2024-12-16T03:36:13Z

both way is ok, i am not sure too :(

waiting for your feedback~

boyu9 · 2024-12-16T07:21:10Z

OK,Thank you. If I do pre-training, is SFT's finetune script suitable for pre-training? I see that the source code only provides the finetune script. Can this script be used for pre-training?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Train a model for a new language #200

Train a model for a new language #200

boyu9 commented Dec 4, 2024

cyente commented Dec 5, 2024

boyu9 commented Dec 5, 2024

CSJianYang commented Dec 10, 2024

boyu9 commented Dec 11, 2024

cyente commented Dec 16, 2024

boyu9 commented Dec 16, 2024

Train a model for a new language #200

Train a model for a new language #200

Comments

boyu9 commented Dec 4, 2024

cyente commented Dec 5, 2024

boyu9 commented Dec 5, 2024

CSJianYang commented Dec 10, 2024

boyu9 commented Dec 11, 2024

cyente commented Dec 16, 2024

boyu9 commented Dec 16, 2024