You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So I am fine-tuning a conversational model and I have a domain-specific dataset. The dataset is structured dialog messages and each conversation is labeled using kto_tags [true/false]. I had until now only positive examples, but now I want to introduce negative samples.
I have couple of different functions that the LLM is able to use and one of those function_calls is search_address which has some required arguments which the model has already learned from the structured dialog dataset. For some addresses there is another argument open_at that is just like google for instance where you can search for locations that are currently open now for example.
Now for some locations it doesn't make sense to add it like streets for example. What I did is I tried to introduce negative samples to train the model to not put it of streets or generally speaking public locations. I can not seem to get the model to learn when to put it and when not to. I guess that if I get the positive example and then right after that I put the opposite or the negative sample the model will learn the difference between right and wrong by counter examples, but I am not sure if this works and also by doing this I use twice the amount of samples, because for each example I add a negative sample
So my question is do you think this will work or is there a better way to teach the model when to use this argument?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
So I am fine-tuning a conversational model and I have a domain-specific dataset. The dataset is structured dialog messages and each conversation is labeled using kto_tags [true/false]. I had until now only positive examples, but now I want to introduce negative samples.
I have couple of different functions that the LLM is able to use and one of those function_calls is search_address which has some required arguments which the model has already learned from the structured dialog dataset. For some addresses there is another argument open_at that is just like google for instance where you can search for locations that are currently open now for example.
Now for some locations it doesn't make sense to add it like streets for example. What I did is I tried to introduce negative samples to train the model to not put it of streets or generally speaking public locations. I can not seem to get the model to learn when to put it and when not to. I guess that if I get the positive example and then right after that I put the opposite or the negative sample the model will learn the difference between right and wrong by counter examples, but I am not sure if this works and also by doing this I use twice the amount of samples, because for each example I add a negative sample
So my question is do you think this will work or is there a better way to teach the model when to use this argument?
Beta Was this translation helpful? Give feedback.
All reactions