-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: Add ability to compare models #34
Comments
hi @zippeurfou , thanks so much for putting this together - I'm still grappling with how to handle OS stuff after going deep in the new company. In particular, I have few meetings lined up to see if we can get a governance for EvalRS and Reclist in 2024 - will update you personally when I have some more clarity around it, hopefully by end of the month. |
Thanks for the update. I'm happy to help in either way.
|
@zippeurfou @jacopotagliabue I have an implemented abstract method which stores the scores outside the new folder & kendall tau implemented. (For same dataset (can be synthetic as well) & different models). I was hoping to make interface for extrenal libraries such as evaluate . Details of kendall tau are similar to this pull request . Let me know if you want me to push these changes or if you want to collaborate , |
@unna97 if you have some example snippet of the method to showcase the developer experience, please do a PR with that and I'll be happy to take a look when I have some time ;-) |
This is a placeholder with no clear deliverable :).
Mostly for open convos to get some thoughts on how helpful it would be.
I think reclist would benefit from the ability to compare models in the context of list output.
The reasoning for this is that reclist allow you to compare models at a "pointwise" level (IE. If you want to use Ndcg for 2 models you will compute it on each model independently and then compare the results) is only giving partial informations.
For example we don't know how different the recommendations are when it comes to the content.
A naive approach for example is just to do Jaccard similarity between the list of the 2 model. Other metrics could be weighted Kendall Tau correlation or rbo (https://github.com/changyaochen/rbo).
At a high level this is to answer the question how different theses 2 model are looking at the content and ranking difference between them rather than comparing metrics computed on each model independently.
Let me know if people have any thoughts if that's an approach we should consider.
If so, I am happy to take a stab but would welcome some small guidance so it follows whatever framework you will like.
The text was updated successfully, but these errors were encountered: