Skip to content

Probability for a document after transforming the model on unseen data does not make sense #2104

Answered by MaartenGr
Smit1400 asked this question in Q&A
Discussion options

You must be logged in to vote

You are not doing anything wrong, that's just the nature of the underlying embedding model which tends to produce relatively high similarity scores. It's distribution of similarity scores can be centered towards to the higher values, so using something like a soft-max would be helpful to get a more fine-grained perspective.

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
0 replies
Answer selected by Smit1400
Comment options

You must be logged in to vote
1 reply
@MaartenGr
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants