Representation Model on full Zero-Shot Topic Modeling #2095
marcomarinodev
started this conversation in
General
Replies: 2 comments 1 reply
-
You always generate keywords for topics using c-TF-IDF regardless of the method that you use. These keywords can still be optimized using any representation model. |
Beta Was this translation helpful? Give feedback.
0 replies
-
I don't know if i should create a new dicussion, but I have another question: does it make sense to use this (https://maartengr.github.io/BERTopic/getting_started/representation/representation.html#zero-shot-classification) if I'm using "full" zero shot topic modeling? I think this is helpful for generated topics with clustering, right? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello everyone, I use BERTopic with a specific
zeroshot_min_similarity
value so that I force to not generate new topics and therefore performing any clustering algorithms. I thought that therepresentation_model
was then "useless" because at the end what BERTopic is doing is computing the labels and document embeddings and compare them with cosine similarity (am I right?). Therefore I tried to setrepresentation model=None
, but the quality of the keywords dropped a lot.So the question is, why the
representation_model
is useful when we are on full zero shot mode? What the score for each keyword represents then, if we don't compute any c-tf-idf score?Thank you.
Beta Was this translation helpful? Give feedback.
All reactions