Representation Model on full Zero-Shot Topic Modeling #2095

marcomarinodev · 2024-07-24T09:59:28Z

marcomarinodev
Jul 24, 2024

Hello everyone, I use BERTopic with a specific zeroshot_min_similarity value so that I force to not generate new topics and therefore performing any clustering algorithms. I thought that the representation_model was then "useless" because at the end what BERTopic is doing is computing the labels and document embeddings and compare them with cosine similarity (am I right?). Therefore I tried to set representation model=None, but the quality of the keywords dropped a lot.

So the question is, why the representation_model is useful when we are on full zero shot mode? What the score for each keyword represents then, if we don't compute any c-tf-idf score?

Thank you.

MaartenGr · 2024-07-24T10:49:43Z

MaartenGr
Jul 24, 2024
Maintainer

You always generate keywords for topics using c-TF-IDF regardless of the method that you use. These keywords can still be optimized using any representation model.

0 replies

marcomarinodev · 2024-07-24T10:57:13Z

marcomarinodev
Jul 24, 2024
Author

I don't know if i should create a new dicussion, but I have another question: does it make sense to use this (https://maartengr.github.io/BERTopic/getting_started/representation/representation.html#zero-shot-classification) if I'm using "full" zero shot topic modeling? I think this is helpful for generated topics with clustering, right?

1 reply

MaartenGr Jul 30, 2024
Maintainer

I would advise using zero-shot topic modeling for the clustering step first before using it as a representation model. There is no need to then use the zero-shot representation model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Representation Model on full Zero-Shot Topic Modeling #2095

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

Representation Model on full Zero-Shot Topic Modeling #2095

marcomarinodev Jul 24, 2024

Replies: 2 comments · 1 reply

MaartenGr Jul 24, 2024 Maintainer

marcomarinodev Jul 24, 2024 Author

MaartenGr Jul 30, 2024 Maintainer

marcomarinodev
Jul 24, 2024

Replies: 2 comments 1 reply

MaartenGr
Jul 24, 2024
Maintainer

marcomarinodev
Jul 24, 2024
Author

MaartenGr Jul 30, 2024
Maintainer