Skip to content

Is this library still useful? #2195

Closed Answered by MaartenGr
damaon asked this question in Q&A
Oct 22, 2024 · 2 comments · 4 replies
Discussion options

You must be logged in to vote

No problem! Glad you found the issue.

Though it seems to adjust these clusters after initial clustering still, right?

It doesn't change the clusters themselves but merely their IDs to be make sure that topic 0 is a larger topic than topic 1, etc.

What vectorizer_model and cTFIDF does exactly after clustering?

The default countvectorizer from sklearn, which you can indeed change.

I was thinking on using also TFIDF representation somehow to make clusters more "words" (as opposed to semantical similarity) based but just normalizing and appending to embeddings before reduction doesn't seem to work, so hoped that BERTopic has it better solved.

BERTopic doesn't use TF-IDF but a variant, c…

Replies: 2 comments 4 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
4 replies
@MaartenGr
Comment options

Answer selected by damaon
@damaon
Comment options

@damaon
Comment options

@MaartenGr
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants