-
Notifications
You must be signed in to change notification settings - Fork 138
Open
Description
During topic learning, one needs to supply W: int, size of vocabulary.
I tried to fathom the meaning of W reading Algorithm 1: Gibbs sampling algorithm for BTM in the paper BTM: Topic Modeling over Short Texts, but W is not an input there. However, it is data-dependent to me, so am I correct if I assume W to mean the number of unique terms in the cleaned and preprocessed corpus? If so, any reason W is not calculated from the corpus docs_pt automatically? I'm afraid I am missing something hence my question.
Thank you.
Metadata
Metadata
Assignees
Labels
No labels