Skip to content

cwenhaw/topic-models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NMF topic models

Topic modeling with Non-negative matrix factorization

Minimizes the following loss function using multiplicative updates:

Let D be no. of documents and V be the vocab size. X is (D x V) data matrix. Each row is a document and each column is a feature, e.g. textual/visual word. W is document-topic matrix of dimension (D x K) where K is the number of topics. H is topic-word matrix of dimension (K x V).

The function that does the NMF is called JAL_NMF. See toy_demo.py for example usage with fake toy data with dense X. For real data, X should be a sparse matrix, e.g. scipy.sparse.csr_matrix.

See topics.py for an example that loads a small set of text data ('text.txt'), forms sparse matrix X and infers the topics. Top words for each topics are printed out.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages