Skip to content

How to get the distribution of a doc over topics (and topic over words) #3

@RamtinYazdanian

Description

@RamtinYazdanian

Hello,

First of all, thanks for developing this for Python!

I have been looking at the code and I cannot seem to find a way to infer the distribution of a document over the topics in its path from the root to the leaf (which would be the parameter theta in the "Hierarchical Topic Models and the Nested Chinese Restaurant Process" paper) and also the distribution of a topic over the words (which would be betas in the same paper).

For the second case, dividing word counts at a node by the sum of word counts should yield the probabilities of the respective topic over the words, but is that the best approximation of those values or is there a way to get a more accurate one?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions