Skip to content

sklearn ValueError on small datasets #6

@karim-sharkawy

Description

@karim-sharkawy

For small reddit data, the normal BERTopic, HDBSCAN, and UMAP values may not be appropriate since they'll end with no topics being created. This raises this error:

ValueError: After pruning, no terms remain. Try a lower min_df or a higher max_df.

Metadata

Metadata

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions