Skip to content

Add benchmark suite #4

@ahans

Description

@ahans

To guide performance improvements, a comprehensive benchmark suite would be helpful. A starting point could be the datasets used in the sklearn examples. One could start with the Python frontend, similar to the current pytest-based test. But having a setup where we first generate a fixed set of test data (maybe by calling a Python script and then saving the data to file) and then use that from a C++-only benchmark would be preferable. When saving the sample data, we could also save the labels and alongside the benchmark use the datasets as more test cases for correctness. Depending on noise level and algorithm settings like eps we cannot expect 100% correct labels, but have to come up with some reasonable threshold.

Edit: Comparing performance to other implementations such as the one from sklearn would also be interesting. Also, performance from Python is what we mostly care about. So a Python-driven benchmark suite would also be reasonable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions