Add benchmark suite

To guide performance improvements, a comprehensive benchmark suite would be helpful. A starting point could be the datasets used in the [sklearn examples](https://scikit-learn.org/stable/auto_examples/cluster/plot_cluster_comparison.html#sphx-glr-auto-examples-cluster-plot-cluster-comparison-py). One could start with the Python frontend, similar to the current `pytest`-based test. But having a setup where we first generate a fixed set of test data (maybe by calling a Python script and then saving the data to file) and then use that from a C++-only benchmark would be preferable. When saving the sample data, we could also save the labels and alongside the benchmark use the datasets as more test cases for correctness. Depending on noise level and algorithm settings like `eps` we cannot expect 100% correct labels, but have to come up with some reasonable threshold. 

Edit: Comparing performance to other implementations such as the one from `sklearn` would also be interesting. Also, performance from Python is what we mostly care about. So a Python-driven benchmark suite would also be reasonable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add benchmark suite #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add benchmark suite #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions