-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Welcome on wiki page of DBSCAN clustering algorithm. Main target is prepareing our own implementation of DBSCAN clustering algorithm in object oriented script language like Python.
DBscan (Density-based spatial clustering of applications with noise) is a data clustering algorithm proposed by Martin Ester, Hans-Peter Kriegel, Jörg Sander and Xiaowei Xu in 1996. It is a density-based clustering algorithm: given a set of points in some space, it groups together points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions (whose nearest neighbors are too far away). DBSCAN is one of the most common clustering algorithms and also most cited in scientific literature.
- Python 3.5
- Matplotlib
- Scipy
- Numpy
- Scikit
The only thing that is necessary to run comparison test is to start script scikitresults_test.py included in test directory. Effects of our approache and scikit solution show pictures below.


Number of samples: 100
- Scikit average time of computing = 0.00025
- Our version average time of computing = 0.00308
- Our algorithm is 12.42 slower than algorithm from scikit
Number of samples: 500
- Scikit average time of computing = 0.00143
- Our version average time of computing = 0.08387
- Our algorithm is 58.70 slower than algorithm from scikit
Number of samples: 1000
- Scikit average time of computing = 0.00317
- Our version average time of computing = 0.33529
- Our algorithm is 105.76 slower than algorithm from scikit
Number of samples: 2000
- Scikit average time of computing = 0.01239
- Our version average time of computing = 1.33483
- Our algorithm is 107.69 slower than algorithm from scikit
Number of samples: 3000
- Scikit average time of computing = 0.01406
- Our version average time of computing = 3.01206
- Our algorithm is 214.27 slower than algorithm from scikit
Number of samples: 4000
- Scikit average time of computing = 0.02061
- Our version average time of computing = 5.69492
- Our algorithm is 276.27 slower than algorithm from scikit
Number of samples: 5000
- Scikit average time of computing = 0.01736
- Our version average time of computing = 8.03304
- Our algorithm is 462.74 slower than algorithm from scikit
Number of samples: 10000
- Scikit average time of computing = 0.08937
- Our version average time of computing = 35.53449
- Our algorithm is 397.63 slower than algorithm from scikit
Plots based on data include above:
