Our PPoPP 2025 paper introduces Popcorn, a new sparse-matrix formulation of Kernel K-means that enables an efficient, high-performance GPU implementation with minimal manual kernel engineering effort. Popcorn achieves up to 123.8× speedup over a CPU version and 2.6× over a GPU implementation that does not use sparse linear algebra.
Needed for python utils like data_generator and scatter_plot.
pip install -r py_utils/requirements.txtUnit testing for C++.
git clone https://github.com/catchorg/Catch2.gitThis script can be used to generate random datasets. If attribute -k is specified the dataset contains clusterized points
python3 py_utils/data_generator.py -n 1000 -d 3 -k 4 -min 0 -max 10 -o datasets/3Dpoints.csvThis script can be used to display results (reading csv output files).
python3 py_utils/scatter_plot.py -f 3Dpoints.csv -d 3These are the commands that can be used to compile GPU Kmeans.
Notice: cmake is required.
mkdir -p build
cd build
cmake ..
cmake --build .This will build the target gpukmeans to build/src/bin/gpukmeans.
This code has is provided in scripts/build.sh so that (from the root of the project) you can run ./scripts/build.sh to build gpukmeans.
The target unit_kernels is excluded from ALL_TARGETS therefore it has to be compiled explicitly.
cmake --build . --target unit_kernelsUse the script scripts/run_tests.sh to build and run unit tests.
The executable gpukmeans reads inputs from stdin or from csv file. The directory datasets contains some examples of data input.
./build/src/bin/gpukmeans --help
gpukmeans is an implementation of the K-means algorithm that uses a GPU
Usage:
gpukmeans [OPTION...]
-h, --help Print usage
-d, --n-dimensions arg Number of dimensions of a point
-n, --n-samples arg Number of points
-k, --n-clusters arg Number of clusters
-m, --maxiter arg Maximum number of iterations
-o, --out-file arg Output filename
-i, --in-file arg Input filename
-r, --runs arg Number of k-means runs (default: 1)
-s, --seed arg Seed for centroids generator
-t, --tolerance arg Tolerance to declare convergenceExample:
./build/src/bin/gpukmeans -d 64 -n 1797 -k 10 -m 300 -o res.csv -i ./datasets/N1797_D64_digits-sklearn.csv -t 0.0001Runs gpukmeans on the dataset N1797_D64_digits-sklearn.csv considering: 64 dimensions, 1797 points, 10 clusters, 300 maximum iterations, tolerance 1e-4, and output the results to res.csv.
If you find this repo helpful to your work, please cite our article:
@inproceedings{bellavita2025popcorn,
title={Popcorn: Accelerating Kernel K-means on GPUs through Sparse Linear Algebra},
author={Bellavita, Julian and Pasquali, Thomas and Del Rio Martin, Laura and Vella, Flavio and Guidi, Giulia},
booktitle={Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming},
pages={426--440},
year={2025}
}
This work was a collaboration between the HiCrest Laboratory at the University of Trento (Italy) and the Cornell HPC Group at Cornell University (USA). The first author was supported by DOE CSGF. The authors acknowledge financial support from ICSC – Centro Nazionale di Ricerca in High-Performance Computing, Big Data and Quantum Computing, funded by European Union – NextGenerationEU. This work has received funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 101175702 and the NationalInstitute of Higher Mathematics Francesco Severi. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231 using NERSC award ASCR-ERCAP0030076. This project received support from the Center for Research on Programmable Plant Systems under National Science Foundation Grant No. DBI-2019674.