CL k-Means

CL k-Means is an efficient and portable implementation of Lloyd's k-Means algorithm in OpenCL. It introduces a more efficient execution strategy that requires only a single pass over data. This single pass optimization is based on a new centroid update algorithm that features a reduced cache footprint.

Build Dependencies

A working OpenCL installation. Refer to Andreas Klöckner's wiki.
Clang 3.8 or greater
Boost version 1.61 or greater

Build Instructions

git clone --recursive https://github.com/TU-Berlin-DIMA/CL-kmeans.git
mkdir CL-kmeans/build
cd CL-kmeans/build
cmake -DCMAKE_BUILD_TYPE=Debug ..
make -j`grep processor /proc/cpuinfo | wc -l`

Usage

First, generate binary files in a simple binary file format. Then run the benchmark. Finally, view the total OpenCL kernel runtimes stored in the output CSV file.

./scripts/generate_features.py
./bench --csv runtime.csv --config ../configurations/intel_core_i7-6700K_three_stage.conf ../data/cluster_data_4f_10c_2048mb.bin
grep TotalTime *runtime_mnts.csv # Total runtime in microseconds in last column

Configurations

CL k-Means can be tuned to different types of processors using simple configuration files. Each file consists of a key-value pairs that define the execution strategy and hardware tuning options.

See example configurations for Intel Core i7-6700K and Nvidia GeForce GTX 1080 processors in the '/configurations' directory.

Publications

This project has resulted in the following academic publications:

C. Lutz et al., "Efficient and Scalable k-Means on GPUs", in Datenbanken Spektrum 2018
C. Lutz et al., "Efficient k-Means on GPUs", in DaMoN 2018

To cite these works, add these BibTeX snippets to your bibliography:

@article{lutz:dbspektrum:2018,
  author    = {Clemens Lutz and
               Sebastian Bre{\ss} and
               Tilmann Rabl and
               Steffen Zeuch and
               Volker Markl},
  title     = {Efficient and Scalable k-Means on GPUs},
  journal   = {Datenbank-Spektrum},
  volume    = {18},
  number    = {3},
  pages     = {157--169},
  year      = {2018},
  url       = {https://doi.org/10.1007/s13222-018-0293-x},
  doi       = {10.1007/s13222-018-0293-x}
}

@inproceedings{lutz:damon:2018,
  author    = {Clemens Lutz and
               Sebastian Bre{\ss} and
               Tilmann Rabl and
               Steffen Zeuch and
               Volker Markl},
  title     = {Efficient k-Means on GPUs},
  booktitle = {Proceedings of the 14th International Workshop on Data Management
               on New Hardware, Houston, TX, USA, June 11, 2018},
  pages     = {3:1--3:3},
  year      = {2018},
  url       = {https://doi.org/10.1145/3211922.3211925},
  doi       = {10.1145/3211922.3211925}
}

Name		Name	Last commit message	Last commit date
Latest commit History 505 Commits
allocator		allocator
baselines		baselines
cl_kernels		cl_kernels
cmake		cmake
configurations		configurations
container		container
images		images
libs		libs
measurement		measurement
scripts		scripts
test		test
.gitignore		.gitignore
.gitmodules		.gitmodules
.ycm_extra_conf.py		.ycm_extra_conf.py
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
SystemConfig.h.in		SystemConfig.h.in
Version.h.in		Version.h.in
abstract_kmeans.hpp		abstract_kmeans.hpp
bench.cpp		bench.cpp
benchmark_configuration.hpp		benchmark_configuration.hpp
binary_format.cpp		binary_format.cpp
binary_format.hpp		binary_format.hpp
buffer_cache.hpp		buffer_cache.hpp
buffer_helper.cpp		buffer_helper.cpp
buffer_helper.hpp		buffer_helper.hpp
centroid_update_configuration.hpp		centroid_update_configuration.hpp
centroid_update_factory.hpp		centroid_update_factory.hpp
cluster_generator.cpp		cluster_generator.cpp
cluster_generator.hpp		cluster_generator.hpp
clustering_benchmark.cpp		clustering_benchmark.cpp
clustering_benchmark.hpp		clustering_benchmark.hpp
common.hpp		common.hpp
configuration_parser.cpp		configuration_parser.cpp
configuration_parser.hpp		configuration_parser.hpp
csv.hpp		csv.hpp
device_scheduler.hpp		device_scheduler.hpp
fused_configuration.hpp		fused_configuration.hpp
fused_factory.hpp		fused_factory.hpp
generator.cpp		generator.cpp
kmeans.hpp		kmeans.hpp
kmeans_armadillo.cpp		kmeans_armadillo.cpp
kmeans_armadillo.hpp		kmeans_armadillo.hpp
kmeans_common.cpp		kmeans_common.cpp
kmeans_common.hpp		kmeans_common.hpp
kmeans_configuration.hpp		kmeans_configuration.hpp
kmeans_gpu_assisted.cpp		kmeans_gpu_assisted.cpp
kmeans_gpu_assisted.hpp		kmeans_gpu_assisted.hpp
kmeans_initializer.cpp		kmeans_initializer.cpp
kmeans_initializer.hpp		kmeans_initializer.hpp
kmeans_naive.cpp		kmeans_naive.cpp
kmeans_naive.hpp		kmeans_naive.hpp
kmeans_single_stage.hpp		kmeans_single_stage.hpp
kmeans_single_stage_buffered.hpp		kmeans_single_stage_buffered.hpp
kmeans_three_stage.hpp		kmeans_three_stage.hpp
kmeans_three_stage_buffered.hpp		kmeans_three_stage_buffered.hpp
labeling_configuration.hpp		labeling_configuration.hpp
labeling_factory.hpp		labeling_factory.hpp
lloyd_gpu_feature_sum.cpp		lloyd_gpu_feature_sum.cpp
lloyd_gpu_feature_sum.hpp		lloyd_gpu_feature_sum.hpp
mass_update_configuration.hpp		mass_update_configuration.hpp
mass_update_factory.hpp		mass_update_factory.hpp
matrix.hpp		matrix.hpp
picluster.cpp		picluster.cpp
simple_buffer_cache.cpp		simple_buffer_cache.cpp
simple_buffer_cache.hpp		simple_buffer_cache.hpp
single_device_scheduler.cpp		single_device_scheduler.cpp
single_device_scheduler.hpp		single_device_scheduler.hpp
temp.hpp		temp.hpp
test.conf		test.conf
test_csv.cpp		test_csv.cpp
timer.hpp		timer.hpp
transfer_bench.cpp		transfer_bench.cpp
utility.hpp		utility.hpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CL k-Means

Build Dependencies

Build Instructions

Usage

Configurations

Publications

About

Releases

Packages

Languages

License

TU-Berlin-DIMA/CL-kmeans

Folders and files

Latest commit

History

Repository files navigation

CL k-Means

Build Dependencies

Build Instructions

Usage

Configurations

Publications

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages