Examples/Tests | Benchmarks | APIs | Reproduce |
---|
A fully concurrent GPU B-Tree that supports versioning (snapshots) and linearizable multipoint queries. Using our data structure and the tools we provide, you can launch one (or more) kernels where inside each kernel, you concurrently perform queries (e.g., point or range queries) and mutations (e.g., insert or update).
For more information, please check our PACT 2022 paper:
A GPU Multiversion B-Tree
Muhammad A. Awad, Serban D. Porumbescu, and John D. Owens
The repository also contains:
- An implementation of our epoch-based memory reclamation strategy
- SlabAlloc memory allocator redesigned to allow building more complex allocators via composition
- Improved implementation of our B-Tree (reference B-Tree that doesn't support snapshots)1.
GPU data structures such as the multiversion GPU B-Tree and other data structures we developed12 should facilitate using them in the following concise and elegant manner:
#include<gpu_versioned_blink_tree.hpp>
#include<thrust/device_vector.hpp>
#include<thrust/for_each.hpp>
int main(){
using key_t = uint32_t; using value_t = uint32_t;
using tree_t = GpuBTree::gpu_versioned_blink_tree<key_t, value_t>;
tree_t vtree(....); // call the data structure constructor
thrust::device_vector<key_t> keys(....); // initialize keys
// solve a problem and do concurrent operations in a fully concurrent manner
thrust::for_each(keys.begin(), keys.end(), [vtree](auto key){
// perform operations in a tile-synchronous way
auto block = cooperative_groups::this_thread_block();
auto tile = cooperative_groups::tiled_partition<tree_t::branching_factor>(block);
// ... problem-specific code
auto value = ...;
vtree.cooperative_insert(key, value, tile, ...); // insert
// ... maybe more problem-specific application code
auto snapshot_id = vtree.take_snapshot(); // take snapshot
// ... maybe even more problem-specific code
auto found_value = vtree.cooperative_find(key, tile, snapshot_id, ...); // query
// ... maybe even more problem-specific code
});
}
The previous example illustrates our vision for using GPU data structures. To a large extent, we can do most of these operations using current CUDA/C++ abstractions and compilers; however, some of the APIs, such as memory allocators and reclaimers (especially on-device ones), still lack adequate support and standardization. BGHT2 provides the same device-side APIs and will require almost zero modifications to run the example snippet above.
Please create an issue if you face challenges with any of the following limitations and requirements.
- C++17/CUDA C++17
- NVIDIA Volta GPU or later microarchitectures
- CMake 3.18 or later
- CUDA 12.0 or later
- GPU with 20 GiBs or higher to run the benchmarks
- Keys and values must have a type of unsigned 32-bit
- Snapshot are limited to a maximum of 2^32 - 1 (can be extended to 2^64-1 easily)
To reproduce the results, follow the following steps. Our PACT 2022 paper was awarded the Results Reproduced v1.1 badge. If you find any mismatch (either faster or slower) between the results in the paper, please create an issue, and we will investigate the performance changes.
Please create an issue. We will welcome any contributions that improve the usability and quality of our repository.
The code in this repository is based on our Multiversion GPU B-Tree and GPU B-Tree publications:
@InProceedings{ Awad:2022:AGM,
author = {Muhammad A. Awad and Serban D. Porumbescu and John D.
Owens},
title = {A {GPU} Multiversion {B}-Tree},
booktitle = {Proceedings of the International Conference on Parallel
Architectures and Compilation Techniques},
series = {PACT 2022},
year = 2022,
month = oct,
code = {https://github.com/owensgroup/MVGpuBTree},
doi = {10.1145/3559009.3569681},
url = {https://escholarship.org/uc/item/4mz5t5b7},
ucdcite = {a146}
}
@InProceedings{ Awad:2019:EAH,
author = {Muhammad A. Awad and Saman Ashkiani and Rob Johnson and
Mart\'{\i}n Farach-Colton and John D. Owens},
title = {Engineering a High-Performance {GPU} {B}-Tree},
booktitle = {Proceedings of the 24th ACM SIGPLAN Symposium on
Principles and Practice of Parallel Programming},
series = {PPoPP 2019},
year = 2019,
month = feb,
pages = {145--157},
acceptance = {29 of 152 submissions, 19.1\%},
doi = {10.1145/3293883.3295706},
acmauthorize = {https://dl.acm.org/doi/10.1145/3293883.3295706?cid=81100458295},
url = {https://escholarship.org/uc/item/1ph2x5td},
code = {https://github.com/owensgroup/GpuBTree},
ucdcite = {a127}
}