kdtree

Demonstrates how to implement and use a C++ template of a k-d tree.

Run the demo

This demo can be build for Windows and Linus.

Windows

Just open the solution KD.sln with Visual Studio.

Linux

g++ main.cpp -o kdtree

Introduction

The k-d tree is a variant of binary tree in which every node contains a k-dimensional point and has a splitting hyperplane that divides the space into two parts. Hypeplane direction is selected to rotate along the d-dimensions. For full explanation see .

The k-d tree can be used for many problems with low dimensional data (d < 10), it allows some balancing techniques and effectibe Nearest-Neighbours search, which makes it useful even for 3D and 2D (for other spatial trees like OcTree balancing and nearest-neighbours search is even more challenging). Here I have demonstrated how a Nearest-Neighbours search with can be efficiently organized.

Implementation of KdTree template in C++

The building block of the k-d tree is the KdNode , like any Binary Tree it must have pointers to a Left and Right subtrees. It also has axis which contains the rotating split dimension of the node:

class KDNode
{
    friend class KDTree;
    public:
        KDNode(const T* x, int n, int axis);
        const T* data() const { return x.data(); }
        unsigned int getId() const { return id;  }

    protected:
        KDNode* parent ;
        KDNode* left ;
        KDNode* right ;        
        
        int axis ;
        std::vector<T> x;
        unsigned int id ;
        bool checked ;
};

Here is a brief explation of each of the KdNode members:

left, right - pointers to the left and right subtrees
parent - pointer to parent node
axis - index of the split axis (can be in the range 0-d)
x - array with d-dimensional data point
id - id of the point used during the tests
checked - used during the nearest neighbours search

This is the full definition of the k-d tree template:

template <typename T>
class KDTree
{
public:
    KDNode*  root ;
    KDTree(unsigned int K, unsigned int D);
    ~KDTree();
    bool            add(const T* x);
    KDNode*         insert(const T* x) ;
    KDNode*         find_exact(const T* x) ;
    KDNode*         find_nearest(const T* x);
    KDNode*         find_nearest_brute(const T* x) ;
    KDNode*         find_exact_brute(const T* x);
    KDNode*         find_parent(const T* x) ;
    void            clear();
    size_t          size() const { return NodesList.size();  };
    size_t          numChecked() const { return NodesChecked.size();  };
    
public:
    T               d_min ;   
    unsigned int    n_dim;

protected:  
    void                    check_subtree(KDNode* node, const T* x);
    void                    set_bounding_cube(KDNode* node, const T* x);
    KDNode*                 search_parent(KDNode* parent, const T* x);
    void                    uncheck();

    KDNode*                 nearest_neighbour ;
    int                     KD_id  ;
    std::vector<KDNode*>    NodesList ;
    std::vector<KDNode*>    NodesChecked ;
    std::vector<T>          x_min, x_max; 
    std::vector<bool>       max_boundary, min_boundary;
    int                     n_boundary ;
};

See:
KDTree.h

Nearest Neighbours Search

Typical Output of the benchmark application is:

KD Tree Created!
Points number:          1000000
Tree Nodes:             1000000
Space Dimension:        10
Creation Time [ms]:     931.431030
Test points:            100

Nearest Neighbours Search
------------------------------------------------
avr. kd NN time [ms]               2.086340
avr. kd exact time [ms]            0.001490
avr. brute NN time [ms]            9.103150
avr. brute exact time [ms]         5.639280
kd/brute NN ratio:                 4.363215
kd/brute exact ratio:              3784.751465
Errors:                            0

Benchmark comparison

The figure below shows the comparison of the NN search with k-d tree and the brute NN search:

These results shows that as can be expected the Brute NN search performance increases linearly with the data dimension O(d), while the k-d NN search performanceis is deteriorating exponentially and after d=10 it is closing to the results obtained with Brute NN. This is another effect of the Curse of dimensionality,, valid for many algorithms working with high dimensional data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

kdtree

Introduction

Implementation of KdTree template in C++

Nearest Neighbours Search

Benchmark comparison

Files

README.md

Latest commit

History

README.md

File metadata and controls

kdtree

Introduction

Implementation of KdTree template in C++

Nearest Neighbours Search

Benchmark comparison