We implement COPAC (Correlation Partition Clustering), which
- computes the local correlation dimensionality based on the largest eigenvalues
- partitions the data set based on this dimension
- calculates a Euclidean distance variant weighted with the correlation dimension, called correlation distance
- further clusters objects within each partition with Generalized DBSCAN, requiring a minimum number of objects to be within eps range for each core point.
Clone this repository and pip install
like so:
git clone https://github.com/VarIr/copac.git
cd copac
python3 -m pip install .
COPAC is then available through the copac package.
COPAC usage follows scikit-learn's cluster API.
from copac import COPAC
# load some X here ...
copac = COPAC(k=10, mu=5, eps=.5, alpha=.85)
y_pred = copac.fit_predict(X)
Published in GitHub: https://github.com/VarIr/copac
The original publication of COPAC.
@article{Achtert2007,
author = {Achtert, E and Bohm, C and Kriegel, H P and Kroger, P and Zimek, A},
title = {{Robust, Complete, and Efficient Correlation Clustering}},
journal = {Proceedings of the Seventh Siam International Conference on Data Mining},
year = {2007},
pages = {413--418}
}
This work is free open source software licensed under GPLv3.