YanxuanLiu
released this
21 Nov 07:24
·
31 commits
to branch-24.12
since this release
Release notes as follows:
- Migrated cuML based ivf-flat and ivf-pq to cuVS and added support for cosine distance.
- Added support for sparse data in UMAP.
- Added support for NNDescent based k-NN graph building for UMAP.
- Updated AWS EMR examples to EMR version 7.3.
- Updated RAPIDS dependencies to 24.10.
- Dropped support for Python 3.9 (transitive from RAPIDS).
- Multiple bug and documentation fixes for data generation, CrossValidator, UMAP, DBScan, KMeans, and approximate k-NN implementations.
- Known issues:
- LogisticRegression hangs on fitting sparse data with all zero features in a GPU
- various CUDA errors when
spark.rapids.ml.uvm.enabled
orspark.python.worker.reuse
are set totrue
and with multiple GPUs per executor. Work around is to set either of those configs tofalse
in multiple GPU per exectuor clusters. - error in multi-class RandomForest fit when one GPU does not see all class label values.
- CUDA error when fewer probes than
k
inivflat-pq
ANN algorithm.
pip package available at https://pypi.org/project/spark-rapids-ml/24.10.0/