-
Notifications
You must be signed in to change notification settings - Fork 32
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
oneformer3d
committed
Mar 21, 2024
0 parents
commit 9d52edf
Showing
51 changed files
with
13,423 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
data | ||
work_dirs | ||
.vscode | ||
__pycache__/ | ||
*.py[cod] | ||
*$py.class | ||
*.ipynb |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
FROM pytorch/pytorch:1.13.1-cuda11.6-cudnn8-devel | ||
|
||
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub \ | ||
&& apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/7fa2af80.pub \ | ||
&& apt-get update \ | ||
&& apt-get install -y ffmpeg libsm6 libxext6 git ninja-build libglib2.0-0 libsm6 libxrender-dev libxext6 | ||
|
||
# Install OpenMMLab projects | ||
RUN pip install --no-deps \ | ||
mmengine==0.7.3 \ | ||
mmdet==3.0.0 \ | ||
mmsegmentation==1.0.0 \ | ||
git+https://github.com/open-mmlab/mmdetection3d.git@22aaa47fdb53ce1870ff92cb7e3f96ae38d17f61 | ||
RUN pip install mmcv==2.0.0 -f https://download.openmmlab.com/mmcv/dist/cu116/torch1.13.0/index.html --no-deps | ||
|
||
# Install MinkowskiEngine | ||
# Feel free to skip nvidia-cuda-dev if minkowski installation is fine | ||
RUN apt-get update \ | ||
&& apt-get -y install libopenblas-dev nvidia-cuda-dev | ||
RUN TORCH_CUDA_ARCH_LIST="6.1 7.0 8.6" \ | ||
pip install git+https://github.com/NVIDIA/MinkowskiEngine.git@02fc608bea4c0549b0a7b00ca1bf15dee4a0b228 -v --no-deps \ | ||
--install-option="--blas=openblas" \ | ||
--install-option="--force_cuda" | ||
|
||
# Install torch-scatter | ||
RUN pip install torch-scatter==2.1.2 -f https://data.pyg.org/whl/torch-1.13.0+cu116.html --no-deps | ||
|
||
# Install ScanNet superpoint segmentator | ||
RUN git clone https://github.com/Karbo123/segmentator.git \ | ||
&& cd segmentator/csrc \ | ||
&& git reset --hard 76efe46d03dd27afa78df972b17d07f2c6cfb696 \ | ||
&& mkdir build \ | ||
&& cd build \ | ||
&& cmake .. \ | ||
-DCMAKE_PREFIX_PATH=`python -c 'import torch;print(torch.utils.cmake_prefix_path)'` \ | ||
-DPYTHON_INCLUDE_DIR=$(python -c "from distutils.sysconfig import get_python_inc; print(get_python_inc())") \ | ||
-DPYTHON_LIBRARY=$(python -c "import distutils.sysconfig as sysconfig; print(sysconfig.get_config_var('LIBDIR'))") \ | ||
-DCMAKE_INSTALL_PREFIX=`python -c 'from distutils.sysconfig import get_python_lib; print(get_python_lib())'` \ | ||
&& make \ | ||
&& make install \ | ||
&& cd ../../.. | ||
|
||
# Install remaining python packages | ||
RUN pip install --no-deps \ | ||
spconv-cu116==2.3.6 \ | ||
addict==2.4.0 \ | ||
yapf==0.33.0 \ | ||
termcolor==2.3.0 \ | ||
packaging==23.1 \ | ||
numpy==1.24.1 \ | ||
rich==13.3.5 \ | ||
opencv-python==4.7.0.72 \ | ||
pycocotools==2.0.6 \ | ||
Shapely==1.8.5 \ | ||
scipy==1.10.1 \ | ||
terminaltables==3.1.10 \ | ||
numba==0.57.0 \ | ||
llvmlite==0.40.0 \ | ||
pccm==0.4.7 \ | ||
ccimport==0.4.2 \ | ||
pybind11==2.10.4 \ | ||
ninja==1.11.1 \ | ||
lark==1.1.5 \ | ||
cumm-cu116==0.4.9 \ | ||
pyquaternion==0.9.9 \ | ||
lyft-dataset-sdk==0.0.8 \ | ||
pandas==2.0.1 \ | ||
python-dateutil==2.8.2 \ | ||
matplotlib==3.5.2 \ | ||
pyparsing==3.0.9 \ | ||
cycler==0.11.0 \ | ||
kiwisolver==1.4.4 \ | ||
scikit-learn==1.2.2 \ | ||
joblib==1.2.0 \ | ||
threadpoolctl==3.1.0 \ | ||
cachetools==5.3.0 \ | ||
nuscenes-devkit==1.1.10 \ | ||
trimesh==3.21.6 \ | ||
open3d==0.17.0 \ | ||
plotly==5.18.0 \ | ||
dash==2.14.2 \ | ||
plyfile==1.0.2 \ | ||
flask==3.0.0 \ | ||
werkzeug==3.0.1 \ | ||
click==8.1.7 \ | ||
blinker==1.7.0 \ | ||
itsdangerous==2.1.2 \ | ||
importlib_metadata==2.1.2 \ | ||
zipp==3.17.0 |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,124 @@ | ||
## OneFormer3D: One Transformer for Unified Point Cloud Segmentation | ||
|
||
**News**: | ||
* :fire: February, 2024. Oneformer3D is now accepted at CVPR 2024. | ||
* :fire: November, 2023. OneFormer3D achieves state-of-the-art in | ||
* 3D instance segmentation on ScanNet ([hidden test](https://kaldir.vc.in.tum.de/scannet_benchmark/semantic_instance_3d)) | ||
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/oneformer3d-one-transformer-for-unified-point/3d-instance-segmentation-on-scannetv2)](https://paperswithcode.com/sota/3d-instance-segmentation-on-scannetv2?p=oneformer3d-one-transformer-for-unified-point) | ||
<details> | ||
<summary>leaderboard screenshot</summary> | ||
<img src="https://github.com/filaPro/oneformer3d/assets/6030962/e8890fd9-336d-4851-85cb-06fbbb60abe3" alt="ScanNet leaderboard"/> | ||
</details> | ||
* 3D instance segmentation on S3DIS (6-Fold) | ||
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/oneformer3d-one-transformer-for-unified-point/3d-instance-segmentation-on-s3dis)](https://paperswithcode.com/sota/3d-instance-segmentation-on-s3dis?p=oneformer3d-one-transformer-for-unified-point) | ||
* 3D panoptic segmentation on ScanNet | ||
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/oneformer3d-one-transformer-for-unified-point/panoptic-segmentation-on-scannet)](https://paperswithcode.com/sota/panoptic-segmentation-on-scannet?p=oneformer3d-one-transformer-for-unified-point) | ||
* 3D object detection on ScanNet (w/o TTA) | ||
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/oneformer3d-one-transformer-for-unified-point/3d-object-detection-on-scannetv2)](https://paperswithcode.com/sota/3d-object-detection-on-scannetv2?p=oneformer3d-one-transformer-for-unified-point) | ||
* 3D semantic segmentation on ScanNet (val, w/o extra training data) [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/oneformer3d-one-transformer-for-unified-point/semantic-segmentation-on-scannet)](https://paperswithcode.com/sota/semantic-segmentation-on-scannet?p=oneformer3d-one-transformer-for-unified-point) | ||
|
||
This repository contains an implementation of OneFormer3D, a 3D (instance, semantic, and panoptic) segmentation method introduced in our paper: | ||
|
||
> **OneFormer3D: One Transformer for Unified Point Cloud Segmentation**<br> | ||
> [Maksim Kolodiazhnyi](https://github.com/col14m), | ||
> [Anna Vorontsova](https://github.com/highrut), | ||
> [Anton Konushin](https://scholar.google.com/citations?user=ZT_k-wMAAAAJ), | ||
> [Danila Rukhovich](https://github.com/filaPro) | ||
> <br> | ||
> Samsung Research<br> | ||
> https://arxiv.org/abs/2311.14405 | ||
### Installation | ||
|
||
For convenience, we provide a [Dockerfile](Dockerfile). | ||
This implementation is based on [mmdetection3d](https://github.com/open-mmlab/mmdetection3d) framework `v1.1.0`. If installing without docker please follow their [getting_started.md](https://github.com/open-mmlab/mmdetection3d/blob/22aaa47fdb53ce1870ff92cb7e3f96ae38d17f61/docs/en/get_started.md). | ||
|
||
|
||
### Getting Started | ||
|
||
Please see [test_train.md](https://github.com/open-mmlab/mmdetection3d/blob/22aaa47fdb53ce1870ff92cb7e3f96ae38d17f61/docs/en/user_guides/train_test.md) for basic usage examples. | ||
For ScanNet and ScanNet200 datasets preprocessing please follow our [instruction](data/scannet). It differs from original mmdetection3d only by adding superpoint clustering. For S3DIS preprocessing we follow original [instruction](https://github.com/open-mmlab/mmdetection3d/tree/22aaa47fdb53ce1870ff92cb7e3f96ae38d17f61/data/s3dis) from mmdetection3d. We also [support](data/structured3d) Structured3D dataset for pre-training. | ||
|
||
Important notes: | ||
* The metrics from our paper can be achieved in several ways, we just choose the most stable one for each dataset in this repository. | ||
* If you are interested in only one of three segmentation tasks, it is possible to achieve slightly better metrics, than declared in our paper. Specifically, increasing `model.criterion.sem_criterion.loss_weight` in config file leads to better semantic metrics, and decreasing improve instance metrics. | ||
* All models can be trained with a single GPU with 32 Gb memory (or even 24 Gb for ScanNet dataset). If you face issues with RAM during instance segmentation evaluation at validation or test stages feel free to decrease `model.test_cfg.topk_insts` in config file. | ||
* Due to the bug in SpConv we [reshape](tools/fix_spconv_checkpoint.py) backbone weights between train and test stages. | ||
|
||
#### ScanNet | ||
|
||
For ScanNet we present the model with [SpConv](https://github.com/traveller59/spconv) backbone, superpoint pooling, selecting all queries, and predicting semantics directly from instance queries. Backbone is initialized from [SSTNet](https://github.com/Gorilla-Lab-SCUT/SSTNet) checkpoint. It should be [downloaded](https://github.com/oneformer3d/oneformer3d/releases/download/v1.0/sstnet_scannet.pth) and put to `work_dirs/tmp` before training. | ||
|
||
```shell | ||
# train (with validation) | ||
python tools/train.py configs/oneformer3d_1xb4_scannet.py | ||
# test | ||
python tools/fix_spconv_checkpoint.py \ | ||
--in-path work_dirs/oneformer3d_1xb4_scannet/epoch_512.pth \ | ||
--out-path work_dirs/oneformer3d_1xb4_scannet/epoch_512.pth | ||
python tools/test.py configs/oneformer3d_1xb4_scannet.py \ | ||
work_dirs/oneformer3d_1xb4_scannet/epoch_512.pth | ||
|
||
``` | ||
|
||
#### ScanNet200 | ||
|
||
For ScanNet200 we present the model with [MinkowskiEngine](https://github.com/NVIDIA/MinkowskiEngine) backbone, superpoint pooling, selecting all queries, and predicting semantics directly from instance queries. Backbone is initialized from [Mask3D](https://github.com/JonasSchult/Mask3D) checkpoint. It should be [downloaded](https://github.com/oneformer3d/oneformer3d/releases/download/v1.0/mask3d_scannet200.pth) and put to `work_dirs/tmp` before training. | ||
|
||
```shell | ||
# train (with validation) | ||
python tools/train.py configs/oneformer3d_1xb4_scannet200.py | ||
# test | ||
python tools/test.py configs/oneformer3d_1xb4_scannet200.py \ | ||
work_dirs/oneformer3d_1xb4_scannet/epoch_512.pth | ||
``` | ||
|
||
#### S3DIS | ||
|
||
For S3DIS we present the model with [SpConv](https://github.com/traveller59/spconv) backbone, w/o superpoint pooling, w/o query selection, and with separate semantic queries. Backbone is pretrained on Structured3D and ScanNet. It can be [downloaded](https://github.com/oneformer3d/oneformer3d/releases/download/v1.0/instance-only-oneformer3d_1xb2_scannet-and-structured3d.pth) and put to `work_dirs/tmp` before training or trained with our code. We train the model on Areas 1, 2, 3, 4, 6 and test on Area 5. To change this split feel free to modify `train_area` and `test_area` parameters in config. | ||
|
||
```shell | ||
# pre-train | ||
python tools/train.py configs/instance-only-oneformer3d_1xb2_scannet-and-structured3d.py | ||
python tools/fix_spconv_checkpoint.py \ | ||
--in-path work_dirs/instance-only-oneformer3d_1xb2_scannet-and-structured3d/iter_600000.pth \ | ||
--out-path work_dirs/tmp/instance-only-oneformer3d_1xb2_scannet-and-structured3d.pth | ||
# train (with validation) | ||
python tools/train.py configs/oneformer3d_1xb2_s3dis-area-5.py | ||
# test | ||
python tools/fix_spconv_checkpoint.py \ | ||
--in-path work_dirs/oneformer3d_1xb2_s3dis-area-5/epoch_512.pth \ | ||
--out-path work_dirs/oneformer3d_1xb2_s3dis-area-5/epoch_512.pth | ||
python tools/test.py configs/oneformer3d_1xb2_s3dis-area-5.py \ | ||
work_dirs/oneformer3d_1xb2_s3dis-area-5/epoch_512.pth | ||
``` | ||
|
||
### Models | ||
|
||
Metric values in the table are given for the provided checkpoints and may vary a little from the ones in our paper. Due to randomness it may be needed to run training with the same config for several times to achieve the best metrics. | ||
|
||
| Dataset | mAP<sub>25</sub> | mAP<sub>50</sub> | mAP | mIoU | PQ | Download | | ||
|:-------:|:----------------:|:----------------:|:---:|:----:|:--:|:--------:| | ||
| ScanNet | 86.7 | 78.8 | 59.3 | 76.4 | 70.7 | [model](https://github.com/oneformer3d/oneformer3d/releases/download/v1.0/oneformer3d_1xb4_scannet.pth) | [log](https://github.com/oneformer3d/oneformer3d/releases/download/v1.0/oneformer3d_1xb4_scannet.log) | [config](configs/oneformer3d_1xb4_scannet.py) | | ||
| ScanNet200 | 44.6 | 40.9 | 30.2 | 29.4 | 29.7 | [model](https://github.com/oneformer3d/oneformer3d/releases/download/v1.0/oneformer3d_1xb4_scannet200.pth) | [log](https://github.com/oneformer3d/oneformer3d/releases/download/v1.0/oneformer3d_1xb4_scannet200.log) | [config](configs/oneformer3d_1xb4_scannet200.py) | | ||
| S3DIS | 80.6 | 72.7 | 58.0 | 71.9 | 64.6 | [model](https://github.com/oneformer3d/oneformer3d/releases/download/v1.0/oneformer3d_1xb2_s3dis-area-5.pth) | [log](https://github.com/oneformer3d/oneformer3d/releases/download/v1.0/oneformer3d_1xb2_s3dis-area-5.log) | [config](configs/oneformer3d_1xb2_s3dis-area-5.py) | | ||
|
||
### Example Predictions | ||
|
||
<p align="center"> | ||
<img src="https://github.com/filaPro/oneformer3d/assets/6030962/12809615-7ed5-46a0-9321-747451862295" alt="ScanNet predictions"/> | ||
</p> | ||
|
||
### Citation | ||
|
||
If you find this work useful for your research, please cite our paper: | ||
|
||
``` | ||
@misc{kolodiazhnyi2023oneformer3d, | ||
url = {https://arxiv.org/abs/2311.14405}, | ||
author = {Kolodiazhnyi, Maxim and Vorontsova, Anna and Konushin, Anton and Rukhovich, Danila}, | ||
title = {OneFormer3D: One Transformer for Unified Point Cloud Segmentation}, | ||
publisher = {arXiv}, | ||
year = {2023} | ||
} | ||
``` |
Oops, something went wrong.