OnePose++: Keypoint-Free One-Shot Object Pose Estimation without CAD Models
Xingyi He*, Jiaming Sun*, Yu'ang Wang, Di Huang, Hujun Bao, Xiaowei Zhou
NeurIPS 2022
- Training, inference and demo code.
- Pipeline to reproduce the evaluation results on the OnePose dataset and proposed OnePose_LowTexture dataset.
-
OnePose Cap
app is available at the App Store (iOS only) to capture your own training and test data.
conda env create -f environment.yaml
conda activate oneposeplus
LoFTR and DeepLM are used in this project. Thanks for their great work, and we appreciate their contribution to the community. Please follow their installation instructions and LICENSE:
git submodule update --init --recursive
# Install DeepLM
cd submodules/DeepLM
sh example.sh
cp ${REPO_ROOT}/backup/deeplm_init_backup.py ${REPO_ROOT}/submodules/DeepLM/__init__.py
Note that the efficient optimizer DeepLM is used in our SfM refinement phase. If you face difficulty in installation, do not worry. You can still run the code by using our first-order optimizer, which is a little slower.
COLMAP is also used in this project for Structure-from-Motion. Please refer to the official instructions for the installation.
Download the pretrained models, including our 2D-3D matching and LoFTR models. Then move them to ${REPO_ROOT}/weights
.
[Optional] You may optionally try out our web-based 3D visualization tool Wis3D for convenient and interactive visualizations of feature matches and point clouds. We also provide many other cool visualization features in Wis3D, welcome to try it out.
# Working in progress, should be ready very soon, only available on test-pypi now.
pip install -i https://test.pypi.org/simple/ wis3d
After the installation, you can refer to this page to run the demo with your custom data.
- Download OnePose dataset from here and OnePose_LowTexture dataset from here, and extract them into
$/your/path/to/onepose_datasets
. If you want to evaluate on LINEMOD dataset, download the real training data, test data and 3D object models from CDPN, and detection results by YOLOv5 from here. Then extract them into$/your/path/to/onepose_datasets/LINEMOD
The directory should be organized in the following structure:|--- /your/path/to/datasets | |--- train_data | |--- val_data | |--- test_data | |--- lowtexture_test_data | |--- LINEMOD | | |--- real_train | | |--- real_test | | |--- models | | |--- yolo_detection
You can refer to dataset document for more informations about OnePose_LowTexture dataset.
- Build the dataset symlinks
REPO_ROOT=/path/to/OnePose_Plus_Plus ln -s /your/path/to/datasets $REPO_ROOT/data/datasets
Reconstructed the semi-dense object point cloud and 2D-3D correspondences are needed for both training and test objects:
python run.py +preprocess=sfm_train_data.yaml use_local_ray=True # for train data
python run.py +preprocess=sfm_inference_onepose_val.yaml use_local_ray=True # for val data
python run.py +preprocess=sfm_inference_onepose.yaml use_local_ray=True # for test data
python run.py +preprocess=sfm_inference_lowtexture.yaml use_local_ray=True # for lowtexture test data
# Eval OnePose dataset:
python inference.py +experiment=inference_onepose.yaml use_local_ray=True verbose=True
# Eval OnePose_LowTexture dataset:
python inference.py +experiment=inference_onepose_lowtexture.yaml use_local_ray=True verbose=True
Note that we perform the parallel evaluation on a single GPU with two workers by default. If your GPU memory is smaller than 6GB, you are supposed to add use_local_ray=False
to turn off the parallelization.
# Parse LINDMOD Dataset to OnePose Dataset format:
sh scripts/parse_linemod_objs.sh
# Reconstruct SfM model on real training data:
python run.py +preprocess=sfm_inference_LINEMOD.yaml use_local_ray=True
# Eval LINEMOD dataset:
python inference.py +experiment=inference_LINEMOD.yaml use_local_ray=True verbose=True
-
Prepare ground-truth annotations. Merge annotations of training/val data:
python merge.py +preprocess=merge_annotation_train.yaml python merge.py +preprocess=merge_annotation_val.yaml
-
Begin training
python train_onepose_plus.py +experiment=train.yaml exp_name=onepose_plus_train
Note that the default config for training uses 8 GPUs with around 23GB VRAM for each GPU. You can set the GPU number or ID in
trainer.gpus
and reduce the batch size indatamodule.batch_size
to reduce the GPU VRAM footprint.
All model weights will be saved under ${REPO_ROOT}/models/checkpoints/${exp_name}
and logs will be saved under ${REPO_ROOT}/logs/${exp_name}
.
You can visualize the training process by Tensorboard:
tensorboard --logdir logs --bind_all --port your_port_number
If you find this code useful for your research, please use the following BibTeX entry.
@inproceedings{
he2022oneposeplusplus,
title={OnePose++: Keypoint-Free One-Shot Object Pose Estimation without {CAD} Models},
author={Xingyi He and Jiaming Sun and Yuang Wang and Di Huang and Hujun Bao and Xiaowei Zhou},
booktitle={Advances in Neural Information Processing Systems},
year={2022}
}
Part of our code is borrowed from hloc and LoFTR. Thanks to their authors for their great works.