Paper | Video (CVPR) | Video (Reconstruction) | Project Page
This repository is the official implementation of the paper:
MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera
Felix Wimbauer*, Nan Yang*, Lukas Von Stumberg, Niclas Zeller and Daniel Cremers
If you find our work useful, please consider citing our paper:
@InProceedings{wimbauer2020monorec,
title = {{MonoRec}: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera},
author = {Wimbauer, Felix and Yang, Nan and von Stumberg, Lukas and Zeller, Niclas and Cremers, Daniel},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2021},
}
The conda
environment for this project can be setup by running the following command:
conda env create -f environment.yml
We provide a sample from the KITTI Odometry test set and a script to run MonoRec on it in example/
.
To download the pretrained model and put it into the right place, run download_model.sh
.
You can manually do this by downloading the weights from here
and unpacking the file to saved/checkpoints/monorec_depth_ref.pth
.
The example script will plot the keyframe, depth prediction and mask prediction.
cd example
python test_monorec.py
In all of our experiments we used the KITTI Odometry dataset for training. For additional evaluations, we used the KITTI, Oxford RobotCar,
TUM Mono-VO and TUM RGB-D datasets. All datapaths can be specified in the respective configuration files. In our experiments, we put all datasets into a seperate folder ../data
.
To setup KITTI Odometry, download the color images and calibration files from the official website (around 145 GB). Instead of the given velodyne laser data files, we use the improved ground truth depth for evaluation, which can be downloaded from here.
Unzip the color images and calibration files into ../data
. The lidar depth maps can be extracted into the given
folder structure by running data_loader/scripts/preprocess_kitti_extract_annotated_depth.py
.
For training and evaluation, we use the poses estimated by Deep Virtual Stereo Odometry (DVSO). They can be downloaded
from here and should be placed under ../data/{kitti_path}/poses_dso
. This folder structure is ensured when
unpacking the zip file in the {kitti_path}
directory.
To supplement the self-supervised training, we use sparse depth maps generated by Deep Virtual Stereo Odometry (DVSO)
during the pose etimation. They can be downloaded from here
and should be palced under ../data/{kitti_path}/sequences/{seq_num}/image_depth_sparse
.
This folder structure is ensured when unpacking the zip file in the {kitti_path}
directory.
The auxiliary moving object masks can be downloaded from here. They should be placed under
../data/{kitti_path}/sequences/{seq_num}/mvobj_mask
. This folder structure again is ensured when
unpacking the zip file in the {kitti_path}
directory.
Finally, for mask training, we also use index masks for the training data, which can be downloaded from here. They should be placed under
../data/{kitti_path}/sequences/{seq_num}/
. This folder structure again is ensured when
unpacking the zip file in the {kitti_path}
directory.
To setup Oxford RobotCar, download the camera model files and the large sample from
the official website. Code, as well as, camera extrinsics need to be downloaded
from the official GitHub repository.
Please move the content of the python
folder to data_loaders/oxford_robotcar/
.
extrinsics/
, models/
and sample/
need to be moved to ../data/oxford_robotcar/
. Note that for poses we
use the official visual odometry poses, which are not provided in the large sample. They need to be downloaded manually from
the raw dataset
and unpacked into the sample folder.
Unfortunately, TUM Mono-VO images are provided only in the original, distorted form. Therefore, they need to be undistorted first before fed into MonoRec. To obtain poses for the sequences, we run the publicly available version of Direct Sparse Odometry.
The official sequences can be downloaded from the official website
and need to be unpacked under ../data/tumrgbd/{sequence_name}
. Note that our provided dataset implementation assumes
intrinsics from fr3
sequences. Note that the data loader for this dataset also relies on the code from the Oxford Robotcar dataset.
This repository provides training and evaluation configurations to reproduce the results from the paper.
To train a model from scratch, first set the dataset_dir
fields to the directory in which KITTI Odometry is located
(default ../data/dataset
).
Then run the following commands in the given order:
python train.py --config configs/train/monorec/monorec_depth.json --options stereo # Depth Bootstrap
python train_monorec.py --config configs/train/monorec/monorec_mask.json --options stereo # Mask Bootstrap
python train_monorec.py --config configs/train/monorec/monorec_mask_ref.json --options mask_loss # Mask Refinement
python train_monorec.py --config configs/train/monorec/monorec_depth_ref.json --options stereo stereo_repr # Depth Refinement
The final model will be stored under saved/models/monorec_depth_ref/00/checkpoint.pth
.
We also provide checkpoints for each training stage:
Training stage | Download |
---|---|
Depth Bootstrap | Link |
Mask Bootstrap | Link |
Mask Refinement | Link |
Depth Refinement (final model) | Link |
Run download_model.sh
to download the final model. It will automatically get moved to saved/checkpoints
.
To reproduce the evaluation results on different datasets, run the following commands:
python evaluate.py --config configs/evaluate/eval_monorec.json # KITTI Odometry
python evaluate.py --config configs/evaluate/eval_monorec_oxrc.json # Oxford Robotcar
To reproduce the pointclouds depicted in the paper and video, use the following commands:
python create_pointcloud.py --config configs/test/pointcloud_monorec.json # KITTI Odometry
python create_pointcloud.py --config configs/test/pointcloud_monorec_oxrc.json # Oxford Robotcar
python create_pointcloud.py --config configs/test/pointcloud_monorec_tmvo.json # TUM Mono-VO