Unsupervised Learning of Depth, Optical Flow and Pose with Occlusion from 3D Geometry

Code for the papers:

G. Wang, H. Wang, Y. Liu, and W. Chen, Unsupervised Learning of Monocular Depth and Ego-Motion Using Multiple Masks, in International Conference on Robotics and Automation, pp. 4724-4730, 2019. [Poster]

G. Wang, C. Zhang, H. Wang, J. Wang, Y. Wang, and X. Wang, Unsupervised Learning of Depth, Optical Flow and Pose with Occlusion from 3D Geometry, in IEEE Transactions on Intelligent Transportation Systems, doi: 10.1109/TITS.2020.3010418.

Prerequisites

Python3 and pytorch are required. Besides, other libraries need to be installed by runing:

pip3 install -r requirements.txt

Preparing training data

KITTI

For KITTI, first download the dataset using this script provided on the official website, and then run the following command.

python3 data/prepare_train_data.py /path/to/raw/kitti/dataset/ --dataset-format 'kitti' --dump-root /path/to/resulting/formatted/data/ --width 832 --height 256 --num-threads 1 --static-frames data/static_frames.txt --with-gt

For testing optical flow ground truths on KITTI, download KITTI2015 dataset. You need to download 1) stereo 2015/flow 2015/scene flow 2015 data set (2 GB), 2) multi-view extension (14 GB), and 3) calibration files (1 MB) . You should have the following directory structure:

kitti2015
  | data_scene_flow  
  | data_scene_flow_calib
  | data_scene_flow_multiview

Cityscapes

For Cityscapes, download the following packages: 1) leftImg8bit_sequence_trainvaltest.zip, 2) camera_trainvaltest.zip. You will probably need to contact the administrators to be able to get it.

python3 data/prepare_train_data.py /path/to/cityscapes/dataset/ --dataset-format 'cityscapes' --dump-root /path/to/resulting/formatted/data/ --width 832 --height 342 --num-threads 1

Notice that for Cityscapes the img_height is set to 342 because we crop out the bottom part of the image that contains the car logo, and the resulting image will have height 256.

Training

python3 train.py /path/to/prepared/data \
--dispnet DispResNetS6 --posenet PoseNetB6 --flownet Back2Future \
-b 4 -pc 1.0 -pf 0.0 -m 0.0 -c 0.0 -s 0.2 \
--epoch-size 100 --log-output -f 30 --nlevels 6 --lr 1e-4 -wssim 0.85 --epochs 4000 \
--smoothness-type edgeaware --fix-masknet --fix-flownet --with-depth-gt --log-terminal \
--spatial-normalize-max --workers 8 --kitti-dir /data/to/kitti --add-less-than-mean-mask \
--add-maskp01 --using-none-mask --name demo \
--pretrained-disp /path/to/disp/model \
--pretrained-pose /path/to/pose/model

Tensorboard can be open with the command:

tensorboard --logdir=./

and visualize the training progress by opening https://localhost:6006 on your browser.

Evaluation

Disparity

python3 test_disp.py --dispnet DispResNetS6 --pretrained-dispnet /path/to/dispnet --pretrained-posent /path/to/posenet --dataset-dir /path/to/KITTI_raw --dataset-list /path/to/test_files_list

Pose

python test_pose.py pretrained/pose_model_best.pth.tar --img-width 832 --img-height 256 --dataset-dir /path/to/kitti/odometry/ --sequences 09 --posenet PoseNetB6

Optical Flow

python test_flow.py --pretrained-disp /path/to/dispnet --pretrained-pose /path/to/posenet --pretrained-mask /path/to/masknet --pretrained-flow /path/to/flownet --kitti-dir /path/to/kitti2015/dataset

Downloads

Pretrained Models

DispNet, PoseNet, and FlowNet in joint unsupervised learning of depth, pose and optical flow.

Acknowlegements

We are grateful to Anurag Ranjan for his github repository. Our code is based on theirs.

References

G. Wang, H. Wang, Y. Liu, and W. Chen, Unsupervised Learning of Monocular Depth and Ego-Motion Using Multiple Masks, in International Conference on Robotics and Automation, pp. 4724-4730, 2019.

G. Wang, C. Zhang, H. Wang, J. Wang, Y. Wang, and X. Wang, Unsupervised Learning of Depth, Optical Flow and Pose with Occlusion from 3D Geometry, in IEEE Transactions on Intelligent Transportation Systems, doi: 10.1109/TITS.2020.3010418.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
datasets		datasets
flowutils		flowutils
kitti_eval		kitti_eval
models		models
stillbox_eval		stillbox_eval
LICENSE		LICENSE
Poster_ICRA2019.pdf		Poster_ICRA2019.pdf
README.md		README.md
custom_transforms.py		custom_transforms.py
inverse_warp_summary.py		inverse_warp_summary.py
logger.py		logger.py
loss_functions_summary.py		loss_functions_summary.py
requirements.txt		requirements.txt
ssim.py		ssim.py
test_disp.py		test_disp.py
test_flow.py		test_flow.py
test_pose.py		test_pose.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unsupervised Learning of Depth, Optical Flow and Pose with Occlusion from 3D Geometry

Prerequisites

Preparing training data

KITTI

Cityscapes

Training

Evaluation

Disparity

Pose

Optical Flow

Downloads

Pretrained Models

Acknowlegements

References

About

Releases

Packages

Languages

License

guangmingw/DOPlearning

Folders and files

Latest commit

History

Repository files navigation

Unsupervised Learning of Depth, Optical Flow and Pose with Occlusion from 3D Geometry

Prerequisites

Preparing training data

KITTI

Cityscapes

Training

Evaluation

Disparity

Pose

Optical Flow

Downloads

Pretrained Models

Acknowlegements

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages