Ziyang Song, Jinxi Li, Bo Yang
We propose the first framework to represent dynamic 3D scenes in infinitely many ways from a monocular RGB video.
Our method enables infinitely sampling of different 3D scenes that match the input monocular video in observed views:
Please first install a GPU-supported pytorch version which fits your machine. We have tested with pytorch 1.13.0.
Then please refer to official guide and install pytorch3d. We have tested with pytorch3d 0.7.5.
Install other dependencies:
pip install -r requirements
Our processed datasets can be downloaded from Google Drive.
If you want to work on your own dataset, please refer to data preparation guide.
You can download all our pre-trained models from Google Drive.
python train.py config/indoor/chessboard.yaml --use_wandb
Specify --use_wandb
to log the training with WandB.
python sample.py config/indoor/chessboard.yaml --checkpoint ${CHECKPOINT}
${CHECKPOINT}
is the checkpoint iterations to be loaded, e.g., 30000.
python test.py config/indoor/chessboard.yaml --checkpoint ${CHECKPOINT} --n_sample_scale_test 1000 --scale_id ${SCALE_ID} --render_test
Specify --render_test
to render testing views, otherwise render training views.
python evaluate.py --dataset_path ${DATASET_PATH} --render_path ${RENDER_PATH} --split test --eval_depth --eval_segm --mask
Specify --eval_depth
to evaluate depth, --eval_segm
to evaluate segmentation, --mask
to apply co-visibility mask as in DyCheck.
If you find our work useful in your research, please consider citing:
@article{song2024,
title={{OSN: Infinite Representations of Dynamic 3D Scenes from Monocular Videos}},
author={Song, Ziyang and Li, Jinxi and Yang, Bo},
journal={ICML},
year={2024}
}