SIGGRAPH ASIA 2023
SAILOR is a generalizable method for human free-view rendering and reconstruction from very sparse (e.g., 4) RGBD streams , achieving near real-time performance under acceleration.
Our free-view rendering results and bullet-time effects on our real-captured dataset (Unseen performers ).
Please install python dependencies in requirements.txt
:
conda create -n SAILOR python=3.8
conda activate SAILOR
pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
Install the surface localization algorithm named ImplicitSeg provided by MonoPort, for our fast post-merging operation.
Our code has been tested under the following system :
- Ubuntu 18.04, 20.04 or 22.04
- Python 3.8 and PyTorch 1.8.0
- GCC/G++ 9.5.0
- Nvidia GPU (RTX 3090) CUDA 11.1 CuDNN
Build the c++ and CUDA libraries:
- VoxelEncoding, FastNerf and Mesh-RenderUtil:
cd c_lib/* && python setup.py install
. VoxelEncoding provides the CUDA accelerated versions of TSDF-Fusion, two-layer tree construction, ray-voxel intersection, adaptive points sampling, etc. FastNerf provides a fully-fused version of the MLPs and Hydra-attention for our SRONet. - AugDepth and Depth2Color [optional] (Eigen3, OpenCV, OpenMp and pybind11 are required):
cd c_lib/*
mkdir build && cd build
cmake .. && make
- Clone or download this repo
- Download our pretrained depth denoising model (
latest_model_BodyDRM2.pth
) and our rendering model (latest_model_BasicRenNet.pth
) here - Move the downloaded models to
./checkpoints_rend/SAILOR
folder
The example static test data is provided in ./test_data
, the data structure of static (or dynamic) is listed as :
<dataset_name>
|-- COLOR
|-- FRAMExxxx
|-- 0.jpg # input RGB image (1024x1024) for each view
|-- 1.jpg
...
|-- DEPTH
|-- FRAMExxxx
|-- 0.png # input depth image (1024x1024, uint16, unit is m after dividing by 10000) for each view
|-- 1.png
...
|-- MASK
|-- FRAMExxxx
|-- 0.png # input human-region mask (1024x1024) for each view
|-- 1.png
...
|-- PARAM
|-- FRAMExxxx
|-- 0.npy # camera intrinsic ('K': 3x3) and pose ('RT': 3x4) matrices for each view
|-- 1.npy
Depth denoising :
- Run
python -m depth_denoising.inference
- The original and denoised point clouds are in the
./depth_denoising/results
folder. Use meshlab to visualize the 3D results - Modify
basic_path
,frame idx
andview_id
in the fileinference.py
to obtain the results of other examples
SRONet and SRONetUp :
- For provided static data, run
python -m upsampling.inference_static --name SAILOR
(in 1K resolution) orpython -m SRONet.inference_static --name SAILOR
(in 512 resolution), to obtain the reconstructed 3D mesh and free-view rendering results. - The reconstructed 3D meshes are in the
./checkpoints_rend/SAILOR/val_results
folder. To render the 3D mesh, runpython -m utils_render.render_mesh
to obtain the free-view mesh rendering results. Modifyopts.ren_data_root
,obj_path
andobj_name
in the filerender_mesh.py
to get new results. - For dynamic data, first download our real-captured data here, unzip the data and put them in the
./test_data
folder - For dynamic data, then run
python -m upsampling.inference_dynamic --name SAILOR
orpython -m SRONet.inference_dynamic --name SAILOR
to obtain the rendering results. - Modify
opts.ren_data_root
andopts.data_name
ininference_static.py
andinference_dynamic.py
to obtain new rendering results - The rendering images and videos are in the
./SRONet(or upsampling)/results
folder.
Interactive rendering :
We release our interactive rendering GUI for our real-captured dataset.
- TensorRT is required to accelerate our depth denoising network and the encoders in SRONet(upsampling). Please refer to TensorRT installation guide and then install torch2trt. Our TensorRT version is 7.2
- Run
python -m depth_denoising.toTensorRT
,python -m SRONet.toTensorRT
andpython -m upsampling.toTensorRT
to obtain the TRTModules (the paramopts.num_gpus
intoTensorRT.py
controls the number of GPUs). The final pth models are in the./SAILOR/accelerated_models
folder - Run
python -m gui.gui_render
. Modify theopts.ren_data_root
ingui_render.py
to test other data, and modify theopts.num_gpus
to use 1 GPU (slow) or 2 GPUs. The GIF below shows the rendering result of using 2 Nvidia RTX 3090, an Intel i9-13900k, and an MSI Z790 god-like motherboard
The code, models, and GUI demos in this repository are released under the GPL-3.0 license.
If you find our work helpful to your research, please cite our paper.
@article{dong2023sailor,
author = {Zheng Dong, Xu Ke, Yaoan Gao, Qilin Sun, Hujun Bao, Weiwei Xu, Rynson W.H. Lau},
title = {SAILOR: Synergizing Radiance and Occupancy Fields for Live Human Performance Capture},
year = {2023},
journal = {ACM Transactions on Graphics (TOG)},
volume = {42},
number = {6},
doi = {10.1145/3618370},
publisher = {ACM}
}