Version: 0.0.1
Date: 2024-09-14
Author: Zongyao Yi
Contact: zongyao.yi@dfki.de
This repo is a third party implementation of the FIGNet Learning Rigid Dynamics with Face Interaction Graph Networks[1], trying to reproduce the results from the original paper. The package also includes the improvement introduced in their following work (FIGNet*)[2].
This project is still in an early experimental stage, and not guaranteed to produce the same results as in the paper. Welcome to contribute if you find errors in the implementation.
The dataset is similar to the Kubric MoviA
dataset but with the
Mujoco as simulator and
robosuite
objects.
For each episode, 5 to 10 objects are sampled. Their attributes are randomized,
including initial poses and velocities, as well as their static
properties such as mass, friction and restitution. Floor static properties are
also randomized.
The dataset contains 100k episodes of length 100 steps and 1M steps in total.
Each step in the dataset equals to 10 simulation steps. dt
is calculated as
dt=10*0.002
with 0.002
the step length in Mujoco.
Dataset format
The dataset is stored as a .npz file. Each trajectory contains a dictionary{
"pos": (traj_len, n_obj, 3), # xyz
"quat": (traj_len, n_obj, 4), # xyzw
"obj_ids": {"obj_name": obj_id},
"meta_data": {}, # describes the scene and properties of objects
"mujoco_xml": str, # xml string to initialize mujoco simulation
}
Node features: [node velocities ((seq_len-1)*3), inverse of mass (1), friction (3), restitution (1), object kinematic (1)]
Collision detection is implemented by the hpp-fcl library [2]. Face-face edge features are calculated based on the detection results.
Because of the object-mesh edges and the novel face-face edges, the graph consists of two sets of nodes (mesh and object nodes) and four sets of edges (mesh-mesh, mesh-object, object-mesh, face-face). According to following work of FIGNet [2], omitting the mesh-mesh edges helps the model to scale without affecting the accuracy. The implementation also adds the option to omit the mesh-mesh edges. Finally, the message passing layer is augmented to handle face-face message passing.
Install opengl related libraries
apt update && apt install ffmpeg libsm6 libxext6 -y
apt install libglfw3 libglfw3-dev -y
Install PyTorch and torchvision
# adapt to your cuda version, this will install torch 2.4.0 with cuda 12.1 and torchvision 0.19.0
pip3 install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu113
Install PyTorch3D following install instruction
pip install 'git+https://github.com/facebookresearch/pytorch3d.git@stable' -v
Install torch-scatter
# adapt to your cuda and python version, check https://data.pyg.org/whl
pip install https://data.pyg.org/whl/torch-2.4.0%2Bcu121/torch_scatter-2.1.2%2Bpt24cu121-cp38-cp38-linux_x86_64.whl
Install hpp-fcl. Since some features from hpp-fcl that we used here are not yet released (see issue #590), this library has to be installed from source (commit 7e3f33b) together with the latest eigenpy (v3.8.0) or install the pre-built wheels for python3.8 as follows
# Install pre-compiled binary through pip if you are using python3.8, try upgrade your pip first
# pip install --upgrade pip
pip install https://cloud.dfki.de/owncloud/index.php/s/F9EwmwWkSW8pzfL/download/eigenpy-3.8.0-0-cp38-cp38-manylinux_2_31_x86_64.whl
pip install https://cloud.dfki.de/owncloud/index.php/s/Tb4baydBiRP6iN2/download/hpp_fcl-2.4.5-3-cp38-cp38-manylinux_2_31_x86_64.whl
git clone https://github.com/jongyaoY/fignet
cd fignet
pip install -r requirements.txt
pip install .
# Setup robosuite
python -m robosuite.scripts.setup_macros
python scripts/generate_data.py --ep_len=100 --internal_steps=10 --total_steps=1000000 --data_path=datasets # Generate 1M steps for training
python scripts/generate_data.py --ep_len=100 --internal_steps=10 --total_steps=100000 --data_path=datasets # Generate 100k steps for validating
You can pre-compute the graphs from the raw dataset beforehand so that the training runs faster (only the training dataset).
python scripts/preprocess_data.py --data_path=[path_to_dataset/train_dataset_name.npz] --num_workers=[default to 1] --config_file=config/train.yaml
This process takes around 8 hours with num_workers=8
, and will create a
folder path_to_dataset/train_dataset_name
with all
the pre-computed graphs stored inside. The dataset with 1M steps will create
960k graphs and takes around 335GB disk space (uncompressed). Alternatively, the pre-computed
graphs for training can also be downloaded
here. It needs to
be uncompressed after download.
For the training you need to pass in a config file; a template can be found in
config/train.yaml. Adapt data_path
, test_data_path
to
the train and test dataset respectively. For train dataset, it can be the raw
dataset (npz file) or the folder containing pre-computed graphs, while the test
dataset should be a npz file. Also adapt batch_size
and num_workers
accordingly. As mentioned above, omitting the mesh-mesh edges improves the memory
efficiency [2]. Set leave_out_mm=True
for better scalability.
python scripts/train.py --config_file=config/train.yaml
The render_model.py script will sample several rollouts with the learned simulator, and generate animation of the ground truth and predicted trajectories.
python scripts/render_model.py --model_path=[model path] --num_ep=[number of episodes] --off_screen --video_path=[video path] --input_seq_len=3 --height=480 --width=640
# or if leave_out_mm is set true during training
python scripts/render_model.py --model_path=[model path] --leave_out_mm --num_ep=[number of episodes] --off_screen --video_path=[video path] --input_seq_len=3 --height=480 --width=640
The FIGNet model was trained for 1M
steps with batch size 128
, and FIGNet*
trained for 500K
steps with batch size 256
. However, the accuracy didn't improve significantly after
around 400K
steps.
Comparing with the results from the papers [1][2], the rotational errors are
slightly higher while the translational errors are about 2 times lower.
Presumably that's because the dataset being used here has smaller time step
dt=0.02
, accounting for lower translational errors, and randomized object
properties (shape, size, friction, restitution) for higher rotational errors.
Rotational Error (rad) | Translational Error (m) | |
---|---|---|
FIGNet step 1M | 0.34 | 0.04 |
FIGNet* step 500K | 0.38 | 0.05 |
The following images show the qualitative comparison between ground truth and
model predictions of 100
steps:
Transfer to new objects and environment
The weights can be downloaded here:
- FIGNet_weights_itr_1M
- FIGNet_weights_itr_750k
- FIGNet_weights_itr_662k
- FIGNet*_weights_itr_470k
- FIGNet*_weights_itr_500k
The FIGNet implementation is highly inspired by the PyTorch version of Graph Network Simulator and Mesh Graph Network Simulator: https://github.com/geoelements/gns. The following files are direct copied from the gns (MIT License):
The following files are partially copied from gns:
This work is carried out as part of the ChargePal project through a grant of the German Federal Ministry for Economic Affairs and Climate Action (BMWK) with the grant number 01ME19003D
[1] Allen, Kelsey R., et al. "Learning rigid dynamics with face interaction graph networks." arXiv preprint arXiv:2212.03574 (2022).
[2] Lopez-Guevara, Tatiana, et al. "Scaling Face Interaction Graph Networks to Real World Scenes." arXiv preprint arXiv:2401.11985 (2024).
[3] Pan, J., Chitta, S., Pan, J., Manocha, D., Mirabel, J., Carpentier, J., & Montaut, L. (2024). HPP-FCL - An extension of the Flexible Collision Library (Version 2.4.4) [Computer software]. https://github.com/humanoid-path-planner/hpp-fcl
RuntimeError: received 0 items of ancdata
Add the following line to preprocess_data.py should solve the problem (see here).
torch.multiprocessing.set_sharing_strategy('file_system')