This repo holds the codes for the L-GCN framework presented on AAAI 2020
Location-aware Graph Convolutional Networks for Video Question Answering Deng Huang, Peihao Chen, Runhao Zeng, Qing Du, Mingkui Tan, Chuang Gan, AAAI 2020, New York.
[Paper]
Code Preparation [back to top]
Clone this repo with git
git clone https://github.com/SunDoge/L-GCN.git
cd L-GCN
Module Preparation [back to top]
This repo is based on Pytorch>=1.2
Other modules can be installed by running
pip install -r requirements.txt
python -m spacy download en
Data Preparation [back to top]
Extract frames by following the instructions in tgif-qa.
./save-frames.sh data/tgif/{gifs,frames}
Some GIF cannot be read by ffmpeg, you can use imagemagick to save the frames.
convert img.gif img/%d.jpg
Since there are too many frames to process, we split them into N parts.
python -m scripts.split_n_parts -o data/tgif/frame_splits/
Extract bboxes using Mask R-CNN. Check the script for more args.
python -m scripts.extract_bboxes_with_maskrcnn \
-f data/tgif/frame_splits/split0.pkl \
-o data/tgif/bboxes_splits/split0.pt \
-c /path/to/e2e_mask_rcnn_R_101_FPN_1x_caffe2.yaml
python -m scripts.merge_box_scores_and_labels \
--bboxes data/tgif/bboxes_splits \
-o data/tgif/bboxes
python -m scripts.extract_resnet152_features_with_bboxes \
-i data/tgif/frames \
-f data/tgif/frame_splits/split0.pkl \
-p data/tgif/bboxes_splits/split0.pt \
-o data/tgif/bbox_features_splits/split0layer4
python -m scripts.merge_bboxes \
--bboxes data/tgif/bbox_features_splits \
-o data/tgif/resnet152_bbox_features
python -m scripts.extract_resnet152_features \
-i data/tgif/frames
Training [back to top]
Use the following command to train L-GCN
python train.py -c config/resnet152-bbox/$TASK_CONFIG -e $PATH_TO_SAVE_RESULT
-
$TASK_CONFIG
denotes the config of task, there are four choice:action.conf
,transition.conf
,frameqa.conf
,count.conf
-
$PATH_TO_SAVE_RESULT
denotes the path to save the result
Citation [back to top]
Please cite the following paper if you feel L-GCN useful to your research
@inproceedings{L-GCN2020AAAI,
author = {Deng Huang and
Peihao Chen and
Runhao Zeng and
Qing Du and
Mingkui Tan and
Chuang Gan},
title = {Location-aware Graph Convolutional Networks for Video Question Answering},
booktitle = {AAAI},
year = {2020},
}
Contact [back to top]
For any question, please file an issue or contact
im.huangdeng@gmail.com
phchencs@gmail.com