Skip to content

Latest commit

 

History

History
60 lines (46 loc) · 3.65 KB

File metadata and controls

60 lines (46 loc) · 3.65 KB
MTUT (CVPR'2019)
@InProceedings{Abavisani_2019_CVPR,
  author = {Abavisani, Mahdi and Joze, Hamid Reza Vaezi and Patel, Vishal M.},
  title = {Improving the Performance of Unimodal Dynamic Hand-Gesture Recognition With Multimodal Training},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  year = {2019}
}
I3D (CVPR'2017)
@InProceedings{Carreira_2017_CVPR,
  author = {Carreira, Joao and Zisserman, Andrew},
  title = {Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset},
  booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {July},
  year = {2017}
}
NVGesture (CVPR'2016)
@InProceedings{Molchanov_2016_CVPR,
  author = {Molchanov, Pavlo and Yang, Xiaodong and Gupta, Shalini and Kim, Kihwan and Tyree, Stephen and Kautz, Jan},
  title = {Online Detection and Classification of Dynamic Hand Gestures With Recurrent 3D Convolutional Neural Network},
  booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  year = {2016}
}

Results on NVGesture test set

Arch Input Size fps bbox AP_rgb AP_depth ckpt log
I3D+MTUT* 112x112 15 $\surd$ 0.725 0.730 ckpt log
I3D+MTUT 224x224 30 $\surd$ 0.782 0.811 ckpt log
I3D+MTUT 224x224 30 $\times$ 0.739 0.809 ckpt log

*: MTUT supports multi-modal training and uni-modal testing. Model trained with this config can be used to recognize gestures in rgb videos with inference config.