MTUT (CVPR'2019)
@InProceedings{Abavisani_2019_CVPR,
author = {Abavisani, Mahdi and Joze, Hamid Reza Vaezi and Patel, Vishal M.},
title = {Improving the Performance of Unimodal Dynamic Hand-Gesture Recognition With Multimodal Training},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}
}
I3D (CVPR'2017)
@InProceedings{Carreira_2017_CVPR,
author = {Carreira, Joao and Zisserman, Andrew},
title = {Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {July},
year = {2017}
}
NVGesture (CVPR'2016)
@InProceedings{Molchanov_2016_CVPR,
author = {Molchanov, Pavlo and Yang, Xiaodong and Gupta, Shalini and Kim, Kihwan and Tyree, Stephen and Kautz, Jan},
title = {Online Detection and Classification of Dynamic Hand Gestures With Recurrent 3D Convolutional Neural Network},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2016}
}
Results on NVGesture test set
Arch | Input Size | fps | bbox | AP_rgb | AP_depth | ckpt | log |
---|---|---|---|---|---|---|---|
I3D+MTUT* | 112x112 | 15 | 0.725 | 0.730 | ckpt | log | |
I3D+MTUT | 224x224 | 30 | 0.782 | 0.811 | ckpt | log | |
I3D+MTUT | 224x224 | 30 | 0.739 | 0.809 | ckpt | log |
*: MTUT supports multi-modal training and uni-modal testing. Model trained with this config can be used to recognize gestures in rgb videos with inference config.