no code implementations • CVPR 2020 • Boxiao Pan, Haoye Cai, De-An Huang, Kuan-Hui Lee, Adrien Gaidon, Ehsan Adeli, Juan Carlos Niebles
In this paper, we propose a novel spatio-temporal graph model for video captioning that exploits object interactions in space and time.
no code implementations • ECCV 2018 • Haoye Cai, Chunyan Bai, Yu-Wing Tai, Chi-Keung Tang
In the second stage, a skeleton-to-image network is trained, which is used to generate a human action video given the complete human pose sequence generated in the first stage.
Ranked #5 on Human action generation on NTU RGB+D 2D