1 code implementation • 30 Dec 2021 • Ziyu Wang, Dejing Xu, Gus Xia, Ying Shan
This is the audio-to-symbolic arrangement problem we tackle in this paper.
no code implementations • 23 Sep 2020 • Binjie Zhang, Yu Li, Chun Yuan, Dejing Xu, Pin Jiang, Ying Shan
The task of language-guided video temporal grounding is to localize the particular video clip corresponding to a query sentence in an untrimmed video.
1 code implementation • 6 Jun 2019 • Zhou Yu, Dejing Xu, Jun Yu, Ting Yu, Zhou Zhao, Yueting Zhuang, DaCheng Tao
It is both crucial and natural to extend this research direction to the video domain for video question answering (VideoQA).
Ranked #29 on Video Question Answering on ActivityNet-QA
Visual Question Answering (VQA) Zero-Shot Video Question Answer
no code implementations • CVPR 2019 • Dejing Xu, Jun Xiao, Zhou Zhao, Jian Shao, Di Xie, Yueting Zhuang
Our method can learn the spatiotemporal representation of the video by predicting the order of shuffled clips from the video.
Ranked #43 on Self-Supervised Action Recognition on UCF101