no code implementations • CVPR 2023 • Li Xu, Mark He Huang, Xindi Shang, Zehuan Yuan, Ying Sun, Jun Liu
Then, following a novel meta optimization scheme to optimize the model to obtain good testing performance on the virtual testing sets after training on the virtual training set, our framework can effectively drive the model to better capture semantics and visual representations of individual concepts, and thus obtain robust generalization performance even when handling novel compositions.
no code implementations • CVPR 2023 • Tianjiao Li, Lin Geng Foo, Ping Hu, Xindi Shang, Hossein Rahmani, Zehuan Yuan, Jun Liu
Pre-training VTs on such corrupted data can be challenging, especially when we pre-train via the masked autoencoding approach, where both the inputs and masked ``ground truth" targets can potentially be unreliable in this case.
1 code implementation • CVPR 2021 • Junbin Xiao, Xindi Shang, Angela Yao, Tat-Seng Chua
We introduce NExT-QA, a rigorously designed video question answering (VideoQA) benchmark to advance video understanding from describing to explaining the temporal actions.
2 code implementations • 18 May 2021 • Junbin Xiao, Xindi Shang, Angela Yao, Tat-Seng Chua
We introduce NExT-QA, a rigorously designed video question answering (VideoQA) benchmark to advance video understanding from describing to explaining the temporal actions.
1 code implementation • ECCV 2020 • Junbin Xiao, Xindi Shang, Xun Yang, Sheng Tang, Tat-Seng Chua
In this paper, we explore a novel task named visual Relation Grounding in Videos (vRGV).
no code implementations • CVPR 2016 • Hanwang Zhang, Xindi Shang, Wenzhuo Yang, Huan Xu, Huanbo Luan, Tat-Seng Chua
Leveraging on the structure of the proposed collaborative learning formulation, we develop an efficient online algorithm that can jointly learn the label embeddings and visual classifiers.