Search Results for author: Xindi Shang

Found 6 papers, 3 papers with code

Meta Compositional Referring Expression Segmentation

no code implementations • CVPR 2023 • Li Xu, Mark He Huang, Xindi Shang, Zehuan Yuan, Ying Sun, Jun Liu

Then, following a novel meta optimization scheme to optimize the model to obtain good testing performance on the virtual testing sets after training on the virtual training set, our framework can effectively drive the model to better capture semantics and visual representations of individual concepts, and thus obtain robust generalization performance even when handling novel compositions.

Meta-Learning Referring Expression +2

Paper
Add Code

Token Boosting for Robust Self-Supervised Visual Transformer Pre-training

no code implementations • CVPR 2023 • Tianjiao Li, Lin Geng Foo, Ping Hu, Xindi Shang, Hossein Rahmani, Zehuan Yuan, Jun Liu

Pre-training VTs on such corrupted data can be challenging, especially when we pre-train via the masked autoencoding approach, where both the inputs and masked ``ground truth" targets can potentially be unreliable in this case.

Paper
Add Code

NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions

1 code implementation • CVPR 2021 • Junbin Xiao, Xindi Shang, Angela Yao, Tat-Seng Chua

We introduce NExT-QA, a rigorously designed video question answering (VideoQA) benchmark to advance video understanding from describing to explaining the temporal actions.

Question Answering Video Question Answering +2

103

Paper
Code

NExT-QA:Next Phase of Question-Answering to Explaining Temporal Actions

2 code implementations • 18 May 2021 • Junbin Xiao, Xindi Shang, Angela Yao, Tat-Seng Chua

We introduce NExT-QA, a rigorously designed video question answering (VideoQA) benchmark to advance video understanding from describing to explaining the temporal actions.

Question Answering Video Question Answering +2

103

Paper
Code

Visual Relation Grounding in Videos

1 code implementation • ECCV 2020 • Junbin Xiao, Xindi Shang, Xun Yang, Sheng Tang, Tat-Seng Chua

In this paper, we explore a novel task named visual Relation Grounding in Videos (vRGV).

Question Answering Relation +2

Paper
Code

Online Collaborative Learning for Open-Vocabulary Visual Classifiers

no code implementations • CVPR 2016 • Hanwang Zhang, Xindi Shang, Wenzhuo Yang, Huan Xu, Huanbo Luan, Tat-Seng Chua

Leveraging on the structure of the proposed collaborative learning formulation, we develop an efficient online algorithm that can jointly learn the label embeddings and visual classifiers.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.