Search Results for author: Junbin Xiao

Found 9 papers, 9 papers with code

Contrastive Video Question Answering via Video Graph Transformer

1 code implementation27 Feb 2023 Junbin Xiao, Pan Zhou, Angela Yao, Yicong Li, Richang Hong, Shuicheng Yan, Tat-Seng Chua

CoVGT's uniqueness and superiority are three-fold: 1) It proposes a dynamic graph transformer module which encodes video by explicitly capturing the visual objects, their relations and dynamics, for complex spatio-temporal reasoning.

Contrastive Learning Question Answering +1

Equivariant and Invariant Grounding for Video Question Answering

2 code implementations26 Jul 2022 Yicong Li, Xiang Wang, Junbin Xiao, Tat-Seng Chua

Specifically, the equivariant grounding encourages the answering to be sensitive to the semantic changes in the causal scene and question; in contrast, the invariant grounding enforces the answering to be insensitive to the changes in the environment scene.

Question Answering Video Question Answering

Video Graph Transformer for Video Question Answering

2 code implementations12 Jul 2022 Junbin Xiao, Pan Zhou, Tat-Seng Chua, Shuicheng Yan

VGT's uniqueness are two-fold: 1) it designs a dynamic graph transformer module which encodes video by explicitly capturing the visual objects, their relations, and dynamics for complex spatio-temporal reasoning; and 2) it exploits disentangled video and text Transformers for relevance comparison between the video and text to perform QA, instead of entangled cross-modal Transformer for answer classification.

Ranked #2 on Video Question Answering on NExT-QA (using extra training data)

Question Answering Video Question Answering +1

Invariant Grounding for Video Question Answering

1 code implementation CVPR 2022 Yicong Li, Xiang Wang, Junbin Xiao, Wei Ji, Tat-Seng Chua

At its core is understanding the alignments between visual scenes in video and linguistic semantics in question to yield the answer.

Question Answering Video Question Answering

Video Question Answering: Datasets, Algorithms and Challenges

1 code implementation2 Mar 2022 Yaoyao Zhong, Junbin Xiao, Wei Ji, Yicong Li, Weihong Deng, Tat-Seng Chua

Video Question Answering (VideoQA) aims to answer natural language questions according to the given videos.

Question Answering Video Question Answering

Video as Conditional Graph Hierarchy for Multi-Granular Question Answering

1 code implementation12 Dec 2021 Junbin Xiao, Angela Yao, Zhiyuan Liu, Yicong Li, Wei Ji, Tat-Seng Chua

To align with the multi-granular essence of linguistic concepts in language queries, we propose to model video as a conditional graph hierarchy which weaves together visual facts of different granularity in a level-wise manner, with the guidance of corresponding textual cues.

Question Answering Video Question Answering +1

NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions

1 code implementation CVPR 2021 Junbin Xiao, Xindi Shang, Angela Yao, Tat-Seng Chua

We introduce NExT-QA, a rigorously designed video question answering (VideoQA) benchmark to advance video understanding from describing to explaining the temporal actions.

Question Answering Video Question Answering +2

NExT-QA:Next Phase of Question-Answering to Explaining Temporal Actions

1 code implementation18 May 2021 Junbin Xiao, Xindi Shang, Angela Yao, Tat-Seng Chua

We introduce NExT-QA, a rigorously designed video question answering (VideoQA) benchmark to advance video understanding from describing to explaining the temporal actions.

Question Answering Video Question Answering +2

Cannot find the paper you are looking for? You can Submit a new open access paper.