no code implementations • 2 Mar 2023 • Bo Wan, Yongfei Liu, Desen Zhou, Tinne Tuytelaars, Xuming He
Human object interaction (HOI) detection plays a crucial role in human-centric scene understanding and serves as a fundamental building-block for many vision tasks.
Human-Object Interaction Detection Knowledge Distillation +2
no code implementations • ICLR 2022 • Bo Wan, Wenjuan Han, Zilong Zheng, Tinne Tuytelaars
We introduce a new task, unsupervised vision-language (VL) grammar induction.
1 code implementation • 9 Sep 2021 • Qian He, Desen Zhou, Bo Wan, Xuming He
To address those challenges, we adopt a primitive-based representation for 3D object, and propose a two-stage graph network for primitive-based 3D object estimation, which consists of a sequential proposal module and a graph reasoning module.
2 code implementations • CVPR 2021 • Rongjie Li, Songyang Zhang, Bo Wan, Xuming He
Scene graph generation is an important visual understanding task with a broad range of vision applications.
1 code implementation • CVPR 2021 • Yongfei Liu, Bo Wan, Lin Ma, Xuming He
Visual grounding, which aims to build a correspondence between visual objects and their language entities, plays a key role in cross-modal scene understanding.
2 code implementations • 20 Nov 2019 • Yongfei Liu, Bo Wan, Xiaodan Zhu, Xuming He
To address their limitations, this paper proposes a language-guided graph representation to capture the global context of grounding entities and their relations, and develop a cross-modal graph matching strategy for the multiple-phrase visual grounding task.
1 code implementation • ICCV 2019 • Bo Wan, Desen Zhou, Yongfei Liu, Rongjie Li, Xuming He
Reasoning human object interactions is a core problem in human-centric scene understanding and detecting such relations poses a unique challenge to vision systems due to large variations in human-object configurations, multiple co-occurring relation instances and subtle visual difference between relation categories.