no code implementations • ECCV 2020 • Xiaojie Li, Jianlong Wu, Hongyu Fang, Yue Liao, Fei Wang, Chen Qian
Sufficient knowledge extraction from the teacher network plays a critical role in the knowledge distillation task to improve the performance of the student network.
1 code implementation • CVPR 2023 • Luting Wang, Yi Liu, Penghui Du, Zihan Ding, Yue Liao, Qiaosong Qi, Biaolong Chen, Si Liu
When extracting object knowledge from PVLMs, the former adaptively transforms object proposals and adopts object-aware mask attention to obtain precise and complete knowledge of objects.
Ranked #6 on
Open Vocabulary Object Detection
on MSCOCO
1 code implementation • 2 Dec 2022 • Fangxun Shu, Biaolong Chen, Yue Liao, Shuwen Xiao, Wenyu Sun, Xiaobo Li, Yousong Zhu, Jinqiao Wang, Si Liu
Our MAC aims to reduce video representation's spatial and temporal redundancy in the VidLP model by a mask sampling mechanism to improve pre-training efficiency.
Ranked #29 on
Video Retrieval
on MSR-VTT-1kA
(using extra training data)
no code implementations • 21 Nov 2022 • Le Zhuo, Zhaokai Wang, Baisen Wang, Yue Liao, Stanley Peng, Chenxi Bao, Miao Lu, Xiaobo Li, Si Liu
To close this gap, we introduce a dataset, benchmark model, and evaluation metric for video background music generation.
1 code implementation • 12 Jul 2022 • Luting Wang, Xiaojie Li, Yue Liao, Zeren Jiang, Jianlong Wu, Fei Wang, Chen Qian, Si Liu
We observe that the core difficulty for heterogeneous KD (hetero-KD) is the significant semantic gap between the backbone features of heterogeneous detectors due to the different optimization manners.
no code implementations • 30 Mar 2022 • Mingfei Chen, Yue Liao, Si Liu, Fei Wang, Jenq-Neng Hwang
RS takes previous detected results as references to aggregate the corresponding features from the combined features of the adjacent frames and makes a one-to-one track state prediction for each reference in parallel.
1 code implementation • CVPR 2022 • Yue Liao, Aixi Zhang, Miao Lu, Yongliang Wang, Xiaobo Li, Si Liu
In this paper, we reveal and address the disadvantages of the conventional query-driven HOI detectors from the two aspects.
Ranked #6 on
Human-Object Interaction Detection
on HICO-DET
1 code implementation • NeurIPS 2021 • Aixi Zhang, Yue Liao, Si Liu, Miao Lu, Yongliang Wang, Chen Gao, Xiaobo Li
To this end, we propose a novel one-stage framework with disentangling human-object detection and interaction classification in a cascade manner.
Ranked #5 on
Human-Object Interaction Detection
on V-COCO
no code implementations • 24 May 2021 • Si Liu, Zitian Wang, Yulu Gao, Lejian Ren, Yue Liao, Guanghui Ren, Bo Li, Shuicheng Yan
For the above exemplar case, our HRS task produces results in the form of relation triplets <girl [left hand], hold, book> and exacts segmentation masks of the book, with which the robot can easily accomplish the grabbing task.
1 code implementation • CVPR 2021 • Mingfei Chen, Yue Liao, Si Liu, ZhiYuan Chen, Fei Wang, Chen Qian
To attain this, we map a trainable interaction query set to an interaction prediction set with a transformer.
Ranked #21 on
Human-Object Interaction Detection
on HICO-DET
(using extra training data)
1 code implementation • 10 Nov 2020 • Zongheng Tang, Yue Liao, Si Liu, Guanbin Li, Xiaojie Jin, Hongxu Jiang, Qian Yu, Dong Xu
HC-STVG is a video grounding task that requires both spatial (where) and temporal (when) localization.
2 code implementations • CVPR 2020 • Zhiwei Dong, Guoxuan Li, Yue Liao, Fei Wang, Pengju Ren, Chen Qian
CentripetalNet predicts the position and the centripetal shift of the corner points and matches corners whose shifted results are aligned.
1 code implementation • CVPR 2020 • Yue Liao, Si Liu, Fei Wang, Yanjie Chen, Chen Qian, Jiashi Feng
Human and object points are the center of the detection boxes, and the interaction point is the midpoint of the human and object points.
Ranked #22 on
Human-Object Interaction Detection
on V-COCO
no code implementations • CVPR 2020 • Yue Liao, Si Liu, Guanbin Li, Fei Wang, Yanjie Chen, Chen Qian, Bo Li
RCCF reformulates the referring expression comprehension as a correlation filtering process.