1 code implementation • ECCV 2020 • Mingfei Han, Yali Wang, Xiaojun Chang, Yu Qiao
Recent studies have shown that, context aggregating information from proposals in different frames can clearly enhance the performance of video object detection.
Ranked #1 on
Video Object Detection
on ImageNet VID
1 code implementation • 3 May 2022 • Mengzhe He, Yali Wang, Jiaxi Wu, Yiru Wang, Hanqing Li, Bo Li, Weihao Gan, Wei Wu, Yu Qiao
It can adaptively enhance source detector to perceive objects in a target image, by leveraging target proposal contexts from iterative cross-attention.
no code implementations • 17 Apr 2022 • Jiaxi Wu, Jiaxin Chen, Mengzhe He, Yiru Wang, Bo Li, Bingqi Ma, Weihao Gan, Wei Wu, Yali Wang, Di Huang
Specifically, TRKP adopts the teacher-student framework, where the multi-head teacher network is built to extract knowledge from labeled source domains and guide the student network to learn detectors in unlabeled target domain.
no code implementations • 5 Apr 2022 • Mingfei Han, David Junhao Zhang, Yali Wang, Rui Yan, Lina Yao, Xiaojun Chang, Yu Qiao
Learning spatial-temporal relation among multiple actors is crucial for group activity recognition.
3 code implementations • 24 Jan 2022 • Kunchang Li, Yali Wang, Junhao Zhang, Peng Gao, Guanglu Song, Yu Liu, Hongsheng Li, Yu Qiao
Different from the typical transformer blocks, the relation aggregators in our UniFormer block are equipped with local and global token affinity respectively in shallow and deep layers, allowing to tackle both redundancy and dependency for efficient and effective representation learning.
Ranked #72 on
Image Classification
on ImageNet
no code implementations • 20 Jan 2022 • Mingye Xu, Yali Wang, Zhipeng Zhou, Hongbin Xu, Yu Qiao
To fill this gap, we propose a generic Contour-Perturbed Reconstruction Network (CP-Net), which can effectively guide self-supervised reconstruction to learn semantic content in the point cloud, and thus promote discriminative power of point cloud representation.
1 code implementation • 12 Jan 2022 • Kunchang Li, Yali Wang, Peng Gao, Guanglu Song, Yu Liu, Hongsheng Li, Yu Qiao
For Something-Something V1 and V2, our UniFormer achieves new state-of-the-art performances of 60. 9% and 71. 2% top-1 accuracy respectively.
2 code implementations • 22 Dec 2021 • Liang Pan, Tong Wu, Zhongang Cai, Ziwei Liu, Xumin Yu, Yongming Rao, Jiwen Lu, Jie zhou, Mingye Xu, Xiaoyuan Luo, Kexue Fu, Peng Gao, Manning Wang, Yali Wang, Yu Qiao, Junsheng Zhou, Xin Wen, Peng Xiang, Yu-Shen Liu, Zhizhong Han, Yuanjie Yan, Junyi An, Lifa Zhu, Changwei Lin, Dongrui Liu, Xin Li, Francisco Gómez-Fernández, Qinlong Wang, Yang Yang
Based on the MVP dataset, this paper reports methods and results in the Multi-View Partial Point Cloud Challenge 2021 on Completion and Registration.
1 code implementation • 24 Nov 2021 • David Junhao Zhang, Kunchang Li, Yunpeng Chen, Yali Wang, Shashwat Chandra, Yu Qiao, Luoqi Liu, Mike Zheng Shou
Self-attention has become an integral component of the recent network architectures, e. g., Transformer, that dominate major image and video benchmarks.
Ranked #11 on
Action Recognition
on Something-Something V2
(using extra training data)
no code implementations • 24 Nov 2021 • Zhuofan Zong, Kunchang Li, Guanglu Song, Yali Wang, Yu Qiao, Biao Leng, Yu Liu
Vision transformers (ViTs) have become the popular structures and outperformed convolutional neural networks (CNNs) on various vision tasks.
no code implementations • 29 Sep 2021 • Zhuofan Zong, Kunchang Li, Guanglu Song, Yali Wang, Yu Qiao, Biao Leng, Yu Liu
Specifically, we first design a novel Token Slimming Module (TSM), which can boost the inference efficiency of ViTs by dynamic token aggregation.
2 code implementations • ICLR 2022 • Kunchang Li, Yali Wang, Gao Peng, Guanglu Song, Yu Liu, Hongsheng Li, Yu Qiao
For Something-Something V1 and V2, our UniFormer achieves new state-of-the-art performances of 60. 8% and 71. 4% top-1 accuracy respectively.
Ranked #1 on
Action Recognition
on Something-Something V1
no code implementations • 15 Sep 2021 • Junhao Zhang, Yali Wang, Zhipeng Zhou, Tianyu Luan, Zhe Wang, Yu Qiao
Graph Convolution Network (GCN) has been successfully used for 3D human pose estimation in videos.
1 code implementation • ICCV 2021 • Hongbin Xu, Zhipeng Zhou, Yali Wang, Wenxiong Kang, Baigui Sun, Hao Li, Yu Qiao
Specially, the limitations can be categorized into two types: ambiguious supervision in foreground and invalid supervision in background.
no code implementations • 27 Aug 2021 • Fanxin Xu, Xiangkui Li, Hang Yang, Yali Wang, Wei Xiang
In this work, an object detector based on YOLOF has been proposed to detect blood cell objects such as red blood cells, white blood cells and platelets.
1 code implementation • ICLR 2021 • Kunchang Li, Xianhang Li, Yali Wang, Jun Wang, Yu Qiao
It can learn to exploit spatial, temporal and channel attention in a high-dimensional manner, to improve the cooperative power of all the feature dimensions in our CT-Module.
Ranked #6 on
Action Recognition
on Something-Something V1
no code implementations • 24 May 2021 • Yi Liu, LiMin Wang, Xiao Ma, Yali Wang, Yu Qiao
Second, the coarse action classes often lead to the ambiguous annotations of temporal boundaries, which are inappropriate for temporal action localization.
no code implementations • 16 Mar 2021 • Tianyu Luan, Yali Wang, Junhao Zhang, Zhe Wang, Zhipeng Zhou, Yu Qiao
By coupling advanced 3D pose estimators and HMR in a serial or parallel manner, these two frameworks can effectively correct human mesh with guidance of a concise pose calibration module.
no code implementations • 23 Jan 2021 • Yali Wang, Steffen Limmer, Markus Olhofer, Michael Emmerich, Thomas Baeck
A preference based multi-objective evolutionary algorithm is proposed for generating solutions in an automatically detected knee point region.
1 code implementation • CVPR 2020 • Xianhang Li, Yali Wang, Zhipeng Zhou, Yu Qiao
Our SmallBig network outperforms a number of recent state-of-the-art approaches, in terms of accuracy and/or efficiency.
no code implementations • 15 Apr 2020 • Yali Wang, André Deutz, Thomas Bäck, Michael Emmerich
Given a point in $m$-dimensional objective space, any $\varepsilon$-ball of a point can be partitioned into the incomparable, the dominated and dominating region.
no code implementations • 14 Apr 2020 • Yali Wang, Bas van Stein, Michael T. M. Emmerich, Thomas Bäck
A customized multi-objective evolutionary algorithm (MOEA) is proposed for the multi-objective flexible job shop scheduling problem (FJSP).
1 code implementation • 16 Mar 2020 • Ze Yang, Yali Wang, Xianyu Chen, Jianzhuang Liu, Yu Qiao
Few-shot object detection is a challenging but realistic scenario, where only a few annotated training images are available for training detectors.
1 code implementation • 24 Feb 2020 • Peiqin Zhuang, Yali Wang, Yu Qiao
These distinct gate vectors inherit mutual context on semantic differences, which allow API-Net to attentively capture contrastive clues by pairwise interaction between two images.
Ranked #3 on
Fine-Grained Image Classification
on Stanford Dogs
no code implementations • 12 Feb 2020 • Hao Chen, Yali Wang, Guoyou Wang, Xiang Bai, Yu Qiao
Inspired by this procedure of learning to detect, we propose a novel Progressive Object Transfer Detection (POTD) framework.
no code implementations • 9 Feb 2019 • Chu Qin, Ying Tan, Shang Ying Chen, Xian Zeng, Xingxing Qi, Tian Jin, Huan Shi, Yiwei Wan, Yu Chen, Jingfeng Li, Weidong He, Yali Wang, Peng Zhang, Feng Zhu, Hongping Zhao, Yuyang Jiang, Yuzong Chen
We ex-plored the superior learning capability of deep autoencoders for unsupervised clustering of 1. 39 mil-lion bioactive molecules into band-clusters in a 3-dimensional latent chemical space.
no code implementations • CVPR 2018 • Yali Wang, Lei Zhou, Yu Qiao
To mimic this capacity, we propose a novel Hybrid Video Memory (HVM) machine, which can hallucinate temporal features of still images from video memory, in order to boost action recognition with few still images.
1 code implementation • 5 Mar 2018 • Hao Chen, Yali Wang, Guoyou Wang, Yu Qiao
Second, we introduce a novel regularized transfer learning framework for low-shot detection, where the transfer knowledge (TK) and background depression (BD) regularizations are proposed to leverage object knowledge respectively from source and target domains, in order to further enhance fine-tuning with a few target images.
Ranked #14 on
Few-Shot Object Detection
on MS-COCO (30-shot)
1 code implementation • 2017 IEEE International Conference on Computer Vision (ICCV) 2017 • Wenbin Du, Yali Wang, Yu Qiao
Firstly, unlike previous works on pose-related action recognition, our RPAN is an end-to-end recurrent network which can exploit important spatial-temporal evolutions of human pose to assist action recognition in a unified framework.
Ranked #5 on
Skeleton Based Action Recognition
on J-HMDB
1 code implementation • 1 Sep 2016 • Zhe Wang, Li-Min Wang, Yali Wang, Bo-Wen Zhang, Yu Qiao
In this paper, we propose a hybrid representation, which leverages the discriminative capacity of CNNs and the simplicity of descriptor encoding schema for image recognition, with a focus on scene recognition.
no code implementations • NeurIPS 2012 • Yali Wang, Brahim Chaib-Draa
We present a novel marginalized particle Gaussian process (MPGP) regression, which provides a fast, accurate online Bayesian filtering framework to model the latent function.