no code implementations • 23 Mar 2023 • Mengyao Lyu, Jundong Zhou, Hui Chen, YiJie Huang, Dongdong Yu, Yaqian Li, Yandong Guo, Yuchen Guo, Liuyu Xiang, Guiguang Ding
Active learning selects informative samples for annotation within budget, which has proven efficient recently on object detection.
1 code implementation • 20 Mar 2023 • Weixuan Sun, Jiayi Zhang, Jianyuan Wang, Zheyuan Liu, Yiran Zhong, Tianpeng Feng, Yandong Guo, Yanhao Zhang, Nick Barnes
Based on this observation, we propose a new learning strategy named False Negative Aware Contrastive (FNAC) to mitigate the problem of misleading the training with such false negative samples.
1 code implementation • 14 Mar 2023 • Anthony Chen, Kevin Zhang, Renrui Zhang, Zihan Wang, Yuheng Lu, Yandong Guo, Shanghang Zhang
Masked Autoencoders learn strong visual representations and achieve state-of-the-art results in several independent modalities, yet very few works have addressed their capabilities in multi-modality settings.
no code implementations • 10 Mar 2023 • Xinyu Huang, Youcai Zhang, Jinyu Ma, Weiwei Tian, Rui Feng, Yuejie Zhang, Yaqian Li, Yandong Guo, Lei Zhang
This paper presents Tag2Text, a vision language pre-training (VLP) framework, which introduces image tagging into vision-language models to guide the learning of visual-linguistic features.
no code implementations • 26 Dec 2022 • Xinyi Wang, Jianteng Peng, Sufang Zhang, Bihui Chen, Yi Wang, Yandong Guo
Recent years witnessed the breakthrough of face recognition with deep convolutional neural networks.
no code implementations • 2 Dec 2022 • Xiaowei Chi, Jiaming Liu, Ming Lu, Rongyu Zhang, Zhaoqing Wang, Yandong Guo, Shanghang Zhang
In order to find them, we further propose a LiDAR-guided sampling strategy to leverage the statistical distribution of LiDAR to determine the heights of local slices.
no code implementations • 1 Dec 2022 • Jianing Li, Ming Lu, Jiaming Liu, Yandong Guo, Li Du, Shanghang Zhang
In this paper, we propose a unified framework named BEV-LGKD to transfer the knowledge in the teacher-student manner.
no code implementations • 30 Nov 2022 • Jiaming Liu, Rongyu Zhang, Xiaowei Chi, Xiaoqi Li, Ming Lu, Yandong Guo, Shanghang Zhang
Vision-Centric Bird-Eye-View (BEV) perception has shown promising potential and attracted increasing attention in autonomous driving.
no code implementations • 18 Oct 2022 • Yuchen Yang, Xudong Zhang, Shuang Gao, Jixiang Wan, Yishan Ping, Yuyue Liu, Jijunnan Li, Yandong Guo
In this paper, we present an efficient client-server visual localization architecture that fuses global and local pose estimations to realize promising precision and efficiency.
7 code implementations • 5 Oct 2022 • Silvio Giancola, Anthony Cioppa, Adrien Deliège, Floriane Magera, Vladimir Somers, Le Kang, Xin Zhou, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdulrahman Darwish, Adrien Maglo, Albert Clapés, Andreas Luyts, Andrei Boiarov, Artur Xarles, Astrid Orcesi, Avijit Shah, Baoyu Fan, Bharath Comandur, Chen Chen, Chen Zhang, Chen Zhao, Chengzhi Lin, Cheuk-Yiu Chan, Chun Chuen Hui, Dengjie Li, Fan Yang, Fan Liang, Fang Da, Feng Yan, Fufu Yu, Guanshuo Wang, H. Anthony Chan, He Zhu, Hongwei Kan, Jiaming Chu, Jianming Hu, Jianyang Gu, Jin Chen, João V. B. Soares, Jonas Theiner, Jorge De Corte, José Henrique Brito, Jun Zhang, Junjie Li, Junwei Liang, Leqi Shen, Lin Ma, Lingchi Chen, Miguel Santos Marques, Mike Azatov, Nikita Kasatkin, Ning Wang, Qiong Jia, Quoc Cuong Pham, Ralph Ewerth, Ran Song, RenGang Li, Rikke Gade, Ruben Debien, Runze Zhang, Sangrok Lee, Sergio Escalera, Shan Jiang, Shigeyuki Odashima, Shimin Chen, Shoichi Masui, Shouhong Ding, Sin-wai Chan, Siyu Chen, Tallal El-Shabrawy, Tao He, Thomas B. Moeslund, Wan-Chi Siu, Wei zhang, Wei Li, Xiangwei Wang, Xiao Tan, Xiaochuan Li, Xiaolin Wei, Xiaoqing Ye, Xing Liu, Xinying Wang, Yandong Guo, YaQian Zhao, Yi Yu, YingYing Li, Yue He, Yujie Zhong, Zhenhua Guo, Zhiheng Li
The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team.
1 code implementation • 20 Jul 2022 • Xiaoqi Li, Jiaming Liu, Shizun Wang, Cheng Lyu, Ming Lu, Yurong Chen, Anbang Yao, Yandong Guo, Shanghang Zhang
Our method significantly reduces the computational cost and achieves even better performance, paving the way for applying neural video delivery techniques to practical applications.
no code implementations • 20 Jul 2022 • Liliang Chen, Jiaqi Li, Han Huang, Yandong Guo
We propose CrossHuman, a novel method that learns cross-guidance from parametric human model and multi-frame RGB images to achieve high-quality 3D human reconstruction.
1 code implementation • 12 Jul 2022 • Xinyu Huang, Youcai Zhang, Ying Cheng, Weiwei Tian, RuiWei Zhao, Rui Feng, Yuejie Zhang, Yaqian Li, Yandong Guo, Xiaobo Zhang
However, the image-text pairs co-occurrent on the Internet typically lack explicit alignment information, which is suboptimal for VLP.
no code implementations • 24 Jun 2022 • Yiqing Shen, Liwu Xu, Yuzhe Yang, Yaqian Li, Yandong Guo
Mixed Sample Regularization (MSR), such as MixUp or CutMix, is a powerful data augmentation strategy to generalize convolutional neural networks.
no code implementations • 16 Jun 2022 • Chen Zhang, Honglin Sun, Chen Chen, Yandong Guo
We propose a motion forecasting model called BANet, which means Boundary-Aware Network, and it is a variant of LaneGCN.
no code implementations • 20 Apr 2022 • Bo Xu, Jiake Xie, Han Huang, Ziwen Li, Cheng Lu, Yong Tang, Yandong Guo
In this paper, we propose a Situational Perception Guided Image Matting (SPG-IM) method that mitigates subjective bias of matting annotations and captures sufficient situational perception information for better global saliency distilled from the visual-to-textual task.
no code implementations • 6 Apr 2022 • Shimin Chen, Chen Chen, Wei Li, Xunqiang Tao, Yandong Guo
In this paper, we propose a unified network for TAD, termed Faster-TAD, by re-purposing a Faster-RCNN like architecture.
no code implementations • 6 Apr 2022 • Shimin Chen, Wei Li, Chen Chen, Jianyang Gu, Jiaming Chu, Xunqiang Tao, Yandong Guo
SEAL consists of two kinds of annotations, SEAL Tubes and SEAL Clips.
no code implementations • CVPR 2022 • Yuzhe Yang, Liwu Xu, Leida Li, Nan Qie, Yaqian Li, Peng Zhang, Yandong Guo
To solve the dilemma, we conduct so far, the most comprehensive subjective study of personalized image aesthetics and introduce a new Personalized image Aesthetics database with Rich Attributes (PARA), which consists of 31, 220 images with annotations by 438 subjects.
1 code implementation • CVPR 2022 • Yiqing Shen, Liwu Xu, Yuzhe Yang, Yaqian Li, Yandong Guo
Afterwards, the former half mini-batch distills on-the-fly soft targets generated in the previous iteration.
no code implementations • CVPR 2022 • Zerong Zheng, Han Huang, Tao Yu, Hongwen Zhang, Yandong Guo, Yebin Liu
These local radiance fields not only leverage the flexibility of implicit representation in shape and appearance modeling, but also factorize cloth deformations into skeleton motions, node residual translations and the dynamic detail variations inside each individual radiance field.
1 code implementation • 22 Mar 2022 • Shizun Wang, Jiaming Liu, Kaixin Chen, Xiaoqi Li, Ming Lu, Yandong Guo
Once the incremental capacity is below the threshold, the patch can exit at the specific layer.
no code implementations • 8 Mar 2022 • Bo Xu, Guanze Liu, Han Huang, Cheng Lu, Yandong Guo
Most existing CNN-based salient object detection methods can identify local segmentation details like hair and animal fur, but often misinterpret the real saliency due to the lack of global contextual information caused by the subjectiveness of the SOD task and the locality of convolution layers.
no code implementations • CVPR 2022 • Lei Jin, Chenyang Xu, Xiaojuan Wang, Yabo Xiao, Yandong Guo, Xuecheng Nie, Jian Zhao
The existing multi-person absolute 3D pose estimation methods are mainly based on two-stage paradigm, i. e., top-down or bottom-up, leading to redundant pipelines with high computation cost.
1 code implementation • 13 Dec 2021 • Youcai Zhang, Yuhao Cheng, Xinyu Huang, Fei Wen, Rui Feng, Yaqian Li, Yandong Guo
Multi-label learning in the presence of missing labels (MLML) is a challenging problem.
1 code implementation • CVPR 2022 • Zhaoqing Wang, Yu Lu, Qiang Li, Xunqiang Tao, Yandong Guo, Mingming Gong, Tongliang Liu
In addition, we present text-to-pixel contrastive learning to explicitly enforce the text feature similar to the related pixel-level features and dissimilar to the irrelevances.
Ranked #5 on
Referring Expression Segmentation
on RefCOCO+ val
no code implementations • 22 Oct 2021 • Ziwen Li, Bo Xu, Han Huang, Cheng Lu, Yandong Guo
In this paper, we propose a new framework Deep Two-Stream Video Inference for Human Body Pose and Shape Estimation (DTS-VIBE), to generate 3D human pose and mesh from RGB videos.
Ranked #3 on
3D Human Pose Estimation
on MPI-INF-3DHP
(PA-MPJPE metric)
no code implementations • 8 Oct 2021 • Shuang Gao, Jixiang Wan, Yishan Ping, Xudong Zhang, Shuzhou Dong, Yuchen Yang, Haikuan Ning, Jijunnan Li, Yandong Guo
High-precision camera re-localization technology in a pre-established 3D environment map is the basis for many tasks, such as Augmented Reality, Robotics and Autonomous Driving.
1 code implementation • ICCV 2021 • Bo Xu, Han Huang, Cheng Lu, Ziwen Li, Yandong Guo
In this paper, we propose a Virtual Multi-modality Foreground Matting (VMFM) method to learn human-object interactive foreground (human and objects interacted with him or her) from a raw RGB image.
no code implementations • 29 Sep 2021 • Haizhou Shi, Youcai Zhang, Zijin Shen, Siliang Tang, Yaqian Li, Yandong Guo, Yueting Zhuang
This paper investigates the feasibility of federated representation learning under the constraints of communication cost and privacy protection.
no code implementations • 13 Sep 2021 • Wenzhao Xiang, Hang Su, Chang Liu, Yandong Guo, Shibao Zheng
As designers of artificial intelligence try to outwit hackers, both sides continue to hone in on AI's inherent vulnerabilities.
no code implementations • 23 Aug 2021 • Jian Zhao, Gang Wang, Jianan Li, Lei Jin, Nana Fan, Min Wang, Xiaojuan Wang, Ting Yong, Yafeng Deng, Yandong Guo, Shiming Ge, Guodong Guo
The 2nd Anti-UAV Workshop \& Challenge aims to encourage research in developing novel and accurate methods for multi-scale object tracking.
no code implementations • 19 Aug 2021 • Yuhao Zhou, Huanhuan Fan, Shuang Gao, Yuchen Yang, Xudong Zhang, Jijunnan Li, Yandong Guo
The localization pipeline is designed as a coarse-to-fine paradigm.
no code implementations • 30 Jul 2021 • Haizhou Shi, Youcai Zhang, Siliang Tang, Wenjie Zhu, Yaqian Li, Yandong Guo, Yueting Zhuang
It is a consensus that small models perform quite poorly under the paradigm of self-supervised contrastive learning.
no code implementations • CVPR 2021 • Xuancheng Zhang, Yutong Feng, Siqi Li, Changqing Zou, Hai Wan, Xibin Zhao, Yandong Guo, Yue Gao
This paper presents a view-guided solution for the task of point cloud completion.
Ranked #3 on
Point Cloud Completion
on ShapeNet-ViPC
no code implementations • 4 Dec 2020 • Leilei Cao, Tong Yang, Yixu Wang, Bo Yan, Yandong Guo
Thus, our model consists of a pyramid of fully convolutional GANs, wherein the content GAN is responsible for completing contents in the lowest-resolution masked image, and each texture GAN is responsible for synthesizing textures in a higher-resolution image.
1 code implementation • 26 May 2020 • Taizhang Shang, Qiuju Dai, Shengchen Zhu, Tong Yang, Yandong Guo
Third, we alternately use different upsampling methods in the upsampling stage to reduce the high computation complexity and still remain satisfactory performance.
Ranked #1 on
Image Super-Resolution
on DIV8K test - 16x upscaling
no code implementations • 25 May 2020 • Huanhuan Fan, Yuhao Zhou, Ang Li, Shuang Gao, Jijunnan Li, Yandong Guo
In this paper, we propose a monocular visual localization pipeline leveraging semantic and depth cues.
2 code implementations • CVPR 2020 • Bo Xu, Cheng Lu, Yandong Guo, Jacob Wang
Vision is often used as a complementary modality for audio speech recognition (ASR), especially in the noisy environment where performance of solo audio modality significantly deteriorates.
Ranked #6 on
Audio-Visual Speech Recognition
on LRS3-TED
(using extra training data)
no code implementations • 3 May 2020 • Kai Zhang, Shuhang Gu, Radu Timofte, Taizhang Shang, Qiuju Dai, Shengchen Zhu, Tong Yang, Yandong Guo, Younghyun Jo, Sejong Yang, Seon Joo Kim, Lin Zha, Jiande Jiang, Xinbo Gao, Wen Lu, Jing Liu, Kwangjin Yoon, Taegyun Jeon, Kazutoshi Akita, Takeru Ooba, Norimichi Ukita, Zhipeng Luo, Yuehan Yao, Zhenyu Xu, Dongliang He, Wenhao Wu, Yukang Ding, Chao Li, Fu Li, Shilei Wen, Jianwei Li, Fuzhi Yang, Huan Yang, Jianlong Fu, Byung-Hoon Kim, JaeHyun Baek, Jong Chul Ye, Yuchen Fan, Thomas S. Huang, Junyeop Lee, Bokyeung Lee, Jungki Min, Gwantae Kim, Kanghyu Lee, Jaihyun Park, Mykola Mykhailych, Haoyu Zhong, Yukai Shi, Xiaojun Yang, Zhijing Yang, Liang Lin, Tongtong Zhao, Jinjia Peng, Huibing Wang, Zhi Jin, Jiahao Wu, Yifu Chen, Chenming Shang, Huanrong Zhang, Jeongki Min, Hrishikesh P. S, Densen Puthussery, Jiji C. V
This paper reviews the NTIRE 2020 challenge on perceptual extreme super-resolution with focus on proposed solutions and results.
no code implementations • 7 Apr 2020 • Zhecan Wang, Jian Zhao, Cheng Lu, Han Huang, Fan Yang, Lianji Li, Yandong Guo
To better demonstrate the advantage of our methods, we further propose a new benchmark dataset with the most rich distribution of head-gaze combination reflecting real-world scenarios.
no code implementations • 12 Dec 2019 • Zhenfeng Zhu, Yingying Meng, Deqiang Kong, Xingxing Zhang, Yandong Guo, Yao Zhao
Due to the deteriorated conditions of \mbox{illumination} lack and uneven lighting, nighttime images have lower contrast and higher noise than their daytime counterparts of the same scene, which limits seriously the performances of conventional background modeling methods.
1 code implementation • 8 Dec 2019 • Fan Yang, Cheng Lu, Yandong Guo, Longin Jan Latecki, Haibin Ling
Feature pyramid architecture has been broadly adopted in object detection and segmentation to deal with multi-scale problem.
no code implementations • 28 Sep 2019 • Zhengming Ding, Yandong Guo, Lei Zhang, Yun Fu
Specifically, we target at building a more effective general face classifier for both normal persons and one-shot persons.
no code implementations • 11 Jul 2019 • Shuai Zheng, Zhenfeng Zhu, Jian Cheng, Yandong Guo, Yao Zhao
Non-uniform blur, mainly caused by camera shake and motions of multiple objects, is one of the most common causes of image quality degradation.
3 code implementations • CVPR 2019 • Yue Wu, Yinpeng Chen, Lijuan Wang, Yuancheng Ye, Zicheng Liu, Yandong Guo, Yun Fu
We believe this is because of the combination of two factors: (a) the data imbalance between the old and new classes, and (b) the increasing number of visually similar classes.
Ranked #2 on
Incremental Learning
on ImageNet100 - 10 steps
(# M Params metric)
no code implementations • 20 May 2019 • Jianfeng Wang, Rong Xiao, Yandong Guo, Lei Zhang
In this paper, we study the problem of object counting with incomplete annotations.
no code implementations • 17 Sep 2018 • Bowen Cheng, Rong Xiao, Yandong Guo, Yuxiao Hu, Jian-Feng Wang, Lei Zhang
We study in this paper how to initialize the parameters of multinomial logistic regression (a fully connected layer followed with softmax and cross entropy loss), which is widely used in deep neural network (DNN) models for classification problems.
no code implementations • 2 Feb 2018 • Yue Wu, Yinpeng Chen, Lijuan Wang, Yuancheng Ye, Zicheng Liu, Yandong Guo, Zhengyou Zhang, Yun Fu
To address these problems, we propose (a) a new loss function to combine the cross-entropy loss and distillation loss, (b) a simple way to estimate and remove the unbalance between the old and new classes , and (c) using Generative Adversarial Networks (GANs) to generate historical data and select representative exemplars during generation.
1 code implementation • 18 Jul 2017 • Yandong Guo, Lei Zhang
First, we build a face feature extraction model, and improve its performance, especially for the persons with very limited training samples, by introducing a regularizer to the cross entropy loss for the multi-nomial logistic regression (MLR) learning.
no code implementations • CVPR 2017 • Yandong Guo, Cheng Lu, Jan P. Allebach, Charles A. Bouman
Experimental results with a variety of document images demonstrate that our method improves the image quality compared with the observed image, and simultaneously improves the compression ratio.
10 code implementations • 27 Jul 2016 • Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, Jianfeng Gao
In this paper, we design a benchmark task and provide the associated datasets for recognizing face images and link them to corresponding entity keys in a knowledge base.