no code implementations • 31 Aug 2023 • Lei Bai, Dongang Wang, Michael Barnett, Mariano Cabezas, Weidong Cai, Fernando Calamante, Kain Kyle, Dongnan Liu, Linda Ly, Aria Nguyen, Chun-Chien Shieh, Ryan Sullivan, Hengrui Wang, Geng Zhan, Wanli Ouyang, Chenyu Wang
Our approach enables collaboration among multiple clinical sites without compromising data privacy under a federated learning paradigm that incorporates a noise-robust training strategy based on label correction.
1 code implementation • 29 Aug 2023 • Xinqi Lin, Jingwen He, Ziyan Chen, Zhaoyang Lyu, Ben Fei, Bo Dai, Wanli Ouyang, Yu Qiao, Chao Dong
We present DiffBIR, which leverages pretrained text-to-image diffusion models for blind image restoration problem.
Ranked #1 on
Blind Face Restoration
on LFW
no code implementations • 26 Aug 2023 • Shengji Tang, Peng Ye, Baopu Li, Weihao Lin, Tao Chen, Tong He, Chong Yu, Wanli Ouyang
Specifically, we implicitly divide all subnets into hierarchical groups by subnet-in-subnet sampling, aggregate the knowledge of different subnets in each group during training, and exploit upper-level group knowledge to supervise lower-level subnet groups.
1 code implementation • 21 Aug 2023 • Tao Han, Lei Bai, Lingbo Liu, Wanli Ouyang
Scale variation is a deep-rooted problem in object counting, which has not been effectively addressed by existing scale-aware algorithms.
1 code implementation • 14 Aug 2023 • Yunyao Mao, Jiajun Deng, Wengang Zhou, Yao Fang, Wanli Ouyang, Houqiang Li
To be specific, the proposed MAMP takes as input the masked spatio-temporal skeleton sequence and predicts the corresponding temporal motion of the masked human joints.
no code implementations • 11 Aug 2023 • Yongqi Huang, Peng Ye, Xiaoshui Huang, Sheng Li, Tao Chen, Tong He, Wanli Ouyang
As Vision Transformers (ViTs) are gradually surpassing CNNs in various visual tasks, one may question: if a training scheme specifically for ViTs exists that can also achieve performance improvement without increasing inference cost?
1 code implementation • 6 Aug 2023 • Lian Xu, Mohammed Bennamoun, Farid Boussaid, Hamid Laga, Wanli Ouyang, Dan Xu
Building upon the observation that the attended regions of the one-class token in the standard vision transformer can contribute to a class-agnostic localization map, we explore the potential of the transformer model to capture class-specific attention for class-discriminative object localization by learning multiple class tokens.
Object Localization
Weakly supervised Semantic Segmentation
+1
no code implementations • 24 Jul 2023 • Pan Tan, Mingchen Li, Yuanxi Yu, Fan Jiang, Lirong Zheng, Banghao Wu, Xinyu Sun, Liqi Kang, Jie Song, Liang Zhang, Yi Xiong, Wanli Ouyang, Zhiqiang Hu, Guisheng Fan, Yufeng Pei, Liang Hong
Designing protein mutants with high stability and activity is a critical yet challenging task in protein engineering.
no code implementations • 24 Jul 2023 • Chuming Li, Ruonan Jia, Jie Liu, Yinmin Zhang, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang
Model-based reinforcement learning (RL) has demonstrated remarkable successes on a range of continuous control tasks due to its high sample efficiency.
1 code implementation • 20 Jul 2023 • Yiyuan Zhang, Kaixiong Gong, Kaipeng Zhang, Hongsheng Li, Yu Qiao, Wanli Ouyang, Xiangyu Yue
Multimodal learning aims to build models that can process and relate information from multiple modalities.
1 code implementation • 18 Jul 2023 • Wenhao Wu, Yuxin Song, Zhun Sun, Jingdong Wang, Chang Xu, Wanli Ouyang
We conduct comprehensive ablation studies on the instantiation of ATMs and demonstrate that this module provides powerful temporal modeling capability at a low computational cost.
Ranked #3 on
Action Recognition
on Something-Something V1
no code implementations • 19 Jun 2023 • Yaqi Zhang, Di Huang, Bin Liu, Shixiang Tang, Yan Lu, Lu Chen, Lei Bai, Qi Chu, Nenghai Yu, Wanli Ouyang
Generating realistic human motion from given action descriptions has experienced significant advancements because of the emerging requirement of digital humans.
no code implementations • 19 Jun 2023 • Qinghong Sun, Yangguang Li, Zexiang Liu, Xiaoshui Huang, Fenggang Liu, Xihui Liu, Wanli Ouyang, Jing Shao
However, the quality and diversity of existing 3D object generation methods are constrained by the inadequacies of existing 3D object datasets, including issues related to text quality, the incompleteness of multi-modal data representation encompassing 2D rendered images and 3D assets, as well as the size of the dataset.
no code implementations • 15 Jun 2023 • YiRong Chen, Ziyue Li, Wanli Ouyang, Michael Lepech
In this work, we propose an Adaptive Hierarchical SpatioTemporal Network (AHSTN) to promote traffic forecasting by exploiting the spatial hierarchy and modeling multi-scale spatial correlations.
no code implementations • 13 Jun 2023 • Weizhen He, Shixiang Tang, Yiheng Deng, Qihao Chen, Qingsong Xie, Yizhou Wang, Lei Bai, Feng Zhu, Rui Zhao, Wanli Ouyang, Donglian Qi, Yunfeng Yan
This paper strives to resolve this problem by proposing a new instruct-ReID task that requires the model to retrieve images according to the given image or language instructions. Our instruct-ReID is a more general ReID setting, where existing ReID tasks can be viewed as special cases by designing different instructions.
1 code implementation • 11 Jun 2023 • Zhenfei Yin, Jiong Wang, JianJian Cao, Zhelun Shi, Dingning Liu, Mukai Li, Lu Sheng, Lei Bai, Xiaoshui Huang, Zhiyong Wang, Jing Shao, Wanli Ouyang
2) We demonstrate the detailed methods of constructing instruction-tuning datasets and benchmarks for MLLMs, which will enable future research on MLLMs to scale up and extend to other domains, tasks, and modalities faster.
1 code implementation • CVPR 2023 • Yingjie Wang, Jiajun Deng, Yao Li, Jinshui Hu, Cong Liu, Yu Zhang, Jianmin Ji, Wanli Ouyang, Yanyong Zhang
LiDAR and Radar are two complementary sensing approaches in that LiDAR specializes in capturing an object's 3D shape while Radar provides longer detection ranges as well as velocity hints.
1 code implementation • CVPR 2023 • Honghui Yang, Wenxiao Wang, Minghao Chen, Binbin Lin, Tong He, Hua Chen, Xiaofei He, Wanli Ouyang
The key to associating the two different representations is our introduced input-dependent Query Initialization module, which could efficiently generate reference points and content queries.
no code implementations • 10 May 2023 • Xulin Li, Yan Lu, Bin Liu, Yuenan Hou, Yating Liu, Qi Chu, Wanli Ouyang, Nenghai Yu
Clothes-invariant feature extraction is critical to the clothes-changing person re-identification (CC-ReID).
Clothes Changing Person Re-Identification
Person Re-Identification
no code implementations • 4 May 2023 • Peng Ye, Tong He, Shengji Tang, Baopu Li, Tao Chen, Lei Bai, Wanli Ouyang
In this work, we aim to re-investigate the training process of residual networks from a novel social psychology perspective of loafing, and further propose a new training scheme as well as three improved strategies for boosting residual networks beyond their performance limits.
1 code implementation • 25 Apr 2023 • Zeyu Lu, Di Huang, Lei Bai, Jingjing Qu, Chengyue Wu, Xihui Liu, Wanli Ouyang
Along with this, we conduct the model capability of AI-Generated images detection evaluation MPBench and the top-performing model from MPBench achieves a 13% failure rate under the same setting used in the human evaluation.
no code implementations • 6 Apr 2023 • Kang Chen, Tao Han, Junchao Gong, Lei Bai, Fenghua Ling, Jing-Jia Luo, Xi Chen, Leiming Ma, Tianning Zhang, Rui Su, Yuanzheng Ci, Bin Li, Xiaokang Yang, Wanli Ouyang
We present FengWu, an advanced data-driven global medium-range weather forecast system based on Artificial Intelligence (AI).
no code implementations • 22 Mar 2023 • Zhilong Liang, Zhenzhi Tan, Ruixin Hong, Wanli Ouyang, Jinying Yuan, ChangShui Zhang
Computer image recognition with machine learning method can make up the defects of artificial judging, giving accurate and quantitative judgement.
1 code implementation • CVPR 2023 • Shixiang Tang, Cheng Chen, Qingsong Xie, Meilin Chen, Yizhou Wang, Yuanzheng Ci, Lei Bai, Feng Zhu, Haiyang Yang, Li Yi, Rui Zhao, Wanli Ouyang
Specifically, we propose a \textbf{HumanBench} based on existing datasets to comprehensively evaluate on the common ground the generalization abilities of different pretraining methods on 19 datasets from 6 diverse downstream tasks, including person ReID, pose estimation, human parsing, pedestrian attribute recognition, pedestrian detection, and crowd counting.
Ranked #1 on
Pedestrian Attribute Recognition
on PA-100K
1 code implementation • CVPR 2023 • Yuanzheng Ci, Yizhou Wang, Meilin Chen, Shixiang Tang, Lei Bai, Feng Zhu, Rui Zhao, Fengwei Yu, Donglian Qi, Wanli Ouyang
When adapted to a specific task, UniHCP achieves new SOTAs on a wide range of human-centric tasks, e. g., 69. 8 mIoU on CIHP for human parsing, 86. 18 mA on PA-100K for attribute prediction, 90. 3 mAP on Market1501 for ReID, and 85. 8 JI on CrowdHuman for pedestrian detection, performing better than specialized models tailored for each task.
Ranked #1 on
Pedestrian Attribute Recognition
on PETA
no code implementations • 3 Mar 2023 • Lintao Wang, Kun Hu, Lei Bai, Yu Ding, Wanli Ouyang, Zhiyong Wang
As past poses often contain useful auxiliary hints, in this paper, we propose a task-agnostic deep learning method, namely Multi-scale Control Signal-aware Transformer (MCS-T), with an attention based encoder-decoder architecture to discover the auxiliary information implicitly for synthesizing controllable motion without explicitly requiring auxiliary information such as phase.
no code implementations • 22 Feb 2023 • Meilin Chen, Yizhou Wang, Shixiang Tang, Feng Zhu, Haiyang Yang, Lei Bai, Rui Zhao, Donglian Qi, Wanli Ouyang
Despite being feasible, recent works largely overlooked discovering the most discriminative regions for contrastive learning to object representations in scene images.
no code implementations • 8 Feb 2023 • Geng Zhan, Dongang Wang, Mariano Cabezas, Lei Bai, Kain Kyle, Wanli Ouyang, Michael Barnett, Chenyu Wang
An accurate and robust quantitative measurement of brain volume change is paramount for translational research and clinical applications.
1 code implementation • 29 Jan 2023 • Yangguang Li, Bin Huang, Zeren Chen, Yufeng Cui, Feng Liang, Mingzhu Shen, Fenggang Liu, Enze Xie, Lu Sheng, Wanli Ouyang, Jing Shao
Our Fast-BEV consists of five parts, We novelly propose (1) a lightweight deployment-friendly view transformation which fast transfers 2D image feature to 3D voxel space, (2) an multi-scale image encoder which leverages multi-scale information for better performance, (3) an efficient BEV encoder which is particularly designed to speed up on-vehicle inference.
1 code implementation • 16 Jan 2023 • Peng Ye, Tong He, Baopu Li, Tao Chen, Lei Bai, Wanli Ouyang
To address the robustness problem, we first benchmark different NAS methods under a wide range of proxy data, proxy channels, proxy layers and proxy epochs, since the robustness of NAS under different kinds of proxies has not been explored before.
no code implementations • CVPR 2023 • Yuchen Ren, Zhendong Mao, Shancheng Fang, Yan Lu, Tong He, Hao Du, Yongdong Zhang, Wanli Ouyang
In this paper, we introduce a new setting called Domain Generalization for Image Captioning (DGIC), where the data from the target domain is unseen in the learning process.
no code implementations • CVPR 2023 • Lian Xu, Wanli Ouyang, Mohammed Bennamoun, Farid Boussaid, Dan Xu
Weakly supervised dense object localization (WSDOL) relies generally on Class Activation Mapping (CAM), which exploits the correlation between the class weights of the image classifier and the pixel-level features.
no code implementations • CVPR 2023 • Shijie Wang, Jianlong Chang, Haojie Li, Zhihui Wang, Wanli Ouyang, Qi Tian
PLEor could leverage pre-trained CLIP model to infer the discrepancies encompassing both pre-defined and unknown subcategories, called category-specific discrepancies, and transfer them to the backbone network trained in the close-set scenarios.
3 code implementations • CVPR 2023 • Wenhao Wu, Haipeng Luo, Bo Fang, Jingdong Wang, Wanli Ouyang
Most existing text-video retrieval methods focus on cross-modal matching between the visual content of videos and textual query sentences.
Ranked #4 on
Video Retrieval
on VATEX
no code implementations • 31 Dec 2022 • Di Huang, Sida Peng, Tong He, Xiaowei Zhou, Wanli Ouyang
We propose a novel approach to self-supervised learning of point cloud representations by differentiable neural rendering.
3 code implementations • CVPR 2023 • Wenhao Wu, Xiaohan Wang, Haipeng Luo, Jingdong Wang, Yi Yang, Wanli Ouyang
In this paper, we propose a novel framework called BIKE, which utilizes the cross-modal bridge to explore bidirectional knowledge: i) We introduce the Video Attribute Association mechanism, which leverages the Video-to-Text knowledge to generate textual auxiliary attributes for complementing video recognition.
Ranked #1 on
Zero-Shot Action Recognition
on ActivityNet
no code implementations • CVPR 2023 • Mingye Xu, Mutian Xu, Tong He, Wanli Ouyang, Yali Wang, Xiaoguang Han, Yu Qiao
Besides, such scenes with progressive masking ratios can also serve to self-distill their intrinsic spatial consistency, requiring to learn the consistent representations from unmasked areas.
no code implementations • 17 Dec 2022 • Yuan YAO, Yuanhan Zhang, Zhenfei Yin, Jiebo Luo, Wanli Ouyang, Xiaoshui Huang
The recent success of pre-trained 2D vision models is mostly attributable to learning from large-scale datasets.
1 code implementation • 8 Dec 2022 • Xiaoshui Huang, Sheng Li, Wentao Qu, Tong He, Yifan Zuo, Wanli Ouyang
This paper introduces Efficient Point Cloud Learning (EPCL), an effective and efficient point cloud learner for directly training high-quality point cloud models with a frozen CLIP model.
1 code implementation • CVPR 2023 • Honghui Yang, Tong He, Jiaheng Liu, Hua Chen, Boxi Wu, Binbin Lin, Xiaofei He, Wanli Ouyang
In contrast to previous 3D MAE frameworks, which either design a complex decoder to infer masked information from maintained regions or adopt sophisticated masking strategies, we instead propose a much simpler paradigm.
no code implementations • 30 Nov 2022 • Di Huang, Xiaopeng Ji, Xingyi He, Jiaming Sun, Tong He, Qing Shuai, Wanli Ouyang, Xiaowei Zhou
The key idea is that the hand motion naturally provides multiple views of the object and the motion can be reliably estimated by a hand pose tracker.
1 code implementation • 29 Nov 2022 • Chuming Li, Jie Liu, Yinmin Zhang, Yuhong Wei, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang
In the learning phase, each agent minimizes the TD error that is dependent on how the subsequent agents have reacted to their chosen action.
Ranked #1 on
SMAC
on SMAC 3s5z_vs_3s6z
no code implementations • 17 Nov 2022 • Jiaheng Liu, Tong He, Honghui Yang, Rui Su, Jiayi Tian, Junran Wu, Hongcheng Guo, Ke Xu, Wanli Ouyang
Previous top-performing methods for 3D instance segmentation often maintain inter-task dependencies and the tendency towards a lack of robustness.
no code implementations • 14 Nov 2022 • Xiaopei Wu, Yang Zhao, Liang Peng, Hua Chen, Xiaoshui Huang, Binbin Lin, Haifeng Liu, Deng Cai, Wanli Ouyang
When training a teacher-student semi-supervised framework, we randomly select gt samples and pseudo samples to both labeled frames and unlabeled frames, making a strong data augmentation for them.
1 code implementation • 11 Oct 2022 • Jingru Tan, Bo Li, Xin Lu, Yongqiang Yao, Fengwei Yu, Tong He, Wanli Ouyang
Long-tail distribution is widely spread in real-world applications.
1 code implementation • 9 Oct 2022 • Peng Ye, Shengji Tang, Baopu Li, Tao Chen, Wanli Ouyang
In this work, we aim to re-investigate the training process of residual networks from a novel social psychology perspective of loafing, and further propose a new training strategy to strengthen the performance of residual networks.
1 code implementation • 3 Oct 2022 • Tianyu Huang, Bowen Dong, Yunhan Yang, Xiaoshui Huang, Rynson W. H. Lau, Wanli Ouyang, WangMeng Zuo
To address this issue, we propose CLIP2Point, an image-depth pre-training method by contrastive learning to transfer CLIP to the 3D domain, and adapt it to point cloud classification.
Ranked #3 on
Training-free 3D Point Cloud Classification
on ScanObjectNN
(using extra training data)
1 code implementation • 23 Sep 2022 • Weitao Feng, Lei Bai, Yongqiang Yao, Fengwei Yu, Wanli Ouyang
In this paper, we propose a Frame Rate Agnostic MOT framework with a Periodic training Scheme (FAPS) to tackle the FraMOT problem for the first time.
1 code implementation • 23 Aug 2022 • Lumin Xu, Sheng Jin, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang
We propose a single-network approach, termed ZoomNet, to take into account the hierarchical structure of the full human body and solve the scale variation of different body parts.
Ranked #2 on
2D Human Pose Estimation
on COCO-WholeBody
no code implementations • 15 Aug 2022 • Xinzhu Ma, Yuan Meng, Yinmin Zhang, Lei Bai, Jun Hou, Shuai Yi, Wanli Ouyang
We hope this work can provide insights for the image-based 3D detection community under a semi-supervised setting.
no code implementations • 29 Jul 2022 • Shijie Wang, Jianlong Chang, Zhihui Wang, Haojie Li, Wanli Ouyang, Qi Tian
In this paper, we develop Fine-grained Retrieval Prompt Tuning (FRPT), which steers a frozen pre-trained model to perform the fine-grained retrieval task from the perspectives of sample prompting and feature adaptation.
1 code implementation • 22 Jul 2022 • Hao Meng, Sheng Jin, Wentao Liu, Chen Qian, Mengxiang Lin, Wanli Ouyang, Ping Luo
Unlike most previous works that directly predict the 3D poses of two interacting hands simultaneously, we propose to decompose the challenging interacting hand pose estimation task and estimate the pose of each hand separately.
1 code implementation • 21 Jul 2022 • Lumin Xu, Sheng Jin, Wang Zeng, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang
In this paper, we introduce the task of Category-Agnostic Pose Estimation (CAPE), which aims to create a pose estimation model capable of detecting the pose of any class of object given only a few samples with keypoint definition.
no code implementations • 21 Jul 2022 • Boyang xia, Wenhao Wu, Haoran Wang, Rui Su, Dongliang He, Haosen Yang, Xiaoran Fan, Wanli Ouyang
On the video level, a temporal attention module is learned under dual video-level supervisions on both the salient and the non-salient representations.
Ranked #3 on
Action Recognition
on ActivityNet
1 code implementation • 17 Jul 2022 • Yuanzheng Ci, Chen Lin, Lei Bai, Wanli Ouyang
Contrastive-based self-supervised learning methods achieved great success in recent years.
no code implementations • TIP 2022 • Peiqin Zhuang, Yu Guo, Zhipeng Yu, Luping Zhou, Lei Bai, Ding Liang, Zhiyong Wang, Yali Wang, Wanli Ouyang
To address this issue, we introduce a Motion Diversification and Selection (MoDS) module to generate diversified spatio-temporal motion features and then select the suitable motion representation dynamically for categorizing the input video.
Ranked #16 on
Action Recognition
on Something-Something V1
3 code implementations • 4 Jul 2022 • Wenhao Wu, Zhun Sun, Wanli Ouyang
In this study, we focus on transferring knowledge for video classification tasks.
Ranked #1 on
Action Recognition
on ActivityNet
1 code implementation • 14 Jun 2022 • Jiajun Deng, Zhengyuan Yang, Daqing Liu, Tianlang Chen, Wengang Zhou, Yanyong Zhang, Houqiang Li, Wanli Ouyang
For another, we devise Language Conditioned Vision Transformer that removes external fusion modules and reuses the uni-modal ViT for vision-language fusion at the intermediate layers.
no code implementations • 13 Jun 2022 • Zengyu Qiu, Xinzhu Ma, Kunlin Yang, Chunya Liu, Jun Hou, Shuai Yi, Wanli Ouyang
Besides, our DPK makes the performance of the student model positively correlated with that of the teacher model, which means that we can further boost the accuracy of students by applying larger teachers.
no code implementations • 10 May 2022 • Haiyang Yang, Meilin Chen, Yizhou Wang, Shixiang Tang, Feng Zhu, Lei Bai, Rui Zhao, Wanli Ouyang
While recent self-supervised learning methods have achieved good performances with evaluation set on the same domain as the training set, they will have an undesirable performance decrease when tested on a different domain.
no code implementations • 3 May 2022 • Dongnan Liu, Mariano Cabezas, Dongang Wang, Zihao Tang, Lei Bai, Geng Zhan, Yuling Luo, Kain Kyle, Linda Ly, James Yu, Chun-Chien Shieh, Aria Nguyen, Ettikan Kandasamy Karuppiah, Ryan Sullivan, Fernando Calamante, Michael Barnett, Wanli Ouyang, Weidong Cai, Chenyu Wang
In addition, the segmentation loss function in each client is also re-weighted according to the lesion volume for the data during training.
1 code implementation • CVPR 2022 • Wang Zeng, Sheng Jin, Wentao Liu, Chen Qian, Ping Luo, Wanli Ouyang, Xiaogang Wang
Vision transformers have achieved great successes in many computer vision tasks.
Ranked #4 on
2D Human Pose Estimation
on COCO-WholeBody
1 code implementation • CVPR 2022 • Qiuhong Shen, Lei Qiao, Jinyang Guo, Peixia Li, Xin Li, Bo Li, Weitao Feng, Weihao Gan, Wei Wu, Wanli Ouyang
As unlimited self-supervision signals can be obtained by tracking a video along a cycle in time, we investigate evolving a Siamese tracker by tracking videos forward-backward.
no code implementations • 25 Mar 2022 • Xinchi Zhou, Dongzhan Zhou, Wanli Ouyang, Hang Zhou, Ziwei Liu, Di Hu
Recent years have witnessed the success of deep learning on the visual sound separation task.
2 code implementations • CVPR 2022 • Tao Han, Lei Bai, Junyu Gao, Qi Wang, Wanli Ouyang
Instead of relying on the Multiple Object Tracking (MOT) techniques, we propose to solve the problem by decomposing all pedestrians into the initial pedestrians who existed in the first frame and the new pedestrians with separate identities in each following frame.
1 code implementation • 10 Mar 2022 • BoYu Chen, Peixia Li, Lei Bai, Lei Qiao, Qiuhong Shen, Bo Li, Weihao Gan, Wei Wu, Wanli Ouyang
Exploiting a general-purpose neural architecture to replace hand-wired designs or inductive biases has recently drawn extensive interest.
1 code implementation • CVPR 2022 • Lian Xu, Wanli Ouyang, Mohammed Bennamoun, Farid Boussaid, Dan Xu
To this end, we propose a Multi-class Token Transformer, termed as MCTformer, which uses multiple class tokens to learn interactions between the class tokens and the patch tokens.
Object Localization
Weakly supervised Semantic Segmentation
+1
1 code implementation • 3 Mar 2022 • Peng Ye, Baopu Li, Yikang Li, Tao Chen, Jiayuan Fan, Wanli Ouyang
Neural Architecture Search~(NAS) has attracted increasingly more attention in recent years because of its capability to design deep neural networks automatically.
1 code implementation • 7 Feb 2022 • Xinzhu Ma, Wanli Ouyang, Andrea Simonelli, Elisa Ricci
3D object detection from images, one of the fundamental and challenging problems in autonomous driving, has received increasing attention from both industry and academia in recent years.
no code implementations • 3 Feb 2022 • Pu Zhang, Lei Bai, Jianru Xue, Jianwu Fang, Nanning Zheng, Wanli Ouyang
Trajectories obtained from object detection and tracking are inevitably noisy, which could cause serious forecasting errors to predictors built on ground truth trajectories.
1 code implementation • ICLR 2022 • Zhiyu Chong, Xinzhu Ma, Hong Zhang, Yuxin Yue, Haojie Li, Zhihui Wang, Wanli Ouyang
Finally, this LiDAR Net can serve as the teacher to transfer the learned knowledge to the baseline model.
no code implementations • ICLR 2022 • Can Wang, Sheng Jin, Yingda Guan, Wentao Liu, Chen Qian, Ping Luo, Wanli Ouyang
PL approaches apply pseudo-labels to unlabeled data, and then train the model with a combination of the labeled and pseudo-labeled data iteratively.
no code implementations • 18 Jan 2022 • Luya Wang, Feng Liang, Yangguang Li, Honggang Zhang, Wanli Ouyang, Jing Shao
Recently, self-supervised vision transformers have attracted unprecedented attention for their impressive representation learning ability.
no code implementations • CVPR 2022 • Jiahao Wang, Baoyuan Wu, Rui Su, Mingdeng Cao, Shuwei Shi, Wanli Ouyang, Yujiu Yang
We conduct experiments both from a control theory lens through a phase locus verification and from a network training lens on several models, including CNNs, Transformers, MLPs, and on benchmark datasets.
1 code implementation • CVPR 2022 • Peng Ye, Baopu Li, Yikang Li, Tao Chen, Jiayuan Fan, Wanli Ouyang
Neural Architecture Search (NAS) has attracted increasingly more attention in recent years because of its capability to design deep neural network automatically.
no code implementations • NeurIPS 2021 • Keyu Tian, Chen Lin, Ser Nam Lim, Wanli Ouyang, Puneet Dokania, Philip Torr
Automated data augmentation (ADA) techniques have played an important role in boosting the performance of deep models.
no code implementations • CVPR 2022 • Yizhou Wang, Shixiang Tang, Feng Zhu, Lei Bai, Rui Zhao, Donglian Qi, Wanli Ouyang
The pretrain-finetune paradigm is a classical pipeline in visual learning.
no code implementations • 13 Oct 2021 • Jincen Jiang, Xuequan Lu, Wanli Ouyang, Meili Wang
Though a number of point cloud learning methods have been proposed to handle unordered points, most of them are supervised and require labels for training.
3 code implementations • ICLR 2022 • Yangguang Li, Feng Liang, Lichen Zhao, Yufeng Cui, Wanli Ouyang, Jing Shao, Fengwei Yu, Junjie Yan
Recently, large-scale Contrastive Language-Image Pre-training (CLIP) has attracted unprecedented attention for its impressive zero-shot recognition ability and excellent transferability to downstream tasks.
no code implementations • 5 Oct 2021 • Jianan Liu, Weiyi Xiong, Liping Bai, Yuxuan Xia, Tao Huang, Wanli Ouyang, Bing Zhu
Automotive radar provides reliable environmental perception in all-weather conditions with affordable cost, but it hardly supplies semantic and geometry information due to the sparsity of radar detection points.
no code implementations • 29 Sep 2021 • Yizhou Wang, Shixiang Tang, Feng Zhu, Lei Bai, Rui Zhao, Donglian Qi, Wanli Ouyang
The pretrain-finetune paradigm is a classical pipeline in visual learning.
1 code implementation • ICCV 2021 • Size Wu, Sheng Jin, Wentao Liu, Lei Bai, Chen Qian, Dong Liu, Wanli Ouyang
Following the top-down paradigm, we decompose the task into two stages, i. e. person localization and pose estimation.
Ranked #2 on
3D Multi-Person Pose Estimation
on Panoptic
(using extra training data)
no code implementations • 23 Aug 2021 • Jiangmiao Pang, Kai Chen, Qi Li, Zhihai Xu, Huajun Feng, Jianping Shi, Wanli Ouyang, Dahua Lin
In this work, we carefully revisit the standard training practice of detectors, and find that the detection performance is often limited by the imbalance during the training process, which generally consists in three levels - sample level, feature level, and objective level.
1 code implementation • ICCV 2021 • BoYu Chen, Peixia Li, Baopu Li, Chen Lin, Chuming Li, Ming Sun, Junjie Yan, Wanli Ouyang
We present BN-NAS, neural architecture search with Batch Normalization (BN-NAS), to accelerate neural architecture search (NAS).
no code implementations • 7 Aug 2021 • BoYu Chen, Peixia Li, Baopu Li, Chuming Li, Lei Bai, Chen Lin, Ming Sun, Junjie Yan, Wanli Ouyang
Then, a compact set of the possible combinations for different token pooling and attention sharing mechanisms are constructed.
1 code implementation • 29 Jul 2021 • Yinmin Zhang, Xinzhu Ma, Shuai Yi, Jun Hou, Zhihui Wang, Wanli Ouyang, Dan Xu
In this paper, we propose to learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection.
Ranked #7 on
Monocular 3D Object Detection
on KITTI Cars Moderate
1 code implementation • ICCV 2021 • Yan Lu, Xinzhu Ma, Lei Yang, Tianzhu Zhang, Yating Liu, Qi Chu, Junjie Yan, Wanli Ouyang
In this paper, we propose a Geometry Uncertainty Projection Network (GUP Net) to tackle the error amplification problem at both inference and training stages.
3D Object Detection From Monocular Images
Depth Estimation
+2
1 code implementation • ICCV 2021 • Lian Xu, Wanli Ouyang, Mohammed Bennamoun, Farid Boussaid, Ferdous Sohel, Dan Xu
Motivated by the significant inter-task correlation, we propose a novel weakly supervised multi-task framework termed as AuxSegNet, to leverage saliency detection and multi-label image classification as auxiliary tasks to improve the primary task of semantic segmentation using only image-level ground-truth labels.
2 code implementations • ICCV 2021 • BoYu Chen, Peixia Li, Chuming Li, Baopu Li, Lei Bai, Chen Lin, Ming Sun, Junjie Yan, Wanli Ouyang
We introduce the first Neural Architecture Search (NAS) method to find a better transformer architecture for image recognition.
Ranked #459 on
Image Classification
on ImageNet
no code implementations • CVPR 2021 • Shixiang Tang, Dapeng Chen, Lei Bai, Kaijian Liu, Yixiao Ge, Wanli Ouyang
In this MCGN, the labels and features of support data are used by the CRF for inferring GNN affinities in a principled and probabilistic way.
no code implementations • 28 May 2021 • Ming Sun, Haoxuan Dou, Baopu Li, Lei Cui, Junjie Yan, Wanli Ouyang
Data sampling acts as a pivotal role in training deep learning models.
4 code implementations • CVPR 2021 • Lumin Xu, Yingda Guan, Sheng Jin, Wentao Liu, Chen Qian, Ping Luo, Wanli Ouyang, Xiaogang Wang
Human pose estimation has achieved significant progress in recent years.
Ranked #26 on
Pose Estimation
on COCO test-dev
(using extra training data)
no code implementations • CVPR 2021 • Shixiang Tang, Dapeng Chen, Jinguo Zhu, Shijie Yu, Wanli Ouyang
The gradient for update should be close to the gradient of the new task, consistent with the gradients shared by all old tasks, and orthogonal to the space spanned by the gradients specific to the old tasks.
2 code implementations • ICCV 2021 • Hongwen Zhang, Yating Tian, Xinchi Zhou, Wanli Ouyang, Yebin Liu, LiMin Wang, Zhenan Sun
Regression-based methods have recently shown promising results in reconstructing human meshes from monocular images.
Ranked #53 on
3D Human Pose Estimation
on 3DPW
(using extra training data)
3D human pose and shape estimation
3D Human Reconstruction
+2
1 code implementation • CVPR 2021 • Xinzhu Ma, Yinmin Zhang, Dan Xu, Dongzhan Zhou, Shuai Yi, Haojie Li, Wanli Ouyang
Estimating 3D bounding boxes from monocular images is an essential component in autonomous driving, while accurate 3D object detection from this kind of data is very challenging.
Ranked #15 on
Monocular 3D Object Detection
on KITTI Cars Moderate
no code implementations • 23 Mar 2021 • Shixiang Tang, Peng Su, Dapeng Chen, Wanli Ouyang
To better understand this issue, we study the problem of continual domain adaptation, where the model is presented with a labelled source domain and a sequence of unlabelled target domains.
no code implementations • 18 Mar 2021 • Jinghao Zhou, Bo Li, Lei Qiao, Peng Wang, Weihao Gan, Wei Wu, Junjie Yan, Wanli Ouyang
Visual Object Tracking (VOT) has synchronous needs for both robustness and accuracy.
no code implementations • 18 Mar 2021 • Jinghao Zhou, Bo Li, Peng Wang, Peixia Li, Weihao Gan, Wei Wu, Junjie Yan, Wanli Ouyang
Visual Object Tracking (VOT) can be seen as an extended task of Few-Shot Learning (FSL).
no code implementations • 8 Jan 2021 • Dan Xu, Xavier Alameda-Pineda, Wanli Ouyang, Elisa Ricci, Xiaogang Wang, Nicu Sebe
In contrast to previous works directly considering multi-scale feature maps obtained from the inner layers of a primary CNN architecture, and simply fusing the features with weighted averaging or concatenation, we propose a probabilistic graph attention network structure based on a novel Attention-Gated Conditional Random Fields (AG-CRFs) model for learning and fusing multi-scale representations in a principled manner.
no code implementations • ICCV 2021 • Shuyang Sun, Xiaoyu Yue, Xiaojuan Qi, Wanli Ouyang, Victor Adrian Prisacariu, Philip H.S. Torr
Aggregating features from different depths of a network is widely adopted to improve the network capability.
1 code implementation • CVPR 2021 • Jie Liu, Chuming Li, Feng Liang, Chen Lin, Ming Sun, Junjie Yan, Wanli Ouyang, Dong Xu
To develop a practical method for learning complex inception convolution based on the data, a simple but effective search algorithm, referred to as efficient dilation optimization (EDO), is developed.
1 code implementation • 12 Dec 2020 • Matthieu Lin, Chuming Li, Xingyuan Bu, Ming Sun, Chen Lin, Junjie Yan, Wanli Ouyang, Zhidong Deng
Furthermore, the bipartite match of ED harms the training efficiency due to the large ground truth number in crowd scenes.
no code implementations • 10 Dec 2020 • Hong Zhang, Shenglun Chen, Zhihui Wang, Haojie Li, Wanli Ouyang
To this end, we first propose to decompose the full matching task into multiple stages of the cost aggregation module.
no code implementations • 10 Dec 2020 • Hong Zhang, Haojie Li, Shenglun Chen, Tiantian Yan, Zhihui Wang, Guo Lu, Wanli Ouyang
To make the Adaptive-Grained Depth Refinement stage robust to the coarse depth and adaptive to the depth range of the points, the Granularity Uncertainty is introduced to Adaptive-Grained Depth Refinement stage.
no code implementations • 27 Nov 2020 • Zhenxun Yuan, Xiao Song, Lei Bai, Wengang Zhou, Zhe Wang, Wanli Ouyang
As a special design of this transformer, the information encoded in the encoder is different from that in the decoder, i. e. the encoder encodes temporal-channel information of multiple frames while the decoder decodes the spatial-channel information for the current frame in a voxel-wise manner.
1 code implementation • ICCV 2021 • Yuanzheng Ci, Chen Lin, Ming Sun, BoYu Chen, Hongwen Zhang, Wanli Ouyang
The automation of neural architecture design has been a coveted alternative to human experts.
no code implementations • 21 Oct 2020 • Jie Liu, Chen Lin, Chuming Li, Lu Sheng, Ming Sun, Junjie Yan, Wanli Ouyang
Several variants of stochastic gradient descent (SGD) have been proposed to improve the learning effectiveness and efficiency when training deep neural networks, among which some recent influential attempts would like to adaptively control the parameter-wise learning rate (e. g., Adam and RMSProp).
no code implementations • 12 Oct 2020 • Shijie Wang, Zhihui Wang, Haojie Li, Wanli Ouyang
Existing deep learning based weakly supervised fine-grained image recognition (WFGIR) methods usually pick out the discriminative regions from the high-level feature (HLF) maps directly.
1 code implementation • ICCV 2021 • Mingzhu Shen, Feng Liang, Ruihao Gong, Yuhang Li, Chuming Li, Chen Lin, Fengwei Yu, Junjie Yan, Wanli Ouyang
Therefore, we propose to combine Network Architecture Search methods with quantization to enjoy the merits of the two sides.
1 code implementation • NeurIPS 2020 • Keyu Tian, Chen Lin, Ming Sun, Luping Zhou, Junjie Yan, Wanli Ouyang
On CIFAR-10, we achieve a top-1 error rate of 1. 24%, which is currently the best performing single model without extra training data.
no code implementations • 28 Sep 2020 • Mingzhu Shen, Feng Liang, Chuming Li, Chen Lin, Ming Sun, Junjie Yan, Wanli Ouyang
Automatic search of Quantized Neural Networks (QNN) has attracted a lot of attention.
no code implementations • 22 Sep 2020 • Weitao Feng, Zhihao Hu, Baopu Li, Weihao Gan, Wei Wu, Wanli Ouyang
Besides, we propose a new MOT evaluation measure, Still Another IDF score (SAIDF), aiming to focus more on identity issues. This new measure may overcome some problems of the previous measures and provide a better insight for identity issues in MOT.
no code implementations • ECCV 2020 • Zhihao Hu, Zhenghao Chen, Dong Xu, Guo Lu, Wanli Ouyang, Shuhang Gu
In this work, we propose a new framework called Resolution-adaptive Flow Coding (RaFC) to effectively compress the flow maps globally and locally, in which we use multi-resolution representations instead of single-resolution representations for both the input flow maps and the output motion features of the MV encoder.
no code implementations • 12 Sep 2020 • Yi Zhou, Shuyang Sun, Chao Zhang, Yikang Li, Wanli Ouyang
By assigning each relationship a single label, current approaches formulate the relationship detection as a classification problem.
1 code implementation • 14 Aug 2020 • Xianghui Yang, Bairun Wang, Kaige Chen, Xinchi Zhou, Shuai Yi, Wanli Ouyang, Luping Zhou
(2) The object categories at the training and inference stages have no overlap, leaving the inter-class gap.
1 code implementation • ECCV 2020 • Xinzhu Ma, Shinan Liu, Zhiyi Xia, Hongwen Zhang, Xingyu Zeng, Wanli Ouyang
Based on this observation, we design an image based CNN detector named Patch-Net, which is more generalized and can be instantiated as pseudo-LiDAR based 3D detectors.
2 code implementations • ECCV 2020 • Sheng Jin, Lumin Xu, Jin Xu, Can Wang, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo
This paper investigates the task of 2D human whole-body pose estimation, which aims to localize dense landmarks on the entire human body including face, hands, body, and feet.
Ranked #8 on
2D Human Pose Estimation
on COCO-WholeBody
no code implementations • ECCV 2020 • Sheng Jin, Wentao Liu, Enze Xie, Wenhai Wang, Chen Qian, Wanli Ouyang, Ping Luo
The modules of HGG can be trained end-to-end with the keypoint detection network and is able to supervise the grouping process in a hierarchical manner.
Ranked #3 on
Keypoint Detection
on OCHuman
3 code implementations • CVPR 2020 • Wang Zeng, Wanli Ouyang, Ping Luo, Wentao Liu, Xiaogang Wang
This paper proposes a model-free 3D human mesh estimation framework, named DecoMR, which explicitly establishes the dense correspondence between the mesh and the local image features in the UV space (i. e. a 2D space used for texture mapping of 3D mesh).
Ranked #1 on
3D Human Reconstruction
on Surreal
no code implementations • 11 May 2020 • Geng Zhan, Dan Xu, Guo Lu, Wei Wu, Chunhua Shen, Wanli Ouyang
Existing anchor-based and anchor-free object detectors in multi-stage or one-stage pipelines have achieved very promising detection performance.
no code implementations • ECCV 2020 • Dongzhan Zhou, Xinchi Zhou, Hongwen Zhang, Shuai Yi, Wanli Ouyang
In this paper, we propose a general and efficient pre-training paradigm, Montage pre-training, for object detection.
no code implementations • 23 Apr 2020 • Zengyuan Guo, Zilin Wang, Zhihui Wang, Wanli Ouyang, Haojie Li, Wen Gao
However, they are behind in accuracy comparing with recent segmentation-based text detectors.
5 code implementations • CVPR 2020 • Ziyu Liu, Hongwen Zhang, Zhenghao Chen, Zhiyong Wang, Wanli Ouyang
Spatial-temporal graphs have been widely used by skeleton-based action recognition algorithms to model human action dynamics.
Ranked #1 on
3D Action Recognition
on Assembly101
no code implementations • ECCV 2020 • Guo Lu, Chunlei Cai, Xiaoyun Zhang, Li Chen, Wanli Ouyang, Dong Xu, Zhiyong Gao
Therefore, the encoder is adaptive to different video contents and achieves better compression performance by reducing the domain gap between the training and testing datasets.
no code implementations • 15 Mar 2020 • Jinyang Guo, Wanli Ouyang, Dong Xu
To this end, we propose a new strategy to suppress the influence of unimportant features (i. e., the features will be removed at the next pruning stage).
1 code implementation • CVPR 2020 • Jingru Tan, Changbao Wang, Buyu Li, Quanquan Li, Wanli Ouyang, Changqing Yin, Junjie Yan
Based on it, we propose a simple but effective loss, named equalization loss, to tackle the problem of long-tailed rare categories by simply ignoring those gradients for rare categories.
Ranked #15 on
Long-tail Learning
on CIFAR-10-LT (ρ=10)
no code implementations • CVPR 2020 • Dongzhan Zhou, Xinchi Zhou, Wenwei Zhang, Chen Change Loy, Shuai Yi, Xuesen Zhang, Wanli Ouyang
While many methods have been proposed to improve the efficiency of NAS, the search progress is still laborious because training and evaluating plausible architectures over large search space is time-consuming.
1 code implementation • 31 Dec 2019 • Hongwen Zhang, Jie Cao, Guo Lu, Wanli Ouyang, Zhenan Sun
Reconstructing 3D human shape and pose from monocular images is challenging despite the promising results achieved by the most recent learning-based methods.
Ranked #57 on
3D Human Pose Estimation
on 3DPW
(MPJPE metric)
3D human pose and shape estimation
3D Human Reconstruction
+2
no code implementations • ICLR 2020 • Feng Liang, Chen Lin, Ronghao Guo, Ming Sun, Wei Wu, Junjie Yan, Wanli Ouyang
However, classification allocation pattern is usually adopted directly to object detector, which is proved to be sub-optimal.
no code implementations • 15 Dec 2019 • Zhe Chen, Wanli Ouyang, Tongliang Liu, DaCheng Tao
Alternatively, to access much more natural-looking pedestrians, we propose to augment pedestrian detection datasets by transforming real pedestrians from the same dataset into different shapes.
2 code implementations • ICCV 2019 • Haodong Duan, Kwan-Yee Lin, Sheng Jin, Wentao Liu, Chen Qian, Wanli Ouyang
In this paper, we propose the Triplet Representation for Body (TRB) -- a compact 2D human body representation, with skeleton keypoints capturing human pose information and contour keypoints containing human shape information.
no code implementations • CVPR 2020 • Xiang Li, Chen Lin, Chuming Li, Ming Sun, Wei Wu, Junjie Yan, Wanli Ouyang
In this paper, we analyse existing weight sharing one-shot NAS approaches from a Bayesian point of view and identify the posterior fading problem, which compromises the effectiveness of shared weights.
no code implementations • 21 Sep 2019 • Zehui Yao, Boyan Zhang, Zhiyong Wang, Wanli Ouyang, Dong Xu, Dagan Feng
For example, given two image domains $X_1$ and $X_2$ with certain attributes, the intersection $X_1 \cap X_2$ denotes a new domain where images possess the attributes from both $X_1$ and $X_2$ domains.
2 code implementations • ICCV 2019 • Peixia Li, Bo-Yu Chen, Wanli Ouyang, Dong Wang, Xiaoyun Yang, Huchuan Lu
In this work, we propose a novel gradient-guided network to exploit the discriminative information in gradients and update the template in the siamese network through feed-forward and backward operations.
Ranked #3 on
Visual Object Tracking
on OTB-2015
(Precision metric)
1 code implementation • ICCV 2019 • Yingyue Xu, Dan Xu, Xiaopeng Hong, Wanli Ouyang, Rongrong Ji, Min Xu, Guoying Zhao
We formulate the CRF graphical model that involves message-passing of feature-feature, feature-prediction, and prediction-prediction, from the coarse scale to the finer scale, to update the features and the corresponding predictions.
no code implementations • ICCV 2019 • Lingbo Liu, Zhilin Qiu, Guanbin Li, Shufan Liu, Wanli Ouyang, Liang Lin
Automatic estimation of the number of people in unconstrained crowded scenes is a challenging task and one major difficulty stems from the huge scale variation of people.
no code implementations • 23 Jun 2019 • Kai Niu, Yan Huang, Wanli Ouyang, Liang Wang
Firstly, the global-global alignment in the Global Contrast (GC) module is for matching the global contexts of images and descriptions.
Ranked #16 on
Text based Person Retrieval
on CUHK-PEDES
143 code implementations • 17 Jun 2019 • Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jiarui Xu, Zheng Zhang, Dazhi Cheng, Chenchen Zhu, Tianheng Cheng, Qijie Zhao, Buyu Li, Xin Lu, Rui Zhu, Yue Wu, Jifeng Dai, Jingdong Wang, Jianping Shi, Wanli Ouyang, Chen Change Loy, Dahua Lin
In this paper, we introduce the various features of this toolbox.
no code implementations • CVPR 2019 • Rui Su, Wanli Ouyang, Luping Zhou, Dong Xu
Specifically, we first generate a larger set of region proposals by combining the latest region proposals from both streams, from which we can readily obtain a larger set of labelled training samples to help learn better action detection models.
1 code implementation • ICCV 2019 • Chen Lin, Minghao Guo, Chuming Li, Yuan Xin, Wei Wu, Dahua Lin, Wanli Ouyang, Junjie Yan
Data augmentation is critical to the success of modern deep learning techniques.
1 code implementation • ICCV 2019 • Chuming Li, Yuan Xin, Chen Lin, Minghao Guo, Wei Wu, Wanli Ouyang, Junjie Yan
The key contribution of this work is the design of search space which can guarantee the generalization and transferability on different vision tasks by including a bunch of existing prevailing loss functions in a unified formulation.
no code implementations • 15 May 2019 • Lingbo Liu, Zhilin Qiu, Guanbin Li, Qing Wang, Wanli Ouyang, Liang Lin
Finally, a GCC module is applied to model the correlation between all regions by computing a global correlation feature as a weighted sum of all regional features, with the weights being calculated as the similarity between the corresponding region pairs.
no code implementations • ICLR 2019 • Wei Gao, Yi Wei, Quanquan Li, Hongwei Qin, Wanli Ouyang, Junjie Yan
Hints can improve the performance of student model by transferring knowledge from teacher model.
1 code implementation • CVPR 2019 • Chunfeng Song, Yan Huang, Wanli Ouyang, Liang Wang
To address this problem, it is a good choice to learn to segment with weak supervision from bounding boxes.
Weakly-supervised Learning
Weakly supervised Semantic Segmentation
+1
5 code implementations • CVPR 2019 • Jiangmiao Pang, Kai Chen, Jianping Shi, Huajun Feng, Wanli Ouyang, Dahua Lin
In this work, we carefully revisit the standard training practice of detectors, and find that the detection performance is often limited by the imbalance during the training process, which generally consists in three levels - sample level, feature level, and objective level.
Ranked #163 on
Object Detection
on COCO test-dev
2 code implementations • ICLR 2019 • Hongyang Li, Bo Dai, Shaoshuai Shi, Wanli Ouyang, Xiaogang Wang
We argue that the reliable set could guide the feature learning of the less reliable set during training - in spirit of student mimicking teacher behavior and thus pushing towards a more compact class centroid in the feature space.
Ranked #148 on
Object Detection
on COCO test-dev
no code implementations • 27 Mar 2019 • Xinzhu Ma, Zhihui Wang, Haojie Li, Peng-Bo Zhang, Xin Fan, Wanli Ouyang
To this end, we first leverage a stand-alone module to transform the input data from 2D image plane to 3D point clouds space for a better input representation, then we perform the 3D detection using PointNet backbone net to obtain objects 3D locations, dimensions and orientations.
no code implementations • CVPR 2019 • Buyu Li, Wanli Ouyang, Lu Sheng, Xingyu Zeng, Xiaogang Wang
We present an efficient 3D object detection framework based on a single RGB image in the scenario of autonomous driving.
Ranked #18 on
Vehicle Pose Estimation
on KITTI Cars Hard
no code implementations • CVPR 2019 • Sheng Jin, Wentao Liu, Wanli Ouyang, Chen Qian
Our framework consists of two main components,~\ie~SpatialNet and TemporalNet.
1 code implementation • CVPR 2019 • Pu Zhang, Wanli Ouyang, Pengfei Zhang, Jianru Xue, Nanning Zheng
In order to address this issue, we propose a data-driven state refinement module for LSTM network (SR-LSTM), which activates the utilization of the current intention of neighbors, and jointly and iteratively refines the current states of all participants in the crowd through a message passing mechanism.
no code implementations • 19 Feb 2019 • Chen Change Loy, Dahua Lin, Wanli Ouyang, Yuanjun Xiong, Shuo Yang, Qingqiu Huang, Dongzhan Zhou, Wei Xia, Quanquan Li, Ping Luo, Junjie Yan, Jian-Feng Wang, Zuoxin Li, Ye Yuan, Boxun Li, Shuai Shao, Gang Yu, Fangyun Wei, Xiang Ming, Dong Chen, Shifeng Zhang, Cheng Chi, Zhen Lei, Stan Z. Li, Hongkai Zhang, Bingpeng Ma, Hong Chang, Shiguang Shan, Xilin Chen, Wu Liu, Boyan Zhou, Huaxiong Li, Peng Cheng, Tao Mei, Artem Kukharenko, Artem Vasenin, Nikolay Sergievskiy, Hua Yang, Liangqi Li, Qiling Xu, Yuan Hong, Lin Chen,