Search Results for author: Yongming Rao

Found 31 papers, 21 papers with code

Temporal Coherence or Temporal Motion: Which is More Critical for Video-based Person Re-identification?

no code implementations ECCV 2020 Guangyi Chen, Yongming Rao, Jiwen Lu, Jie zhou

Specifically, we disentangle the video representation into the temporal coherence and motion parts and randomly change the scale of the temporal motion features as the adversarial noise.

Frame Video-Based Person Re-Identification

FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality Assessment

1 code implementation7 Apr 2022 Jinglin Xu, Yongming Rao, Xumin Yu, Guangyi Chen, Jie zhou, Jiwen Lu

Most existing action quality assessment methods rely on the deep features of an entire video to predict the score, which is less reliable due to the non-transparent inference process and poor interpretability.

Action Quality Assessment

SurroundDepth: Entangling Surrounding Views for Self-Supervised Multi-Camera Depth Estimation

1 code implementation7 Apr 2022 Yi Wei, Linqing Zhao, Wenzhao Zheng, Zheng Zhu, Yongming Rao, Guan Huang, Jiwen Lu, Jie zhou

In this paper, we propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.

Autonomous Driving Monocular Depth Estimation

LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection

1 code implementation28 Mar 2022 Yi Wei, Zibu Wei, Yongming Rao, Jiaxin Li, Jie zhou, Jiwen Lu

In this paper, we propose the LiDAR Distillation to bridge the domain gap induced by different LiDAR beams for 3D object detection.

3D Object Detection

Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion

1 code implementation25 Mar 2022 Tianpei Gu, Guangyi Chen, Junlong Li, Chunze Lin, Yongming Rao, Jie zhou, Jiwen Lu

Human behavior has the nature of indeterminacy, which requires the pedestrian trajectory prediction system to model the multi-modality of future motion states.

Pedestrian Trajectory Prediction Trajectory Prediction

Back to Reality: Weakly-supervised 3D Object Detection with Shape-guided Label Enhancement

2 code implementations10 Mar 2022 Xiuwei Xu, Yifan Wang, Yu Zheng, Yongming Rao, Jie zhou, Jiwen Lu

In this paper, we propose a weakly-supervised approach for 3D object detection, which makes it possible to train a strong 3D detector with position-level annotations (i. e. annotations of object centers).

3D Object Detection Domain Adaptation

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

1 code implementation2 Dec 2021 Yongming Rao, Wenliang Zhao, Guangyi Chen, Yansong Tang, Zheng Zhu, Guan Huang, Jie zhou, Jiwen Lu

In this work, we present a new framework for dense prediction by implicitly and explicitly leveraging the pre-trained knowledge from CLIP.

Instance Segmentation Language Modelling +4

Structure-Preserving Image Super-Resolution

1 code implementation26 Sep 2021 Cheng Ma, Yongming Rao, Jiwen Lu, Jie zhou

Firstly, we propose SPSR with gradient guidance (SPSR-G) by exploiting gradient maps of images to guide the recovery in two aspects.

Image Super-Resolution SSIM

NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo

1 code implementation ICCV 2021 Yi Wei, Shaohui Liu, Yongming Rao, Wang Zhao, Jiwen Lu, Jie zhou

In this work, we present a new multi-view depth estimation method that utilizes both conventional reconstruction and learning-based priors over the recently proposed neural radiance fields (NeRF).

Depth Estimation

PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers

1 code implementation ICCV 2021 Xumin Yu, Yongming Rao, Ziyi Wang, Zuyan Liu, Jiwen Lu, Jie zhou

In this paper, we present a new method that reformulates point cloud completion as a set-to-set translation problem and design a new model, called PoinTr that adopts a transformer encoder-decoder architecture for point cloud completion.

Point Cloud Completion

Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

1 code implementation ICCV 2021 Yongming Rao, Guangyi Chen, Jiwen Lu, Jie zhou

Unlike most existing methods that learn visual attention based on conventional likelihood, we propose to learn the attention with counterfactual causality, which provides a tool to measure the attention quality and a powerful supervisory signal to guide the learning process.

Causal Inference Fine-Grained Image Classification +5

RandomRooms: Unsupervised Pre-training from Synthetic Shapes and Randomized Layouts for 3D Object Detection

2 code implementations ICCV 2021 Yongming Rao, Benlin Liu, Yi Wei, Jiwen Lu, Cho-Jui Hsieh, Jie zhou

In particular, we propose to generate random layouts of a scene by making use of the objects in the synthetic CAD dataset and learn the 3D scene representation by applying object-level contrastive learning on two random scenes generated from the same set of synthetic objects.

2D object detection 3D Object Detection +2

Towards Interpretable Deep Metric Learning with Structural Matching

1 code implementation ICCV 2021 Wenliang Zhao, Yongming Rao, Ziyi Wang, Jiwen Lu, Jie zhou

Our method is model-agnostic, which can be applied to off-the-shelf backbone networks and metric learning methods.

Metric Learning

Global Filter Networks for Image Classification

3 code implementations NeurIPS 2021 Yongming Rao, Wenliang Zhao, Zheng Zhu, Jiwen Lu, Jie zhou

Recent advances in self-attention and pure multi-layer perceptrons (MLP) models for vision have shown great potential in achieving promising performance with fewer inductive biases.

Ranked #8 on Image Classification on Stanford Cars (using extra training data)

Classification Domain Generalization +1

DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification

1 code implementation NeurIPS 2021 Yongming Rao, Wenliang Zhao, Benlin Liu, Jiwen Lu, Jie zhou, Cho-Jui Hsieh

Based on this observation, we propose a dynamic token sparsification framework to prune redundant tokens progressively and dynamically based on the input.

Image Classification

PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds

1 code implementation CVPR 2021 Yi Wei, Ziyi Wang, Yongming Rao, Jiwen Lu, Jie zhou

In this paper, we propose a Point-Voxel Recurrent All-Pairs Field Transforms (PV-RAFT) method to estimate scene flow from point clouds.

Scene Flow Estimation

Structure-Preserving Super Resolution with Gradient Guidance

2 code implementations CVPR 2020 Cheng Ma, Yongming Rao, Yean Cheng, Ce Chen, Jiwen Lu, Jie zhou

In this paper, we propose a structure-preserving super resolution method to alleviate the above issue while maintaining the merits of GAN-based methods to generate perceptual-pleasant details.

Image Super-Resolution SSIM

Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds

1 code implementation CVPR 2020 Yongming Rao, Jiwen Lu, Jie zhou

Based on this hypothesis, we propose to learn point cloud representation by bidirectional reasoning between the local structures at different abstraction hierarchies and the global shape without human supervision.

3D Object Classification General Classification +1

Deep Face Super-Resolution with Iterative Collaboration between Attentive Recovery and Landmark Estimation

1 code implementation CVPR 2020 Cheng Ma, Zhenyu Jiang, Yongming Rao, Jiwen Lu, Jie zhou

In this paper, we propose a deep face super-resolution (FSR) method with iterative collaboration between two recurrent networks which focus on facial image recovery and landmark estimation respectively.

Super-Resolution

P$^2$GNet: Pose-Guided Point Cloud Generating Networks for 6-DoF Object Pose Estimation

no code implementations19 Dec 2019 Peiyu Yu, Yongming Rao, Jiwen Lu, Jie zhou

Humans are able to perform fast and accurate object pose estimation even under severe occlusion by exploiting learned object model priors from everyday life.

6D Pose Estimation 6D Pose Estimation using RGB

COIN: A Large-scale Dataset for Comprehensive Instructional Video Analysis

no code implementations CVPR 2019 Yansong Tang, Dajun Ding, Yongming Rao, Yu Zheng, Danyang Zhang, Lili Zhao, Jiwen Lu, Jie zhou

There are substantial instructional videos on the Internet, which enables us to acquire knowledge for completing various tasks.

Action Detection

Learning Globally Optimized Object Detector via Policy Gradient

no code implementations CVPR 2018 Yongming Rao, Dahua Lin, Jiwen Lu, Jie zhou

In this paper, we propose a simple yet effective method to learn globally optimized detector for object detection, which is a simple modification to the standard cross-entropy gradient inspired by the REINFORCE algorithm.

Object Detection

Runtime Neural Pruning

no code implementations NeurIPS 2017 Ji Lin, Yongming Rao, Jiwen Lu, Jie zhou

In this paper, we propose a Runtime Neural Pruning (RNP) framework which prunes the deep neural network dynamically at the runtime.

Attention-Aware Deep Reinforcement Learning for Video Face Recognition

no code implementations ICCV 2017 Yongming Rao, Jiwen Lu, Jie zhou

In this paper, we propose an attention-aware deep reinforcement learning (ADRL) method for video face recognition, which aims to discard the misleading and confounding frames and find the focuses of attention in face videos for person recognition.

Face Recognition Person Recognition +1

Learning Discriminative Aggregation Network for Video-Based Face Recognition

no code implementations ICCV 2017 Yongming Rao, Ji Lin, Jiwen Lu, Jie zhou

In this paper, we propose a discriminative aggregation network (DAN) for video face recognition, which aims to integrate information from video frames effectively and efficiently.

Face Recognition Metric Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.