Search Results for author: Lu Sheng

Found 36 papers, 17 papers with code

Fast-BEV: A Fast and Strong Bird's-Eye View Perception Baseline

1 code implementation29 Jan 2023 Yangguang Li, Bin Huang, Zeren Chen, Yufeng Cui, Feng Liang, Mingzhu Shen, Fenggang Liu, Enze Xie, Lu Sheng, Wanli Ouyang, Jing Shao

Our Fast-BEV consists of five parts, We novelly propose (1) a lightweight deployment-friendly view transformation which fast transfers 2D image feature to 3D voxel space, (2) an multi-scale image encoder which leverages multi-scale information for better performance, (3) an efficient BEV encoder which is particularly designed to speed up on-vehicle inference.

Data Augmentation

Improving RGB-D Point Cloud Registration by Learning Multi-scale Local Linear Transformation

1 code implementation31 Aug 2022 ZiMing Wang, Xiaoliang Huo, Zhenghao Chen, Jing Zhang, Lu Sheng, Dong Xu

In addition to previous methods that seek correspondences by hand-crafted or learnt geometric features, recent point cloud registration methods have tried to apply RGB-D data to achieve more accurate correspondence.

Point Cloud Registration

SketchSampler: Sketch-based 3D Reconstruction via View-dependent Depth Sampling

1 code implementation14 Aug 2022 Chenjian Gao, Qian Yu, Lu Sheng, Yi-Zhe Song, Dong Xu

Reconstructing a 3D shape based on a single sketch image is challenging due to the large domain gap between a sparse, irregular sketch and a regular, dense 3D shape.

3D Reconstruction

X-Learner: Learning Cross Sources and Tasks for Universal Visual Representation

no code implementations16 Mar 2022 Yinan He, Gengshi Huang, Siyu Chen, Jianing Teng, Wang Kun, Zhenfei Yin, Lu Sheng, Ziwei Liu, Yu Qiao, Jing Shao

2) Squeeze Stage: X-Learner condenses the model to a reasonable size and learns the universal and generalizable representation for various tasks transferring.

object-detection Object Detection +2

3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds

no code implementations CVPR 2022 Daigang Cai, Lichen Zhao, Jing Zhang, Lu Sheng, Dong Xu

Observing that the 3D captioning task and the 3D grounding task contain both shared and complementary information in nature, in this work, we propose a unified framework to jointly solve these two distinct but closely related tasks in a synergistic fashion, which consists of both shared task-agnostic modules and lightweight task-specific modules.

Dense Captioning Visual Grounding

VoteHMR: Occlusion-Aware Voting Network for Robust 3D Human Mesh Recovery from Partial Point Clouds

no code implementations17 Oct 2021 Guanze Liu, Yu Rong, Lu Sheng

3D human mesh recovery from point clouds is essential for various tasks, including AR/VR and human behavior understanding.

Human Mesh Recovery

Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

1 code implementation CVPR 2021 Bowen Cheng, Lu Sheng, Shaoshuai Shi, Ming Yang, Dong Xu

Inspired by the back-tracing strategy in the conventional Hough voting methods, in this work, we introduce a new 3D object detection method, named as Back-tracing Representative Points Network (BRNet), which generatively back-traces the representative points from the vote centers and also revisits complementary seed points around these generated points, so as to better capture the fine local structural features surrounding the potential objects from the raw point clouds.

3D Object Detection object-detection

DanceFormer: Music Conditioned 3D Dance Generation with Parametric Motion Transformer

1 code implementation18 Mar 2021 Buyu Li, Yongchi Zhao, Zhelun Shi, Lu Sheng

In this paper, we reformulate it by a two-stage process, ie, a key pose generation and then an in-between parametric motion curve prediction, where the key poses are easier to be synchronized with the music beats and the parametric curves can be efficiently regressed to render fluent rhythm-aligned movements.

ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis

1 code implementation CVPR 2021 Yinan He, Bei Gan, Siyu Chen, Yichun Zhou, Guojun Yin, Luchuan Song, Lu Sheng, Jing Shao, Ziwei Liu

To counter this emerging threat, we construct the ForgeryNet dataset, an extremely large face forgery dataset with unified annotations in image- and video-level data across four tasks: 1) Image Forgery Classification, including two-way (real / fake), three-way (real / fake with identity-replaced forgery approaches / fake with identity-remained forgery approaches), and n-way (real and 15 respective forgery approaches) classification.

Classification General Classification +1

3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds

no code implementations ICCV 2021 Lichen Zhao, Daigang Cai, Lu Sheng, Dong Xu

Visual grounding on 3D point clouds is an emerging vision and language task that benefits various applications in understanding the 3D visual world.

Object Proposal Generation Visual Grounding

StyleFormer: Real-Time Arbitrary Style Transfer via Parametric Style Composition

1 code implementation ICCV 2021 Xiaolei Wu, Zhihao Hu, Lu Sheng, Dong Xu

In this work, we propose a new feed-forward arbitrary style transfer method, referred to as StyleFormer, which can simultaneously fulfill fine-grained style diversity and semantic content coherency.

Style Transfer

PV-NAS: Practical Neural Architecture Search for Video Recognition

no code implementations2 Nov 2020 ZiHao Wang, Chen Lin, Lu Sheng, Junjie Yan, Jing Shao

Recently, deep learning has been utilized to solve video recognition problem due to its prominent representation ability.

Neural Architecture Search Video Recognition

Adaptive Gradient Method with Resilience and Momentum

no code implementations21 Oct 2020 Jie Liu, Chen Lin, Chuming Li, Lu Sheng, Ming Sun, Junjie Yan, Wanli Ouyang

Several variants of stochastic gradient descent (SGD) have been proposed to improve the learning effectiveness and efficiency when training deep neural networks, among which some recent influential attempts would like to adaptively control the parameter-wise learning rate (e. g., Adam and RMSProp).

Thinking in Frequency: Face Forgery Detection by Mining Frequency-aware Clues

2 code implementations ECCV 2020 Yuyang Qian, Guojun Yin, Lu Sheng, Zixuan Chen, Jing Shao

As realistic facial manipulation technologies have achieved remarkable progress, social concerns about potential malicious abuse of these technologies bring out an emerging research topic of face forgery detection.

Unsupervised Domain Expansion from Multiple Sources

no code implementations26 May 2020 Jing Zhang, Wanqing Li, Lu Sheng, Chang Tang, Philip Ogunbona

Given an existing system learned from previous source domains, it is desirable to adapt the system to new domains without accessing and forgetting all the previous domains in some applications.

Domain Adaptation Unsupervised Domain Expansion

Powering One-shot Topological NAS with Stabilized Share-parameter Proxy

no code implementations ECCV 2020 Ronghao Guo, Chen Lin, Chuming Li, Keyu Tian, Ming Sun, Lu Sheng, Junjie Yan

Specifically, the difficulties for architecture searching in such a complex space has been eliminated by the proposed stabilized share-parameter proxy, which employs Stochastic Gradient Langevin Dynamics to enable fast shared parameter sampling, so as to achieve stabilized measurement of architecture performance even in search space with complex topological structures.

Neural Architecture Search

Morphing and Sampling Network for Dense Point Cloud Completion

2 code implementations30 Nov 2019 Minghua Liu, Lu Sheng, Sheng Yang, Jing Shao, Shi-Min Hu

3D point cloud completion, the task of inferring the complete geometric shape from a partial point cloud, has been attracting attention in the community.

Point Cloud Completion

Visibility Constrained Generative Model for Depth-based 3D Facial Pose Tracking

no code implementations6 May 2019 Lu Sheng, Jianfei Cai, Tat-Jen Cham, Vladimir Pavlovic, King Ngi Ngan

In this paper, we propose a generative framework that unifies depth-based 3D facial pose tracking and face model adaptation on-the-fly, in the unconstrained scenarios with heavy occlusions and arbitrary facial expression variations.

Face Model Pose Estimation +1

Context and Attribute Grounded Dense Captioning

no code implementations CVPR 2019 Guojun Yin, Lu Sheng, Bin Liu, Nenghai Yu, Xiaogang Wang, Jing Shao

Dense captioning aims at simultaneously localizing semantic regions and describing these regions-of-interest (ROIs) with short phrases or sentences in natural language.

Dense Captioning

Video Generation from Single Semantic Label Map

2 code implementations CVPR 2019 Junting Pan, Chengyu Wang, Xu Jia, Jing Shao, Lu Sheng, Junjie Yan, Xiaogang Wang

This paper proposes the novel task of video generation conditioned on a SINGLE semantic label map, which provides a good balance between flexibility and quality in the generation process.

Image Generation Optical Flow Estimation +1

Unsupervised Bi-directional Flow-based Video Generation from one Snapshot

no code implementations3 Mar 2019 Lu Sheng, Junting Pan, Jiaming Guo, Jing Shao, Xiaogang Wang, Chen Change Loy

Imagining multiple consecutive frames given one single snapshot is challenging, since it is difficult to simultaneously predict diverse motions from a single image and faithfully generate novel frames without visual distortions.

Video Generation

Multi-Label Image Classification via Knowledge Distillation from Weakly-Supervised Detection

1 code implementation16 Sep 2018 Yongcheng Liu, Lu Sheng, Jing Shao, Junjie Yan, Shiming Xiang, Chunhong Pan

Specifically, given the image-level annotations, (1) we first develop a weakly-supervised detection (WSD) model, and then (2) construct an end-to-end multi-label image classification framework augmented by a knowledge distillation module that guides the classification model by the WSD model according to the class-level predictions for the whole image and the object-level visual features for object RoIs.

Classification General Classification +3

Zoom-Net: Mining Deep Feature Interactions for Visual Relationship Recognition

no code implementations ECCV 2018 Guojun Yin, Lu Sheng, Bin Liu, Nenghai Yu, Xiaogang Wang, Jing Shao, Chen Change Loy

We show that by encouraging deep message propagation and interactions between local object features and global predicate features, one can achieve compelling performance in recognizing complex relationships without using any linguistic priors.

Avatar-Net: Multi-scale Zero-shot Style Transfer by Feature Decoration

3 code implementations CVPR 2018 Lu Sheng, Ziyi Lin, Jing Shao, Xiaogang Wang

Zero-shot artistic style transfer is an important image synthesis problem aiming at transferring arbitrary style into content images.

Image Generation Image Reconstruction +1

Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition

1 code implementation CVPR 2018 Shuyang Sun, Zhanghui Kuang, Wanli Ouyang, Lu Sheng, Wei zhang

In this study, we introduce a novel compact motion representation for video action recognition, named Optical Flow guided Feature (OFF), which enables the network to distill temporal information through a fast and robust approach.

Action Recognition Action Recognition In Videos +2

A Generative Model for Depth-Based Robust 3D Facial Pose Tracking

no code implementations CVPR 2017 Lu Sheng, Jianfei Cai, Tat-Jen Cham, Vladimir Pavlovic, King Ngi Ngan

We consider the problem of depth-based robust 3D facial pose tracking under unconstrained scenarios with heavy occlusions and arbitrary facial expression variations.

Face Model Pose Estimation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.