Search Results for author: Yunpeng Chen

Found 36 papers, 16 papers with code

Distribution-Aware Single-Stage Models for Multi-Person 3D Pose Estimation

no code implementations15 Mar 2022 Zitian Wang, Xuecheng Nie, Xiaochao Qu, Yunpeng Chen, Si Liu

In this paper, we present a novel Distribution-Aware Single-stage (DAS) model for tackling the challenging multi-person 3D pose estimation problem.

3D Pose Estimation

SODAR: Segmenting Objects by DynamicallyAggregating Neighboring Mask Representations

no code implementations15 Feb 2022 Tao Wang, Jun Hao Liew, Yu Li, Yunpeng Chen, Jiashi Feng

Unlike the original per grid cell object masks, SODAR is implicitly supervised to learn mask representations that encode geometric structure of nearby objects and complement adjacent representations with context.

Instance Segmentation Semantic Segmentation

MorphMLP: A Self-Attention Free, MLP-Like Backbone for Image and Video

1 code implementation24 Nov 2021 David Junhao Zhang, Kunchang Li, Yunpeng Chen, Yali Wang, Shashwat Chandra, Yu Qiao, Luoqi Liu, Mike Zheng Shou

Self-attention has become an integral component of the recent network architectures, e. g., Transformer, that dominate major image and video benchmarks.

Ranked #11 on Action Recognition on Something-Something V2 (using extra training data)

Action Recognition Image Classification +1

Improved Pillar with Fine-grained Feature for 3D Object Detection

no code implementations12 Oct 2021 Jiahui Fu, Guanghui Ren, Yunpeng Chen, Si Liu

In contrast, the 2D grid-based methods, such as PointPillar, can easily achieve a stable and efficient speed based on simple 2D convolution, but it is hard to get the competitive accuracy limited by the coarse-grained point clouds representation.

3D Object Detection Autonomous Driving

Dense Contrastive Visual-Linguistic Pretraining

no code implementations24 Sep 2021 Lei Shi, Kai Shuang, Shijie Geng, Peng Gao, Zuohui Fu, Gerard de Melo, Yunpeng Chen, Sen Su

To overcome these issues, we propose unbiased Dense Contrastive Visual-Linguistic Pretraining (DCVLP), which replaces the region regression and classification with cross-modality region contrastive learning that requires no annotations.

Contrastive Learning Data Augmentation +1

PnP-DETR: Towards Efficient Visual Analysis with Transformers

1 code implementation ICCV 2021 Tao Wang, Li Yuan, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

Recently, DETR pioneered the solution of vision tasks with transformers, it directly translates the image feature map into the object detection result.

Object Detection Panoptic Segmentation

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

10 code implementations ICCV 2021 Li Yuan, Yunpeng Chen, Tao Wang, Weihao Yu, Yujun Shi, Zihang Jiang, Francis EH Tay, Jiashi Feng, Shuicheng Yan

To overcome such limitations, we propose a new Tokens-To-Token Vision Transformer (T2T-ViT), which incorporates 1) a layer-wise Tokens-to-Token (T2T) transformation to progressively structurize the image to tokens by recursively aggregating neighboring Tokens into one Token (Tokens-to-Token), such that local structure represented by surrounding tokens can be modeled and tokens length can be reduced; 2) an efficient backbone with a deep-narrow structure for vision transformer motivated by CNN architecture design after empirical study.

Image Classification Language Modelling

AggMask: Exploring locally aggregated learning of mask representations for instance segmentation

1 code implementation1 Jan 2021 Tao Wang, Jun Hao Liew, Yu Li, Yunpeng Chen, Jiashi Feng

Recently proposed one-stage instance segmentation models (\emph{e. g.}, SOLO) learn to directly predict location-specific object mask with fully-convolutional networks.

Instance Segmentation Semantic Segmentation

ProxylessKD: Direct Knowledge Distillation with Inherited Classifier for Face Recognition

no code implementations31 Oct 2020 Weidong Shi, Guanghui Ren, Yunpeng Chen, Shuicheng Yan

We observe that existing knowledge distillation models optimize the proxy tasks that force the student to mimic the teacher's behavior, instead of directly optimizing the face recognition accuracy.

Face Recognition Knowledge Distillation

A Simple Baseline for Pose Tracking in Videos of Crowded Scenes

no code implementations16 Oct 2020 Li Yuan, Shuning Chang, Ziyuan Huang, Yichen Zhou, Yunpeng Chen, Xuecheng Nie, Francis E. H. Tay, Jiashi Feng, Shuicheng Yan

This paper presents our solution to ACM MM challenge: Large-scale Human-centric Video Analysis in Complex Events\cite{lin2020human}; specifically, here we focus on Track3: Crowd Pose Tracking in Complex Events.

Multi-Object Tracking Optical Flow Estimation +1

Towards Accurate Human Pose Estimation in Videos of Crowded Scenes

no code implementations16 Oct 2020 Li Yuan, Shuning Chang, Xuecheng Nie, Ziyuan Huang, Yichen Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

In this paper, we focus on improving human pose estimation in videos of crowded scenes from the perspectives of exploiting temporal context and collecting new data.

Frame Optical Flow Estimation +1

Toward Accurate Person-level Action Recognition in Videos of Crowded Scenes

no code implementations16 Oct 2020 Li Yuan, Yichen Zhou, Shuning Chang, Ziyuan Huang, Yunpeng Chen, Xuecheng Nie, Tao Wang, Jiashi Feng, Shuicheng Yan

Prior works always fail to deal with this problem in two aspects: (1) lacking utilizing information of the scenes; (2) lacking training data in the crowd and complex scenes.

Action Recognition Action Recognition In Videos +3

ConvBERT: Improving BERT with Span-based Dynamic Convolution

7 code implementations NeurIPS 2020 Zi-Hang Jiang, Weihao Yu, Daquan Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

The novel convolution heads, together with the rest self-attention heads, form a new mixed attention block that is more efficient at both global and local context learning.

Natural Language Understanding

Rethinking Bottleneck Structure for Efficient Mobile Network Design

4 code implementations ECCV 2020 Zhou Daquan, Qibin Hou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

In this paper, we rethink the necessity of such design changes and find it may bring risks of information loss and gradient confusion.

General Classification Neural Architecture Search +1

Semantic Domain Adversarial Networks for Unsupervised Domain Adaptation

1 code implementation30 Mar 2020 Dapeng Hu, Jian Liang, Qibin Hou, Hanshu Yan, Yunpeng Chen, Shuicheng Yan, Jiashi Feng

To successfully align the multi-modal data structures across domains, the following works exploit discriminative information in the adversarial training process, e. g., using multiple class-wise discriminators and introducing conditional information in input or output of the domain discriminator.

Object Recognition Semantic Segmentation +1

Highly Efficient Salient Object Detection with 100K Parameters

1 code implementation ECCV 2020 Shang-Hua Gao, Yong-Qiang Tan, Ming-Ming Cheng, Chengze Lu, Yunpeng Chen, Shuicheng Yan

Salient object detection models often demand a considerable amount of computation cost to make precise prediction for each pixel, making them hardly applicable on low-power devices.

RGB Salient Object Detection Salient Object Detection

AdversarialNAS: Adversarial Neural Architecture Search for GANs

1 code implementation CVPR 2020 Chen Gao, Yunpeng Chen, Si Liu, Zhenxiong Tan, Shuicheng Yan

In this paper, we propose an AdversarialNAS method specially tailored for Generative Adversarial Networks (GANs) to search for a superior generative model on the task of unconditional image generation.

Image Generation Neural Architecture Search +1

Dynamic Feature Fusion for Semantic Edge Detection

1 code implementation25 Feb 2019 Yuan Hu, Yunpeng Chen, Xiang Li, Jiashi Feng

In this work, we propose a novel dynamic feature fusion strategy that assigns different fusion weights for different input images and locations adaptively.

Edge Detection

Deep Reasoning with Multi-Scale Context for Salient Object Detection

no code implementations24 Jan 2019 Zun Li, Congyan Lang, Yunpeng Chen, Junhao Liew, Jiashi Feng

However, the saliency inference module that performs saliency prediction from the fused features receives much less attention on its architecture design and typically adopts only a few fully convolutional layers.

RGB Salient Object Detection Saliency Prediction +1

Graph-Based Global Reasoning Networks

5 code implementations CVPR 2019 Yunpeng Chen, Marcus Rohrbach, Zhicheng Yan, Shuicheng Yan, Jiashi Feng, Yannis Kalantidis

In this work, we propose a new approach for reasoning globally in which a set of features are globally aggregated over the coordinate space and then projected to an interaction space where relational reasoning can be efficiently computed.

Action Classification Action Recognition +3

Multi-Fiber Networks for Video Recognition

no code implementations ECCV 2018 Yunpeng Chen, Yannis Kalantidis, Jianshu Li, Shuicheng Yan, Jiashi Feng

In this paper, we aim to reduce the computational cost of spatio-temporal deep neural networks, making them run as fast as their 2D counterparts while preserving state-of-the-art accuracy on video recognition benchmarks.

Ranked #32 on Action Recognition on UCF101 (using extra training data)

Action Classification Action Recognition +1

Weaving Multi-scale Context for Single Shot Detector

no code implementations8 Dec 2017 Yunpeng Chen, Jianshu Li, Bin Zhou, Jiashi Feng, Shuicheng Yan

For 320x320 input of batch size = 8, WeaveNet reaches 79. 5% mAP on PASCAL VOC 2007 test in 101 fps with only 4 fps extra cost, and further improves to 79. 7% mAP with more iterations.

Object Detection

Predicting Scene Parsing and Motion Dynamics in the Future

no code implementations NeurIPS 2017 Xiaojie Jin, Huaxin Xiao, Xiaohui Shen, Jimei Yang, Zhe Lin, Yunpeng Chen, Zequn Jie, Jiashi Feng, Shuicheng Yan

The ability of predicting the future is important for intelligent systems, e. g. autonomous vehicles and robots to plan early and make decisions accordingly.

Autonomous Vehicles motion prediction +2

Learning to Segment Human by Watching YouTube

no code implementations4 Oct 2017 Xiaodan Liang, Yunchao Wei, Liang Lin, Yunpeng Chen, Xiaohui Shen, Jianchao Yang, Shuicheng Yan

An intuition on human segmentation is that when a human is moving in a video, the video-context (e. g., appearance and motion clues) may potentially infer reasonable mask information for the whole human body.

Human Detection Semantic Segmentation +3

Dual Path Networks

15 code implementations NeurIPS 2017 Yunpeng Chen, Jianan Li, Huaxin Xiao, Xiaojie Jin, Shuicheng Yan, Jiashi Feng

In this work, we present a simple, highly efficient and modularized Dual Path Network (DPN) for image classification which presents a new topology of connection paths internally.

Image Classification

Training Group Orthogonal Neural Networks with Privileged Information

no code implementations24 Jan 2017 Yunpeng Chen, Xiaojie Jin, Jiashi Feng, Shuicheng Yan

Learning rich and diverse representations is critical for the performance of deep convolutional neural networks (CNNs).

Image Classification Semantic Segmentation

Video Scene Parsing with Predictive Feature Learning

no code implementations ICCV 2017 Xiaojie Jin, Xin Li, Huaxin Xiao, Xiaohui Shen, Zhe Lin, Jimei Yang, Yunpeng Chen, Jian Dong, Luoqi Liu, Zequn Jie, Jiashi Feng, Shuicheng Yan

In this way, the network can effectively learn to capture video dynamics and temporal context, which are critical clues for video scene parsing, without requiring extra manual annotations.

Frame Representation Learning +1

Multi-Path Feedback Recurrent Neural Network for Scene Parsing

no code implementations27 Aug 2016 Xiaojie Jin, Yunpeng Chen, Jiashi Feng, Zequn Jie, Shuicheng Yan

In this paper, we consider the scene parsing problem and propose a novel Multi-Path Feedback recurrent neural network (MPF-RNN) for parsing scene images.

Scene Parsing

Collaborative Layer-wise Discriminative Learning in Deep Neural Networks

no code implementations19 Jul 2016 Xiaojie Jin, Yunpeng Chen, Jian Dong, Jiashi Feng, Shuicheng Yan

In this paper, we propose a layer-wise discriminative learning method to enhance the discriminative capability of a deep network by allowing its layers to work collaboratively for classification.

General Classification Scene Classification

STC: A Simple to Complex Framework for Weakly-supervised Semantic Segmentation

1 code implementation10 Sep 2015 Yunchao Wei, Xiaodan Liang, Yunpeng Chen, Xiaohui Shen, Ming-Ming Cheng, Jiashi Feng, Yao Zhao, Shuicheng Yan

Then, a better network called Enhanced-DCNN is learned with supervision from the predicted segmentation masks of simple images based on the Initial-DCNN as well as the image-level annotations.

RGB Salient Object Detection Salient Object Detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.