Search Results for author: Xiaohui Shen

Found 66 papers, 24 papers with code

Learning Progressive Joint Propagation for Human Motion Prediction

no code implementations ECCV 2020 Yujun Cai, Lin Huang, Yiwei Wang, Tat-Jen Cham, Jianfei Cai, Junsong Yuan, Jun Liu, Xu Yang, Yiheng Zhu, Xiaohui Shen, Ding Liu, Jing Liu, Nadia Magnenat Thalmann

Last, in order to incorporate a general motion space for high-quality prediction, we build a memory-based dictionary, which aims to preserve the global motion patterns in training data to guide the predictions.

Human motion prediction motion prediction

Video Object Detection via Object-level Temporal Aggregation

no code implementations ECCV 2020 Chun-Han Yao, Chen Fang, Xiaohui Shen, Yangyue Wan, Ming-Hsuan Yang

While single-image object detectors can be naively applied to videos in a frame-by-frame fashion, the prediction is often temporally inconsistent.

Video Object Detection

Adversarial Open Domain Adaption for Sketch-to-Photo Synthesis

2 code implementations12 Apr 2021 Xiaoyu Xiang, Ding Liu, Xiao Yang, Yiheng Zhu, Xiaohui Shen, Jan P. Allebach

In this paper, we explore the open-domain sketch-to-photo translation, which aims to synthesize a realistic photo from a freehand sketch with its class label, even if the sketches of that class are missing in the training data.

Domain Adaptation Image-to-Image Translation +1

DynOcc: Learning Single-View Depth from Dynamic Occlusion Cues

no code implementations30 Mar 2021 Yifan Wang, Linjie Luo, Xiaohui Shen, Xing Mei

Recently, significant progress has been made in single-view depth estimation thanks to increasingly large and diverse depth datasets.

3D Reconstruction Autonomous Driving +2

Unsupervised Real-world Low-light Image Enhancement with Decoupled Networks

no code implementations6 May 2020 Wei Xiong, Ding Liu, Xiaohui Shen, Chen Fang, Jiebo Luo

Conventional learning-based approaches to low-light image enhancement typically require a large amount of paired training data, which are difficult to acquire in real-world scenarios.

Low-Light Image Enhancement

Human Motion Transfer from Poses in the Wild

no code implementations7 Apr 2020 Jian Ren, Menglei Chai, Sergey Tulyakov, Chen Fang, Xiaohui Shen, Jianchao Yang

In this paper, we tackle the problem of human motion transfer, where we synthesize novel motion video for a target person that imitates the movement from a reference video.

EnlightenGAN: Deep Light Enhancement without Paired Supervision

7 code implementations17 Jun 2019 Yifan Jiang, Xinyu Gong, Ding Liu, Yu Cheng, Chen Fang, Xiaohui Shen, Jianchao Yang, Pan Zhou, Zhangyang Wang

Deep learning-based methods have achieved remarkable success in image restoration and enhancement, but are they still competitive when there is a lack of paired training data?

Image Restoration Low-Light Image Enhancement

Fashion Editing with Adversarial Parsing Learning

no code implementations CVPR 2020 Haoye Dong, Xiaodan Liang, Yixuan Zhang, Xujie Zhang, Zhenyu Xie, Bowen Wu, Ziqi Zhang, Xiaohui Shen, Jian Yin

Interactive fashion image manipulation, which enables users to edit images with sketches and color strokes, is an interesting research problem with great application value.

Human Parsing Image Manipulation

Graphonomy: Universal Human Parsing via Graph Transfer Learning

1 code implementation CVPR 2019 Ke Gong, Yiming Gao, Xiaodan Liang, Xiaohui Shen, Meng Wang, Liang Lin

By distilling universal semantic graph representation to each specific task, Graphonomy is able to predict all levels of parsing labels in one system without piling up the complexity.

Human Parsing Transfer Learning

Regional Homogeneity: Towards Learning Transferable Universal Adversarial Perturbations Against Defenses

1 code implementation ECCV 2020 Yingwei Li, Song Bai, Cihang Xie, Zhenyu Liao, Xiaohui Shen, Alan L. Yuille

We observe the property of regional homogeneity in adversarial perturbations and suggest that the defenses are less robust to regionally homogeneous perturbations.

Object Detection Semantic Segmentation

Sequence-to-Segment Networks for Segment Detection

no code implementations NeurIPS 2018 Zijun Wei, Boyu Wang, Minh Hoai Nguyen, Jianming Zhang, Zhe Lin, Xiaohui Shen, Radomir Mech, Dimitris Samaras

Detecting segments of interest from an input sequence is a challenging problem which often requires not only good knowledge of individual target segments, but also contextual understanding of the entire input sequence and the relationships between the target segments.

Temporal Action Proposal Generation Video Summarization

DeepLens: Shallow Depth Of Field From A Single Image

no code implementations18 Oct 2018 Lijun Wang, Xiaohui Shen, Jianming Zhang, Oliver Wang, Zhe Lin, Chih-Yao Hsieh, Sarah Kong, Huchuan Lu

To achieve this, we propose a novel neural network model comprised of a depth prediction module, a lens blur module, and a guided upsampling module.

Depth Estimation

Learning to Blend Photos

1 code implementation ECCV 2018 Wei-Chih Hung, Jianming Zhang, Xiaohui Shen, Zhe Lin, Joon-Young Lee, Ming-Hsuan Yang

Specifically, given a foreground image and a background image, our proposed method automatically generates a set of blending photos with scores that indicate the aesthetics quality with the proposed quality network and policy network.

Compositing-aware Image Search

no code implementations ECCV 2018 Hengshuang Zhao, Xiaohui Shen, Zhe Lin, Kalyan Sunkavalli, Brian Price, Jiaya Jia

We present a new image search technique that, given a background image, returns compatible foreground objects for image compositing tasks.

Image Retrieval

Concept Mask: Large-Scale Segmentation from Semantic Concepts

no code implementations ECCV 2018 Yufei Wang, Zhe Lin, Xiaohui Shen, Jianming Zhang, Scott Cohen

Then, we refine and extend the embedding network to predict an attention map, using a curated dataset with bounding box annotations on 750 concepts.

Semantic Segmentation

Learning to Understand Image Blur

no code implementations CVPR 2018 Shanghang Zhang, Xiaohui Shen, Zhe Lin, Radomír Měch, João P. Costeira, José M. F. Moura

In this paper, we propose a unified framework to estimate a spatially-varying blur map and understand its desirability in terms of image quality at the same time.

Look into Person: Joint Body Parsing & Pose Estimation Network and A New Benchmark

3 code implementations5 Apr 2018 Xiaodan Liang, Ke Gong, Xiaohui Shen, Liang Lin

To further explore and take advantage of the semantic correlation of these two tasks, we propose a novel joint human parsing and pose estimation network to explore efficient context modeling, which can simultaneously predict parsing and pose with extremely high quality.

Human Parsing Pose Estimation +1

Generative Image Inpainting with Contextual Attention

25 code implementations CVPR 2018 Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang

Motivated by these observations, we propose a new deep generative model-based approach which can not only synthesize novel image structures but also explicitly utilize surrounding image features as references during network training to make better predictions.

Image Inpainting

Predicting Scene Parsing and Motion Dynamics in the Future

no code implementations NeurIPS 2017 Xiaojie Jin, Huaxin Xiao, Xiaohui Shen, Jimei Yang, Zhe Lin, Yunpeng Chen, Zequn Jie, Jiashi Feng, Shuicheng Yan

The ability of predicting the future is important for intelligent systems, e. g. autonomous vehicles and robots to plan early and make decisions accordingly.

Autonomous Vehicles motion prediction +2

Scene Parsing with Global Context Embedding

1 code implementation ICCV 2017 Wei-Chih Hung, Yi-Hsuan Tsai, Xiaohui Shen, Zhe Lin, Kalyan Sunkavalli, Xin Lu, Ming-Hsuan Yang

We present a scene parsing method that utilizes global context information based on both the parametric and non- parametric models.

Scene Parsing

Learning to Segment Human by Watching YouTube

no code implementations4 Oct 2017 Xiaodan Liang, Yunchao Wei, Liang Lin, Yunpeng Chen, Xiaohui Shen, Jianchao Yang, Shuicheng Yan

An intuition on human segmentation is that when a human is moving in a video, the video-context (e. g., appearance and motion clues) may potentially infer reasonable mask information for the whole human body.

Human Detection Semantic Segmentation +2

Personalized Image Aesthetics

no code implementations ICCV 2017 Jian Ren, Xiaohui Shen, Zhe Lin, Radomir Mech, David J. Foran

To accommodate our study, we first collect two distinct datasets, a large image dataset from Flickr and annotated by Amazon Mechanical Turk, and a small dataset of real personal albums rated by owners.

Active Learning

FoveaNet: Perspective-aware Urban Scene Parsing

no code implementations ICCV 2017 Xin Li, Zequn Jie, Wei Wang, Changsong Liu, Jimei Yang, Xiaohui Shen, Zhe Lin, Qiang Chen, Shuicheng Yan, Jiashi Feng

Thus, they suffer from heterogeneous object scales caused by perspective projection of cameras on actual scenes and inevitably encounter parsing failures on distant objects as well as other boundary and recognition errors.

Scene Parsing

Recognizing and Curating Photo Albums via Event-Specific Image Importance

no code implementations19 Jul 2017 Yufei Wang, Zhe Lin, Xiaohui Shen, Radomir Mech, Gavin Miller, Garrison W. Cottrell

Automatic organization of personal photos is a problem with many real world ap- plications, and can be divided into two main tasks: recognizing the event type of the photo collection, and selecting interesting images from the collection.

Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition

no code implementations CVPR 2017 Yufei Wang, Zhe Lin, Xiaohui Shen, Scott Cohen, Garrison W. Cottrell

Furthermore, our algorithm can generate descriptions with varied length, benefiting from the separate control of the skeleton and attributes.

Image Captioning Language Modelling

Learning to Predict Indoor Illumination from a Single Image

no code implementations1 Apr 2017 Marc-André Gardner, Kalyan Sunkavalli, Ersin Yumer, Xiaohui Shen, Emiliano Gambaretto, Christian Gagné, Jean-François Lalonde

We propose an automatic method to infer high dynamic range illumination from a single, limited field-of-view, low dynamic range photograph of an indoor scene.

Recurrent Multimodal Interaction for Referring Image Segmentation

1 code implementation ICCV 2017 Chenxi Liu, Zhe Lin, Xiaohui Shen, Jimei Yang, Xin Lu, Alan Yuille

In this paper we are interested in the problem of image segmentation given natural language descriptions, i. e. referring expressions.

Semantic Segmentation

Interpretable Structure-Evolving LSTM

no code implementations CVPR 2017 Xiaodan Liang, Liang Lin, Xiaohui Shen, Jiashi Feng, Shuicheng Yan, Eric P. Xing

Instead of learning LSTM models over the pre-fixed structures, we propose to further learn the intermediate interpretable multi-level graph structures in a progressive and stochastic way from data during the LSTM network optimization.

Small Data Image Classification

Learning to Detect Multiple Photographic Defects

1 code implementation6 Dec 2016 Ning Yu, Xiaohui Shen, Zhe Lin, Radomir Mech, Connelly Barnes

Our new dataset enables us to formulate the problem as a multi-task learning problem and train a multi-column deep convolutional neural network (CNN) to simultaneously predict the severity of all the defects.

Defect Detection Multi-Task Learning

SURGE: Surface Regularized Geometry Estimation from a Single Image

no code implementations NeurIPS 2016 Peng Wang, Xiaohui Shen, Bryan Russell, Scott Cohen, Brian Price, Alan L. Yuille

This paper introduces an approach to regularize 2. 5D surface normal and depth predictions at each pixel given a single input image.

Video Scene Parsing with Predictive Feature Learning

no code implementations ICCV 2017 Xiaojie Jin, Xin Li, Huaxin Xiao, Xiaohui Shen, Zhe Lin, Jimei Yang, Yunpeng Chen, Jian Dong, Luoqi Liu, Zequn Jie, Jiashi Feng, Shuicheng Yan

In this way, the network can effectively learn to capture video dynamics and temporal context, which are critical clues for video scene parsing, without requiring extra manual annotations.

Representation Learning Scene Parsing

Top-down Neural Attention by Excitation Backprop

2 code implementations1 Aug 2016 Jianming Zhang, Zhe Lin, Jonathan Brandt, Xiaohui Shen, Stan Sclaroff

We aim to model the top-down attention of a Convolutional Neural Network (CNN) classifier for generating task-specific attention maps.

Salient Object Subitizing

no code implementations CVPR 2015 Jianming Zhang, Shugao Ma, Mehrnoosh Sameki, Stan Sclaroff, Margrit Betke, Zhe Lin, Xiaohui Shen, Brian Price, Radomir Mech

We study the problem of Salient Object Subitizing, i. e. predicting the existence and the number of salient objects in an image using holistic cues.

Image Retrieval RGB Salient Object Detection +1

Progressive Attention Networks for Visual Attribute Prediction

1 code implementation8 Jun 2016 Paul Hongsuck Seo, Zhe Lin, Scott Cohen, Xiaohui Shen, Bohyung Han

We propose a novel attention model that can accurately attends to target objects of various scales and shapes in images.

Photo Aesthetics Ranking Network with Attributes and Content Adaptation

2 code implementations6 Jun 2016 Shu Kong, Xiaohui Shen, Zhe Lin, Radomir Mech, Charless Fowlkes

In this work, we propose to learn a deep convolutional neural network to rank photo aesthetics in which the relative ranking of photo aesthetics are directly modeled in the loss function.

Aesthetics Quality Assessment

Event-Specific Image Importance

no code implementations CVPR 2016 Yufei Wang, Zhe Lin, Xiaohui Shen, Radomir Mech, Gavin Miller, Garrison W. Cottrell

In this paper, we show that the selection of important images is consistent among different viewers, and that this selection process is related to the event type of the album.

A Multi-Level Contextual Model For Person Recognition in Photo Albums

no code implementations CVPR 2016 Haoxiang Li, Jonathan Brandt, Zhe Lin, Xiaohui Shen, Gang Hua

Our new framework enables efficient use of these complementary multi-level contextual cues to improve overall recognition rates on the photo album person recognition task, as demonstrated through state-of-the-art results on a challenging public dataset.

Person Recognition

Shortlist Selection With Residual-Aware Distance Estimator for K-Nearest Neighbor Search

no code implementations CVPR 2016 Jae-Pil Heo, Zhe Lin, Xiaohui Shen, Jonathan Brandt, Sung-Eui Yoon

We have tested the proposed method with the inverted index and multi-index on a diverse set of benchmarks including up to one billion data points with varying dimensions, and found that our method robustly improves the accuracy of shortlists (up to 127% relatively higher) over the state-of-the-art techniques with a comparable or even faster computational cost.


Semantic Object Parsing with Graph LSTM

no code implementations23 Mar 2016 Xiaodan Liang, Xiaohui Shen, Jiashi Feng, Liang Lin, Shuicheng Yan

By taking the semantic object parsing task as an exemplar application scenario, we propose the Graph Long Short-Term Memory (Graph LSTM) network, which is the generalization of LSTM from sequential data or multi-dimensional data to general graph-structured data.

Minimum Barrier Salient Object Detection at 80 FPS

no code implementations ICCV 2015 Jianming Zhang, Stan Sclaroff, Zhe Lin, Xiaohui Shen, Brian Price, Radomir Mech

Powered by this fast MBD transform algorithm, the proposed salient object detection method runs at 80 FPS, and significantly outperforms previous methods with similar speed on four large benchmark datasets, and achieves comparable or better performance than state-of-the-art methods.

Ranked #6 on Video Salient Object Detection on DAVSOD-easy35 (using extra training data)

Salient Object Detection Video Salient Object Detection

Human Parsing With Contextualized Convolutional Neural Network

no code implementations ICCV 2015 Xiaodan Liang, Chunyan Xu, Xiaohui Shen, Jianchao Yang, Si Liu, Jinhui Tang, Liang Lin, Shuicheng Yan

In this work, we address the human parsing task with a novel Contextualized Convolutional Neural Network (Co-CNN) architecture, which well integrates the cross-layer context, global image-level context, within-super-pixel context and cross-super-pixel neighborhood context into a unified network.

Human Parsing

Semantic Object Parsing with Local-Global Long Short-Term Memory

no code implementations CVPR 2016 Xiaodan Liang, Xiaohui Shen, Donglai Xiang, Jiashi Feng, Liang Lin, Shuicheng Yan

The long chains of sequential computation by stacked LG-LSTM layers also enable each pixel to sense a much larger region for inference benefiting from the memorization of previous dependencies in all positions along all dimensions.

Reversible Recursive Instance-level Object Segmentation

no code implementations CVPR 2016 Xiaodan Liang, Yunchao Wei, Xiaohui Shen, Zequn Jie, Jiashi Feng, Liang Lin, Shuicheng Yan

By being reversible, the proposal refinement sub-network adaptively determines an optimal number of refinement iterations required for each proposal during both training and testing.

Denoising Semantic Segmentation

Automatic Content-Aware Color and Tone Stylization

no code implementations CVPR 2016 Joon-Young Lee, Kalyan Sunkavalli, Zhe Lin, Xiaohui Shen, In So Kweon

We introduce a new technique that automatically generates diverse, visually compelling stylizations for a photograph in an unsupervised manner.

Style Transfer

STC: A Simple to Complex Framework for Weakly-supervised Semantic Segmentation

1 code implementation10 Sep 2015 Yunchao Wei, Xiaodan Liang, Yunpeng Chen, Xiaohui Shen, Ming-Ming Cheng, Jiashi Feng, Yao Zhao, Shuicheng Yan

Then, a better network called Enhanced-DCNN is learned with supervision from the predicted segmentation masks of simple images based on the Initial-DCNN as well as the image-level annotations.

RGB Salient Object Detection Salient Object Detection +1

LCNN: Low-level Feature Embedded CNN for Salient Object Detection

no code implementations17 Aug 2015 Hongyang Li, Huchuan Lu, Zhe Lin, Xiaohui Shen, Brian Price

In this paper, we propose a novel deep neural network framework embedded with low-level features (LCNN) for salient object detection in complex images.

RGB Salient Object Detection Salient Object Detection

Towards Unified Depth and Semantic Prediction From a Single Image

no code implementations CVPR 2015 Peng Wang, Xiaohui Shen, Zhe Lin, Scott Cohen, Brian Price, Alan L. Yuille

By allowing for interactions between the depth and semantic information, the joint network provides more accurate depth prediction than a state-of-the-art CNN trained solely for depth prediction [5].

Depth Estimation Semantic Segmentation

A Convolutional Neural Network Cascade for Face Detection

no code implementations CVPR 2015 Haoxiang Li, Zhe Lin, Xiaohui Shen, Jonathan Brandt, Gang Hua

To improve localization effectiveness, and reduce the number of candidates at later stages, we introduce a CNN-based calibration stage after each of the detection stages in the cascade.

Face Detection

Inner and Inter Label Propagation: Salient Object Detection in the Wild

2 code implementations27 May 2015 Hongyang Li, Huchuan Lu, Zhe Lin, Xiaohui Shen, Brian Price

For most natural images, some boundary superpixels serve as the background labels and the saliency of other superpixels are determined by ranking their similarities to the boundary labels based on an inner propagation scheme.

RGB Salient Object Detection Saliency Detection +1

Joint Object and Part Segmentation using Deep Learned Potentials

no code implementations ICCV 2015 Peng Wang, Xiaohui Shen, Zhe Lin, Scott Cohen, Brian Price, Alan Yuille

Segmenting semantic objects from images and parsing them into their respective semantic parts are fundamental steps towards detailed object understanding in computer vision.

Semantic Segmentation

Matching-CNN Meets KNN: Quasi-Parametric Human Parsing

no code implementations CVPR 2015 Si Liu, Xiaodan Liang, Luoqi Liu, Xiaohui Shen, Jianchao Yang, Changsheng Xu, Liang Lin, Xiaochun Cao, Shuicheng Yan

Under the classic K Nearest Neighbor (KNN)-based nonparametric framework, the parametric Matching Convolutional Neural Network (M-CNN) is proposed to predict the matching confidence and displacements of the best matched region in the testing image for a particular semantic region in one KNN image.

Human Parsing

Deep Human Parsing with Active Template Regression

1 code implementation9 Mar 2015 Xiaodan Liang, Si Liu, Xiaohui Shen, Jianchao Yang, Luoqi Liu, Jian Dong, Liang Lin, Shuicheng Yan

The first CNN network is with max-pooling, and designed to predict the template coefficients for each label mask, while the second CNN network is without max-pooling to preserve sensitivity to label mask position and accurately predict the active shape parameters.

Human Parsing

Efficient Boosted Exemplar-based Face Detection

no code implementations CVPR 2014 Haoxiang Li, Zhe Lin, Jonathan Brandt, Xiaohui Shen, Gang Hua

Despite the fact that face detection has been studied intensively over the past several decades, the problem is still not completely solved.

Face Detection

Towards Unified Human Parsing and Pose Estimation

no code implementations CVPR 2014 Jian Dong, Qiang Chen, Xiaohui Shen, Jianchao Yang, Shuicheng Yan

We study the problem of human body configuration analysis, more specifically, human parsing and human pose estimation.

Human Parsing Pose Estimation

Detecting and Aligning Faces by Image Retrieval

no code implementations CVPR 2013 Xiaohui Shen, Zhe Lin, Jonathan Brandt, Ying Wu

In order to overcome these challenges, we present a novel and robust exemplarbased face detector that integrates image retrieval and discriminative learning.

Face Alignment Face Detection +2

Cannot find the paper you are looking for? You can Submit a new open access paper.