Search Results for author: Bolei Zhou

Found 80 papers, 42 papers with code

Learning to Simulate Self-Driven Particles System with Coordinated Policy Optimization

no code implementations26 Oct 2021 Zhenghao Peng, Quanyi Li, Ka Ming Hui, Chunxiao Liu, Bolei Zhou

Self-Driven Particles (SDP) describe a category of multi-agent systems common in everyday life, such as flocking birds and traffic flows.

STransGAN: An Empirical Study on Transformer in GANs

no code implementations25 Oct 2021 Rui Xu, Xiangyu Xu, Kai Chen, Bolei Zhou, Chen Change Loy

In this paper, we conduct a comprehensive empirical study to investigate the intrinsic properties of Transformer in GAN for high-fidelity image synthesis.

Image Generation

Safe Driving via Expert Guided Policy Optimization

no code implementations13 Oct 2021 Zhenghao Peng, Quanyi Li, Chunxiao Liu, Bolei Zhou

Offline RL technique is further used to learn from the partial demonstration generated by the expert.

Offline RL

MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning

1 code implementation26 Sep 2021 Quanyi Li, Zhenghao Peng, Zhenghai Xue, Qihang Zhang, Bolei Zhou

Driving safely requires multiple capabilities from human and intelligent agents, such as the generalizability to unseen environments, the decision making in complex multi-agent settings, and the safety awareness of the surrounding traffic.

Decision Making Safe Exploration

Interpreting Generative Adversarial Networks for Interactive Image Generation

no code implementations10 Aug 2021 Bolei Zhou

Great progress has been made by the advances in Generative Adversarial Networks (GANs) for image generation.

Image Generation

Safe Exploration by Solving Early Terminated MDP

no code implementations9 Jul 2021 Hao Sun, Ziping Xu, Meng Fang, Zhenghao Peng, Jiadong Guo, Bo Dai, Bolei Zhou

Safe exploration is crucial for the real-world application of reinforcement learning (RL).

Safe Exploration

Data-Efficient Instance Generation from Instance Discrimination

1 code implementation8 Jun 2021 Ceyuan Yang, Yujun Shen, Yinghao Xu, Bolei Zhou

Meanwhile, the learned instance discrimination capability from the discriminator is in turn exploited to encourage the generator for diverse generation.

Data Augmentation Image Generation

Multimodal Motion Prediction with Stacked Transformers

no code implementations CVPR 2021 Yicheng Liu, Jinghuai Zhang, Liangji Fang, Qinhong Jiang, Bolei Zhou

Predicting multiple plausible future trajectories of the nearby vehicles is crucial for the safety of autonomous driving.

Autonomous Driving motion prediction

Unsupervised Image Transformation Learning via Generative Adversarial Networks

no code implementations13 Mar 2021 Kaiwen Zha, Yujun Shen, Bolei Zhou

In this work, we study the image transformation problem by learning the underlying transformations from a collection of images using Generative Adversarial Networks (GANs).

Image Generation

Instance Localization for Self-supervised Detection Pretraining

1 code implementation CVPR 2021 Ceyuan Yang, Zhirong Wu, Bolei Zhou, Stephen Lin

The pretext task is to predict the instance category given the composited images as well as the foreground bounding boxes.

Classification General Classification +4

Deep Learning for Scene Classification: A Survey

no code implementations26 Jan 2021 Delu Zeng, Minyu Liao, Mohammad Tavakolian, Yulan Guo, Bolei Zhou, Dewen Hu, Matti Pietikäinen, Li Liu

Scene classification, aiming at classifying a scene image to one of the predefined scene categories by comprehending the entire image, is a longstanding, fundamental and challenging problem in computer vision.

Classification General Classification +1

GAN Inversion: A Survey

1 code implementation14 Jan 2021 Weihao Xia, Yulun Zhang, Yujiu Yang, Jing-Hao Xue, Bolei Zhou, Ming-Hsuan Yang

GAN inversion aims to invert a given image back into the latent space of a pretrained GAN model, for the image to be faithfully reconstructed from the inverted code by the generator.

GAN inversion Image Manipulation +1

Self-Supervised Continuous Control without Policy Gradient

no code implementations1 Jan 2021 Hao Sun, Ziping Xu, Meng Fang, Yuhang Song, Jiechao Xiong, Bo Dai, Zhengyou Zhang, Bolei Zhou

Despite the remarkable progress made by the policy gradient algorithms in reinforcement learning (RL), sub-optimal policies usually result from the local exploration property of the policy gradient update.

Continuous Control Policy Gradient Methods +1

Improving the Generalization of End-to-End Driving through Procedural Generation

2 code implementations26 Dec 2020 Quanyi Li, Zhenghao Peng, Qihang Zhang, Chunxiao Liu, Bolei Zhou

We validate that training with the increasing number of procedurally generated scenes significantly improves the generalization of the agent across scenarios of different traffic densities and road networks.

Autonomous Driving

Improving the Fairness of Deep Generative Models without Retraining

1 code implementation9 Dec 2020 Shuhan Tan, Yujun Shen, Bolei Zhou

Generative Adversarial Networks (GANs) advance face synthesis through learning the underlying distribution of observed data.

Face Generation Face Recognition +2

Positional Encoding as Spatial Inductive Bias in GANs

no code implementations CVPR 2021 Rui Xu, Xintao Wang, Kai Chen, Bolei Zhou, Chen Change Loy

In this work, taking SinGAN and StyleGAN2 as examples, we show that such capability, to a large extent, is brought by the implicit positional encoding when using zero padding in the generators.

Image Manipulation Translation

Texture Memory-Augmented Deep Patch-Based Image Inpainting

1 code implementation28 Sep 2020 Rui Xu, Minghao Guo, Jiaqi Wang, Xiaoxiao Li, Bolei Zhou, Chen Change Loy

By bringing together the best of both paradigms, we propose a new deep inpainting framework where texture generation is guided by a texture memory of patch samples extracted from unmasked regions.

Image Inpainting Texture Synthesis

Understanding the Role of Individual Units in a Deep Neural Network

2 code implementations10 Sep 2020 David Bau, Jun-Yan Zhu, Hendrik Strobelt, Agata Lapedriza, Bolei Zhou, Antonio Torralba

Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.

Image Classification Image Generation +1

A Unified Framework for Shot Type Classification Based on Subject Centric Lens

no code implementations ECCV 2020 Anyi Rao, Jiaze Wang, Linning Xu, Xuekun Jiang, Qingqiu Huang, Bolei Zhou, Dahua Lin

Shots are key narrative elements of various videos, e. g. movies, TV series, and user-generated videos that are thriving over the Internet.

General Classification

Generative Hierarchical Features from Synthesizing Images

1 code implementation CVPR 2021 Yinghao Xu, Yujun Shen, Jiapeng Zhu, Ceyuan Yang, Bolei Zhou

Generative Adversarial Networks (GANs) have recently advanced image synthesis by learning the underlying distribution of the observed data.

Face Verification Image Classification +2

Closed-Form Factorization of Latent Semantics in GANs

8 code implementations CVPR 2021 Yujun Shen, Bolei Zhou

A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.

Image Generation Image Manipulation

Unsupervised Landmark Learning from Unpaired Data

1 code implementation29 Jun 2020 Yinghao Xu, Ceyuan Yang, Ziwei Liu, Bo Dai, Bolei Zhou

Recent attempts for unsupervised landmark learning leverage synthesized image pairs that are similar in appearance but different in poses.

Video Representation Learning with Visual Tempo Consistency

1 code implementation28 Jun 2020 Ceyuan Yang, Yinghao Xu, Bo Dai, Bolei Zhou

Visual tempo, which describes how fast an action goes, has shown its potential in supervised action recognition.

Action Anticipation Action Detection +3

Non-local Policy Optimization via Diversity-regularized Collaborative Exploration

no code implementations14 Jun 2020 Zhenghao Peng, Hao Sun, Bolei Zhou

Conventional Reinforcement Learning (RL) algorithms usually have one single agent learning to solve the task independently.

Zeroth-Order Supervised Policy Improvement

no code implementations11 Jun 2020 Hao Sun, Ziping Xu, Yuhang Song, Meng Fang, Jiechao Xiong, Bo Dai, Bolei Zhou

However, PG algorithms rely on exploiting the value function being learned with the first-order update locally, which results in limited sample efficiency.

Continuous Control Policy Gradient Methods +1

Novel Policy Seeking with Constrained Optimization

no code implementations21 May 2020 Hao Sun, Zhenghao Peng, Bo Dai, Jian Guo, Dahua Lin, Bolei Zhou

In problem-solving, we humans can come up with multiple novel solutions to the same problem.

InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs

2 code implementations18 May 2020 Yujun Shen, Ceyuan Yang, Xiaoou Tang, Bolei Zhou

In this work, we propose a framework called InterFaceGAN to interpret the disentangled face representation learned by the state-of-the-art GAN models and study the properties of the facial semantics encoded in the latent space.

Face Generation GAN inversion

Evolutionary Stochastic Policy Distillation

2 code implementations27 Apr 2020 Hao Sun, Xinyu Pan, Bo Dai, Dahua Lin, Bolei Zhou

Solving the Goal-Conditioned Reward Sparse (GCRS) task is a challenging reinforcement learning problem due to the sparsity of reward signals.

TPNet: Trajectory Proposal Network for Motion Prediction

no code implementations CVPR 2020 Liangji Fang, Qinhong Jiang, Jianping Shi, Bolei Zhou

However, it remains difficult for these methods to provide multimodal predictions as well as integrate physical constraints such as traffic rules and movable areas.

Autonomous Driving motion prediction +1

Temporal Pyramid Network for Action Recognition

3 code implementations CVPR 2020 Ceyuan Yang, Yinghao Xu, Jianping Shi, Bo Dai, Bolei Zhou

Previous works often capture the visual tempo through sampling raw videos at multiple rates and constructing an input-level frame pyramid, which usually requires a costly multi-branch network to handle.

Action Recognition

A Local-to-Global Approach to Multi-modal Movie Scene Segmentation

2 code implementations CVPR 2020 Anyi Rao, Linning Xu, Yu Xiong, Guodong Xu, Qingqiu Huang, Bolei Zhou, Dahua Lin

Scene, as the crucial unit of storytelling in movies, contains complex activities of actors and their interactions in a physical environment.

Action Recognition Scene Segmentation

TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting

no code implementations CVPR 2020 Zhuoqian Yang, Wentao Zhu, Wayne Wu, Chen Qian, Qiang Zhou, Bolei Zhou, Chen Change Loy

We present a lightweight video motion retargeting approach TransMoMo that is capable of transferring motion of a person in a source video realistically to another video of a target person.

motion retargeting

In-Domain GAN Inversion for Real Image Editing

3 code implementations ECCV 2020 Jiapeng Zhu, Yujun Shen, Deli Zhao, Bolei Zhou

A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.

GAN inversion Image Interpolation +1

Image Processing Using Multi-Code GAN Prior

1 code implementation CVPR 2020 Jinjin Gu, Yujun Shen, Bolei Zhou

Such an over-parameterization of the latent space significantly improves the image reconstruction quality, outperforming existing competitors.

Blind Face Restoration Colorization +5

Learning a Decision Module by Imitating Driver's Control Behaviors

no code implementations30 Nov 2019 Junning Huang, Sirui Xie, Jiankai Sun, Qiurui Ma, Chunxiao Liu, Jianping Shi, Dahua Lin, Bolei Zhou

In this work, we propose a hybrid framework to learn neural decisions in the classical modular pipeline through end-to-end imitation learning.

Autonomous Driving Imitation Learning

Every Frame Counts: Joint Learning of Video Segmentation and Optical Flow

no code implementations28 Nov 2019 Mingyu Ding, Zhe Wang, Bolei Zhou, Jianping Shi, Zhiwu Lu, Ping Luo

Moreover, our framework is able to utilize both labeled and unlabeled frames in the video through joint training, while no additional calculation is required in inference.

Optical Flow Estimation Semantic Segmentation +2

Semantic Hierarchy Emerges in Deep Generative Representations for Scene Synthesis

2 code implementations21 Nov 2019 Ceyuan Yang, Yujun Shen, Bolei Zhou

Despite the success of Generative Adversarial Networks (GANs) in image synthesis, there lacks enough understanding on what generative models have learned inside the deep generative representations and how photo-realistic images are able to be composed of the layer-wise stochasticity introduced in recent GANs.

Image Generation

Policy Continuation with Hindsight Inverse Dynamics

1 code implementation NeurIPS 2019 Hao Sun, Zhizhong Li, Xiaotong Liu, Dahua Lin, Bolei Zhou

This approach learns from Hindsight Inverse Dynamics based on Hindsight Experience Replay, enabling the learning process in a self-imitated manner and thus can be trained with supervised learning.

A Graph-Based Framework to Bridge Movies and Synopses

no code implementations ICCV 2019 Yu Xiong, Qingqiu Huang, Lingfeng Guo, Hang Zhou, Bolei Zhou, Dahua Lin

On top of this dataset, we develop a framework to perform matching between movie segments and synopsis paragraphs.

Reasoning About Human-Object Interactions Through Dual Attention Networks

no code implementations ICCV 2019 Tete Xiao, Quanfu Fan, Dan Gutfreund, Mathew Monfort, Aude Oliva, Bolei Zhou

The model not only finds when an action is happening and which object is being manipulated, but also identifies which part of the object is being interacted with.

Human-Object Interaction Detection

Interpreting the Latent Space of GANs for Semantic Face Editing

4 code implementations CVPR 2020 Yujun Shen, Jinjin Gu, Xiaoou Tang, Bolei Zhou

In this work, we propose a novel framework, called InterFaceGAN, for semantic face editing by interpreting the latent semantics learned by GANs.

Face Generation GAN inversion +1

Disentangled Inference for GANs with Latently Invertible Autoencoder

3 code implementations19 Jun 2019 Jiapeng Zhu, Deli Zhao, Bo Zhang, Bolei Zhou

In this paper, we show that the entanglement of the latent space for the VAE/GAN framework poses the main challenge for encoder learning.

Cross-view Semantic Segmentation for Sensing Surroundings

no code implementations9 Jun 2019 Bowen Pan, Jiankai Sun, Ho Yin Tiga Leung, Alex Andonian, Bolei Zhou

Our further experiment on a LoCoBot robot shows that our model enables the surrounding sensing capability from 2D image input.

Domain Adaptation Semantic Segmentation

Deep Flow-Guided Video Inpainting

2 code implementations CVPR 2019 Rui Xu, Xiaoxiao Li, Bolei Zhou, Chen Change Loy

Then the synthesized flow field is used to guide the propagation of pixels to fill up the missing regions in the video.

One-shot visual object segmentation Optical Flow Estimation +2

On the Units of GANs (Extended Abstract)

no code implementations29 Jan 2019 David Bau, Jun-Yan Zhu, Hendrik Strobelt, Bolei Zhou, Joshua B. Tenenbaum, William T. Freeman, Antonio Torralba

We quantify the causal effect of interpretable units by measuring the ability of interventions to control objects in the output.

Proceedings of AAAI 2019 Workshop on Network Interpretability for Deep Learning

no code implementations25 Jan 2019 Quanshi Zhang, Lixin Fan, Bolei Zhou

This is the Proceedings of AAAI 2019 Workshop on Network Interpretability for Deep Learning

FaceFeat-GAN: a Two-Stage Approach for Identity-Preserving Face Synthesis

no code implementations4 Dec 2018 Yujun Shen, Bolei Zhou, Ping Luo, Xiaoou Tang

In the second stage, they compete in the image domain to render photo-realistic images that contain high diversity but preserve identity.

Face Generation

Single Image Intrinsic Decomposition without a Single Intrinsic Image

no code implementations ECCV 2018 Wei-Chiu Ma, Hang Chu, Bolei Zhou, Raquel Urtasun, Antonio Torralba

At inference time, our model can be easily reduced to a single stream module that performs intrinsic decomposition on a single input image.

Intrinsic Image Decomposition

Interpretable Basis Decomposition for Visual Explanation

1 code implementation ECCV 2018 Bolei Zhou, Yiyou Sun, David Bau, Antonio Torralba

Explanations of the decisions made by a deep neural network are important for human end-users to be able to understand and diagnose the trustworthiness of the system.

Unified Perceptual Parsing for Scene Understanding

15 code implementations ECCV 2018 Tete Xiao, Yingcheng Liu, Bolei Zhou, Yuning Jiang, Jian Sun

In this paper, we study a new task called Unified Perceptual Parsing, which requires the machine vision systems to recognize as many visual concepts as possible from a given image.

Scene Understanding Semantic Segmentation

Revisiting the Importance of Individual Units in CNNs via Ablation

no code implementations7 Jun 2018 Bolei Zhou, Yiyou Sun, David Bau, Antonio Torralba

We confirm that unit attributes such as class selectivity are a poor predictor for impact on overall accuracy as found previously in recent work \cite{morcos2018importance}.

General Classification

DeepMiner: Discovering Interpretable Representations for Mammogram Classification and Explanation

no code implementations31 May 2018 Jimmy Wu, Bolei Zhou, Diondra Peck, Scott Hsieh, Vandana Dialani, Lester Mackey, Genevieve Patterson

We propose DeepMiner, a framework to discover interpretable representations in deep neural networks and to build explanations for medical predictions.

Classification General Classification +1

Recurrent Residual Module for Fast Inference in Videos

no code implementations CVPR 2018 Bowen Pan, Wuwei Lin, Xiaolin Fang, Chaoqin Huang, Bolei Zhou, Cewu Lu

Deep convolutional neural networks (CNNs) have made impressive progress in many video recognition tasks such as video pose estimation and video object detection.

Pose Estimation Video Object Detection +1

Temporal Relational Reasoning in Videos

3 code implementations ECCV 2018 Bolei Zhou, Alex Andonian, Aude Oliva, Antonio Torralba

Temporal relational reasoning, the ability to link meaningful transformations of objects or entities over time, is a fundamental property of intelligent species.

Action Classification Action Recognition +3

Interpreting Deep Visual Representations via Network Dissection

1 code implementation15 Nov 2017 Bolei Zhou, David Bau, Aude Oliva, Antonio Torralba

In this work, we describe Network Dissection, a method that interprets networks by providing labels for the units of their deep visual representations.

Visual Question Generation as Dual Task of Visual Question Answering

no code implementations CVPR 2018 Yikang Li, Nan Duan, Bolei Zhou, Xiao Chu, Wanli Ouyang, Xiaogang Wang

Recently visual question answering (VQA) and visual question generation (VQG) are two trending topics in the computer vision, which have been explored separately.

Question Answering Question Generation +1

Scene Graph Generation from Objects, Phrases and Region Captions

1 code implementation ICCV 2017 Yikang Li, Wanli Ouyang, Bolei Zhou, Kun Wang, Xiaogang Wang

Object detection, scene graph generation and region captioning, which are three scene understanding tasks at different semantic levels, are tied together: scene graphs are generated on top of objects detected in an image with their pairwise relationship predicted, while region captioning gives a language description of the objects, their attributes, relations, and other context information.

Graph Generation Object Detection +2

Scene Parsing Through ADE20K Dataset

no code implementations CVPR 2017 Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, Adela Barriuso, Antonio Torralba

A novel network design called Cascade Segmentation Module is proposed to parse a scene into stuff, objects, and object parts in a cascade and improve over the baselines.

Scene Parsing

Network Dissection: Quantifying Interpretability of Deep Visual Representations

no code implementations CVPR 2017 David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, Antonio Torralba

Given any CNN model, the proposed method draws on a broad data set of visual concepts to score the semantics of hidden units at each intermediate convolutional layer.

Open Vocabulary Scene Parsing

no code implementations ICCV 2017 Hang Zhao, Xavier Puig, Bolei Zhou, Sanja Fidler, Antonio Torralba

Recognizing arbitrary objects in the wild has been a challenging problem due to the limitations of existing classification models and datasets.

General Classification Scene Parsing

SegICP: Integrated Deep Semantic Segmentation and Pose Estimation

2 code implementations5 Mar 2017 Jay M. Wong, Vincent Kee, Tiffany Le, Syler Wagner, Gian-Luca Mariottini, Abraham Schneider, Lei Hamilton, Rahul Chipalkatty, Mitchell Hebert, David M. S. Johnson, Jimmy Wu, Bolei Zhou, Antonio Torralba

Recent robotic manipulation competitions have highlighted that sophisticated robots still struggle to achieve fast and reliable perception of task-relevant objects in complex, realistic scenarios.

Motion Capture Object Recognition +3

Person Search with Natural Language Description

1 code implementation CVPR 2017 Shuang Li, Tong Xiao, Hongsheng Li, Bolei Zhou, Dayu Yue, Xiaogang Wang

Searching persons in large-scale image databases with the query of natural language description has important applications in video surveillance.

Person Search Text based Person Retrieval

Places: An Image Database for Deep Scene Understanding

no code implementations6 Oct 2016 Bolei Zhou, Aditya Khosla, Agata Lapedriza, Antonio Torralba, Aude Oliva

The rise of multi-million-item dataset initiatives has enabled data-hungry machine learning algorithms to reach near-human semantic classification at tasks such as object and scene recognition.

Classification General Classification +3

Semantic Understanding of Scenes through the ADE20K Dataset

20 code implementations18 Aug 2016 Bolei Zhou, Hang Zhao, Xavier Puig, Tete Xiao, Sanja Fidler, Adela Barriuso, Antonio Torralba

Scene parsing, or recognizing and segmenting objects and stuff in an image, is one of the key problems in computer vision.

Scene Parsing Semantic Segmentation

Learning Deep Features for Discriminative Localization

30 code implementations CVPR 2016 Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio Torralba

In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network to have remarkable localization ability despite being trained on image-level labels.

Weakly-Supervised Object Localization

Optimization as Estimation with Gaussian Processes in Bandit Settings

1 code implementation21 Oct 2015 Zi Wang, Bolei Zhou, Stefanie Jegelka

Recently, there has been rising interest in Bayesian optimization -- the optimization of an unknown function with assumptions usually expressed by a Gaussian Process (GP) prior.

Gaussian Processes

Understanding Intra-Class Knowledge Inside CNN

no code implementations9 Jul 2015 Donglai Wei, Bolei Zhou, Antonio Torrabla, William Freeman

Convolutional Neural Network (CNN) has been successful in image recognition tasks, and recent works shed lights on how CNN separates different classes with the learned inter-class knowledge through visualization.

Image Retrieval

Object Detectors Emerge in Deep Scene CNNs

1 code implementation22 Dec 2014 Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio Torralba

With the success of new computational architectures for visual processing, such as convolutional neural networks (CNN) and access to image databases with millions of labeled examples (e. g., ImageNet, Places), the state of the art in computer vision is advancing rapidly.

Classification General Classification +3

Learning Deep Features for Scene Recognition using Places Database

no code implementations NeurIPS 2014 Bolei Zhou, Agata Lapedriza, Jianxiong Xiao, Antonio Torralba, Aude Oliva

Whereas the tremendous recent progress in object recognition tasks is due to the availability of large datasets like ImageNet and the rise of Convolutional Neural Networks (CNNs) for learning high-level features, performance at scene recognition has not attained the same level of success.

Object Recognition Scene Recognition

ConceptLearner: Discovering Visual Concepts from Weakly Labeled Image Collections

no code implementations CVPR 2015 Bolei Zhou, Vignesh Jagadeesh, Robinson Piramuthu

Discovering visual knowledge from weakly labeled data is crucial to scale up computer vision recognition system, since it is expensive to obtain fully labeled data for a large number of concept categories.

Object Detection Scene Recognition

Measuring Crowd Collectiveness

no code implementations CVPR 2013 Bolei Zhou, Xiaoou Tang, Xiaogang Wang

Collective motions are common in crowd systems and have attracted a great deal of attention in a variety of multidisciplinary fields.

Cannot find the paper you are looking for? You can Submit a new open access paper.