Search Results for author: Bolei Zhou

Found 123 papers, 64 papers with code

Urban Scene Diffusion through Semantic Occupancy Map

no code implementations • 18 Mar 2024 • Junge Zhang, Qihang Zhang, Li Zhang, Ramana Rao Kompella, Gaowen Liu, Bolei Zhou

Generating unbounded 3D scenes is crucial for large-scale scene understanding and simulation.

Image Generation Scene Understanding

Paper
Add Code

A Holistic Framework Towards Vision-based Traffic Signal Control with Microscopic Simulation

no code implementations • 11 Mar 2024 • Pan He, Quanyi Li, Xiaoyong Yuan, Bolei Zhou

Traffic signal control (TSC) is crucial for reducing traffic congestion that leads to smoother traffic flow, reduced idling time, and mitigated CO2 emissions.

Benchmarking

Paper
Add Code

Unsupervised Discovery of Steerable Factors When Graph Deep Generative Models Are Entangled

1 code implementation • 29 Jan 2024 • Shengchao Liu, Chengpeng Wang, Jiarui Lu, Weili Nie, Hanchen Wang, Zhuoxinran Li, Bolei Zhou, Jian Tang

Deep generative models (DGMs) have been widely developed for graph data.

Disentanglement

Paper
Code

SceneWiz3D: Towards Text-guided 3D Scene Composition

no code implementations • 13 Dec 2023 • Qihang Zhang, Chaoyang Wang, Aliaksandr Siarohin, Peiye Zhuang, Yinghao Xu, Ceyuan Yang, Dahua Lin, Bolei Zhou, Sergey Tulyakov, Hsin-Ying Lee

We are witnessing significant breakthroughs in the technology for generating 3D objects from text.

Text to 3D

Paper
Add Code

FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition

no code implementations • 12 Dec 2023 • Sicheng Mo, Fangzhou Mu, Kuan Heng Lin, Yanli Liu, Bochen Guan, Yin Li, Bolei Zhou

Recent approaches such as ControlNet offer users fine-grained spatial control over text-to-image (T2I) diffusion models.

Paper
Add Code

BerfScene: Bev-conditioned Equivariant Radiance Fields for Infinite 3D Scene Generation

no code implementations • 4 Dec 2023 • Qihang Zhang, Yinghao Xu, Yujun Shen, Bo Dai, Bolei Zhou, Ceyuan Yang

Generating large-scale 3D scenes cannot simply apply existing 3D object synthesis technique since 3D scenes usually hold complex spatial configurations and consist of a number of objects at varying scales.

Scene Generation

Paper
Add Code

CAT: Closed-loop Adversarial Training for Safe End-to-End Driving

no code implementations • 19 Oct 2023 • Linrui Zhang, Zhenghao Peng, Quanyi Li, Bolei Zhou

Driving safety is a top priority for autonomous vehicles.

Autonomous Vehicles motion prediction

Paper
Add Code

In-Domain GAN Inversion for Faithful Reconstruction and Editability

no code implementations • 25 Sep 2023 • Jiapeng Zhu, Yujun Shen, Yinghao Xu, Deli Zhao, Qifeng Chen, Bolei Zhou

This work fills in this gap by proposing in-domain GAN inversion, which consists of a domain-guided encoder and a domain-regularized optimizer, to regularize the inverted code in the native latent space of the pre-trained GAN model.

Image Generation Image Reconstruction

Paper
Add Code

Improving Out-of-Distribution Robustness of Classifiers via Generative Interpolation

no code implementations • 23 Jul 2023 • Haoyue Bai, Ceyuan Yang, Yinghao Xu, S. -H. Gary Chan, Bolei Zhou

data.

Data Augmentation

Paper
Add Code

Efficient 3D Articulated Human Generation with Layered Surface Volumes

no code implementations • 11 Jul 2023 • Yinghao Xu, Wang Yifan, Alexander W. Bergman, Menglei Chai, Bolei Zhou, Gordon Wetzstein

These layers are rendered using alpha compositing with fast differentiable rasterization, and they can be interpreted as a volumetric representation that allocates its capacity to a manifold of finite thickness around the template.

Paper
Add Code

Next Steps for Human-Centered Generative AI: A Technical Perspective

no code implementations • 27 Jun 2023 • Xiang 'Anthony' Chen, Jeff Burke, Ruofei Du, Matthew K. Hong, Jennifer Jacobs, Philippe Laban, DIngzeyu Li, Nanyun Peng, Karl D. D. Willis, Chien-Sheng Wu, Bolei Zhou

Through iterative, cross-disciplinary discussions, we define and propose next-steps for Human-centered Generative AI (HGAI).

Paper
Add Code

V2V4Real: A Real-world Large-scale Dataset for Vehicle-to-Vehicle Cooperative Perception

1 code implementation • CVPR 2023 • Runsheng Xu, Xin Xia, Jinlong Li, Hanzhao Li, Shuo Zhang, Zhengzhong Tu, Zonglin Meng, Hao Xiang, Xiaoyu Dong, Rui Song, Hongkai Yu, Bolei Zhou, Jiaqi Ma

To facilitate the development of cooperative perception, we present V2V4Real, the first large-scale real-world multi-modal dataset for V2V perception.

3D Object Detection 3D Object Tracking +4

167

Paper
Code

Guarded Policy Optimization with Imperfect Online Demonstrations

no code implementations • 3 Mar 2023 • Zhenghai Xue, Zhenghao Peng, Quanyi Li, Zhihan Liu, Bolei Zhou

Assuming optimal, the teacher policy has the perfect timing and capability to intervene in the learning process of the student agent, providing safety guarantee and exploration guidance.

Continuous Control Efficient Exploration +2

Paper
Add Code

Spatial Steerability of GANs via Self-Supervision from Discriminator

no code implementations • 20 Jan 2023 • Jianyuan Wang, Lalit Bhagat, Ceyuan Yang, Yinghao Xu, Yujun Shen, Hongdong Li, Bolei Zhou

In this work, we propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space or requiring extra annotations.

Image Generation Inductive Bias +1

Paper
Add Code

GH-Feat: Learning Versatile Generative Hierarchical Features from GANs

no code implementations • 12 Jan 2023 • Yinghao Xu, Yujun Shen, Jiapeng Zhu, Ceyuan Yang, Bolei Zhou

In this work we investigate that such a generative feature learned from image synthesis exhibits great potentials in solving a wide range of computer vision tasks, including both generative ones and more importantly discriminative ones.

Face Verification Image Harmonization +3

Paper
Add Code

Street-View Image Generation from a Bird's-Eye View Layout

1 code implementation • 11 Jan 2023 • Alexander Swerdlow, Runsheng Xu, Bolei Zhou

Instead of using perception data from real-life scenarios, an ideal model for simulation would generate realistic street-view images that align with a given HD map and traffic layout, a task that is critical for visualizing complex traffic scenarios and developing robust perception models for autonomous driving.

Autonomous Driving Image Generation

Paper
Code

DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-aware Scene Synthesis

no code implementations • CVPR 2023 • Yinghao Xu, Menglei Chai, Zifan Shi, Sida Peng, Ivan Skorokhodov, Aliaksandr Siarohin, Ceyuan Yang, Yujun Shen, Hsin-Ying Lee, Bolei Zhou, Sergey Tulyakov

Existing 3D-aware image synthesis approaches mainly focus on generating a single canonical object and show limited capacity in composing a complex scene containing a variety of objects.

3D-Aware Image Synthesis Object

Paper
Add Code

Towards Smooth Video Composition

1 code implementation • 14 Dec 2022 • Qihang Zhang, Ceyuan Yang, Yujun Shen, Yinghao Xu, Bolei Zhou

Video generation requires synthesizing consistent and persistent frames with dynamic content over time.

Ranked #1 on Video Generation on YouTube Driving

Image Generation single-image-generation +2

Paper
Code

V2XP-ASG: Generating Adversarial Scenes for Vehicle-to-Everything Perception

1 code implementation • 27 Sep 2022 • Hao Xiang, Runsheng Xu, Xin Xia, Zhaoliang Zheng, Bolei Zhou, Jiaqi Ma

Recent advancements in Vehicle-to-Everything communication technology have enabled autonomous vehicles to share sensory information to obtain better perception performance.

Autonomous Vehicles

Paper
Code

Improving GANs with A Dynamic Discriminator

no code implementations • 20 Sep 2022 • Ceyuan Yang, Yujun Shen, Yinghao Xu, Deli Zhao, Bo Dai, Bolei Zhou

Two capacity adjusting schemes are developed for training GANs under different data regimes: i) given a sufficient amount of training data, the discriminator benefits from a progressively increased learning capacity, and ii) when the training data is limited, gradually decreasing the layer width mitigates the over-fitting issue of the discriminator.

3D-Aware Image Synthesis Data Augmentation

Paper
Add Code

Optimistic Curiosity Exploration and Conservative Exploitation with Linear Reward Shaping

1 code implementation • 15 Sep 2022 • Hao Sun, Lei Han, Rui Yang, Xiaoteng Ma, Jian Guo, Bolei Zhou

We validate our insight on a range of RL tasks and show its improvement over baselines: (1) In offline RL, the conservative exploitation leads to improved performance based on off-the-shelf algorithms; (2) In online continuous control, multiple value functions with different shifting constants can be used to tackle the exploration-exploitation dilemma for better sample efficiency; (3) In discrete control tasks, a negative reward shifting yields an improvement over the curiosity-based exploration method.

Continuous Control Offline RL

Paper
Code

CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers

2 code implementations • 5 Jul 2022 • Runsheng Xu, Zhengzhong Tu, Hao Xiang, Wei Shao, Bolei Zhou, Jiaqi Ma

The extensive experiments on the V2V perception dataset, OPV2V, demonstrate that CoBEVT achieves state-of-the-art performance for cooperative BEV semantic segmentation.

3D Object Detection Autonomous Driving +2

188

Paper
Code

Human-AI Shared Control via Policy Dissection

1 code implementation • 31 May 2022 • Quanyi Li, Zhenghao Peng, Haibin Wu, Lan Feng, Bolei Zhou

Inspired by the neuroscience approach to investigate the motor cortex in primates, we develop a simple yet effective frequency-based approach called \textit{Policy Dissection} to align the intermediate representation of the learned neural controller with the kinematic attributes of the agent behavior.

Autonomous Driving Reinforcement Learning (RL)

194

Paper
Code

Learning to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining

1 code implementation • 5 Apr 2022 • Qihang Zhang, Zhenghao Peng, Bolei Zhou

Specifically, we train an inverse dynamic model with a small amount of labeled data and use it to predict action labels for all the YouTube video frames.

Autonomous Driving Imitation Learning

Paper
Code

Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation

1 code implementation • CVPR 2022 • Xian Liu, Qianyi Wu, Hang Zhou, Yinghao Xu, Rui Qian, Xinyi Lin, Xiaowei Zhou, Wayne Wu, Bo Dai, Bolei Zhou

To enhance the quality of synthesized gestures, we develop a contrastive learning strategy based on audio-text alignment for better audio representations.

Ranked #3 on Gesture Generation on TED Gesture Dataset

Contrastive Learning Gesture Generation

117

Paper
Code

LocATe: End-to-end Localization of Actions in 3D with Transformers

no code implementations • 21 Mar 2022 • Jiankai Sun, Bolei Zhou, Michael J. Black, Arjun Chandrasekaran

An important component of this problem is 3D Temporal Action Localization (3D-TAL), which involves recognizing what actions a person is performing, and when.

Action Recognition object-detection +2

Paper
Add Code

Efficient Learning of Safe Driving Policy via Human-AI Copilot Optimization

no code implementations • ICLR 2022 • Quanyi Li, Zhenghao Peng, Bolei Zhou

HACO can train agents to drive in unseen traffic scenarios with a handful of human intervention budget and achieve high safety and generalizability, outperforming both reinforcement learning and imitation learning baselines with a large margin.

Imitation Learning reinforcement-learning +1

Paper
Add Code

Visual Sound Localization in the Wild by Cross-Modal Interference Erasing

1 code implementation • 13 Feb 2022 • Xian Liu, Rui Qian, Hang Zhou, Di Hu, Weiyao Lin, Ziwei Liu, Bolei Zhou, Xiaowei Zhou

Specifically, we observe that the previous practice of learning only a single audio representation is insufficient due to the additive nature of audio signals.

Paper
Code

Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation

no code implementations • 19 Jan 2022 • Xian Liu, Yinghao Xu, Qianyi Wu, Hang Zhou, Wayne Wu, Bolei Zhou

Moreover, to enable portrait rendering in one unified neural radiance field, a Torso Deformation module is designed to stabilize the large-scale non-rigid torso motions.

Paper
Add Code

AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object Detection

no code implementations • 17 Jan 2022 • Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinghong Jiang, Feng Zhao, Bolei Zhou, Hang Zhao

This map enables our model to automate the alignment of non-homogenous features in a dynamic and data-driven manner.

3D Object Detection Autonomous Driving +1

Paper
Add Code

3D-aware Image Synthesis via Learning Structural and Textural Representations

1 code implementation • CVPR 2022 • Yinghao Xu, Sida Peng, Ceyuan Yang, Yujun Shen, Bolei Zhou

The feature field is further accumulated into a 2D feature map as the textural representation, followed by a neural renderer for appearance synthesis.

3D-Aware Image Synthesis Generative Adversarial Network

126

Paper
Code

Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition

no code implementations • CVPR 2022 • Yinghao Xu, Fangyun Wei, Xiao Sun, Ceyuan Yang, Yujun Shen, Bo Dai, Bolei Zhou, Stephen Lin

Typically in recent work, the pseudo-labels are obtained by training a model on the labeled data, and then using confident predictions from the model to teach itself.

Action Recognition

Paper
Add Code

SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations

1 code implementation • 9 Dec 2021 • Zhenyu Li, Zehui Chen, Ang Li, Liangji Fang, Qinhong Jiang, Xianming Liu, Junjun Jiang, Bolei Zhou, Hang Zhao

To bridge this gap, we aim to learn a spatial-aware visual representation that can describe the three-dimensional space and is more suitable and effective for these tasks.

Contrastive Learning Unsupervised Pre-training

Paper
Code

Improving GAN Equilibrium by Raising Spatial Awareness

1 code implementation • CVPR 2022 • Jianyuan Wang, Ceyuan Yang, Yinghao Xu, Yujun Shen, Hongdong Li, Bolei Zhou

We further propose to align the spatial awareness of G with the attention map induced from D. Through this way we effectively lessen the information gap between D and G. Extensive results show that our method pushes the two-player game in GANs closer to the equilibrium, leading to a better synthesis performance.

Attribute Inductive Bias

157

Paper
Code

One-Shot Generative Domain Adaptation

no code implementations • ICCV 2023 • Ceyuan Yang, Yujun Shen, Zhiyi Zhang, Yinghao Xu, Jiapeng Zhu, Zhirong Wu, Bolei Zhou

We then equip the well-learned discriminator backbone with an attribute classifier to ensure that the generator captures the appropriate characters from the reference.

Attribute Domain Adaptation +1

Paper
Add Code

Learning to Simulate Self-Driven Particles System with Coordinated Policy Optimization

2 code implementations • NeurIPS 2021 • Zhenghao Peng, Quanyi Li, Ka Ming Hui, Chunxiao Liu, Bolei Zhou

Self-Driven Particles (SDP) describe a category of multi-agent systems common in everyday life, such as flocking birds and traffic flows.

Multi-agent Reinforcement Learning reinforcement-learning +1

602

Paper
Code

The Nuts and Bolts of Adopting Transformer in GANs

no code implementations • 25 Oct 2021 • Rui Xu, Xiangyu Xu, Kai Chen, Bolei Zhou, Chen Change Loy

Transformer becomes prevalent in computer vision, especially for high-level vision tasks.

Generative Adversarial Network Image Generation

Paper
Add Code

Safe Driving via Expert Guided Policy Optimization

1 code implementation • 13 Oct 2021 • Zhenghao Peng, Quanyi Li, Chunxiao Liu, Bolei Zhou

Offline RL technique is further used to learn from the partial demonstration generated by the expert.

Offline RL reinforcement-learning +1

Paper
Code

Improving Out-of-Distribution Robustness of Classifiers Through Interpolated Generative Models

no code implementations • 29 Sep 2021 • Haoyue Bai, Ceyuan Yang, Yinghao Xu, S.-H. Gary Chan, Bolei Zhou

In this paper, we employ interpolated generative models to generate OoD samples at training time via data augmentation.

Data Augmentation

Paper
Add Code

SPLID: Self-Imitation Policy Learning through Iterative Distillation

no code implementations • 29 Sep 2021 • Zhihan Liu, Hao Sun, Bolei Zhou

To this end, we propose a novel meta-algorithm Self-Imitation Policy Learning through Iterative Distillation (SPLID) which relies on the concept of $\delta$-distilled policy to iteratively level up the quality of the target data and agent mimics from the relabeled target data.

Continuous Control

Paper
Add Code

Reward Shifting for Optimistic Exploration and Conservative Exploitation

no code implementations • 29 Sep 2021 • Hao Sun, Lei Han, Jian Guo, Bolei Zhou

We verify our insight on a range of tasks: (1) In offline RL, the conservative exploitation leads to improved learning performance based on off-the-shelf algorithms; (2) In online continuous control, multiple value functions with different shifting constants can be used to trade-off between exploration and exploitation thus improving learning efficiency; (3) In online RL with discrete action space, a negative reward shifting brings an improvement over the previous curiosity-based exploration method.

Continuous Control Offline RL

Paper
Add Code

Interpreting Molecule Generative Models for Interactive Molecule Discovery

no code implementations • 29 Sep 2021 • Yuanqi Du, Xian Liu, Shengchao Liu, Bolei Zhou

In this work, we develop a simple yet effective method to interpret the latent space of the learned generative models with various molecular properties for more interactive molecule generation and discovery.

Drug Discovery

Paper
Add Code

MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning

2 code implementations • 26 Sep 2021 • Quanyi Li, Zhenghao Peng, Lan Feng, Qihang Zhang, Zhenghai Xue, Bolei Zhou

Based on MetaDrive, we construct a variety of RL tasks and baselines in both single-agent and multi-agent settings, including benchmarking generalizability across unseen scenes, safe exploration, and learning multi-agent traffic.

Benchmarking Decision Making +5

602

Paper
Code

Interpreting Generative Adversarial Networks for Interactive Image Generation

no code implementations • 10 Aug 2021 • Bolei Zhou

Significant progress has been made by the advances in Generative Adversarial Networks (GANs) for image generation.

Image Generation

Paper
Add Code

Safe Exploration by Solving Early Terminated MDP

no code implementations • 9 Jul 2021 • Hao Sun, Ziping Xu, Meng Fang, Zhenghao Peng, Jiadong Guo, Bo Dai, Bolei Zhou

Safe exploration is crucial for the real-world application of reinforcement learning (RL).

Reinforcement Learning (RL) Safe Exploration

Paper
Add Code

Data-Efficient Instance Generation from Instance Discrimination

1 code implementation • NeurIPS 2021 • Ceyuan Yang, Yujun Shen, Yinghao Xu, Bolei Zhou

Meanwhile, the learned instance discrimination capability from the discriminator is in turn exploited to encourage the generator for diverse generation.

Ranked #6 on Image Generation on FFHQ 256 x 256 (FD metric)

2k Data Augmentation +1

100

Paper
Code

TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization

2 code implementations • ICCV 2021 • Wei Gao, Fang Wan, Xingjia Pan, Zhiliang Peng, Qi Tian, Zhenjun Han, Bolei Zhou, Qixiang Ye

TS-CAM finally couples the patch tokens with the semantic-agnostic attention map to achieve semantic-aware localization.

Object Weakly-Supervised Object Localization

130

Paper
Code

Multimodal Motion Prediction with Stacked Transformers

1 code implementation • CVPR 2021 • Yicheng Liu, Jinghuai Zhang, Liangji Fang, Qinhong Jiang, Bolei Zhou

Predicting multiple plausible future trajectories of the nearby vehicles is crucial for the safety of autonomous driving.

Autonomous Driving motion prediction

330

Paper
Code

Unsupervised Image Transformation Learning via Generative Adversarial Networks

no code implementations • 13 Mar 2021 • Kaiwen Zha, Yujun Shen, Bolei Zhou

In this work, we study the image transformation problem, which targets at learning the underlying transformations (e. g., the transition of seasons) from a collection of unlabeled images.

Image Generation valid

Paper
Add Code

Instance Localization for Self-supervised Detection Pretraining

1 code implementation • CVPR 2021 • Ceyuan Yang, Zhirong Wu, Bolei Zhou, Stephen Lin

The pretext task is to predict the instance category given the composited images as well as the foreground bounding boxes.

Classification General Classification +6

144

Paper
Code

Deep Learning for Scene Classification: A Survey

no code implementations • 26 Jan 2021 • Delu Zeng, Minyu Liao, Mohammad Tavakolian, Yulan Guo, Bolei Zhou, Dewen Hu, Matti Pietikäinen, Li Liu

Scene classification, aiming at classifying a scene image to one of the predefined scene categories by comprehending the entire image, is a longstanding, fundamental and challenging problem in computer vision.

Classification General Classification +1

Paper
Add Code

GAN Inversion: A Survey

1 code implementation • 14 Jan 2021 • Weihao Xia, Yulun Zhang, Yujiu Yang, Jing-Hao Xue, Bolei Zhou, Ming-Hsuan Yang

GAN inversion aims to invert a given image back into the latent space of a pretrained GAN model, for the image to be faithfully reconstructed from the inverted code by the generator.

Image Manipulation Image Restoration

1,083

Paper
Code

Self-Supervised Continuous Control without Policy Gradient

no code implementations • 1 Jan 2021 • Hao Sun, Ziping Xu, Meng Fang, Yuhang Song, Jiechao Xiong, Bo Dai, Zhengyou Zhang, Bolei Zhou

Despite the remarkable progress made by the policy gradient algorithms in reinforcement learning (RL), sub-optimal policies usually result from the local exploration property of the policy gradient update.

Continuous Control Policy Gradient Methods +3

Paper
Add Code

Improving the Generalization of End-to-End Driving through Procedural Generation

2 code implementations • 26 Dec 2020 • Quanyi Li, Zhenghao Peng, Qihang Zhang, Chunxiao Liu, Bolei Zhou

We validate that training with the increasing number of procedurally generated scenes significantly improves the generalization of the agent across scenarios of different traffic densities and road networks.

Autonomous Driving

125

Paper
Code

Positional Encoding as Spatial Inductive Bias in GANs

no code implementations • CVPR 2021 • Rui Xu, Xintao Wang, Kai Chen, Bolei Zhou, Chen Change Loy

In this work, taking SinGAN and StyleGAN2 as examples, we show that such capability, to a large extent, is brought by the implicit positional encoding when using zero padding in the generators.

Image Manipulation Inductive Bias +1

Paper
Add Code

Improving the Fairness of Deep Generative Models without Retraining

1 code implementation • 9 Dec 2020 • Shuhan Tan, Yujun Shen, Bolei Zhou

Generative Adversarial Networks (GANs) advance face synthesis through learning the underlying distribution of observed data.

Attribute Face Generation +3

Paper
Code

Texture Memory-Augmented Deep Patch-Based Image Inpainting

1 code implementation • 28 Sep 2020 • Rui Xu, Minghao Guo, Jiaqi Wang, Xiaoxiao Li, Bolei Zhou, Chen Change Loy

By bringing together the best of both paradigms, we propose a new deep inpainting framework where texture generation is guided by a texture memory of patch samples extracted from unmasked regions.

Image Inpainting Retrieval +1

6,559

Paper
Code

Understanding the Role of Individual Units in a Deep Neural Network

2 code implementations • 10 Sep 2020 • David Bau, Jun-Yan Zhu, Hendrik Strobelt, Agata Lapedriza, Bolei Zhou, Antonio Torralba

Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.

Generative Adversarial Network Image Classification +2

298

Paper
Code

A Unified Framework for Shot Type Classification Based on Subject Centric Lens

no code implementations • ECCV 2020 • Anyi Rao, Jiaze Wang, Linning Xu, Xuekun Jiang, Qingqiu Huang, Bolei Zhou, Dahua Lin

Shots are key narrative elements of various videos, e. g. movies, TV series, and user-generated videos that are thriving over the Internet.

General Classification Vocal Bursts Type Prediction

Paper
Add Code

Generative Hierarchical Features from Synthesizing Images

1 code implementation • CVPR 2021 • Yinghao Xu, Yujun Shen, Jiapeng Zhu, Ceyuan Yang, Bolei Zhou

Generative Adversarial Networks (GANs) have recently advanced image synthesis by learning the underlying distribution of the observed data.

Face Verification Image Classification +2

157

Paper
Code

Closed-Form Factorization of Latent Semantics in GANs

11 code implementations • CVPR 2021 • Yujun Shen, Bolei Zhou

A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.

Attribute Image Generation +1

2,652

Paper
Code

Unsupervised Landmark Learning from Unpaired Data

1 code implementation • 29 Jun 2020 • Yinghao Xu, Ceyuan Yang, Ziwei Liu, Bo Dai, Bolei Zhou

Recent attempts for unsupervised landmark learning leverage synthesized image pairs that are similar in appearance but different in poses.

Paper
Code

Video Representation Learning with Visual Tempo Consistency

1 code implementation • 28 Jun 2020 • Ceyuan Yang, Yinghao Xu, Bo Dai, Bolei Zhou

Visual tempo, which describes how fast an action goes, has shown its potential in supervised action recognition.

Action Anticipation Action Detection +3

Paper
Code

Non-local Policy Optimization via Diversity-regularized Collaborative Exploration

no code implementations • 14 Jun 2020 • Zhenghao Peng, Hao Sun, Bolei Zhou

Conventional Reinforcement Learning (RL) algorithms usually have one single agent learning to solve the task independently.

Reinforcement Learning (RL)

Paper
Add Code

Zeroth-Order Supervised Policy Improvement

no code implementations • 11 Jun 2020 • Hao Sun, Ziping Xu, Yuhang Song, Meng Fang, Jiechao Xiong, Bo Dai, Bolei Zhou

However, PG algorithms rely on exploiting the value function being learned with the first-order update locally, which results in limited sample efficiency.

Continuous Control Policy Gradient Methods +2

Paper
Add Code

Novel Policy Seeking with Constrained Optimization

1 code implementation • 21 May 2020 • Hao Sun, Zhenghao Peng, Bo Dai, Jian Guo, Dahua Lin, Bolei Zhou

In problem-solving, we humans can come up with multiple novel solutions to the same problem.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs

2 code implementations • 18 May 2020 • Yujun Shen, Ceyuan Yang, Xiaoou Tang, Bolei Zhou

In this work, we propose a framework called InterFaceGAN to interpret the disentangled face representation learned by the state-of-the-art GAN models and study the properties of the facial semantics encoded in the latent space.

Attribute Face Generation

1,466

Paper
Code

Semantic Photo Manipulation with a Generative Image Prior

1 code implementation • 15 May 2020 • David Bau, Hendrik Strobelt, William Peebles, Jonas Wulff, Bolei Zhou, Jun-Yan Zhu, Antonio Torralba

First, it is hard for GANs to precisely reproduce an input image.

992

Paper
Code

Evolutionary Stochastic Policy Distillation

1 code implementation • 27 Apr 2020 • Hao Sun, Xinyu Pan, Bo Dai, Dahua Lin, Bolei Zhou

Solving the Goal-Conditioned Reward Sparse (GCRS) task is a challenging reinforcement learning problem due to the sparsity of reward signals.

Paper
Code

TPNet: Trajectory Proposal Network for Motion Prediction

no code implementations • CVPR 2020 • Liangji Fang, Qinhong Jiang, Jianping Shi, Bolei Zhou

However, it remains difficult for these methods to provide multimodal predictions as well as integrate physical constraints such as traffic rules and movable areas.

Autonomous Driving motion prediction +1

Paper
Add Code

Temporal Pyramid Network for Action Recognition

3 code implementations • CVPR 2020 • Ceyuan Yang, Yinghao Xu, Jianping Shi, Bo Dai, Bolei Zhou

Previous works often capture the visual tempo through sampling raw videos at multiple rates and constructing an input-level frame pyramid, which usually requires a costly multi-branch network to handle.

Ranked #105 on Action Recognition on Something-Something V2

Action Recognition

3,876

Paper
Code

A Local-to-Global Approach to Multi-modal Movie Scene Segmentation

4 code implementations • CVPR 2020 • Anyi Rao, Linning Xu, Yu Xiong, Guodong Xu, Qingqiu Huang, Bolei Zhou, Dahua Lin

Scene, as the crucial unit of storytelling in movies, contains complex activities of actors and their interactions in a physical environment.

Action Recognition Scene Segmentation +1

212

Paper
Code

In-Domain GAN Inversion for Real Image Editing

2 code implementations • ECCV 2020 • Jiapeng Zhu, Yujun Shen, Deli Zhao, Bolei Zhou

A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.

Image Reconstruction

457

Paper
Code

TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting

no code implementations • CVPR 2020 • Zhuoqian Yang, Wentao Zhu, Wayne Wu, Chen Qian, Qiang Zhou, Bolei Zhou, Chen Change Loy

We present a lightweight video motion retargeting approach TransMoMo that is capable of transferring motion of a person in a source video realistically to another video of a target person.

motion retargeting

Paper
Add Code

Image Processing Using Multi-Code GAN Prior

1 code implementation • CVPR 2020 • Jinjin Gu, Yujun Shen, Bolei Zhou

Such an over-parameterization of the latent space significantly improves the image reconstruction quality, outperforming existing competitors.

Ranked #7 on Blind Face Restoration on CelebA-Test

Blind Face Restoration Colorization +6

288

Paper
Code

Learning a Decision Module by Imitating Driver's Control Behaviors

no code implementations • 30 Nov 2019 • Junning Huang, Sirui Xie, Jiankai Sun, Qiurui Ma, Chunxiao Liu, Jianping Shi, Dahua Lin, Bolei Zhou

In this work, we propose a hybrid framework to learn neural decisions in the classical modular pipeline through end-to-end imitation learning.

Autonomous Driving Imitation Learning

Paper
Add Code

Every Frame Counts: Joint Learning of Video Segmentation and Optical Flow

no code implementations • 28 Nov 2019 • Mingyu Ding, Zhe Wang, Bolei Zhou, Jianping Shi, Zhiwu Lu, Ping Luo

Moreover, our framework is able to utilize both labeled and unlabeled frames in the video through joint training, while no additional calculation is required in inference.

Optical Flow Estimation Segmentation +3

Paper
Add Code

Semantic Hierarchy Emerges in Deep Generative Representations for Scene Synthesis

2 code implementations • 21 Nov 2019 • Ceyuan Yang, Yujun Shen, Bolei Zhou

Despite the success of Generative Adversarial Networks (GANs) in image synthesis, there lacks enough understanding on what generative models have learned inside the deep generative representations and how photo-realistic images are able to be composed of the layer-wise stochasticity introduced in recent GANs.

Image Generation

157

Paper
Code

Policy Continuation with Hindsight Inverse Dynamics

1 code implementation • NeurIPS 2019 • Hao Sun, Zhizhong Li, Xiaotong Liu, Dahua Lin, Bolei Zhou

This approach learns from Hindsight Inverse Dynamics based on Hindsight Experience Replay, enabling the learning process in a self-imitated manner and thus can be trained with supervised learning.

Reinforcement Learning (RL)

Paper
Code

Seeing What a GAN Cannot Generate

1 code implementation • ICCV 2019 • David Bau, Jun-Yan Zhu, Jonas Wulff, William Peebles, Hendrik Strobelt, Bolei Zhou, Antonio Torralba

Differences in statistics reveal object classes that are omitted by a GAN.

Semantic Segmentation

184

Paper
Code

A Graph-Based Framework to Bridge Movies and Synopses

no code implementations • ICCV 2019 • Yu Xiong, Qingqiu Huang, Lingfeng Guo, Hang Zhou, Bolei Zhou, Dahua Lin

On top of this dataset, we develop a framework to perform matching between movie segments and synopsis paragraphs.

Paper
Add Code

Semantic Hierarchy Emerges in the Deep Generative Representations for Scene Synthesis

no code implementations • 25 Sep 2019 • Ceyuan Yang, Yujun Shen, Bolei Zhou

Despite the success of Generative Adversarial Networks (GANs) in image synthesis, there lacks enough understanding on what networks have learned inside the deep generative representations and how photo-realistic images are able to be composed from random noises.

Image Generation

Paper
Add Code

Learning with Social Influence through Interior Policy Differentiation

no code implementations • 25 Sep 2019 • Hao Sun, Bo Dai, Jiankai Sun, Zhenghao Peng, Guodong Xu, Dahua Lin, Bolei Zhou

In this work we model the social influence into the scheme of reinforcement learning, enabling the agents to learn both from the environment and from their peers.

Reinforcement Learning (RL)

Paper
Add Code

LIA: Latently Invertible Autoencoder with Adversarial Learning

no code implementations • 25 Sep 2019 • Jiapeng Zhu, Deli Zhao, Bolei Zhou, Bo Zhang

A two-stage stochasticity-free training scheme is designed to train LIA via adversarial learning, in the sense that the decoder of LIA is first trained as a standard GAN with the invertible network and then the partial encoder is learned from an autoencoder by detaching the invertible network from LIA.

Generative Adversarial Network Variational Inference

Paper
Add Code

Reasoning About Human-Object Interactions Through Dual Attention Networks

no code implementations • ICCV 2019 • Tete Xiao, Quanfu Fan, Dan Gutfreund, Mathew Monfort, Aude Oliva, Bolei Zhou

The model not only finds when an action is happening and which object is being manipulated, but also identifies which part of the object is being interacted with.

Human-Object Interaction Detection Object

Paper
Add Code

Interpreting the Latent Space of GANs for Semantic Face Editing

4 code implementations • CVPR 2020 • Yujun Shen, Jinjin Gu, Xiaoou Tang, Bolei Zhou

In this work, we propose a novel framework, called InterFaceGAN, for semantic face editing by interpreting the latent semantics learned by GANs.

Attribute Disentanglement +2

1,466

Paper
Code

Disentangled Inference for GANs with Latently Invertible Autoencoder

3 code implementations • 19 Jun 2019 • Jiapeng Zhu, Deli Zhao, Bo Zhang, Bolei Zhou

In this paper, we show that the entanglement of the latent space for the VAE/GAN framework poses the main challenge for encoder learning.

Paper
Code

Cross-view Semantic Segmentation for Sensing Surroundings

1 code implementation • 9 Jun 2019 • Bowen Pan, Jiankai Sun, Ho Yin Tiga Leung, Alex Andonian, Bolei Zhou

Our further experiment on a LoCoBot robot shows that our model enables the surrounding sensing capability from 2D image input.

Domain Adaptation Semantic Segmentation

146

Paper
Code

Deep Flow-Guided Video Inpainting

2 code implementations • CVPR 2019 • Rui Xu, Xiaoxiao Li, Bolei Zhou, Chen Change Loy

Then the synthesized flow field is used to guide the propagation of pixels to fill up the missing regions in the video.

Ranked #8 on Video Inpainting on DAVIS

One-shot visual object segmentation Optical Flow Estimation +2

2,316

Paper
Code

Visualizing and Understanding GANs

no code implementations • ICLR Workshop DeepGenStruct 2019 • David Bau, Jun-Yan Zhu, Hendrik Strobelt, Bolei Zhou, Joshua B. Tenenbaum, William T. Freeman, Antonio Torralba

We present an analytic framework to visualize and understand GANs at the unit-, object-, and scene-level.

Object

Paper
Add Code

On the Units of GANs (Extended Abstract)

no code implementations • 29 Jan 2019 • David Bau, Jun-Yan Zhu, Hendrik Strobelt, Bolei Zhou, Joshua B. Tenenbaum, William T. Freeman, Antonio Torralba

We quantify the causal effect of interpretable units by measuring the ability of interventions to control objects in the output.

Paper
Add Code

Proceedings of AAAI 2019 Workshop on Network Interpretability for Deep Learning

no code implementations • 25 Jan 2019 • Quanshi Zhang, Lixin Fan, Bolei Zhou

This is the Proceedings of AAAI 2019 Workshop on Network Interpretability for Deep Learning

Paper
Add Code

FaceFeat-GAN: a Two-Stage Approach for Identity-Preserving Face Synthesis

no code implementations • 4 Dec 2018 • Yujun Shen, Bolei Zhou, Ping Luo, Xiaoou Tang

In the second stage, they compete in the image domain to render photo-realistic images that contain high diversity but preserve identity.

Face Generation Vocal Bursts Valence Prediction

Paper
Add Code

GAN Dissection: Visualizing and Understanding Generative Adversarial Networks

8 code implementations • ICLR 2019 • David Bau, Jun-Yan Zhu, Hendrik Strobelt, Bolei Zhou, Joshua B. Tenenbaum, William T. Freeman, Antonio Torralba

Then, we quantify the causal effect of interpretable units by measuring the ability of interventions to control objects in the output.

Image Generation Object

1,770

Paper
Code

Single Image Intrinsic Decomposition without a Single Intrinsic Image

no code implementations • ECCV 2018 • Wei-Chiu Ma, Hang Chu, Bolei Zhou, Raquel Urtasun, Antonio Torralba

At inference time, our model can be easily reduced to a single stream module that performs intrinsic decomposition on a single input image.

Intrinsic Image Decomposition

Paper
Add Code

Interpretable Basis Decomposition for Visual Explanation

1 code implementation • ECCV 2018 • Bolei Zhou, Yiyou Sun, David Bau, Antonio Torralba

Explanations of the decisions made by a deep neural network are important for human end-users to be able to understand and diagnose the trustworthiness of the system.

Paper
Code

Real-Time Object Pose Estimation with Pose Interpreter Networks

2 code implementations • 3 Aug 2018 • Jimmy Wu, Bolei Zhou, Rebecca Russell, Vincent Kee, Syler Wagner, Mitchell Hebert, Antonio Torralba, David M. S. Johnson

In this work, we introduce pose interpreter networks for 6-DoF object pose estimation.

Object Pose Estimation

121

Paper
Code

Unified Perceptual Parsing for Scene Understanding

18 code implementations • ECCV 2018 • Tete Xiao, Yingcheng Liu, Bolei Zhou, Yuning Jiang, Jian Sun

In this paper, we study a new task called Unified Perceptual Parsing, which requires the machine vision systems to recognize as many visual concepts as possible from a given image.

Ranked #88 on Semantic Segmentation on ADE20K val

Scene Understanding Semantic Segmentation

7,374

Paper
Code

Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation

1 code implementation • ECCV 2018 • Yikang Li, Wanli Ouyang, Bolei Zhou, Jianping Shi, Chao Zhang, Xiaogang Wang

Generating scene graph to describe all the relations inside an image gains increasing interests these years.

Ranked #1 on Scene Graph Generation on VRD

Clustering Graph Generation +3

213

Paper
Code

Revisiting the Importance of Individual Units in CNNs via Ablation

no code implementations • 7 Jun 2018 • Bolei Zhou, Yiyou Sun, David Bau, Antonio Torralba

We confirm that unit attributes such as class selectivity are a poor predictor for impact on overall accuracy as found previously in recent work \cite{morcos2018importance}.

General Classification

Paper
Add Code

DeepMiner: Discovering Interpretable Representations for Mammogram Classification and Explanation

1 code implementation • 31 May 2018 • Jimmy Wu, Bolei Zhou, Diondra Peck, Scott Hsieh, Vandana Dialani, Lester Mackey, Genevieve Patterson

We propose DeepMiner, a framework to discover interpretable representations in deep neural networks and to build explanations for medical predictions.

Classification General Classification +1

Paper
Code

Expert identification of visual primitives used by CNNs during mammogram classification

1 code implementation • 13 Mar 2018 • Jimmy Wu, Diondra Peck, Scott Hsieh, Vandana Dialani, Constance D. Lehman, Bolei Zhou, Vasilis Syrgkanis, Lester Mackey, Genevieve Patterson

This work interprets the internal representations of deep neural networks trained for classification of diseased tissue in 2D mammograms.

Classification General Classification

Paper
Code

Recurrent Residual Module for Fast Inference in Videos

no code implementations • CVPR 2018 • Bowen Pan, Wuwei Lin, Xiaolin Fang, Chaoqin Huang, Bolei Zhou, Cewu Lu

Deep convolutional neural networks (CNNs) have made impressive progress in many video recognition tasks such as video pose estimation and video object detection.

object-detection Pose Estimation +2

Paper
Add Code

Moments in Time Dataset: one million videos for event understanding

4 code implementations • 9 Jan 2018 • Mathew Monfort, Alex Andonian, Bolei Zhou, Kandan Ramakrishnan, Sarah Adel Bargal, Tom Yan, Lisa Brown, Quanfu Fan, Dan Gutfruend, Carl Vondrick, Aude Oliva

We present the Moments in Time Dataset, a large-scale human-annotated collection of one million short videos corresponding to dynamic events unfolding within three seconds.

Ranked #2 on Multimodal Activity Recognition on Moments in Time Dataset

Action Recognition Multimodal Activity Recognition +1

354

Paper
Code

Temporal Relational Reasoning in Videos

5 code implementations • ECCV 2018 • Bolei Zhou, Alex Andonian, Aude Oliva, Antonio Torralba

Temporal relational reasoning, the ability to link meaningful transformations of objects or entities over time, is a fundamental property of intelligent species.

Ranked #2 on Hand Gesture Recognition on Jester test

Action Classification Action Recognition In Videos +4

782

Paper
Code

Interpreting Deep Visual Representations via Network Dissection

2 code implementations • 15 Nov 2017 • Bolei Zhou, David Bau, Aude Oliva, Antonio Torralba

In this work, we describe Network Dissection, a method that interprets networks by providing labels for the units of their deep visual representations.

211

Paper
Code

Visual Question Generation as Dual Task of Visual Question Answering

no code implementations • CVPR 2018 • Yikang Li, Nan Duan, Bolei Zhou, Xiao Chu, Wanli Ouyang, Xiaogang Wang

Recently visual question answering (VQA) and visual question generation (VQG) are two trending topics in the computer vision, which have been explored separately.

Question Answering Question Generation +2

Paper
Add Code

Scene Graph Generation from Objects, Phrases and Region Captions

1 code implementation • ICCV 2017 • Yikang Li, Wanli Ouyang, Bolei Zhou, Kun Wang, Xiaogang Wang

Object detection, scene graph generation and region captioning, which are three scene understanding tasks at different semantic levels, are tied together: scene graphs are generated on top of objects detected in an image with their pairwise relationship predicted, while region captioning gives a language description of the objects, their attributes, relations, and other context information.

Ranked #2 on Object Detection on Visual Genome

Graph Generation object-detection +3

226

Paper
Code

Scene Parsing Through ADE20K Dataset

no code implementations • CVPR 2017 • Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, Adela Barriuso, Antonio Torralba

A novel network design called Cascade Segmentation Module is proposed to parse a scene into stuff, objects, and object parts in a cascade and improve over the baselines.

Object Scene Parsing +1

Paper
Add Code

Network Dissection: Quantifying Interpretability of Deep Visual Representations

1 code implementation • CVPR 2017 • David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, Antonio Torralba

Given any CNN model, the proposed method draws on a broad data set of visual concepts to score the semantics of hidden units at each intermediate convolutional layer.

211

Paper
Code

Open Vocabulary Scene Parsing

no code implementations • ICCV 2017 • Hang Zhao, Xavier Puig, Bolei Zhou, Sanja Fidler, Antonio Torralba

Recognizing arbitrary objects in the wild has been a challenging problem due to the limitations of existing classification models and datasets.

General Classification Scene Parsing

Paper
Add Code

SegICP: Integrated Deep Semantic Segmentation and Pose Estimation

2 code implementations • 5 Mar 2017 • Jay M. Wong, Vincent Kee, Tiffany Le, Syler Wagner, Gian-Luca Mariottini, Abraham Schneider, Lei Hamilton, Rahul Chipalkatty, Mitchell Hebert, David M. S. Johnson, Jimmy Wu, Bolei Zhou, Antonio Torralba

Recent robotic manipulation competitions have highlighted that sophisticated robots still struggle to achieve fast and reliable perception of task-relevant objects in complex, realistic scenarios.

Object Recognition Point Cloud Registration +3

Paper
Code

Person Search with Natural Language Description

1 code implementation • CVPR 2017 • Shuang Li, Tong Xiao, Hongsheng Li, Bolei Zhou, Dayu Yue, Xiaogang Wang

Searching persons in large-scale image databases with the query of natural language description has important applications in video surveillance.

Attribute Person Search +1

143

Paper
Code

Places: An Image Database for Deep Scene Understanding

no code implementations • 6 Oct 2016 • Bolei Zhou, Aditya Khosla, Agata Lapedriza, Antonio Torralba, Aude Oliva

The rise of multi-million-item dataset initiatives has enabled data-hungry machine learning algorithms to reach near-human semantic classification at tasks such as object and scene recognition.

BIG-bench Machine Learning Classification +4

Paper
Add Code

Semantic Understanding of Scenes through the ADE20K Dataset

21 code implementations • 18 Aug 2016 • Bolei Zhou, Hang Zhao, Xavier Puig, Tete Xiao, Sanja Fidler, Adela Barriuso, Antonio Torralba

Scene parsing, or recognizing and segmenting objects and stuff in an image, is one of the key problems in computer vision.

Scene Parsing Segmentation +1

4,833

Paper
Code

Learning Deep Features for Discriminative Localization

33 code implementations • CVPR 2016 • Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio Torralba

In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network to have remarkable localization ability despite being trained on image-level labels.

Ranked #2 on Weakly-Supervised Object Localization on Tiny ImageNet

Weakly-Supervised Object Localization

6,299

Paper
Code

Simple Baseline for Visual Question Answering

7 code implementations • 7 Dec 2015 • Bolei Zhou, Yuandong Tian, Sainbayar Sukhbaatar, Arthur Szlam, Rob Fergus

We describe a very simple bag-of-words baseline for visual question answering.

Ranked #10 on Visual Question Answering (VQA) on COCO Visual Question Answering (VQA) real images 1.0 multiple choice

Visual Question Answering

186

Paper
Code

Optimization as Estimation with Gaussian Processes in Bandit Settings

1 code implementation • 21 Oct 2015 • Zi Wang, Bolei Zhou, Stefanie Jegelka

Recently, there has been rising interest in Bayesian optimization -- the optimization of an unknown function with assumptions usually expressed by a Gaussian Process (GP) prior.

Bayesian Optimization Gaussian Processes

Paper
Code

Understanding Intra-Class Knowledge Inside CNN

no code implementations • 9 Jul 2015 • Donglai Wei, Bolei Zhou, Antonio Torrabla, William Freeman

Convolutional Neural Network (CNN) has been successful in image recognition tasks, and recent works shed lights on how CNN separates different classes with the learned inter-class knowledge through visualization.

Image Retrieval Object +1

Paper
Add Code

Object Detectors Emerge in Deep Scene CNNs

1 code implementation • 22 Dec 2014 • Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio Torralba

With the success of new computational architectures for visual processing, such as convolutional neural networks (CNN) and access to image databases with millions of labeled examples (e. g., ImageNet, Places), the state of the art in computer vision is advancing rapidly.

General Classification Object +3

Paper
Code

Learning Deep Features for Scene Recognition using Places Database

no code implementations • NeurIPS 2014 • Bolei Zhou, Agata Lapedriza, Jianxiong Xiao, Antonio Torralba, Aude Oliva

Whereas the tremendous recent progress in object recognition tasks is due to the availability of large datasets like ImageNet and the rise of Convolutional Neural Networks (CNNs) for learning high-level features, performance at scene recognition has not attained the same level of success.

Object Object Recognition +1

Paper
Add Code

ConceptLearner: Discovering Visual Concepts from Weakly Labeled Image Collections

no code implementations • CVPR 2015 • Bolei Zhou, Vignesh Jagadeesh, Robinson Piramuthu

Discovering visual knowledge from weakly labeled data is crucial to scale up computer vision recognition system, since it is expensive to obtain fully labeled data for a large number of concept categories.

object-detection Object Detection +1

Paper
Add Code

Measuring Crowd Collectiveness

no code implementations • CVPR 2013 • Bolei Zhou, Xiaoou Tang, Xiaogang Wang

Collective motions are common in crowd systems and have attracted a great deal of attention in a variety of multidisciplinary fields.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.