Search Results for author: Dahua Lin

Found 137 papers, 72 papers with code

Guided Diffusion Model for Adversarial Purification

no code implementations30 May 2022 Jinyi Wang, Zhaoyang Lyu, Dahua Lin, Bo Dai, Hongfei Fu

In this paper, we propose a novel purification approach, referred to as guided diffusion model for purification (GDMP), to help protect classifiers from adversarial attacks.

Denoising

Accelerating Diffusion Models via Early Stop of the Diffusion Process

1 code implementation25 May 2022 Zhaoyang Lyu, Xudong Xu, Ceyuan Yang, Dahua Lin, Bo Dai

By modeling the reverse process of gradually diffusing the data distribution into a Gaussian distribution, generating a sample in DDPMs can be regarded as iteratively denoising a randomly sampled Gaussian noise.

Denoising Image Generation

Towards Diverse and Natural Scene-aware 3D Human Motion Synthesis

no code implementations CVPR 2022 Jingbo Wang, Yu Rong, Jingyuan Liu, Sijie Yan, Dahua Lin, Bo Dai

The ability to synthesize long-term human motion sequences in real-world scenes can facilitate numerous applications.

motion synthesis

PYSKL: Towards Good Practices for Skeleton Action Recognition

1 code implementation19 May 2022 Haodong Duan, Jiaqi Wang, Kai Chen, Dahua Lin

The toolbox supports a wide variety of skeleton action recognition algorithms, including approaches based on GCN and CNN.

Action Recognition Skeleton Based Action Recognition

MINI: Mining Implicit Novel Instances for Few-Shot Object Detection

no code implementations6 May 2022 Yuhang Cao, Jiaqi Wang, Yiqi Lin, Dahua Lin

The offline mining mechanism leverages a self-supervised discriminative model to collaboratively mine implicit novel instances with a trained FSOD network.

Few-Shot Object Detection object-detection

SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition

1 code implementation CVPR 2022 Mingxin Huang, Yuliang Liu, Zhenghao Peng, Chongyu Liu, Dahua Lin, Shenggao Zhu, Nicholas Yuan, Kai Ding, Lianwen Jin

End-to-end scene text spotting has attracted great attention in recent years due to the success of excavating the intrinsic synergy of the scene text detection and recognition.

Scene Text Detection Text Spotting

OCSampler: Compressing Videos to One Clip with Single-step Sampling

no code implementations CVPR 2022 Jintao Lin, Haodong Duan, Kai Chen, Dahua Lin, LiMin Wang

Recent works prefer to formulate frame sampling as a sequential decision task by selecting frames one by one according to their importance, while we present a new paradigm of learning instance-specific video condensation policies to select informative frames for representing the entire video only in a single step.

Video Recognition

SPTS: Single-Point Text Spotting

no code implementations15 Dec 2021 Dezhi Peng, Xinyu Wang, Yuliang Liu, Jiaxin Zhang, Mingxin Huang, Songxuan Lai, Shenggao Zhu, Jing Li, Dahua Lin, Chunhua Shen, Xiang Bai, Lianwen Jin

For the first time, we demonstrate that training scene text spotting models can be achieved with an extremely low-cost annotation of a single-point for each instance.

Language Modelling Text Spotting

CityNeRF: Building NeRF at City Scale

no code implementations10 Dec 2021 Yuanbo Xiangli, Linning Xu, Xingang Pan, Nanxuan Zhao, Anyi Rao, Christian Theobalt, Bo Dai, Dahua Lin

Neural Radiance Field (NeRF) has achieved outstanding performance in modeling 3D objects and controlled scenes, usually under a single scale.

Balanced Chamfer Distance as a Comprehensive Metric for Point Cloud Completion

1 code implementation NeurIPS 2021 Tong Wu, Liang Pan, Junzhe Zhang, Tai Wang, Ziwei Liu, Dahua Lin

We adopt DCD to evaluate the point cloud completion task, where experimental results show that DCD pays attention to both the overall structure and local geometric details and provides a more reliable evaluation even when CD and EMD contradict each other.

Point Cloud Completion

Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion

1 code implementation24 Nov 2021 Tong Wu, Liang Pan, Junzhe Zhang, Tai Wang, Ziwei Liu, Dahua Lin

We adopt DCD to evaluate the point cloud completion task, where experimental results show that DCD pays attention to both the overall structure and local geometric details and provides a more reliable evaluation even when CD and EMD contradict each other.

Point Cloud Completion

Few-Shot Object Detection via Association and DIscrimination

1 code implementation NeurIPS 2021 Yuhang Cao, Jiaqi Wang, Ying Jin, Tong Wu, Kai Chen, Ziwei Liu, Dahua Lin

1) In the association step, in contrast to implicitly leveraging multiple base classes, we construct a compact novel class feature space via explicitly imitating a specific base class feature space.

Few-Shot Object Detection object-detection +2

INTERN: A New Learning Paradigm Towards General Vision

no code implementations16 Nov 2021 Jing Shao, Siyu Chen, Yangguang Li, Kun Wang, Zhenfei Yin, Yinan He, Jianing Teng, Qinghong Sun, Mengya Gao, Jihao Liu, Gengshi Huang, Guanglu Song, Yichao Wu, Yuming Huang, Fenggang Liu, Huan Peng, Shuo Qin, Chengyu Wang, Yujie Wang, Conghui He, Ding Liang, Yu Liu, Fengwei Yu, Junjie Yan, Dahua Lin, Xiaogang Wang, Yu Qiao

Enormous waves of technological innovations over the past several years, marked by the advances in AI technologies, are profoundly reshaping the industry and the society.

Generative Occupancy Fields for 3D Surface-Aware Image Synthesis

1 code implementation NeurIPS 2021 Xudong Xu, Xingang Pan, Dahua Lin, Bo Dai

In this paper, we propose Generative Occupancy Fields (GOF), a novel model based on generative radiance fields that can learn compact object surfaces without impeding its training convergence.

3D-Aware Image Synthesis

Temporal RoI Align for Video Object Recognition

1 code implementation8 Sep 2021 Tao Gong, Kai Chen, Xinjiang Wang, Qi Chu, Feng Zhu, Dahua Lin, Nenghai Yu, Huamin Feng

In this work, considering the features of the same object instance are highly similar among frames in a video, a novel Temporal RoI Align operator is proposed to extract features from other frames feature maps for current frame proposals by utilizing feature similarity.

Instance Segmentation object-detection +4

Towards Balanced Learning for Instance Recognition

no code implementations23 Aug 2021 Jiangmiao Pang, Kai Chen, Qi Li, Zhihai Xu, Huajun Feng, Jianping Shi, Wanli Ouyang, Dahua Lin

In this work, we carefully revisit the standard training practice of detectors, and find that the detection performance is often limited by the imbalance during the training process, which generally consists in three levels - sample level, feature level, and objective level.

MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding

1 code implementation14 Aug 2021 Zhanghui Kuang, Hongbin Sun, Zhizhong Li, Xiaoyu Yue, Tsui Hin Lin, Jianyong Chen, Huaqiang Wei, Yiqin Zhu, Tong Gao, Wenwei Zhang, Kai Chen, Wayne Zhang, Dahua Lin

We present MMOCR-an open-source toolbox which provides a comprehensive pipeline for text detection and recognition, as well as their downstream tasks such as named entity recognition and key information extraction.

Key information extraction named-entity-recognition +2

Vision Transformer with Progressive Sampling

1 code implementation ICCV 2021 Xiaoyu Yue, Shuyang Sun, Zhanghui Kuang, Meng Wei, Philip Torr, Wayne Zhang, Dahua Lin

As a typical example, the Vision Transformer (ViT) directly applies a pure transformer architecture on image classification, by simply splitting images into tokens with a fixed length, and employing transformers to learn relations between these tokens.

Image Classification

Probabilistic and Geometric Depth: Detecting Objects in Perspective

1 code implementation29 Jul 2021 Tai Wang, Xinge Zhu, Jiangmiao Pang, Dahua Lin

As the preliminary depth estimation of each instance is usually inaccurate in this ill-posed setting, we incorporate a probabilistic representation to capture the uncertainty.

Depth Estimation Monocular 3D Object Detection +1

Transcript to Video: Efficient Clip Sequencing from Texts

no code implementations25 Jul 2021 Yu Xiong, Fabian Caba Heilbron, Dahua Lin

To meet the demands for non-experts, we present Transcript-to-Video -- a weakly-supervised framework that uses texts as input to automatically create video sequences from an extensive collection of shots.

Scene-aware Generative Network for Human Motion Synthesis

no code implementations CVPR 2021 Jingbo Wang, Sijie Yan, Bo Dai, Dahua Lin

We revisit human motion synthesis, a task useful in various real world applications, in this paper.

motion synthesis

WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection

no code implementations21 May 2021 Shijie Fang, Yuhang Cao, Xinjiang Wang, Kai Chen, Dahua Lin, Wayne Zhang

The performance of object detection, to a great extent, depends on the availability of large annotated datasets.

object-detection Object Detection +1

Revisiting Skeleton-based Action Recognition

3 code implementations CVPR 2022 Haodong Duan, Yue Zhao, Kai Chen, Dahua Lin, Bo Dai

In this work, we propose PoseC3D, a new approach to skeleton-based action recognition, which relies on a 3D heatmap stack instead of a graph sequence as the base representation of human skeletons.

Action Recognition Group Activity Recognition +2

FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection

4 code implementations22 Apr 2021 Tai Wang, Xinge Zhu, Jiangmiao Pang, Dahua Lin

In this paper, we study this problem with a practice built on a fully convolutional single-stage detector and propose a general framework FCOS3D.

Autonomous Driving Monocular 3D Object Detection +1

Visually Informed Binaural Audio Generation without Binaural Audios

no code implementations CVPR 2021 Xudong Xu, Hang Zhou, Ziwei Liu, Bo Dai, Xiaogang Wang, Dahua Lin

Moreover, combined with binaural recordings, our method is able to further boost the performance of binaural audio generation under supervised settings.

Audio Generation

Adversarial Robustness under Long-Tailed Distribution

1 code implementation CVPR 2021 Tong Wu, Ziwei Liu, Qingqiu Huang, Yu Wang, Dahua Lin

We then perform a systematic study on existing long-tailed recognition methods in conjunction with the adversarial training framework.

Adversarial Robustness

Towards Evaluating and Training Verifiably Robust Neural Networks

1 code implementation CVPR 2021 Zhaoyang Lyu, Minghao Guo, Tong Wu, Guodong Xu, Kehuan Zhang, Dahua Lin

Recent works have shown that interval bound propagation (IBP) can be used to train verifiably robust neural networks.

3D Building Reconstruction From Monocular Remote Sensing Images

no code implementations ICCV 2021 Weijia Li, Lingxuan Meng, Jinwang Wang, Conghui He, Gui-Song Xia, Dahua Lin

3D building reconstruction from monocular remote sensing imagery is an important research problem and an economic solution to large-scale city modeling, compared with reconstruction from LiDAR data and multi-view imagery.

3D Reconstruction

CARAFE++: Unified Content-Aware ReAssembly of FEatures

no code implementations7 Dec 2020 Jiaqi Wang, Kai Chen, Rui Xu, Ziwei Liu, Chen Change Loy, Dahua Lin

Feature reassembly, i. e. feature downsampling and upsampling, is a key operation in a number of modern convolutional network architectures, e. g., residual networks and feature pyramids.

Image Inpainting Instance Segmentation +3

FLAVA: Find, Localize, Adjust and Verify to Annotate LiDAR-Based Point Clouds

no code implementations20 Nov 2020 Tai Wang, Conghui He, Zhe Wang, Jianping Shi, Dahua Lin

Recent years have witnessed the rapid progress of perception algorithms on top of LiDAR, a widely adopted sensor for autonomous driving systems.

Autonomous Driving

Understanding the wiring evolution in differentiable neural architecture search

1 code implementation2 Sep 2020 Sirui Xie, Shoukang Hu, Xinjiang Wang, Chunxiao Liu, Jianping Shi, Xunying Liu, Dahua Lin

To this end, we pose questions that future differentiable methods for neural wiring discovery need to confront, hoping to evoke a discussion and rethinking on how much bias has been enforced implicitly in existing NAS methods.

Neural Architecture Search

Online Multi-modal Person Search in Videos

no code implementations ECCV 2020 Jiangyue Xia, Anyi Rao, Qingqiu Huang, Linning Xu, Jiangtao Wen, Dahua Lin

The task of searching certain people in videos has seen increasing potential in real-world applications, such as video organization and editing.

Person Recognition Person Search +1

A Unified Framework for Shot Type Classification Based on Subject Centric Lens

no code implementations ECCV 2020 Anyi Rao, Jiaze Wang, Linning Xu, Xuekun Jiang, Qingqiu Huang, Bolei Zhou, Dahua Lin

Shots are key narrative elements of various videos, e. g. movies, TV series, and user-generated videos that are thriving over the Internet.

General Classification

Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic Segmentation

2 code implementations4 Aug 2020 Hui Zhou, Xinge Zhu, Xiao Song, Yuexin Ma, Zhe Wang, Hongsheng Li, Dahua Lin

A straightforward solution to tackle the issue of 3D-to-2D projection is to keep the 3D representation and process the points in the 3D space.

3D Semantic Segmentation LIDAR Semantic Segmentation

MovieNet: A Holistic Dataset for Movie Understanding

no code implementations ECCV 2020 Qingqiu Huang, Yu Xiong, Anyi Rao, Jiaze Wang, Dahua Lin

We believe that such a holistic dataset would promote the researches on story-based long video understanding and beyond.

Video Understanding

Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets

1 code implementation ECCV 2020 Tong Wu, Qingqiu Huang, Ziwei Liu, Yu Wang, Dahua Lin

We present a new loss function called Distribution-Balanced Loss for the multi-label recognition problems that exhibit long-tailed class distributions.

General Classification Multi-Label Classification

Learn to Propagate Reliably on Noisy Affinity Graphs

no code implementations ECCV 2020 Lei Yang, Qingqiu Huang, Huaiyi Huang, Linning Xu, Dahua Lin

Recent works have shown that exploiting unlabeled data through label propagation can substantially reduce the labeling cost, which has been a critical issue in developing visual recognition models.

Novel Policy Seeking with Constrained Optimization

no code implementations21 May 2020 Hao Sun, Zhenghao Peng, Bo Dai, Jian Guo, Dahua Lin, Bolei Zhou

In problem-solving, we humans can come up with multiple novel solutions to the same problem.

reinforcement-learning

Intra- and Inter-Action Understanding via Temporal Action Parsing

no code implementations CVPR 2020 Dian Shao, Yue Zhao, Bo Dai, Dahua Lin

Current methods for action recognition primarily rely on deep convolutional networks to derive feature embeddings of visual and motion features.

Action Parsing Action Recognition +1

Evolutionary Stochastic Policy Distillation

2 code implementations27 Apr 2020 Hao Sun, Xinyu Pan, Bo Dai, Dahua Lin, Bolei Zhou

Solving the Goal-Conditioned Reward Sparse (GCRS) task is a challenging reinforcement learning problem due to the sparsity of reward signals.

Feature Pyramid Grids

1 code implementation7 Apr 2020 Kai Chen, Yuhang Cao, Chen Change Loy, Dahua Lin, Christoph Feichtenhofer

Feature pyramid networks have been widely adopted in the object detection literature to improve feature representations for better handling of variations in scale.

Neural Architecture Search object-detection +2

Self-Supervised Scene De-occlusion

2 code implementations CVPR 2020 Xiaohang Zhan, Xingang Pan, Bo Dai, Ziwei Liu, Dahua Lin, Chen Change Loy

This is achieved via Partial Completion Network (PCNet)-mask (M) and -content (C), that learn to recover fractions of object masks and contents, respectively, in a self-supervised manner.

Image Manipulation Scene Understanding

A Local-to-Global Approach to Multi-modal Movie Scene Segmentation

3 code implementations CVPR 2020 Anyi Rao, Linning Xu, Yu Xiong, Guodong Xu, Qingqiu Huang, Bolei Zhou, Dahua Lin

Scene, as the crucial unit of storytelling in movies, contains complex activities of actors and their interactions in a physical environment.

Action Recognition Scene Segmentation

Reconfigurable Voxels: A New Representation for LiDAR-Based Point Clouds

no code implementations6 Apr 2020 Tai Wang, Xinge Zhu, Dahua Lin

LiDAR is an important method for autonomous driving systems to sense the environment.

Autonomous Driving

SSN: Shape Signature Networks for Multi-class Object Detection from Point Clouds

1 code implementation6 Apr 2020 Xinge Zhu, Yuexin Ma, Tai Wang, Yan Xu, Jianping Shi, Dahua Lin

Multi-class 3D object detection aims to localize and classify objects of multiple categories from point clouds.

3D Object Detection object-detection

Learning to Cluster Faces via Confidence and Connectivity Estimation

2 code implementations CVPR 2020 Lei Yang, Dapeng Chen, Xiaohang Zhan, Rui Zhao, Chen Change Loy, Dahua Lin

With the vertex confidence and edge connectivity, we can naturally organize more relevant vertices on the affinity graph and group them into clusters.

Connectivity Estimation Face Clustering

Omni-sourced Webly-supervised Learning for Video Recognition

3 code implementations ECCV 2020 Haodong Duan, Yue Zhao, Yuanjun Xiong, Wentao Liu, Dahua Lin

Then a joint-training strategy is proposed to deal with the domain gaps between multiple data sources and formats in webly-supervised learning.

Ranked #2 on Action Recognition on UCF101 (using extra training data)

Action Classification Action Recognition +1

Learning Diverse Fashion Collocation by Neural Graph Filtering

no code implementations11 Mar 2020 Xin Liu, Yongbin Sun, Ziwei Liu, Dahua Lin

To facilitate a comprehensive study on diverse fashion collocation, we reorganize Amazon Fashion dataset with carefully designed evaluation protocols.

Recommendation Systems

DSNAS: Direct Neural Architecture Search without Parameter Retraining

1 code implementation CVPR 2020 Shoukang Hu, Sirui Xie, Hehui Zheng, Chunxiao Liu, Jianping Shi, Xunying Liu, Dahua Lin

We argue that given a computer vision task for which a NAS method is expected, this definition can reduce the vaguely-defined NAS evaluation to i) accuracy of this task and ii) the total computation consumed to finally obtain a model with satisfying accuracy.

Neural Architecture Search

Real or Not Real, that is the Question

2 code implementations ICLR 2020 Yuanbo Xiangli, Yubin Deng, Bo Dai, Chen Change Loy, Dahua Lin

While generative adversarial networks (GAN) have been widely adopted in various topics, in this paper we generalize the standard GAN to a new perspective by treating realness as a random variable that can be estimated from multiple angles.

Regularizing Reasons for Outfit Evaluation with Gradient Penalty

no code implementations2 Feb 2020 Xingxing Zou, Zhizhong Li, Ke Bai, Dahua Lin, Waikeung Wong

In this paper, we build an outfit evaluation system which provides feedbacks consisting of a judgment with a convincing explanation.

Side-Aware Boundary Localization for More Precise Object Detection

1 code implementation ECCV 2020 Jiaqi Wang, Wenwei Zhang, Yuhang Cao, Kai Chen, Jiangmiao Pang, Tao Gong, Jianping Shi, Chen Change Loy, Dahua Lin

To tackle the difficulty of precise localization in the presence of displacements with large variance, we further propose a two-step localization scheme, which first predicts a range of movement through bucket prediction and then pinpoints the precise position within the predicted bucket.

object-detection Object Detection

Fastened CROWN: Tightened Neural Network Robustness Certificates

1 code implementation2 Dec 2019 Zhaoyang Lyu, Ching-Yun Ko, Zhifeng Kong, Ngai Wong, Dahua Lin, Luca Daniel

We draw inspiration from such work and further demonstrate the optimality of deterministic CROWN (Zhang et al. 2018) solutions in a given linear programming problem under mild constraints.

Learning a Decision Module by Imitating Driver's Control Behaviors

no code implementations30 Nov 2019 Junning Huang, Sirui Xie, Jiankai Sun, Qiurui Ma, Chunxiao Liu, Jianping Shi, Dahua Lin, Bolei Zhou

In this work, we propose a hybrid framework to learn neural decisions in the classical modular pipeline through end-to-end imitation learning.

Autonomous Driving Imitation Learning

Learning to Synthesize Fashion Textures

no code implementations18 Nov 2019 Wu Shi, Tak-Wai Hui, Ziwei Liu, Dahua Lin, Chen Change Loy

Another important observation is that fashion textures are multi-modal.

Policy Continuation with Hindsight Inverse Dynamics

1 code implementation NeurIPS 2019 Hao Sun, Zhizhong Li, Xiaotong Liu, Dahua Lin, Bolei Zhou

This approach learns from Hindsight Inverse Dynamics based on Hindsight Experience Replay, enabling the learning process in a self-imitated manner and thus can be trained with supervised learning.

reinforcement-learning

A Graph-Based Framework to Bridge Movies and Synopses

no code implementations ICCV 2019 Yu Xiong, Qingqiu Huang, Lingfeng Guo, Hang Zhou, Bolei Zhou, Dahua Lin

On top of this dataset, we develop a framework to perform matching between movie segments and synopsis paragraphs.

Learning with Social Influence through Interior Policy Differentiation

no code implementations25 Sep 2019 Hao Sun, Bo Dai, Jiankai Sun, Zhenghao Peng, Guodong Xu, Dahua Lin, Bolei Zhou

In this work we model the social influence into the scheme of reinforcement learning, enabling the agents to learn both from the environment and from their peers.

reinforcement-learning

Regulatory Focus: Promotion and Prevention Inclinations in Policy Search

no code implementations25 Sep 2019 Lanxin Lei, Zhizhong Li, Xiaoyang Li, Cong Qiu, Dahua Lin

The estimation of advantage is crucial for a number of reinforcement learning algorithms, as it directly influences the choices of future paths.

Atari Games Continuous Control +1

Biased Estimates of Advantages over Path Ensembles

no code implementations15 Sep 2019 Lanxin Lei, Zhizhong Li, Dahua Lin

The estimation of advantage is crucial for a number of reinforcement learning algorithms, as it directly influences the choices of future paths.

Atari Games Continuous Control +1

Open Compound Domain Adaptation

no code implementations CVPR 2020 Ziwei Liu, Zhongqi Miao, Xingang Pan, Xiaohang Zhan, Dahua Lin, Stella X. Yu, Boqing Gong

A typical domain adaptation approach is to adapt models trained on the annotated data in a source domain (e. g., sunny weather) for achieving high performance on the test data in a target domain (e. g., rainy weather).

Domain Adaptation Facial Expression Recognition +1

Recursive Visual Sound Separation Using Minus-Plus Net

no code implementations ICCV 2019 Xudong Xu, Bo Dai, Dahua Lin

Sounds provide rich semantics, complementary to visual data, for many tasks.

POPQORN: Quantifying Robustness of Recurrent Neural Networks

2 code implementations17 May 2019 Ching-Yun Ko, Zhaoyang Lyu, Tsui-Wei Weng, Luca Daniel, Ngai Wong, Dahua Lin

The vulnerability to adversarial attacks has been a critical issue for deep neural networks.

CARAFE: Content-Aware ReAssembly of FEatures

2 code implementations ICCV 2019 Jiaqi Wang, Kai Chen, Rui Xu, Ziwei Liu, Chen Change Loy, Dahua Lin

CARAFE introduces little computational overhead and can be readily integrated into modern network architectures.

Instance Segmentation object-detection +2

Prime Sample Attention in Object Detection

1 code implementation CVPR 2020 Yuhang Cao, Kai Chen, Chen Change Loy, Dahua Lin

Our experiments demonstrate that it is often more effective to focus on prime samples than hard samples when training a detector.

object-detection Object Detection

Libra R-CNN: Towards Balanced Learning for Object Detection

5 code implementations CVPR 2019 Jiangmiao Pang, Kai Chen, Jianping Shi, Huajun Feng, Wanli Ouyang, Dahua Lin

In this work, we carefully revisit the standard training practice of detectors, and find that the detection performance is often limited by the imbalance during the training process, which generally consists in three levels - sample level, feature level, and objective level.

object-detection Object Detection

Learning to Cluster Faces on an Affinity Graph

2 code implementations CVPR 2019 Lei Yang, Xiaohang Zhan, Dapeng Chen, Junjie Yan, Chen Change Loy, Dahua Lin

Face recognition sees remarkable progress in recent years, and its performance has reached a very high level.

Face Recognition

Self-Supervised Learning via Conditional Motion Propagation

1 code implementation CVPR 2019 Xiaohang Zhan, Xingang Pan, Ziwei Liu, Dahua Lin, Chen Change Loy

Instead of explicitly modeling the motion probabilities, we design the pretext task as a conditional motion propagation problem.

Human Parsing Instance Segmentation +2

Hybrid Task Cascade for Instance Segmentation

5 code implementations CVPR 2019 Kai Chen, Jiangmiao Pang, Jiaqi Wang, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jianping Shi, Wanli Ouyang, Chen Change Loy, Dahua Lin

In exploring a more effective approach, we find that the key to a successful instance segmentation cascade is to fully leverage the reciprocal relationship between detection and segmentation.

Instance Segmentation object-detection +2

Region Proposal by Guided Anchoring

2 code implementations CVPR 2019 Jiaqi Wang, Kai Chen, Shuo Yang, Chen Change Loy, Dahua Lin

State-of-the-art detectors mostly rely on a dense anchoring scheme, where anchors are sampled uniformly over the spatial domain with a predefined set of scales and aspect ratios.

object-detection Object Detection +1

Monocular 3D Pose Recovery via Nonconvex Sparsity with Theoretical Analysis

no code implementations29 Dec 2018 Jianqiao Wangni, Dahua Lin, Ji Liu, Kostas Daniilidis, Jianbo Shi

For recovering 3D object poses from 2D images, a prevalent method is to pre-train an over-complete dictionary $\mathcal D=\{B_i\}_i^D$ of 3D basis poses.

IRLAS: Inverse Reinforcement Learning for Architecture Search

1 code implementation CVPR 2019 Minghao Guo, Zhao Zhong, Wei Wu, Dahua Lin, Junjie Yan

Motivated by the fact that human-designed networks are elegant in topology with a fast inference speed, we propose a mirror stimuli function inspired by biological cognition theory to extract the abstract topological knowledge of an expert human-design network (ResNeXt).

Neural Architecture Search reinforcement-learning

An Embarrassingly Simple Approach for Knowledge Distillation

1 code implementation5 Dec 2018 Mengya Gao, Yujun Shen, Quanquan Li, Junjie Yan, Liang Wan, Dahua Lin, Chen Change Loy, Xiaoou Tang

Knowledge Distillation (KD) aims at improving the performance of a low-capacity student model by inheriting knowledge from a high-capacity teacher model.

Face Recognition Knowledge Distillation +3

A Neural Compositional Paradigm for Image Captioning

1 code implementation NeurIPS 2018 Bo Dai, Sanja Fidler, Dahua Lin

Mainstream captioning models often follow a sequential structure to generate captions, leading to issues such as introduction of irrelevant semantics, lack of diversity in the generated captions, and inadequate generalization performance.

Image Captioning

Improving On-policy Learning with Statistical Reward Accumulation

no code implementations7 Sep 2018 Yubin Deng, Ke Yu, Dahua Lin, Xiaoou Tang, Chen Change Loy

Most methods in deep-RL achieve good results via the maximization of the reward signal provided by the environment, typically in the form of discounted cumulative returns.

Atari Games

Consensus-Driven Propagation in Massive Unlabeled Data for Face Recognition

1 code implementation ECCV 2018 Xiaohang Zhan, Ziwei Liu, Junjie Yan, Dahua Lin, Chen Change Loy

Face recognition has witnessed great progress in recent years, mainly attributed to the high-capacity model designed and the abundant labeled data collected.

Face Recognition

Penalizing Top Performers: Conservative Loss for Semantic Segmentation Adaptation

no code implementations ECCV 2018 Xinge Zhu, Hui Zhou, Ceyuan Yang, Jianping Shi, Dahua Lin

Due to the expensive and time-consuming annotations (e. g., segmentation) for real-world images, recent works in computer vision resort to synthetic data.

Domain Adaptation Semantic Segmentation

PSANet: Point-wise Spatial Attention Network for Scene Parsing

4 code implementations ECCV 2018 Hengshuang Zhao, Yi Zhang, Shu Liu, Jianping Shi, Chen Change Loy, Dahua Lin, Jiaya Jia

We notice information flow in convolutional neural networks is restricted inside local neighborhood regions due to the physical design of convolutional filters, which limits the overall understanding of complex scenes.

Scene Parsing Semantic Segmentation

Find and Focus: Retrieve and Localize Video Events with Natural Language Queries

no code implementations ECCV 2018 Dian Shao, Yu Xiong, Yue Zhao, Qingqiu Huang, Yu Qiao, Dahua Lin

The thriving of video sharing services brings new challenges to video retrieval, e. g. the rapid growth in video duration and content diversity.

Video Retrieval

Generative Adversarial Frontal View to Bird View Synthesis

no code implementations1 Aug 2018 Xinge Zhu, Zhichao Yin, Jianping Shi, Hongsheng Li, Dahua Lin

Due to the large gap and severe deformation between the frontal view and bird view, generating a bird view image from a single frontal view is challenging.

Bird View Synthesis Homography Estimation +1

Pose Guided Human Video Generation

no code implementations ECCV 2018 Ceyuan Yang, Zhe Wang, Xinge Zhu, Chen Huang, Jianping Shi, Dahua Lin

Human pose, on the other hand, can represent motion patterns intrinsically and interpretably, and impose the geometric constraints regardless of appearance.

motion prediction Video Generation

Person Search in Videos with One Portrait Through Visual and Temporal Links

2 code implementations ECCV 2018 Qingqiu Huang, Wentao Liu, Dahua Lin

In real-world applications, e. g. law enforcement and video retrieval, one often needs to search a certain person in long videos with just one portrait.

Person Re-Identification Person Search +1

Move Forward and Tell: A Progressive Generator of Video Descriptions

no code implementations ECCV 2018 Yilei Xiong, Bo Dai, Dahua Lin

We present an efficient framework that can generate a coherent paragraph to describe a given video.

Video Captioning

Rethinking the Form of Latent States in Image Captioning

no code implementations ECCV 2018 Bo Dai, Deming Ye, Dahua Lin

Taking advantage of this, we visually reveal the internal dynamics in the process of caption generation, as well as the connections between input visual domain and output linguistic domain.

Image Captioning

Probabilistic Ensemble of Collaborative Filters

no code implementations26 Jun 2018 Zhiyu Min, Dahua Lin

Collaborative filtering is an important technique for recommendation.

Collaborative Filtering

From Trailers to Storylines: An Efficient Way to Learn from Movies

1 code implementation14 Jun 2018 Qingqiu Huang, Yuanjun Xiong, Yu Xiong, Yuqi Zhang, Dahua Lin

Experiments on this dataset showed that the proposed method can substantially reduce the training time while obtaining highly effective features and coherent temporal structures.

Unifying Identification and Context Learning for Person Recognition

1 code implementation CVPR 2018 Qingqiu Huang, Yu Xiong, Dahua Lin

In this work, we aim to move beyond such limitations and propose a new framework to leverage context for person recognition.

Face Recognition Person Recognition

Unsupervised Feature Learning via Non-Parametric Instance Discrimination

2 code implementations CVPR 2018 Zhirong Wu, Yuanjun Xiong, Stella X. Yu, Dahua Lin

Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so.

General Classification object-detection +3

Recognize Actions by Disentangling Components of Dynamics

no code implementations CVPR 2018 Yue Zhao, Yuanjun Xiong, Dahua Lin

Despite the remarkable progress in action recognition over the past several years, existing methods remain limited in efficiency and effectiveness.

Action Recognition Optical Flow Estimation +1

Learning Globally Optimized Object Detector via Policy Gradient

no code implementations CVPR 2018 Yongming Rao, Dahua Lin, Jiwen Lu, Jie zhou

In this paper, we propose a simple yet effective method to learn globally optimized detector for object detection, which is a simple modification to the standard cross-entropy gradient inspired by the REINFORCE algorithm.

object-detection Object Detection

Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination

14 code implementations5 May 2018 Zhirong Wu, Yuanjun Xiong, Stella Yu, Dahua Lin

Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so.

General Classification object-detection +1

Optimizing Video Object Detection via a Scale-Time Lattice

1 code implementation CVPR 2018 Kai Chen, Jiaqi Wang, Shuo Yang, Xingcheng Zhang, Yuanjun Xiong, Chen Change Loy, Dahua Lin

High-performance object detection relies on expensive convolutional networks to compute features, often leading to significant challenges in applications, e. g. those that require detecting objects from video streams in real time.

object-detection Video Object Detection

Accelerated Training for Massive Classification via Dynamic Class Selection

no code implementations5 Jan 2018 Xingcheng Zhang, Lei Yang, Junjie Yan, Dahua Lin

Massive classification, a classification task defined over a vast number of classes (hundreds of thousands or even millions), has become an essential part of many real-world systems, such as face recognition.

Classification Face Recognition +1

Peephole: Predicting Network Performance Before Training

1 code implementation9 Dec 2017 Boyang Deng, Junjie Yan, Dahua Lin

The quest for performant networks has been a significant force that drives the advancements of deep learning in recent years.

Learning Sparse Visual Representations with Leaky Capped Norm Regularizers

no code implementations8 Nov 2017 Jianqiao Wangni, Dahua Lin

To the best of our knowledge, this is the first convergence analysis of the 3D recovery problem.

Be Your Own Prada: Fashion Synthesis with Structural Coherence

no code implementations ICCV 2017 Shizhan Zhu, Sanja Fidler, Raquel Urtasun, Dahua Lin, Chen Change Loy

In the second stage, a generative model with a newly proposed compositional mapping layer is used to render the final image with precise regions and textures conditioned on this map.

Fashion Synthesis Semantic Segmentation

Contrastive Learning for Image Captioning

no code implementations NeurIPS 2017 Bo Dai, Dahua Lin

Specifically, via two constraints formulated on top of a reference model, the proposed method can encourage distinctiveness, while maintaining the overall quality of the generated captions.

Contrastive Learning Image Captioning

Scalable Estimation of Dirichlet Process Mixture Models on Distributed Data

no code implementations19 Sep 2017 Ruohui Wang, Dahua Lin

We consider the estimation of Dirichlet Process Mixture Models (DPMMs) in distributed environments, where data are distributed across multiple computing nodes.

Integrating Specialized Classifiers Based on Continuous Time Markov Chain

no code implementations7 Sep 2017 Zhizhong Li, Dahua Lin

Specialized classifiers, namely those dedicated to a subset of classes, are often adopted in real-world recognition systems.

Discover and Learn New Objects from Documentaries

1 code implementation CVPR 2017 Kai Chen, Hang Song, Chen Change Loy, Dahua Lin

Despite the remarkable progress in recent years, detecting objects in a new context remains a challenging task.

Temporal Segment Networks for Action Recognition in Videos

8 code implementations8 May 2017 Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, Luc van Gool

Furthermore, based on the temporal segment networks, we won the video classification track at the ActivityNet challenge 2016 among 24 teams, which demonstrates the effectiveness of TSN and the proposed good practices.

Ranked #17 on Action Classification on Moments in Time (Top 5 Accuracy metric)

Action Classification Action Recognition +2

Towards Diverse and Natural Image Descriptions via a Conditional GAN

1 code implementation ICCV 2017 Bo Dai, Sanja Fidler, Raquel Urtasun, Dahua Lin

Despite the substantial progress in recent years, the image captioning techniques are still far from being perfect. Sentences produced by existing methods, e. g. those based on RNNs, are often overly rigid and lacking in variability.

Image Captioning

UntrimmedNets for Weakly Supervised Action Recognition and Detection

2 code implementations CVPR 2017 Limin Wang, Yuanjun Xiong, Dahua Lin, Luc van Gool

We exploit the learned models for action recognition (WSR) and detection (WSD) on the untrimmed video datasets of THUMOS14 and ActivityNet.

Weakly Supervised Action Localization Weakly-Supervised Action Recognition

PolyNet: A Pursuit of Structural Diversity in Very Deep Networks

3 code implementations CVPR 2017 Xingcheng Zhang, Zhizhong Li, Chen Change Loy, Dahua Lin

A number of studies have shown that increasing the depth or width of convolutional networks is a rewarding approach to improve the performance of image recognition.

Image Classification

Deep Markov Random Field for Image Modeling

1 code implementation7 Sep 2016 Zhirong Wu, Dahua Lin, Xiaoou Tang

Markov Random Fields (MRFs), a formulation widely used in generative image modeling, have long been plagued by the lack of expressive power.

Temporal Segment Networks: Towards Good Practices for Deep Action Recognition

21 code implementations2 Aug 2016 Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, Luc van Gool

The other contribution is our study on a series of good practices in learning ConvNets on video data with the help of temporal segment network.

Action Classification Action Recognition +3

CUHK & ETHZ & SIAT Submission to ActivityNet Challenge 2016

1 code implementation2 Aug 2016 Yuanjun Xiong, Li-Min Wang, Zhe Wang, Bo-Wen Zhang, Hang Song, Wei Li, Dahua Lin, Yu Qiao, Luc van Gool, Xiaoou Tang

This paper presents the method that underlies our submission to the untrimmed video classification task of ActivityNet Challenge 2016.

General Classification Video Classification

Adjustable Bounded Rectifiers: Towards Deep Binary Representations

no code implementations19 Nov 2015 Zhirong Wu, Dahua Lin, Xiaoou Tang

This suggests that the semantic structure of a neural network may be manifested through a guided binarization process.

Binarization

Recognize Complex Events From Static Images by Fusing Deep Channels

no code implementations CVPR 2015 Yuanjun Xiong, Kai Zhu, Dahua Lin, Xiaoou Tang

A considerable portion of web images capture events that occur in our personal lives or social activities.

Generating Multi-Sentence Lingual Descriptions of Indoor Scenes

no code implementations28 Feb 2015 Dahua Lin, Chen Kong, Sanja Fidler, Raquel Urtasun

This paper proposes a novel framework for generating lingual descriptions of indoor scenes.

Text Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.