Search Results for author: Liang Lin

Found 285 papers, 114 papers with code

Larger Norm More Transferable: An Adaptive Feature Norm Approach for Unsupervised Domain Adaptation

3 code implementations • ICCV 2019 • Ruijia Xu, Guanbin Li, Jihan Yang, Liang Lin

Domain adaptation enables the learner to safely generalize into novel environments by mitigating domain shifts across distributions.

Ranked #6 on Domain Adaptation on ImageCLEF-DA

Partial Domain Adaptation Transfer Learning +1

3,120

Paper
Code

Joint Detection and Identification Feature Learning for Person Search

2 code implementations • CVPR 2017 • Tong Xiao, Shuang Li, Bochao Wang, Liang Lin, Xiaogang Wang

Existing person re-identification benchmarks and methods mainly focus on matching cropped pedestrian images between queries and candidates.

Ranked #9 on Person Re-Identification on CUHK03

Pedestrian Detection Person Re-Identification +1

732

Paper
Code

Identity-Preserving Talking Face Generation with Landmark and Appearance Priors

1 code implementation • CVPR 2023 • Weizhi Zhong, Chaowei Fang, Yinqi Cai, Pengxu Wei, Gangming Zhao, Liang Lin, Guanbin Li

Prior landmark characteristics of the speaker's face are employed to make the generated landmarks coincide with the facial outline of the speaker.

Talking Face Generation

543

Paper
Code

Toward Characteristic-Preserving Image-based Virtual Try-On Network

5 code implementations • ECCV 2018 • Bochao Wang, Huabin Zheng, Xiaodan Liang, Yimin Chen, Liang Lin, Meng Yang

Second, to alleviate boundary artifacts of warped clothes and make the results more realistic, we employ a Try-On Module that learns a composition mask to integrate the warped clothes and the rendered image to ensure smoothness.

Geometric Matching Virtual Try-on

460

Paper
Code

3D Human Pose Machines with Self-supervised Learning

2 code implementations • arXiv.org 2019 • Keze Wang, Liang Lin, Chenhan Jiang, Chen Qian, Pengxu Wei

Driven by recent computer vision and robotic applications, recovering 3D human poses has become increasingly important and attracted growing interests.

Ranked #263 on 3D Human Pose Estimation on Human3.6M

3D Human Pose Estimation Self-Supervised Learning

409

Paper
Code

Instance-level Human Parsing via Part Grouping Network

1 code implementation • ECCV 2018 • Ke Gong, Xiaodan Liang, Yicheng Li, Yimin Chen, Ming Yang, Liang Lin

Instance-level human parsing towards real-world human analysis scenarios is still under-explored due to the absence of sufficient data resources and technical difficulty in parsing multiple instances in a single pass.

Ranked #6 on Human Part Segmentation on CIHP

Edge Detection Human Parsing +2

408

Paper
Code

Look into Person: Joint Body Parsing & Pose Estimation Network and A New Benchmark

3 code implementations • 5 Apr 2018 • Xiaodan Liang, Ke Gong, Xiaohui Shen, Liang Lin

To further explore and take advantage of the semantic correlation of these two tasks, we propose a novel joint human parsing and pose estimation network to explore efficient context modeling, which can simultaneously predict parsing and pose with extremely high quality.

Ranked #10 on Semantic Segmentation on LIP val

Human Parsing Pose Estimation +1

369

Paper
Code

Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models

1 code implementation • 23 May 2023 • Weifeng Chen, Yatai Ji, Jie Wu, Hefeng Wu, Pan Xie, Jiashi Li, Xin Xia, Xuefeng Xiao, Liang Lin

Based on a pre-trained conditional text-to-image (T2I) diffusion model, our model aims to generate videos conditioned on a sequence of control signals, such as edge or depth maps.

Optical Flow Estimation Style Transfer +4

335

Paper
Code

EagleEye: Fast Sub-net Evaluation for Efficient Neural Network Pruning

1 code implementation • ECCV 2020 • Bailin Li, Bowen Wu, Jiang Su, Guangrun Wang, Liang Lin

Many algorithms try to predict model performance of the pruned sub-nets by introducing various evaluation methods.

Ranked #5 on Network Pruning on ImageNet

Efficient Neural Network Network Pruning

301

Paper
Code

Graphonomy: Universal Human Parsing via Graph Transfer Learning

1 code implementation • CVPR 2019 • Ke Gong, Yiming Gao, Xiaodan Liang, Xiaohui Shen, Meng Wang, Liang Lin

By distilling universal semantic graph representation to each specific task, Graphonomy is able to predict all levels of parsing labels in one system without piling up the complexity.

Human Parsing Transfer Learning

287

Paper
Code

Graphonomy: Universal Image Parsing via Graph Reasoning and Transfer

2 code implementations • 26 Jan 2021 • Liang Lin, Yiming Gao, Ke Gong, Meng Wang, Xiaodan Liang

Prior highly-tuned image parsing models are usually studied in a certain domain with a specific set of semantic labels and can hardly be adapted into other scenarios (e. g., sharing discrepant label granularity) without extensive re-training.

Graph Representation Learning Human Parsing +2

287

Paper
Code

Single View Stereo Matching

1 code implementation • CVPR 2018 • Yue Luo, Jimmy Ren, Mude Lin, Jiahao Pang, Wenxiu Sun, Hongsheng Li, Liang Lin

The resulting model outperforms all the previous monocular depth estimation methods as well as the stereo block matching method in the challenging KITTI dataset by only using a small number of real training data.

Ranked #42 on Monocular Depth Estimation on KITTI Eigen split

Monocular Depth Estimation Stereo Matching +1

280

Paper
Code

LSTM Pose Machines

1 code implementation • CVPR 2018 • Yue Luo, Jimmy Ren, Zhouxia Wang, Wenxiu Sun, Jinshan Pan, Jianbo Liu, Jiahao Pang, Liang Lin

Such suboptimal results are mainly attributed to the inability of imposing sequential geometric consistency, handling severe image quality degradation (e. g. motion blur and occlusion) as well as the inability of capturing the temporal correlation among video frames.

Ranked #3 on Pose Estimation on J-HMDB

2D Human Pose Estimation Pose Estimation

274

Paper
Code

Blockwisely Supervised Neural Architecture Search with Knowledge Distillation

1 code implementation • 29 Nov 2019 • Changlin Li, Jiefeng Peng, Liuchun Yuan, Guangrun Wang, Xiaodan Liang, Liang Lin, Xiaojun Chang

Moreover, we find that the knowledge of a network model lies not only in the network parameters but also in the network architecture.

Ranked #1 on Neural Architecture Search on CIFAR-100

Knowledge Distillation Neural Architecture Search

230

Paper
Code

DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions

1 code implementation • 2 Mar 2024 • Guangrun Wang, Changlin Li, Liuchun Yuan, Jiefeng Peng, Xiaoyu Xian, Xiaodan Liang, Xiaojun Chang, Liang Lin

Addressing this problem, we modularize a large search space into blocks with small search spaces and develop a family of models with the distilling neural architecture (DNA) techniques.

Neural Architecture Search

230

Paper
Code

Look into Person: Self-supervised Structure-sensitive Learning and A New Benchmark for Human Parsing

1 code implementation • CVPR 2017 • Ke Gong, Xiaodan Liang, Dongyu Zhang, Xiaohui Shen, Liang Lin

Human parsing has recently attracted a lot of research interests due to its huge application potentials.

Ranked #13 on Semantic Segmentation on LIP val

Human Parsing Self-Supervised Learning +1

227

Paper
Code

Multi-level Wavelet-CNN for Image Restoration

5 code implementations • 18 May 2018 • Pengju Liu, Hongzhi Zhang, Kai Zhang, Liang Lin, WangMeng Zuo

With the modified U-Net architecture, wavelet transform is introduced to reduce the size of feature maps in the contracting subnetwork.

Ranked #2 on Grayscale Image Denoising on Set12 sigma25

Computational Efficiency Image Denoising +2

220

Paper
Code

Crafting a Toolchain for Image Restoration by Deep Reinforcement Learning

2 code implementations • CVPR 2018 • Ke Yu, Chao Dong, Liang Lin, Chen Change Loy

We investigate a novel approach for image restoration by reinforcement learning.

Image Restoration reinforcement-learning +1

208

Paper
Code

Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation

1 code implementation • 5 Dec 2023 • Shanshan Zhong, Zhongzhan Huang, ShangHua Gao, Wushao Wen, Liang Lin, Marinka Zitnik, Pan Zhou

To this end, we study LLMs on the popular Oogiri game which needs participants to have good creativity and strong associative thinking for responding unexpectedly and humorously to the given image, text, or both, and thus is suitable for LoT study.

Logical Reasoning

190

Paper
Code

Component Divide-and-Conquer for Real-World Image Super-Resolution

1 code implementation • ECCV 2020 • Pengxu Wei, Ziwei Xie, Hannan Lu, Zongyuan Zhan, Qixiang Ye, WangMeng Zuo, Liang Lin

Learning an SR model with conventional pixel-wise loss usually is easily dominated by flat regions and edges, and fails to infer realistic details of complex textures.

Image Super-Resolution

176

Paper
Code

Weakly Supervised Person Re-ID: Differentiable Graphical Learning and A New Benchmark

1 code implementation • 8 Apr 2019 • Guangrun Wang, Guangcong Wang, Xujie Zhang, Jian-Huang Lai, Zhengtao Yu, Liang Lin

Learning a Re-ID model with bag-level annotation is called the weakly supervised Re-ID problem.

Ranked #2 on Person Re-Identification on SYSU-30k

Person Re-Identification Pseudo Label

162

Paper
Code

Learning Semantic-Specific Graph Representation for Multi-Label Image Recognition

2 code implementations • ICCV 2019 • Tianshui Chen, Muxin Xu, Xiaolu Hui, Hefeng Wu, Liang Lin

Recognizing multiple labels of images is a practical and challenging task, and significant progress has been made by searching semantic-aware regions and modeling label dependency.

Ranked #8 on Multi-Label Classification on PASCAL VOC 2007

Graph Representation Learning Multi-Label Classification +1

158

Paper
Code

Deep Human Parsing with Active Template Regression

1 code implementation • 9 Mar 2015 • Xiaodan Liang, Si Liu, Xiaohui Shen, Jianchao Yang, Luoqi Liu, Jian Dong, Liang Lin, Shuicheng Yan

The first CNN network is with max-pooling, and designed to predict the template coefficients for each label mask, while the second CNN network is without max-pooling to preserve sensitivity to label mask position and accurately predict the active shape parameters.

Human Parsing Position +1

152

Paper
Code

SNAS: Stochastic Neural Architecture Search

2 code implementations • ICLR 2019 • Sirui Xie, Hehui Zheng, Chunxiao Liu, Liang Lin

In experiments on CIFAR-10, SNAS takes less epochs to find a cell architecture with state-of-the-art accuracy than non-differentiable evolution-based and reinforcement-learning-based NAS, which is also transferable to ImageNet.

Ranked #25 on Neural Architecture Search on NAS-Bench-201, CIFAR-10

Neural Architecture Search reinforcement-learning +1

144

Paper
Code

Adaptively Connected Neural Networks

1 code implementation • CVPR 2019 • Guangrun Wang, Keze Wang, Liang Lin

This paper presents a novel adaptively connected neural network (ACNet) to improve the traditional convolutional neural networks (CNNs) {in} two aspects.

Ranked #1 on Document Classification on Cora

Document Classification Image Classification +1

144

Paper
Code

Knowledge-Embedded Routing Network for Scene Graph Generation

3 code implementations • CVPR 2019 • Tianshui Chen, Weihao Yu, Riquan Chen, Liang Lin

More specifically, we show that the statistical correlations between objects appearing in images and their relationships, can be explicitly represented by a structured knowledge graph, and a routing mechanism is learned to propagate messages through the graph to explore their interactions.

Ranked #9 on Scene Graph Generation on Visual Genome

Graph Generation Scene Graph Generation

117

Paper
Code

Learning Warped Guidance for Blind Face Restoration

1 code implementation • ECCV 2018 • Xiaoming Li, Ming Liu, Yuting Ye, WangMeng Zuo, Liang Lin, Ruigang Yang

For better recovery of fine facial details, we modify the problem setting by taking both the degraded observation and a high-quality guided image of the same identity as input to our guided face restoration network (GFRNet).

Ranked #1 on Image Super-Resolution on WebFace - 8x upscaling

Blind Face Restoration

109

Paper
Code

Cross-Modal Causal Intervention for Medical Report Generation

2 code implementations • 16 Mar 2023 • Weixing Chen, Yang Liu, Ce Wang, Jiarui Zhu, Shen Zhao, Guanbin Li, Cheng-Lin Liu, Liang Lin

Medical report generation (MRG) is essential for computer-aided diagnosis and medication guidance, which can relieve the heavy burden of radiologists by automatically generating the corresponding medical reports according to the given radiology image.

Medical Report Generation object-detection +1

108

Paper
Code

Visual Causal Scene Refinement for Video Question Answering

2 code implementations • 7 May 2023 • Yushen Wei, Yang Liu, Hong Yan, Guanbin Li, Liang Lin

Our VCSR involves two essential modules: i) the Question-Guided Refiner (QGR) module, which refines consecutive video frames guided by the question semantics to obtain more representative segment features for causal front-door intervention; ii) the Causal Scene Separator (CSS) module, which discovers a collection of visual causal and non-causal scenes based on the visual-linguistic causal relevance and estimates the causal effect of the scene-separating intervention in a contrastive learning manner.

Contrastive Learning Question Answering +2

108

Paper
Code

CausalVLR: A Toolbox and Benchmark for Visual-Linguistic Causal Reasoning

2 code implementations • 30 Jun 2023 • Yang Liu, Weixing Chen, Guanbin Li, Liang Lin

We present CausalVLR (Causal Visual-Linguistic Reasoning), an open-source toolbox containing a rich set of state-of-the-art causal relation discovery and causal inference methods for various visual-linguistic reasoning tasks, such as VQA, image/video captioning, medical report generation, model generalization and robustness, etc.

Causal Inference Medical Report Generation +2

108

Paper
Code

Towards CausalGPT: A Multi-Agent Approach for Faithful Knowledge Reasoning via Promoting Causal Consistency in LLMs

2 code implementations • 23 Aug 2023 • Ziyi Tang, Ruilin Wang, Weixing Chen, Keze Wang, Yang Liu, Tianshui Chen, Liang Lin

Despite advancements in LLMs, knowledge-based reasoning remains a longstanding issue due to the fragility of knowledge recall and inference.

counterfactual Science Question Answering

108

Paper
Code

DreamEditor: Text-Driven 3D Scene Editing with Neural Fields

1 code implementation • 23 Jun 2023 • Jingyu Zhuang, Chen Wang, Lingjie Liu, Liang Lin, Guanbin Li

Neural fields have achieved impressive advancements in view synthesis and scene reconstruction.

3D scene Editing

105

Paper
Code

Cross-Domain Facial Expression Recognition: A Unified Evaluation Benchmark and Adversarial Graph Learning

1 code implementation • 3 Aug 2020 • Tianshui Chen, Tao Pu, Hefeng Wu, Yuan Xie, Lingbo Liu, Liang Lin

Although each declares to achieve superior performance, fair comparisons are lacking due to the inconsistent choices of the source/target datasets and feature extractors.

Ranked #1 on Cross-Domain Facial Expression Recognition on Source: AFE, Target: CK+, JAFFE, SFEW2.0, FER2013, ExpW

Cross-Domain Facial Expression Recognition Domain Adaptation +3

104

Paper
Code

Adversarial Graph Representation Adaptation for Cross-Domain Facial Expression Recognition

1 code implementation • 3 Aug 2020 • Yuan Xie, Tianshui Chen, Tao Pu, Hefeng Wu, Liang Lin

However, most of these works focus on holistic feature adaptation, and they ignore local features that are more transferable across different datasets.

Cross-Domain Facial Expression Recognition Facial Expression Recognition (FER)

104

Paper
Code

SUR-adapter: Enhancing Text-to-Image Pre-trained Diffusion Models with Large Language Models

1 code implementation • 9 May 2023 • Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin

Our approach can make text-to-image diffusion models easier to use with better user experience, which demonstrates our approach has the potential for further advancing the development of user-friendly text-to-image generation models by bridging the semantic gap between simple narrative prompts and complex keyword-based prompts.

Knowledge Distillation Text-to-Image Generation

103

Paper
Code

Hybrid Knowledge Routed Modules for Large-scale Object Detection

1 code implementation • NeurIPS 2018 • Chenhan Jiang, Hang Xu, Xiangdan Liang, Liang Lin

The dominant object detection approaches treat the recognition of each region separately and overlook crucial semantic correlations between objects in one scene.

Object object-detection +1

102

Paper
Code

Transferable, Controllable, and Inconspicuous Adversarial Attacks on Person Re-identification With Deep Mis-Ranking

1 code implementation • CVPR 2020 • Hongjun Wang, Guangrun Wang, Ya Li, Dongyu Zhang, Liang Lin

To examine the robustness of ReID systems is rather important because the insecurity of ReID systems may cause severe losses, e. g., the criminals may use the adversarial perturbations to cheat the CCTV systems.

Adversarial Attack Person Re-Identification

Paper
Code

Towards Real-World Burst Image Super-Resolution: Benchmark and Method

1 code implementation • ICCV 2023 • Pengxu Wei, Yujing Sun, Xingbei Guo, Chang Liu, Jie Chen, Xiangyang Ji, Liang Lin

Despite substantial advances, single-image super-resolution (SISR) is always in a dilemma to reconstruct high-quality images with limited information from one input image, especially in realistic scenarios.

Burst Image Super-Resolution

Paper
Code

SkeletonMAE: Graph-based Masked Autoencoder for Skeleton Sequence Pre-training

1 code implementation • ICCV 2023 • Hong Yan, Yang Liu, Yushen Wei, Zhen Li, Guanbin Li, Liang Lin

Moreover, these methods ignore how to utilize the fine-grained dependencies among different skeleton joints to pre-train an efficient skeleton sequence learning model that can generalize well across different datasets.

Action Recognition Representation Learning +1

Paper
Code

Fine-Grained Representation Learning and Recognition by Exploiting Hierarchical Semantic Embedding

1 code implementation • 14 Aug 2018 • Tianshui Chen, Wenxi Wu, Yuefang Gao, Le Dong, Xiaonan Luo, Liang Lin

In this work, we investigate simultaneously predicting categories of different levels in the hierarchy and integrating this structured correlation information into the deep neural network by developing a novel Hierarchical Semantic Embedding (HSE) framework.

Ranked #51 on Fine-Grained Image Classification on CUB-200-2011

Fine-Grained Image Classification Fine-Grained Image Recognition +1

Paper
Code

Knowledge-Routed Visual Question Reasoning: Challenges for Deep Representation Embedding

1 code implementation • 14 Dec 2020 • Qingxing Cao, Bailin Li, Xiaodan Liang, Keze Wang, Liang Lin

Specifically, we generate the question-answer pair based on both the Visual Genome scene graph and an external knowledge base with controlled programs to disentangle the knowledge from other biases.

Question Answering Visual Question Answering

Paper
Code

Adaptive Temporal Encoding Network for Video Instance-level Human Parsing

1 code implementation • 2 Aug 2018 • Qixian Zhou, Xiaodan Liang, Ke Gong, Liang Lin

Beyond the existing single-person and multiple-person human parsing tasks in static images, this paper makes the first attempt to investigate a more realistic video instance-level human parsing that simultaneously segments out each person instance and parses each instance into more fine-grained parts (e. g., head, leg, dress).

Human Parsing Segmentation +4

Paper
Code

Semi-Supervised Video Salient Object Detection Using Pseudo-Labels

1 code implementation • ICCV 2019 • Pengxiang Yan, Guanbin Li, Yuan Xie, Zhen Li, Chuan Wang, Tianshui Chen, Liang Lin

Specifically, we present an effective video saliency detector that consists of a spatial refinement network and a spatiotemporal module.

Ranked #1 on Video Salient Object Detection on VOS-T (using extra training data)

object-detection Salient Object Detection +2

Paper
Code

Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering

2 code implementations • 26 Jul 2022 • Yang Liu, Guanbin Li, Liang Lin

Existing visual question answering methods often suffer from cross-modal spurious correlations and oversimplified event-level reasoning processes that fail to capture event temporality, causality, and dynamics spanning over the video.

Causal Inference Question Answering +2

Paper
Code

Learning a Wavelet-like Auto-Encoder to Accelerate Deep Neural Networks

2 code implementations • 20 Dec 2017 • Tianshui Chen, Liang Lin, WangMeng Zuo, Xiaonan Luo, Lei Zhang

In this work, aiming at a general and comprehensive way for neural network acceleration, we develop a Wavelet-like Auto-Encoder (WAE) that decomposes the original input image into two low-resolution channels (sub-images) and incorporate the WAE into the classification neural networks for joint training.

Classification General Classification +1

Paper
Code

Towards Quantifiable Dialogue Coherence Evaluation

1 code implementation • ACL 2021 • Zheng Ye, Liucun Lu, Lishan Huang, Liang Lin, Xiaodan Liang

To address these limitations, we propose Quantifiable Dialogue Coherence Evaluation (QuantiDCE), a novel framework aiming to train a quantifiable dialogue coherence metric that can reflect the actual human rating standards.

Coherence Evaluation Dialogue Evaluation +1

Paper
Code

End-to-End Knowledge-Routed Relational Dialogue System for Automatic Diagnosis

1 code implementation • 30 Jan 2019 • Lin Xu, Qixian Zhou, Ke Gong, Xiaodan Liang, Jianheng Tang, Liang Lin

Besides the challenges for conversational dialogue systems (e. g. topic transition coherency and question understanding), automatic medical diagnosis further poses more critical requirements for the dialogue rationality in the context of medical knowledge and symptom-disease relations.

Decision Making Dialogue Management +5

Paper
Code

GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dialogue Systems

1 code implementation • EMNLP 2020 • Lishan Huang, Zheng Ye, Jinghui Qin, Liang Lin, Xiaodan Liang

Capitalized on the topic-level dialogue graph, we propose a new evaluation metric GRADE, which stands for Graph-enhanced Representations for Automatic Dialogue Evaluation.

Dialogue Evaluation

Paper
Code

Dynamic Spatial-Temporal Representation Learning for Traffic Flow Prediction

2 code implementations • 2 Sep 2019 • Lingbo Liu, Jiajie Zhen, Guanbin Li, Geng Zhan, Zhaocheng He, Bowen Du, Liang Lin

Specifically, the first ConvLSTM unit takes normal traffic flow features as input and generates a hidden state at each time-step, which is further fed into the connected convolutional layer for spatial attention map inference.

Representation Learning Traffic Prediction

Paper
Code

Adversarially-Aware Robust Object Detector

1 code implementation • 13 Jul 2022 • Ziyi Dong, Pengxu Wei, Liang Lin

In this work, we empirically explore the model training for adversarial robustness in object detection, which greatly attributes to the conflict between learning clean images and adversarial images.

Adversarial Robustness Object +2

Paper
Code

OccluMix: Towards De-Occlusion Virtual Try-on by Semantically-Guided Mixup

2 code implementations • 3 Jan 2023 • Zhijing Yang, Junyang Chen, Yukai Shi, Hao Li, Tianshui Chen, Liang Lin

Image Virtual try-on aims at replacing the cloth on a personal image with a garment image (in-shop clothes), which has attracted increasing attention from the multimedia and computer vision communities.

Semantic Parsing Virtual Try-on

Paper
Code

Real-World Image Super-Resolution by Exclusionary Dual-Learning

1 code implementation • 6 Jun 2022 • Hao Li, Jinghui Qin, Zhijing Yang, Pengxu Wei, Jinshan Pan, Liang Lin, Yukai Shi

Real-world image super-resolution is a practical image restoration problem that aims to obtain high-quality images from in-the-wild input, has recently received considerable attention with regard to its tremendous application potentials.

Image Restoration Image Super-Resolution

Paper
Code

Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting

1 code implementation • CVPR 2021 • Lingbo Liu, Jiaqi Chen, Hefeng Wu, Guanbin Li, Chenglong Li, Liang Lin

Extensive experiments conducted on the RGBT-CC benchmark demonstrate the effectiveness of our framework for RGBT crowd counting.

Crowd Counting Representation Learning

Paper
Code

Deep Reasoning with Knowledge Graph for Social Relationship Understanding

1 code implementation • 2 Jul 2018 • Zhouxia Wang, Tianshui Chen, Jimmy Ren, Weihao Yu, Hui Cheng, Liang Lin

And this structured knowledge can be efficiently integrated into the deep neural network architecture to promote social relationship understanding by an end-to-end trainable Graph Reasoning Model (GRM), in which a propagation mechanism is learned to propagate node message through the graph to explore the interaction between persons of interest and the contextual objects.

Ranked #2 on Visual Social Relationship Recognition on PIPA

Visual Social Relationship Recognition

Paper
Code

Physical-Virtual Collaboration Modeling for Intra-and Inter-Station Metro Ridership Prediction

2 code implementations • 14 Jan 2020 • Lingbo Liu, Jingwen Chen, Hefeng Wu, Jiajie Zhen, Guanbin Li, Liang Lin

To address this problem, we model a metro system as graphs with various topologies and propose a unified Physical-Virtual Collaboration Graph Network (PVCGN), which can effectively learn the complex ridership patterns from the tailor-designed graphs.

Representation Learning

Paper
Code

Efficient Crowd Counting via Structured Knowledge Transfer

2 code implementations • 23 Mar 2020 • Lingbo Liu, Jiaqi Chen, Hefeng Wu, Tianshui Chen, Guanbin Li, Liang Lin

Crowd counting is an application-oriented task and its inference efficiency is crucial for real-world applications.

Crowd Counting Transfer Learning

Paper
Code

ScaleLong: Towards More Stable Training of Diffusion Model via Scaling Network Long Skip Connection

2 code implementations • NeurIPS 2023 • Zhongzhan Huang, Pan Zhou, Shuicheng Yan, Liang Lin

Besides, we also observe the theoretical benefits of the LSC coefficient scaling of UNet in the stableness of hidden features and gradient and also robustness.

Paper
Code

Tree-Structured Policy based Progressive Reinforcement Learning for Temporally Language Grounding in Video

1 code implementation • 18 Jan 2020 • Jie Wu, Guanbin Li, Si Liu, Liang Lin

Temporally language grounding in untrimmed videos is a newly-raised task in video understanding.

Decision Making reinforcement-learning +2

Paper
Code

Blending-target Domain Adaptation by Adversarial Meta-Adaptation Networks

1 code implementation • CVPR 2019 • Ziliang Chen, Jingyu Zhuang, Xiaodan Liang, Liang Lin

(Unsupervised) Domain Adaptation (DA) seeks for classifying target instances when solely provided with source labeled and target unlabeled examples for training.

Ranked #3 on Multi-target Domain Adaptation on Office-Home

Multi-target Domain Adaptation Transfer Learning +1

Paper
Code

GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning

1 code implementation • Findings (ACL) 2021 • Jiaqi Chen, Jianheng Tang, Jinghui Qin, Xiaodan Liang, Lingbo Liu, Eric P. Xing, Liang Lin

Therefore, we propose a Geometric Question Answering dataset GeoQA, containing 4, 998 geometric problems with corresponding annotated programs, which illustrate the solving process of the given problems.

Ranked #4 on Mathematical Reasoning on PGPS9K

Math Mathematical Reasoning +1

Paper
Code

Divide and Contrast: Source-free Domain Adaptation via Adaptive Contrastive Learning

2 code implementations • 12 Nov 2022 • Ziyi Zhang, Weikai Chen, Hui Cheng, Zhen Li, Siyuan Li, Liang Lin, Guanbin Li

We investigate a practical domain adaptation task, called source-free domain adaptation (SFUDA), where the source-pretrained model is adapted to the target domain without access to the source data.

Ranked #4 on Source-Free Domain Adaptation on VisDA-2017

Contrastive Learning Source-Free Domain Adaptation

Paper
Code

Solving Inefficiency of Self-supervised Representation Learning

1 code implementation • ICCV 2021 • Guangrun Wang, Keze Wang, Guangcong Wang, Philip H. S. Torr, Liang Lin

In this paper, we reveal two contradictory phenomena in contrastive learning that we call under-clustering and over-clustering problems, which are major obstacles to learning efficiency.

Ranked #1 on Self-Supervised Person Re-Identification on SYSU-30k

Clustering Contrastive Learning +4

Paper
Code

Cost-Effective Active Learning for Deep Image Classification

3 code implementations • 13 Jan 2017 • Keze Wang, Dongyu Zhang, Ya Li, Ruimao Zhang, Liang Lin

In this paper, we propose a novel active learning framework, which is capable of building a competitive classifier with optimal feature representation via a limited amount of labeled training instances in an incremental learning manner.

Active Learning Classification +5

Paper
Code

Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation

1 code implementation • 22 Dec 2020 • Shuai Lin, Pan Zhou, Xiaodan Liang, Jianheng Tang, Ruihui Zhao, Ziliang Chen, Liang Lin

Besides, we develop a Graph-Evolving Meta-Learning (GEML) framework that learns to evolve the commonsense graph for reasoning disease-symptom correlations in a new disease, which effectively alleviates the needs of a large number of dialogues.

Dialogue Generation Meta-Learning

Paper
Code

Deep Cocktail Network: Multi-source Unsupervised Domain Adaptation with Category Shift

1 code implementation • CVPR 2018 • Ruijia Xu, Ziliang Chen, WangMeng Zuo, Junjie Yan, Liang Lin

Motivated by the theoretical results in \cite{mansour2009domain}, the target distribution can be represented as the weighted combination of source distributions, and, the multi-source unsupervised domain adaptation via DCTN is then performed as two alternating steps: i) It deploys multi-way adversarial learning to minimize the discrepancy between the target and each of the multiple source domains, which also obtains the source-specific perplexity scores to denote the possibilities that a target sample belongs to different source domains.

Ranked #5 on Multi-Source Unsupervised Domain Adaptation on Office-31

Multi-Source Unsupervised Domain Adaptation Unsupervised Domain Adaptation

Paper
Code

Structured Semantic Transfer for Multi-Label Recognition with Partial Labels

1 code implementation • 21 Dec 2021 • Tianshui Chen, Tao Pu, Hefeng Wu, Yuan Xie, Liang Lin

To reduce the annotation cost, we propose a structured semantic transfer (SST) framework that enables training multi-label recognition models with partial labels, i. e., merely some labels are known while other labels are missing (also called unknown labels) per image.

Ranked #6 on Multi-label Image Recognition with Partial Labels on PASCAL VOC 2007

Multi-label Image Recognition with Partial Labels

Paper
Code

Unsupervised Domain Adaptive Salient Object Detection Through Uncertainty-Aware Pseudo-Label Learning

1 code implementation • 26 Feb 2022 • Pengxiang Yan, Ziyi Wu, Mengmeng Liu, Kun Zeng, Liang Lin, Guanbin Li

To relieve the burden of labor-intensive labeling, deep unsupervised SOD methods have been proposed to exploit noisy labels generated by handcrafted saliency methods.

object-detection Object Detection +2

Paper
Code

Semantic-Aware Representation Blending for Multi-Label Image Recognition with Partial Labels

1 code implementation • 4 Mar 2022 • Tao Pu, Tianshui Chen, Hefeng Wu, Liang Lin

However, these algorithms depend on sufficient multi-label annotations to train the models, leading to poor performance especially with low known label proportion.

Ranked #2 on Multi-label Image Recognition with Partial Labels on Visual Genome

Multi-label Image Recognition with Partial Labels

Paper
Code

Heterogeneous Semantic Transfer for Multi-label Recognition with Partial Labels

1 code implementation • 23 May 2022 • Tianshui Chen, Tao Pu, Lingbo Liu, Yukai Shi, Zhijing Yang, Liang Lin

Multi-label image recognition with partial labels (MLR-PL), in which some labels are known while others are unknown for each image, may greatly reduce the cost of annotation and thus facilitate large-scale MLR.

Ranked #4 on Multi-label Image Recognition with Partial Labels on PASCAL VOC 2007

Multi-label Image Recognition with Partial Labels

Paper
Code

Dual-Perspective Semantic-Aware Representation Blending for Multi-Label Image Recognition with Partial Labels

1 code implementation • 26 May 2022 • Tao Pu, Tianshui Chen, Hefeng Wu, Yukai Shi, Zhijing Yang, Liang Lin

Specifically, an instance-perspective representation blending (IPRB) module is designed to blend the representations of the known labels in an image with the representations of the corresponding unknown labels in another image to complement these unknown labels.

Ranked #3 on Multi-label Image Recognition with Partial Labels on PASCAL VOC 2007

Image Classification Multi-label Image Recognition with Partial Labels

Paper
Code

DDet: Dual-path Dynamic Enhancement Network for Real-World Image Super-Resolution

1 code implementation • 25 Feb 2020 • Yukai Shi, Haoyu Zhong, Zhijing Yang, Xiaojun Yang, Liang Lin

Previous image SR methods fail to exhibit similar performance on Real-SR as the image data is not aligned inherently.

Image Super-Resolution

Paper
Code

TCGL: Temporal Contrastive Graph for Self-supervised Video Representation Learning

2 code implementations • 7 Dec 2021 • Yang Liu, Keze Wang, Lingbo Liu, Haoyuan Lan, Liang Lin

To overcome these limitations, we take advantage of the multi-scale temporal dependencies within videos and proposes a novel video self-supervised learning framework named Temporal Contrastive Graph Learning (TCGL), which jointly models the inter-snippet and intra-snippet temporal dependencies for temporal representation learning with a hybrid graph contrastive learning strategy.

Action Recognition Contrastive Learning +5

Paper
Code

Unsupervised Image Super-Resolution using Cycle-in-Cycle Generative Adversarial Networks

1 code implementation • 3 Sep 2018 • Yuan Yuan, Siyuan Liu, Jiawei Zhang, Yongbing Zhang, Chao Dong, Liang Lin

We consider the single image super-resolution problem in a more general case that the low-/high-resolution pairs and the down-sampling process are unavailable.

Image Super-Resolution Image-to-Image Translation +1

Paper
Code

Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains

1 code implementation • CVPR 2018 • Jiahao Pang, Wenxiu Sun, Chengxi Yang, Jimmy Ren, Ruichao Xiao, Jin Zeng, Liang Lin

By feeding real stereo pairs of different domains to stereo models pre-trained with synthetic data, we see that: i) a pre-trained model does not generalize well to the new domain, producing artifacts at boundaries and ill-posed regions; however, ii) feeding an up-sampled stereo pair leads to a disparity map with extra details.

Stereo Matching Stereo Matching Hand

Paper
Code

Prototypical Graph Contrastive Learning

1 code implementation • 17 Jun 2021 • Shuai Lin, Pan Zhou, Zi-Yuan Hu, Shuojia Wang, Ruihui Zhao, Yefeng Zheng, Liang Lin, Eric Xing, Xiaodan Liang

However, since for a query, its negatives are uniformly sampled from all graphs, existing methods suffer from the critical sampling bias issue, i. e., the negatives likely having the same semantic structure with the query, leading to performance degradation.

Clustering Contrastive Learning +1

Paper
Code

Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision Action Recognition

1 code implementation • 1 Sep 2020 • Yang Liu, Keze Wang, Guanbin Li, Liang Lin

In this paper, we propose a novel framework, named Semantics-aware Adaptive Knowledge Distillation Networks (SAKDN), to enhance action recognition in vision-sensor modality (videos) by adaptively transferring and distilling the knowledge from multiple wearable sensors.

Action Recognition Image Generation +3

Paper
Code

Collaborative Training between Region Proposal Localization and Classification for Domain Adaptive Object Detection

1 code implementation • ECCV 2020 • Ganlong Zhao, Guanbin Li, Ruijia Xu, Liang Lin

Domain adaptation for object detection tries to adapt the detector from labeled datasets to unlabeled ones for better performance.

Domain Adaptation General Classification +4

Paper
Code

Symbolic Graph Reasoning Meets Convolutions

1 code implementation • NeurIPS 2018 • Xiaodan Liang, Zhiting Hu, Hao Zhang, Liang Lin, Eric P. Xing

To cooperate with local convolutions, each SGR is constituted by three modules: a) a primal local-to-semantic voting module where the features of all symbolic nodes are generated by voting from local representations; b) a graph reasoning module propagates information over knowledge graph to achieve global semantic coherency; c) a dual semantic-to-local mapping module learns new associations of the evolved symbolic nodes with local representations, and accordingly enhances local features.

Ranked #81 on Semantic Segmentation on ADE20K val

Image Classification Semantic Segmentation

Paper
Code

Auto-Panoptic: Cooperative Multi-Component Architecture Search for Panoptic Segmentation

2 code implementations • NeurIPS 2020 • Yangxin Wu, Gengwei Zhang, Hang Xu, Xiaodan Liang, Liang Lin

In this work, we propose an efficient, cooperative and highly automated framework to simultaneously search for all main components including backbone, segmentation branches, and feature fusion module in a unified panoptic segmentation pipeline based on the prevailing one-shot Network Architecture Search (NAS) paradigm.

Instance Segmentation Panoptic Segmentation +2

Paper
Code

Neural-Symbolic Solver for Math Word Problems with Auxiliary Tasks

1 code implementation • ACL 2021 • Jinghui Qin, Xiaodan Liang, Yining Hong, Jianheng Tang, Liang Lin

Previous math word problem solvers following the encoder-decoder paradigm fail to explicitly incorporate essential math symbolic constraints, leading to unexplainable and unreasonable predictions.

Math

Paper
Code

Pi-NAS: Improving Neural Architecture Search by Reducing Supernet Training Consistency Shift

1 code implementation • ICCV 2021 • Jiefeng Peng, Jiqi Zhang, Changlin Li, Guangrun Wang, Xiaodan Liang, Liang Lin

We attribute this ranking correlation problem to the supernet training consistency shift, including feature shift and parameter shift.

Attribute Neural Architecture Search

Paper
Code

On Fast Simulation of Dynamical System with Neural Vector Enhanced Numerical Solver

1 code implementation • 7 Aug 2022 • Zhongzhan Huang, Senwei Liang, Hong Zhang, Haizhao Yang, Liang Lin

The large-scale simulation of dynamical systems is critical in numerous scientific and engineering disciplines.

Computational Efficiency

Paper
Code

UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression

1 code implementation • 6 Dec 2022 • Jiaqi Chen, Tong Li, Jinghui Qin, Pan Lu, Liang Lin, Chongyu Chen, Xiaodan Liang

Naturally, we also present a unified multi-task Geometric Transformer framework, Geoformer, to tackle calculation and proving problems simultaneously in the form of sequence generation, which finally shows the reasoning ability can be improved on both two tasks by unifying formulation.

Ranked #3 on Mathematical Reasoning on PGPS9K

Geometry Problem Solving Logical Reasoning +1

Paper
Code

Dual Adversarial Adaptation for Cross-Device Real-World Image Super-Resolution

1 code implementation • CVPR 2022 • Xiaoqian Xu, Pengxu Wei, Weikai Chen, Mingzhi Mao, Liang Lin, Guanbin Li

To address this issue, we propose an unsupervised domain adaptation mechanism for real-world SR, named Dual ADversarial Adaptation (DADA), which only requires LR images in the target domain with available real paired data from a source camera.

Image Super-Resolution Unsupervised Domain Adaptation

Paper
Code

LSTM-CF: Unifying Context Modeling and Fusion with LSTMs for RGB-D Scene Labeling

1 code implementation • 18 Apr 2016 • Zhen Li, Yukang Gan, Xiaodan Liang, Yizhou Yu, Hui Cheng, Liang Lin

Another long short-term memorized fusion layer is set up to integrate the contexts along the vertical direction from different channels, and perform bi-directional propagation of the fused vertical contexts along the horizontal direction to obtain true 2D global contexts.

Scene Labeling

Paper
Code

Semantically-Aligned Universal Tree-Structured Solver for Math Word Problems

1 code implementation • EMNLP 2020 • Jinghui Qin, Lihui Lin, Xiaodan Liang, Rumin Zhang, Liang Lin

A practical automatic textual math word problems (MWPs) solver should be able to solve various textual MWPs while most existing works only focused on one-unknown linear MWPs.

Ranked #10 on Math Word Problem Solving on ALG514

Math Math Word Problem Solving

Paper
Code

SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised Learning for Robust Infrared Small Target Detection

1 code implementation • 8 Mar 2024 • Yahao Lu, Yupei Lin, Han Wu, Xiaoyu Xian, Yukai Shi, Liang Lin

The quality, quantity, and diversity of the infrared dataset are critical to the detection of small targets.

object-detection Object Detection +1

Paper
Code

Recognizing Focal Liver Lesions in Contrast-Enhanced Ultrasound with Discriminatively Trained Spatio-Temporal Model

1 code implementation • 3 Feb 2015 • Xiaodan Liang, Qingxing Cao, Rui Huang, Liang Lin

The aim of this study is to provide an automatic computational framework to assist clinicians in diagnosing Focal Liver Lesions (FLLs) in Contrast-Enhancement Ultrasound (CEUS).

Paper
Code

AU-Expression Knowledge Constrained Representation Learning for Facial Expression Recognition

1 code implementation • 29 Dec 2020 • Tao Pu, Tianshui Chen, Yuan Xie, Hefeng Wu, Liang Lin

In this work, we explore the correlations among the action units and facial expressions, and devise an AU-Expression Knowledge Constrained Representation Learning (AUE-CRL) framework to learn the AU representations without AU annotations and adaptively use representations to facilitate facial expression recognition.

Facial Expression Recognition Facial Expression Recognition (FER) +1

Paper
Code

Online Metro Origin-Destination Prediction via Heterogeneous Information Aggregation

1 code implementation • 2 Jul 2021 • Lingbo Liu, Yuying Zhu, Guanbin Li, Ziyi Wu, Lei Bai, Liang Lin

In this work, we proposed a novel neural network module termed Heterogeneous Information Aggregation Machine (HIAM), which fully exploits heterogeneous information of historical data (e. g., incomplete OD matrices, unfinished order vectors, and DO matrices) to jointly learn the evolutionary patterns of OD and DO ridership.

Time Series Analysis

Paper
Code

Being Comes from Not-being: Open-vocabulary Text-to-Motion Generation with Wordless Training

1 code implementation • CVPR 2023 • Junfan Lin, Jianlong Chang, Lingbo Liu, Guanbin Li, Liang Lin, Qi Tian, Chang Wen Chen

During inference, instead of changing the motion generator, our method reformulates the input text into a masked motion as the prompt for the motion generator to ``reconstruct'' the motion.

Language Modelling Zero-Shot Learning

Paper
Code

AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis

1 code implementation • 27 Feb 2024 • Tao Tang, Guangrun Wang, Yixing Lao, Peng Chen, Jie Liu, Liang Lin, Kaicheng Yu, Xiaodan Liang

Through extensive experiments across various datasets and scenes, we demonstrate the effectiveness of our approach in facilitating better interaction between LiDAR and camera modalities within a unified neural field.

Novel View Synthesis

Paper
Code

Adversarial Reinforced Instruction Attacker for Robust Vision-Language Navigation

1 code implementation • 23 Jul 2021 • Bingqian Lin, Yi Zhu, Yanxin Long, Xiaodan Liang, Qixiang Ye, Liang Lin

Specifically, we propose a Dynamic Reinforced Instruction Attacker (DR-Attacker), which learns to mislead the navigator to move to the wrong target by destroying the most instructive information in instructions at different timesteps.

Vision and Language Navigation Vision-Language Navigation

Paper
Code

Robust Real-World Image Super-Resolution against Adversarial Attacks

1 code implementation • 31 Jul 2022 • Jiutao Yue, Haofeng Li, Pengxu Wei, Guanbin Li, Liang Lin

Since the frequency masking may not only destroys the adversarial perturbations but also affects the sharp details in a clean image, we further develop an adversarial sample classifier based on the frequency domain of images to determine if applying the proposed mask module.

Image Super-Resolution

Paper
Code

Masked Images Are Counterfactual Samples for Robust Fine-tuning

1 code implementation • CVPR 2023 • Yao Xiao, Ziyi Tang, Pengxu Wei, Cong Liu, Liang Lin

In this paper, based on causal analysis of the aforementioned problems, we propose a novel fine-tuning method, which uses masked images as counterfactual samples that help improve the robustness of the fine-tuning model.

counterfactual

Paper
Code

Knowledge Graph Transfer Network for Few-Shot Recognition

1 code implementation • 21 Nov 2019 • Riquan Chen, Tianshui Chen, Xiaolu Hui, Hefeng Wu, Guanbin Li, Liang Lin

In this work, we represent the semantic correlations in the form of structured knowledge graph and integrate this graph into deep neural networks to promote few-shot learning by a novel Knowledge Graph Transfer Network (KGTN).

Ranked #1 on Few-Shot Image Classification on ImageNet-FS (10-shot, all)

Few-Shot Image Classification Few-Shot Learning +2

Paper
Code

ADASR: An Adversarial Auto-Augmentation Framework for Hyperspectral and Multispectral Data Fusion

1 code implementation • 11 Oct 2023 • Jinghui Qin, Lihuang Fang, Ruitao Lu, Liang Lin, Yukai Shi

Deep learning-based hyperspectral image (HSI) super-resolution, which aims to generate high spatial resolution HSI (HR-HSI) by fusing hyperspectral image (HSI) and multispectral image (MSI) with deep neural networks (DNNs), has attracted lots of attention.

Data Augmentation Super-Resolution

Paper
Code

Cost-effective Object Detection: Active Sample Mining with Switchable Selection Criteria

1 code implementation • 30 Jun 2018 • Keze Wang, Liang Lin, Xiaopeng Yan, Ziliang Chen, Dongyu Zhang, Lei Zhang

The proposed process can be compatible with mini-batch based training (i. e., using a batch of unlabeled or partially labeled data as a one-time input) for object detection.

Active Learning object-detection +2

Paper
Code

Continual Object Detection via Prototypical Task Correlation Guided Gating Mechanism

1 code implementation • CVPR 2022 • BinBin Yang, Xinchi Deng, Han Shi, Changlin Li, Gengwei Zhang, Hang Xu, Shen Zhao, Liang Lin, Xiaodan Liang

To make ROSETTA automatically determine which experience is available and useful, a prototypical task correlation guided Gating Diversity Controller(GDC) is introduced to adaptively adjust the diversity of gates for the new task based on class-specific prototypes.

Continual Learning Object +2

Paper
Code

Spatial-Temporal Knowledge-Embedded Transformer for Video Scene Graph Generation

1 code implementation • 23 Sep 2023 • Tao Pu, Tianshui Chen, Hefeng Wu, Yongyi Lu, Liang Lin

In this work, we propose a spatial-temporal knowledge-embedded transformer (STKET) that incorporates the prior spatial-temporal knowledge into the multi-head cross-attention mechanism to learn more representative relationship representations.

Graph Generation Object +2

Paper
Code

LogicSolver: Towards Interpretable Math Word Problem Solving with Logical Prompt-enhanced Learning

2 code implementations • 17 May 2022 • Zhicheng Yang, Jinghui Qin, Jiaqi Chen, Liang Lin, Xiaodan Liang

To address this issue and make a step towards interpretable MWP solving, we first construct a high-quality MWP dataset named InterMWP which consists of 11, 495 MWPs and annotates interpretable logical formulas based on algebraic knowledge as the grounded linguistic logic of each solution equation.

Math Math Word Problem Solving

Paper
Code

Semantic-Aware Auto-Encoders for Self-Supervised Representation Learning

1 code implementation • CVPR 2022 • Guangrun Wang, Yansong Tang, Liang Lin, Philip H.S. Torr

Inspired by perceptual learning that could use cross-view learning to perceive concepts and semantics, we propose a novel AE that could learn semantic-aware representation via cross-view image reconstruction.

Image Reconstruction Representation Learning +1

Paper
Code

Mirror Gradient: Towards Robust Multimodal Recommender Systems via Exploring Flat Local Minima

1 code implementation • 17 Feb 2024 • Shanshan Zhong, Zhongzhan Huang, Daifeng Li, Wushao Wen, Jinghui Qin, Liang Lin

This strategy can implicitly enhance the model's robustness during the optimization process, mitigating instability risks arising from multimodal information inputs.

Multimodal Recommendation

Paper
Code

Multivariate-Information Adversarial Ensemble for Scalable Joint Distribution Matching

1 code implementation • 8 Jul 2019 • Ziliang Chen, Zhanfu Yang, Xiaoxi Wang, Xiaodan Liang, Xiaopeng Yan, Guanbin Li, Liang Lin

A broad range of cross-$m$-domain generation researches boil down to matching a joint distribution by deep generative models (DGMs).

Paper
Code

Fine-Grained Image Captioning with Global-Local Discriminative Objective

1 code implementation • 21 Jul 2020 • Jie Wu, Tianshui Chen, Hefeng Wu, Zhi Yang, Guangchun Luo, Liang Lin

This is primarily due to (i) the conservative characteristic of traditional training objectives that drives the model to generate correct but hardly discriminative captions for similar images and (ii) the uneven word distribution of the ground-truth captions, which encourages generating highly frequent words/phrases while suppressing the less frequent but more concrete ones.

Descriptive Image Captioning +2

Paper
Code

Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp

1 code implementation • 30 Nov 2020 • Junfan Lin, Zhongzhan Huang, Keze Wang, Xiaodan Liang, Weiwei Chen, Liang Lin

Although deep reinforcement learning (RL) has been successfully applied to a variety of robotic control tasks, it's still challenging to apply it to real-world tasks, due to the poor sample efficiency.

Continuous Control Reinforcement Learning (RL)

Paper
Code

NiteDR: Nighttime Image De-Raining with Cross-View Sensor Cooperative Learning for Dynamic Driving Scenes

1 code implementation • 28 Feb 2024 • Cidan Shi, Lihuang Fang, Han Wu, Xiaoyu Xian, Yukai Shi, Liang Lin

Specifically, we introduce cooperative learning between visible and infrared images captured by different sensors.

Autonomous Driving Rain Removal

Paper
Code

Contrastive Transformer Learning with Proximity Data Generation for Text-Based Person Search

1 code implementation • 15 Nov 2023 • Hefeng Wu, Weifeng Chen, Zhibin Liu, Tianshui Chen, Zhiguang Chen, Liang Lin

Moreover, we propose a proximity data generation (PDG) module to automatically produce more diverse data for cross-modal training.

Contrastive Learning Cross-Modal Retrieval +4

Paper
Code

Adaptive Global-Local Representation Learning and Selection for Cross-Domain Facial Expression Recognition

1 code implementation • 20 Jan 2024 • Yuefang Gao, Yuhao Xie, Zeke Zexi Hu, Tianshui Chen, Liang Lin

Specifically, the framework consists of separate global-local adversarial learning modules that learn domain-invariant global and local features independently.

Cross-Domain Facial Expression Recognition Model Optimization +2

Paper
Code

IDF-CR: Iterative Diffusion Process for Divide-and-Conquer Cloud Removal in Remote-sensing Images

1 code implementation • 18 Mar 2024 • Meilin Wang, Yexing Song, Pengxu Wei, Xiaoyu Xian, Yukai Shi, Liang Lin

IDF-CR consists of a pixel space cloud removal module (Pixel-CR) and a latent space iterative noise diffusion network (IND).

Cloud Removal Image Generation +1

Paper
Code

Road Network Guided Fine-Grained Urban Traffic Flow Inference

1 code implementation • 29 Sep 2021 • Lingbo Liu, Mengmeng Liu, Guanbin Li, Ziyi Wu, Junfan Lin, Liang Lin

Furthermore, we take the road network feature as a query to capture the long-range spatial distribution of traffic flow with a transformer architecture.

Paper
Code

DenseLight: Efficient Control for Large-scale Traffic Signals with Dense Feedback

1 code implementation • 13 Jun 2023 • Junfan Lin, Yuying Zhu, Lingbo Liu, Yang Liu, Guanbin Li, Liang Lin

1) The travel time of a vehicle is delayed feedback on the effectiveness of TSC policy at each traffic intersection since it is obtained after the vehicle has left the road network.

Reinforcement Learning (RL)

Paper
Code

Towards Human-Machine Cooperation: Self-supervised Sample Mining for Object Detection

no code implementations • CVPR 2018 • Keze Wang, Xiaopeng Yan, Dongyu Zhang, Lei Zhang, Liang Lin

Though quite challenging, leveraging large-scale unlabeled or partially labeled images in a cost-effective way has increasingly attracted interests for its great importance to computer vision.

Active Learning Object +2

Paper
Add Code

DRPose3D: Depth Ranking in 3D Human Pose Estimation

no code implementations • 23 May 2018 • Min Wang, Xipeng Chen, Wentao Liu, Chen Qian, Liang Lin, Lizhuang Ma

In this paper, we propose a two-stage depth ranking based method (DRPose3D) to tackle the problem of 3D human pose estimation.

3D Human Pose Estimation 3D Pose Estimation

Paper
Add Code

Visual Tracking via Dynamic Graph Learning

no code implementations • 4 Oct 2017 • Chenglong Li, Liang Lin, WangMeng Zuo, Jin Tang, Ming-Hsuan Yang

First, the graph is initialized by assigning binary weights of some image patches to indicate the object and background patches according to the predicted bounding box.

Graph Learning Object +2

Paper
Add Code

Visual Question Reasoning on General Dependency Tree

no code implementations • CVPR 2018 • Qingxing Cao, Xiaodan Liang, Bailing Li, Guanbin Li, Liang Lin

This network comprises of two collaborative modules: i) an adversarial attention module to exploit the local visual evidence for each word parsed from the question; ii) a residual composition module to compose the previously mined evidence.

Question Answering Visual Question Answering

Paper
Add Code

Weakly Supervised Salient Object Detection Using Image Labels

no code implementations • 17 Mar 2018 • Guanbin Li, Yuan Xie, Liang Lin

Our algorithm is based on alternately exploiting a graphical model and training a fully convolutional network for model updating.

Object object-detection +3

Paper
Add Code

Deep Structured Scene Parsing by Learning with Image Descriptions

no code implementations • CVPR 2016 • Liang Lin, Guangrun Wang, Rui Zhang, Ruimao Zhang, Xiaodan Liang, WangMeng Zuo

This paper addresses a fundamental problem of scene understanding: How to parse the scene image into a structured configuration (i. e., a semantic object hierarchy with object interaction relations) that finely accords with human perception.

Descriptive Object +3

Paper
Add Code

Learning to Segment Human by Watching YouTube

no code implementations • 4 Oct 2017 • Xiaodan Liang, Yunchao Wei, Liang Lin, Yunpeng Chen, Xiaohui Shen, Jianchao Yang, Shuicheng Yan

An intuition on human segmentation is that when a human is moving in a video, the video-context (e. g., appearance and motion clues) may potentially infer reasonable mask information for the whole human body.

Human Detection Segmentation +5

Paper
Add Code

Batch Kalman Normalization: Towards Training Deep Neural Networks with Micro-Batches

no code implementations • 9 Feb 2018 • Guangrun Wang, Jiefeng Peng, Ping Luo, Xinjiang Wang, Liang Lin

As an indispensable component, Batch Normalization (BN) has successfully improved the training of deep neural networks (DNNs) with mini-batches, by normalizing the distribution of the internal representation for each hidden layer.

Image Classification

Paper
Add Code

Hierarchical Scene Parsing by Weakly Supervised Learning with Image Descriptions

no code implementations • 27 Sep 2017 • Ruimao Zhang, Liang Lin, Guangrun Wang, Meng Wang, WangMeng Zuo

Rather than relying on elaborative annotations (e. g., manually labeled semantic maps and relations), we train our deep model in a weakly-supervised learning manner by leveraging the descriptive sentences of the training images.

Descriptive Object +4

Paper
Add Code

Structured Inhomogeneous Density Map Learning for Crowd Counting

no code implementations • 20 Jan 2018 • Hanhui Li, Xiangjian He, Hefeng Wu, Saeed Amirgholipour Kasmani, Ruomei Wang, Xiaonan Luo, Liang Lin

In this paper, we aim at tackling the problem of crowd counting in extremely high-density scenes, which contain hundreds, or even thousands of people.

Crowd Counting

Paper
Add Code

Context-Aware Semantic Inpainting

no code implementations • 21 Dec 2017 • Haofeng Li, Guanbin Li, Liang Lin, Yizhou Yu

Our proposed GAN-based framework consists of a fully convolutional design for the generator which helps to better preserve spatial structures and a joint loss function with a revised perceptual loss to capture high-level semantics in the context.

Generative Adversarial Network Image Inpainting

Paper
Add Code

Recurrent Attentional Reinforcement Learning for Multi-label Image Recognition

no code implementations • 20 Dec 2017 • Tianshui Chen, Zhouxia Wang, Guanbin Li, Liang Lin

Recognizing multiple labels of images is a fundamental but challenging task in computer vision, and remarkable progress has been attained by localizing semantic-aware image regions and predicting their labels with deep convolutional neural networks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Multi-label Image Recognition by Recurrently Discovering Attentional Regions

no code implementations • ICCV 2017 • Zhouxia Wang, Tianshui Chen, Guanbin Li, Ruijia Xu, Liang Lin

This paper proposes a novel deep architecture to address multi-label image recognition, a fundamental and practical task towards general visual understanding.

General Classification Multi-Label Image Classification +1

Paper
Add Code

Content-Adaptive Sketch Portrait Generation by Decompositional Representation Learning

no code implementations • 4 Oct 2017 • Dongyu Zhang, Liang Lin, Tianshui Chen, Xian Wu, Wenwei Tan, Ebroul Izquierdo

Sketch portrait generation benefits a wide range of applications such as digital entertainment and law enforcement.

Representation Learning

Paper
Add Code

Attention-Aware Face Hallucination via Deep Reinforcement Learning

no code implementations • CVPR 2017 • Qingxing Cao, Liang Lin, Yukai Shi, Xiaodan Liang, Guanbin Li

Face hallucination is a domain-specific super-resolution problem with the goal to generate high-resolution (HR) faces from low-resolution (LR) input images.

Face Hallucination Hallucination +3

Paper
Add Code

Recurrent 3D Pose Sequence Machines

no code implementations • CVPR 2017 • Mude Lin, Liang Lin, Xiaodan Liang, Keze Wang, Hui Cheng

3D human articulated pose recovery from monocular image sequences is very challenging due to the diverse appearances, viewpoints, occlusions, and also the human 3D pose is inherently ambiguous from the monocular imagery.

Ranked #20 on 3D Human Pose Estimation on HumanEva-I

3D Human Pose Estimation 3D Pose Estimation

Paper
Add Code

Deep Co-Space: Sample Mining Across Feature Transformation for Semi-Supervised Learning

no code implementations • 28 Jul 2017 • Ziliang Chen, Keze Wang, Xiao Wang, Pai Peng, Ebroul Izquierdo, Liang Lin

Aiming at improving performance of visual classification in a cost-effective manner, this paper proposes an incremental semi-supervised learning paradigm called Deep Co-Space (DCS).

Classification General Classification +1

Paper
Add Code

Structure-Preserving Image Super-resolution via Contextualized Multi-task Learning

no code implementations • 26 Jul 2017 • Yukai Shi, Keze Wang, Chongyu Chen, Li Xu, Liang Lin

Single image super resolution (SR), which refers to reconstruct a higher-resolution (HR) image from the observed low-resolution (LR) image, has received substantial attention due to its tremendous application potentials.

Computational Efficiency Image Restoration +2

Paper
Add Code

Knowledge-Guided Recurrent Neural Network Learning for Task-Oriented Action Prediction

no code implementations • 15 Jul 2017 • Liang Lin, Lili Huang, Tianshui Chen, Yukang Gan, Hui Cheng

This paper aims at task-oriented action prediction, i. e., predicting a sequence of actions towards accomplishing a specific task under a certain scene, which is a new problem in computer vision research.

Common Sense Reasoning valid

Paper
Add Code

Active Self-Paced Learning for Cost-Effective and Progressive Face Identification

no code implementations • 13 Jan 2017 • Liang Lin, Keze Wang, Deyu Meng, WangMeng Zuo, Lei Zhang

By naturally combining two recently rising techniques: active learning (AL) and self-paced learning (SPL), our framework is capable of automatically annotating new instances and incorporating them into training under weak expert re-certification.

Active Learning Face Identification

Paper
Add Code

Instance-Level Salient Object Segmentation

no code implementations • CVPR 2017 • Guanbin Li, Yuan Xie, Liang Lin, Yizhou Yu

Image saliency detection has recently witnessed rapid progress due to deep convolutional neural networks.

Ranked #15 on RGB Salient Object Detection on DUTS-TE (max F-measure metric)

Instance Segmentation Object +3

Paper
Add Code

Interpretable Structure-Evolving LSTM

no code implementations • CVPR 2017 • Xiaodan Liang, Liang Lin, Xiaohui Shen, Jiashi Feng, Shuicheng Yan, Eric P. Xing

Instead of learning LSTM models over the pre-fixed structures, we propose to further learn the intermediate interpretable multi-level graph structures in a progressive and stochastic way from data during the LSTM network optimization.

Small Data Image Classification

Paper
Add Code

Progressively Diffused Networks for Semantic Image Segmentation

no code implementations • 20 Feb 2017 • Ruimao Zhang, Wei Yang, Zhanglin Peng, Xiaogang Wang, Liang Lin

This paper introduces Progressively Diffused Networks (PDNs) for unifying multi-scale context modeling with deep feature learning, by taking semantic image segmentation as an exemplar application.

Image Segmentation Segmentation +1

Paper
Add Code

Learning to Segment Object Candidates via Recursive Neural Networks

no code implementations • 4 Dec 2016 • Tianshui Chen, Liang Lin, Xian Wu, Nong Xiao, Xiaonan Luo

To avoid the exhaustive search over locations and scales, current state-of-the-art object detection systems usually involve a crucial component generating a batch of candidate object proposals from images.

Object object-detection +1

Paper
Add Code

Human Pose Estimation from Depth Images via Inference Embedded Multi-task Learning

no code implementations • 13 Aug 2016 • Keze Wang, Shengfu Zhai, Hui Cheng, Xiaodan Liang, Liang Lin

In this paper, we propose a novel inference-embedded multi-task learning framework for predicting human pose from still depth images, which is implemented with a deep architecture of neural networks.

Multi-Task Learning Pose Estimation +1

Paper
Add Code

Is Faster R-CNN Doing Well for Pedestrian Detection?

no code implementations • 24 Jul 2016 • Liliang Zhang, Liang Lin, Xiaodan Liang, Kaiming He

Detecting pedestrian has been arguably addressed as a special topic beyond general object detection.

Ranked #19 on Pedestrian Detection on Caltech

Object object-detection +3

Paper
Add Code

Local- and Holistic- Structure Preserving Image Super Resolution via Deep Joint Component Learning

no code implementations • 25 Jul 2016 • Yukai Shi, Keze Wang, Li Xu, Liang Lin

Recently, machine learning based single image super resolution (SR) approaches focus on jointly learning representations for high-resolution (HR) and low-resolution (LR) image patch pairs to improve the quality of the super-resolved images.

Image Super-Resolution Representation Learning

Paper
Add Code

Cross-Domain Visual Matching via Generalized Similarity Measure and Feature Learning

no code implementations • 13 May 2016 • Liang Lin, Guangrun Wang, WangMeng Zuo, Xiangchu Feng, Lei Zhang

Cross-domain visual data matching is one of the fundamental problems in many real-world vision tasks, e. g., matching persons across ID photos and surveillance videos.

Face Verification Model Optimization +2

Paper
Add Code

DARI: Distance metric And Representation Integration for Person Verification

no code implementations • 15 Apr 2016 • Guangrun Wang, Liang Lin, Shengyong Ding, Ya Li, Qing Wang

The past decade has witnessed the rapid development of feature representation learning and distance metric learning, whereas the two steps are often discussed separately.

Ranked #7 on Person Re-Identification on SYSU-30k (using extra training data)

Metric Learning Person Re-Identification +1

Paper
Add Code

Unconstrained Facial Landmark Localization with Backbone-Branches Fully-Convolutional Networks

no code implementations • 13 Jul 2015 • Zhujin Liang, Shengyong Ding, Liang Lin

This paper investigates how to rapidly and accurately localize facial landmarks in unconstrained, cluttered environments rather than in the well segmented face images.

Face Alignment

Paper
Add Code

Geometric Scene Parsing with Hierarchical LSTM

no code implementations • 7 Apr 2016 • Zhanglin Peng, Ruimao Zhang, Xiaodan Liang, Xiaobai Liu, Liang Lin

This paper addresses the problem of geometric scene parsing, i. e. simultaneously labeling geometric surfaces (e. g. sky, ground and vertical plane) and determining the interaction relations (e. g. layering, supporting, siding and affinity) between main regions.

3D Reconstruction Scene Labeling

Paper
Add Code

Semantic Object Parsing with Graph LSTM

no code implementations • 23 Mar 2016 • Xiaodan Liang, Xiaohui Shen, Jiashi Feng, Liang Lin, Shuicheng Yan

By taking the semantic object parsing task as an exemplar application scenario, we propose the Graph Long Short-Term Memory (Graph LSTM) network, which is the generalization of LSTM from sequential data or multi-dimensional data to general graph-structured data.

Object Superpixels

Paper
Add Code

Character Proposal Network for Robust Text Extraction

no code implementations • 13 Feb 2016 • Shuye Zhang, Mude Lin, Tianshui Chen, Lianwen Jin, Liang Lin

Maximally stable extremal regions (MSER), which is a popular method to generate character proposals/candidates, has shown superior performance in scene text detection.

Scene Text Detection Text Detection

Paper
Add Code

Learning Support Correlation Filters for Visual Tracking

no code implementations • 22 Jan 2016 • Wangmeng Zuo, Xiaohe Wu, Liang Lin, Lei Zhang, Ming-Hsuan Yang

Sampling and budgeting training examples are two essential factors in tracking algorithms based on support vector machines (SVMs) as a trade-off between accuracy and efficiency.

Visual Tracking

Paper
Add Code

Deep Feature Learning with Relative Distance Comparison for Person Re-identification

no code implementations • 11 Dec 2015 • Shengyong Ding, Liang Lin, Guangrun Wang, Hongyang Chao

Identifying the same individual across different scenes is an important yet difficult task in intelligent video surveillance.

Ranked #9 on Person Re-Identification on SYSU-30k (using extra training data)

Person Re-Identification

Paper
Add Code

DISC: Deep Image Saliency Computing via Progressive Representation Learning

no code implementations • 13 Nov 2015 • Tianshui Chen, Liang Lin, Lingbo Liu, Xiaonan Luo, Xuelong. Li

Our DISC framework is capable of uniformly highlighting the objects-of-interest from complex background while preserving well object details.

object-detection Representation Learning +2

Paper
Add Code

A Deep Structured Model with Radius-Margin Bound for 3D Human Activity Recognition

no code implementations • 5 Dec 2015 • Liang Lin, Keze Wang, WangMeng Zuo, Meng Wang, Jiebo Luo, Lei Zhang

Understanding human activity is very challenging even with the recently developed 3D/depth sensors.

Human Activity Recognition

Paper
Add Code

Reversible Recursive Instance-level Object Segmentation

no code implementations • CVPR 2016 • Xiaodan Liang, Yunchao Wei, Xiaohui Shen, Zequn Jie, Jiashi Feng, Liang Lin, Shuicheng Yan

By being reversible, the proposal refinement sub-network adaptively determines an optimal number of refinement iterations required for each proposal during both training and testing.

Denoising Object +2

Paper
Add Code

Semantic Object Parsing with Local-Global Long Short-Term Memory

no code implementations • CVPR 2016 • Xiaodan Liang, Xiaohui Shen, Donglai Xiang, Jiashi Feng, Liang Lin, Shuicheng Yan

The long chains of sequential computation by stacked LG-LSTM layers also enable each pixel to sense a much larger region for inference benefiting from the memorization of previous dependencies in all positions along all dimensions.

Memorization Position

Paper
Add Code

Proposal-free Network for Instance-level Object Segmentation

no code implementations • 9 Sep 2015 • Xiaodan Liang, Yunchao Wei, Xiaohui Shen, Jianchao Yang, Liang Lin, Shuicheng Yan

Instance-level object segmentation is an important yet under-explored task.

Clustering Object +3

Paper
Add Code

Bit-Scalable Deep Hashing with Regularized Similarity Learning for Image Retrieval and Person Re-identification

no code implementations • 19 Aug 2015 • Ruimao Zhang, Liang Lin, Rui Zhang, WangMeng Zuo, Lei Zhang

Furthermore, each bit of our hashing codes is unequally weighted so that we can manipulate the code lengths by truncating the insignificant bits.

Deep Hashing Image Retrieval +1

Paper
Add Code

Deep Boosting: Joint Feature Selection and Analysis Dictionary Learning in Hierarchy

no code implementations • 8 Aug 2015 • Zhanglin Peng, Ya Li, Zhaoquan Cai, Liang Lin

In each layer, we construct a dictionary of filters by combining the filters from the lower layer, and iteratively optimize the image representation with a joint discriminative-generative formulation, i. e. minimization of empirical classification error plus regularization of analysis image generation over training images.

Classification Dictionary Learning +4

Paper
Add Code

PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Edge-Preserving Coherence

no code implementations • CVPR 2013 • Keze Wang, Liang Lin, Jiangbo Lu, Chenglong Li, Keyang Shi

In this paper, we propose a unified framework called PISA, which stands for Pixelwise Image Saliency Aggregating various bottom-up cues and priors.

Image Segmentation Object Recognition +2

Paper
Add Code

Computational Baby Learning

no code implementations • 11 Nov 2014 • Xiaodan Liang, Si Liu, Yunchao Wei, Luoqi Liu, Liang Lin, Shuicheng Yan

Then the concept detector can be fine-tuned based on these new instances.

object-detection Object Detection

Paper
Add Code

F-SVM: Combination of Feature Transformation and SVM Learning via Convex Relaxation

no code implementations • 20 Apr 2015 • Xiaohe Wu, WangMeng Zuo, Yuanyuan Zhu, Liang Lin

The generalization error bound of support vector machine (SVM) depends on the ratio of radius and margin, while standard SVM only considers the maximization of the margin but ignores the minimization of the radius.

Paper
Add Code

End-to-End Photo-Sketch Generation via Fully Convolutional Representation Learning

no code implementations • 28 Jan 2015 • Liliang Zhang, Liang Lin, Xian Wu, Shengyong Ding, Lei Zhang

Sketch-based face recognition is an interesting task in vision and multimedia research, yet it is quite challenging due to the great difference between face photos and sketches.

Face Recognition Representation Learning

Paper
Add Code

Matching-CNN Meets KNN: Quasi-Parametric Human Parsing

no code implementations • CVPR 2015 • Si Liu, Xiaodan Liang, Luoqi Liu, Xiaohui Shen, Jianchao Yang, Changsheng Xu, Liang Lin, Xiaochun Cao, Shuicheng Yan

Under the classic K Nearest Neighbor (KNN)-based nonparametric framework, the parametric Matching Convolutional Neural Network (M-CNN) is proposed to predict the matching confidence and displacements of the best matched region in the testing image for a particular semantic region in one KNN image.

Human Parsing

Paper
Add Code

Data-Driven Scene Understanding with Adaptively Retrieved Exemplars

no code implementations • 3 Feb 2015 • Xionghao Liu, Wei Yang, Liang Lin, Qing Wang, Zhaoquan Cai, Jian-Huang Lai

In the first step, the references are selected by jointly matching their appearances with the target as well as the semantics (i. e. the assigned labels of the target and the references).

Scene Understanding Semantic Segmentation +1

Paper
Add Code

Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection

no code implementations • CVPR 2013 • Xiaolong Wang, Liang Lin, Lichao Huang, Shuicheng Yan

This paper proposes a reconfigurable model to recognize and detect multiclass (or multiview) objects with large variation in appearance.

Object Recognition valid

Paper
Add Code

Deep Joint Task Learning for Generic Object Extraction

no code implementations • NeurIPS 2014 • Xiaolong Wang, Liliang Zhang, Liang Lin, Zhujin Liang, WangMeng Zuo

We present a general joint task learning framework, in which each task (either object localization or object segmentation) is tackled via a multi-layer convolutional neural network, and the two networks work collaboratively to boost performance.

Object Object Localization +1

Paper
Add Code

Dynamical And-Or Graph Learning for Object Shape Modeling and Detection

no code implementations • NeurIPS 2012 • Xiaolong Wang, Liang Lin

A discriminative learning algorithm, extended from the CCCP [23], is proposed to train the model in a dynamical manner: the model structure (e. g., the configuration of the leaf-nodes associated with the or-nodes) is automatically determined with optimizing the multi-layer parameters during the iteration.

Graph Learning

Paper
Add Code

Clothing Co-Parsing by Joint Image Segmentation and Labeling

no code implementations • CVPR 2014 • Wei Yang, Ping Luo, Liang Lin

This paper aims at developing an integrated system of clothing co-parsing, in order to jointly parse a set of clothing images (unsegmented but annotated with tags) into semantic configurations.

Image Segmentation Semantic Segmentation

Paper
Add Code

Learning Contour-Fragment-based Shape Model with And-Or Tree Representation

no code implementations • 3 Feb 2015 • Liang Lin, Xiaolong Wang, Wei Yang, Jian-Huang Lai

This paper proposes a simple yet effective method to learn the hierarchical object shape model consisting of local contour fragments, which represents a category of shapes in the form of an And-Or tree.

Clustering Edge Detection +1

Paper
Add Code

Deep Boosting: Layered Feature Mining for General Image Classification

no code implementations • 3 Feb 2015 • Zhanglin Peng, Liang Lin, Ruimao Zhang, Jing Xu

Constructing effective representations is a critical but challenging problem in multimedia understanding.

Classification General Classification +1

Paper
Add Code

An Expressive Deep Model for Human Action Parsing from A Single Image

no code implementations • 2 Feb 2015 • Zhujin Liang, Xiaolong Wang, Rui Huang, Liang Lin

This paper aims at one newly raising task in vision and multimedia research: recognizing human actions from still images.

Action Parsing Action Understanding +2

Paper
Add Code

Towards a solid solution of real-time fire and flame detection

no code implementations • 2 Feb 2015 • Bo Jiang, Yongyi Lu, Xiying Li, Liang Lin

Although the object detection and recognition has received growing attention for decades, a robust fire and flame detection method is rarely explored.

object-detection Object Detection +1

Paper
Add Code

Integrating Graph Partitioning and Matching for Trajectory Analysis in Video Surveillance

no code implementations • 2 Feb 2015 • Liang Lin, Yongyi Lu, Yan Pan, Xiaowu Chen

With this graph representation, we pose trajectory analysis as a joint task of spatial graph partitioning and temporal graph matching.

Attribute Graph Matching +1

Paper
Add Code

Adaptive Scene Category Discovery with Generative Learning and Compositional Sampling

no code implementations • 2 Feb 2015 • Liang Lin, Ruimao Zhang, Xiaohua Duan

During the iterations of inference, the model of each category is analytically updated by a generative learning algorithm.

Image Categorization

Paper
Add Code

Iterated Support Vector Machines for Distance Metric Learning

no code implementations • 2 Feb 2015 • Wangmeng Zuo, Faqiang Wang, David Zhang, Liang Lin, Yuchi Huang, Deyu Meng, Lei Zhang

Distance metric learning aims to learn from the given training data a valid distance metric, with which the similarity between data samples can be more effectively evaluated for classification.

Classification Face Verification +5

Paper
Add Code

Complex Background Subtraction by Pursuing Dynamic Spatio-Temporal Models

no code implementations • 2 Feb 2015 • Liang Lin, Yuanlu Xu, Xiaodan Liang, Jian-Huang Lai

Although it has been widely discussed in video surveillance, background subtraction is still an open problem in the context of complex scenarios, e. g., dynamic backgrounds, illumination variations, and indistinct foreground objects.

Paper
Add Code

Discriminatively Trained And-Or Graph Models for Object Shape Detection

no code implementations • 2 Feb 2015 • Liang Lin, Xiaolong Wang, Wei Yang, Jian-Huang Lai

In this paper, we investigate a novel reconfigurable part-based model, namely And-Or graph model, to recognize object shapes in images.

object-detection Object Detection

Paper
Add Code

3D Human Activity Recognition with Reconfigurable Convolutional Neural Networks

no code implementations • 26 Jan 2015 • Keze Wang, Xiaolong Wang, Liang Lin, Meng Wang, WangMeng Zuo

Our model thus advances existing approaches in two aspects: (i) it acts directly on the raw inputs (grayscale-depth data) to conduct recognition instead of relying on hand-crafted features, and (ii) the model structure can be dynamically adjusted accounting for the temporal variations of human activities, i. e. the network configuration is allowed to be partially activated during inference.

Human Activity Recognition

Paper
Add Code

Learning Latent Spatio-Temporal Compositional Model for Human Action Recognition

no code implementations • 1 Feb 2015 • Xiaodan Liang, Liang Lin, Liangliang Cao

Action recognition is an important problem in multimedia understanding.

Action Recognition Temporal Action Localization +1

Paper
Add Code

Human Re-identification by Matching Compositional Template with Cluster Sampling

no code implementations • 1 Feb 2015 • Yuanlu Xu, Liang Lin, Wei-Shi Zheng, Xiaobai Liu

This paper aims at a newly raising task in visual surveillance: re-identifying people at a distance by matching body information, given several reference examples.

Person Re-Identification

Paper
Add Code

Correntropy Induced L2 Graph for Robust Subspace Clustering

no code implementations • 18 Jan 2015 • Canyi Lu, Jinhui Tang, Min Lin, Liang Lin, Shuicheng Yan, Zhouchen Lin

In this paper, we study the robust subspace clustering problem, which aims to cluster the given possibly noisy data points into their underlying subspaces.

Clustering graph construction

Paper
Add Code

Crowd Counting using Deep Recurrent Spatial-Aware Network

no code implementations • 2 Jul 2018 • Lingbo Liu, Hongjun Wang, Guanbin Li, Wanli Ouyang, Liang Lin

Crowd counting from unconstrained scene images is a crucial task in many real-world applications like urban surveillance and management, but it is greatly challenged by the camera's perspective that causes huge appearance variations in people's scales and rotations.

Crowd Counting Management

Paper
Add Code

Knowledge-Embedded Representation Learning for Fine-Grained Image Recognition

no code implementations • 2 Jul 2018 • Tianshui Chen, Liang Lin, Riquan Chen, Yang Wu, Xiaonan Luo

Humans can naturally understand an image in depth with the aid of rich knowledge accumulated from daily lives or professions.

Fine-Grained Image Classification Fine-Grained Image Recognition +1

Paper
Add Code

SCAN: Self-and-Collaborative Attention Network for Video Person Re-identification

no code implementations • 16 Jul 2018 • Ruimao Zhang, Hongbin Sun, Jingyu Li, Yuying Ge, Liang Lin, Ping Luo, Xiaogang Wang

To address the above issues, we present a novel and practical deep architecture for video person re-identification termed Self-and-Collaborative Attention Network (SCAN).

Video-Based Person Re-Identification

Paper
Add Code

Non-locally Enhanced Encoder-Decoder Network for Single Image De-raining

no code implementations • 4 Aug 2018 • Guanbin Li, Xiang He, Wei zhang, Huiyou Chang, Le Dong, Liang Lin

Single image rain streaks removal has recently witnessed substantial progress due to the development of deep convolutional neural networks.

Paper
Add Code

Neural Task Planning with And-Or Graph Representations

no code implementations • 25 Aug 2018 • Tianshui Chen, Riquan Chen, Lin Nie, Xiaonan Luo, Xiaobai Liu, Liang Lin

This paper focuses on semantic task planning, i. e., predicting a sequence of actions toward accomplishing a specific task under a certain scene, which is a new problem in computer vision research.

Common Sense Reasoning valid

Paper
Add Code

Attentive Crowd Flow Machines

no code implementations • 1 Sep 2018 • Lingbo Liu, Ruimao Zhang, Jiefeng Peng, Guanbin Li, Bowen Du, Liang Lin

Traffic flow prediction is crucial for urban traffic management and public safety.

Management

Paper
Add Code

Interpretable Visual Question Answering by Reasoning on Dependency Trees

no code implementations • 6 Sep 2018 • Qingxing Cao, Bailin Li, Xiaodan Liang, Liang Lin

Collaborative reasoning for understanding image-question pairs is a very critical but underexplored topic in interpretable visual question answering systems.

Question Answering valid +1

Paper
Add Code

PIRM Challenge on Perceptual Image Enhancement on Smartphones: Report

no code implementations • 3 Oct 2018 • Andrey Ignatov, Radu Timofte, Thang Van Vu, Tung Minh Luu, Trung X. Pham, Cao Van Nguyen, Yongwoo Kim, Jae-Seok Choi, Munchurl Kim, Jie Huang, Jiewen Ran, Chen Xing, Xingguang Zhou, Pengfei Zhu, Mingrui Geng, Yawei Li, Eirikur Agustsson, Shuhang Gu, Luc van Gool, Etienne de Stoutz, Nikolay Kobyshev, Kehui Nie, Yan Zhao, Gen Li, Tong Tong, Qinquan Gao, Liu Hanwen, Pablo Navarrete Michelini, Zhu Dan, Hu Fengshuo, Zheng Hui, Xiumei Wang, Lirui Deng, Rang Meng, Jinghui Qin, Yukai Shi, Wushao Wen, Liang Lin, Ruicheng Feng, Shixiang Wu, Chao Dong, Yu Qiao, Subeesh Vasu, Nimisha Thekke Madam, Praveen Kandula, A. N. Rajagopalan, Jie Liu, Cheolkon Jung

This paper reviews the first challenge on efficient perceptual image enhancement with the focus on deploying deep learning models on smartphones.

Image Enhancement Image Super-Resolution

Paper
Add Code

Learning Deep Representations for Semantic Image Parsing: a Comprehensive Overview

no code implementations • 10 Oct 2018 • Lili Huang, Jiefeng Peng, Ruimao Zhang, Guanbin Li, Liang Lin

Semantic image parsing, which refers to the process of decomposing images into semantic regions and constructing the structure representation of the input, has recently aroused widespread interest in the field of computer vision.

Representation Learning Segmentation +1

Paper
Add Code

Cross-Modal Attentional Context Learning for RGB-D Object Detection

no code implementations • 30 Oct 2018 • Guanbin Li, Yukang Gan, Hejun Wu, Nong Xiao, Liang Lin

In this paper, we address this problem by developing a Cross-Modal Attentional Context (CMAC) learning framework, which enables the full exploitation of the context information from both RGB and depth data.

Autonomous Driving Object +2

Paper
Add Code

FRAME Revisited: An Interpretation View Based on Particle Evolution

no code implementations • 4 Dec 2018 • Xu Cai, Yang Wu, Guanbin Li, Ziliang Chen, Liang Lin

FRAME (Filters, Random fields, And Maximum Entropy) is an energy-based descriptive model that synthesizes visual realism by capturing mutual patterns from structural input signals.

Descriptive

Paper
Add Code

Facial Landmark Machines: A Backbone-Branches Architecture with Progressive Representation Learning

no code implementations • 10 Dec 2018 • Lingbo Liu, Guanbin Li, Yuan Xie, Yizhou Yu, Qing Wang, Liang Lin

In this paper, we propose a novel cascaded backbone-branches fully convolutional neural network~(BB-FCN) for rapidly and accurately localizing facial landmarks in unconstrained and cluttered settings.

Face Alignment Face Detection +2

Paper
Add Code

NADPEx: An on-policy temporally consistent exploration method for deep reinforcement learning

no code implementations • ICLR 2019 • Sirui Xie, Junning Huang, Lanxin Lei, Chunxiao Liu, Zheng Ma, Wei zhang, Liang Lin

Reinforcement learning agents need exploratory behaviors to escape from local optima.

Continuous Control reinforcement-learning +1

Paper
Add Code

Kalman Normalization: Normalizing Internal Representations Across Network Layers

no code implementations • NeurIPS 2018 • Guangrun Wang, Jiefeng Peng, Ping Luo, Xinjiang Wang, Liang Lin

In this paper, we present a novel normalization method, called Kalman Normalization (KN), for improving and accelerating the training of DNNs, particularly under the context of micro-batches.

object-detection Object Detection

Paper
Add Code

Flow Guided Recurrent Neural Encoder for Video Salient Object Detection

no code implementations • CVPR 2018 • Guanbin Li, Yuan Xie, Tianhao Wei, Keze Wang, Liang Lin

Image saliency detection has recently witnessed significant progress due to deep convolutional neural networks.

Ranked #2 on Video Salient Object Detection on DAVSOD-Difficult20 (using extra training data)

Object object-detection +4

Paper
Add Code

Interpretable Video Captioning via Trajectory Structured Localization

no code implementations • CVPR 2018 • Xian Wu, Guanbin Li, Qingxing Cao, Qingge Ji, Liang Lin

Automatically describing open-domain videos with natural language are attracting increasing interest in the field of artificial intelligence.

Image Captioning Sentence +2

Paper
Add Code

Monocular Depth Estimation with Affinity, Vertical Pooling, and Label Enhancement

no code implementations • ECCV 2018 • Yukang Gan, Xiangyu Xu, Wenxiu Sun, Liang Lin

While significant progress has been made in monocular depth estimation with Convolutional Neural Networks (CNNs) extracting absolute features, such as edges and textures, the depth constraint of neighboring pixels, namely relative features, has been mostly ignored by recent methods.

Monocular Depth Estimation Stereo Matching +1

Paper
Add Code

Generative Semantic Manipulation with Mask-Contrasting GAN

no code implementations • ECCV 2018 • Xiaodan Liang, Hao Zhang, Liang Lin, Eric Xing

Despite the promising results on paired/unpaired image-to-image translation achieved by Generative Adversarial Networks (GANs), prior works often only transfer the low-level information (e. g. color or texture changes), but fail to manipulate high-level semantic meanings (e. g., geometric structure or content) of different object regions.

Image-to-Image Translation

Paper
Add Code

Robust Region Grouping via Internal Patch Statistics

no code implementations • CVPR 2013 • Xiaobai Liu, Liang Lin, Alan L. Yuille

In this work, we present an efficient multi-scale low-rank representation for image segmentation.

Image Segmentation Semantic Segmentation +1

Paper
Add Code

PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Spatial Priors

no code implementations • CVPR 2013 • Keyang Shi, Keze Wang, Jiangbo Lu, Liang Lin

By fusing complementary contrast measures in such a pixelwise adaptive manner, the detection effectiveness is significantly boosted.

Image Segmentation Object Recognition +2

Paper
Add Code

Discriminative Learning of Iteration-Wise Priors for Blind Deconvolution

no code implementations • CVPR 2015 • Wangmeng Zuo, Dongwei Ren, Shuhang Gu, Liang Lin, Lei Zhang

The maximum a posterior (MAP)-based blind deconvolution framework generally involves two stages: blur kernel estimation and non-blind restoration.

Deblurring

Paper
Add Code

SOLD: Sub-Optimal Low-rank Decomposition for Efficient Video Segmentation

no code implementations • CVPR 2015 • Chenglong Li, Liang Lin, WangMeng Zuo, Shuicheng Yan, Jin Tang

In particular, the affinity matrix with the rank fixed can be decomposed into two sub-matrices of low rank, and then we iteratively optimize them with closed-form solutions.

Video Segmentation Video Semantic Segmentation

Paper
Add Code

Joint Learning of Single-Image and Cross-Image Representations for Person Re-Identification

no code implementations • CVPR 2016 • Faqiang Wang, WangMeng Zuo, Liang Lin, David Zhang, Lei Zhang

Person re-identification has been usually solved as either the matching of single-image representation (SIR) or the classification of cross-image representation (CIR).

Person Re-Identification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.