Search Results for author: Guanbin Li

Found 102 papers, 42 papers with code

Propagating Over Phrase Relations for One-Stage Visual Grounding

no code implementations ECCV 2020 Sibei Yang, Guanbin Li, Yizhou Yu

Phrase level visual grounding aims to locate in an image the corresponding visual regions referred to by multiple noun phrases in a given sentence.

Phrase Grounding Relational Reasoning +1

Self-Supervised Correction Learning for Semi-Supervised Biomedical Image Segmentation

1 code implementation12 Jan 2023 Ruifei Zhang, Sishuo Liu, Yizhou Yu, Guanbin Li

Since the two tasks rely on similar feature information, the unlabeled data effectively enhances the representation of the network to the lesion regions and further improves the segmentation performance.

Image Segmentation Medical Image Segmentation +2

Lesion-aware Dynamic Kernel for Polyp Segmentation

1 code implementation12 Jan 2023 Ruifei Zhang, Peiwen Lai, Xiang Wan, De-Jun Fan, Feng Gao, Xiao-Jian Wu, Guanbin Li

Automatic and accurate polyp segmentation plays an essential role in early colorectal cancer diagnosis.

Adaptive Context Selection for Polyp Segmentation

1 code implementation12 Jan 2023 Ruifei Zhang, Guanbin Li, Zhen Li, Shuguang Cui, Dahong Qian, Yizhou Yu

To tackle these issues, we propose an adaptive context selection based encoder-decoder framework which is composed of Local Context Attention (LCA) module, Global Context Module (GCM) and Adaptive Selection Module (ASM).

Which Pixel to Annotate: a Label-Efficient Nuclei Segmentation Framework

1 code implementation20 Dec 2022 Wei Lou, Haofeng Li, Guanbin Li, Xiaoguang Han, Xiang Wan

Recently deep neural networks, which require a large amount of annotated samples, have been widely applied in nuclei instance segmentation of H\&E stained pathology images.

Instance Segmentation Semantic Segmentation

BoxPolyp:Boost Generalized Polyp Segmentation Using Extra Coarse Bounding Box Annotations

no code implementations7 Dec 2022 Jun Wei, Yiwen Hu, Guanbin Li, Shuguang Cui, S Kevin Zhou, Zhen Li

In practice, box annotations are applied to alleviate the over-fitting issue of previous polyp segmentation models, which generate fine-grained polyp area through the iterative boosted segmentation model.

Divide and Contrast: Source-free Domain Adaptation via Adaptive Contrastive Learning

1 code implementation12 Nov 2022 Ziyi Zhang, Weikai Chen, Hui Cheng, Zhen Li, Siyuan Li, Liang Lin, Guanbin Li

We investigate a practical domain adaptation task, called source-free domain adaptation (SFUDA), where the source-pretrained model is adapted to the target domain without access to the source data.

Contrastive Learning Source-Free Domain Adaptation

OhMG: Zero-shot Open-vocabulary Human Motion Generation

no code implementations28 Oct 2022 Junfan Lin, Jianlong Chang, Lingbo Liu, Guanbin Li, Liang Lin, Qi Tian, Chang-Wen Chen

After that, by formulating the generated poses from the text2pose stage as prompts, the motion generator can generate motions referring to the poses in a controllable and flexible manner.

Language Modelling

View-Disentangled Transformer for Brain Lesion Detection

1 code implementation20 Sep 2022 Haofeng Li, Junjia Huang, Guanbin Li, Zhou Liu, Yihong Zhong, Yingying Chen, Yunfei Wang, Xiang Wan

Deep neural networks (DNNs) have been widely adopted in brain lesion detection and segmentation.

Lesion Detection

Attentive Symmetric Autoencoder for Brain MRI Segmentation

1 code implementation19 Sep 2022 Junjia Huang, Haofeng Li, Guanbin Li, Xiang Wan

Self-supervised learning methods based on image patch reconstruction have witnessed great success in training auto-encoders, whose pre-trained weights can be transferred to fine-tune other downstream tasks of image understanding.

Image Segmentation MRI segmentation +2

Align, Reason and Learn: Enhancing Medical Vision-and-Language Pre-training with Knowledge

1 code implementation15 Sep 2022 Zhihong Chen, Guanbin Li, Xiang Wan

Most existing methods mainly contain three elements: uni-modal encoders (i. e., a vision encoder and a language encoder), a multi-modal fusion module, and pretext tasks, with few studies considering the importance of medical domain expert knowledge and explicitly exploiting such knowledge to facilitate Med-VLP.

Multi-Modal Masked Autoencoders for Medical Vision-and-Language Pre-Training

1 code implementation15 Sep 2022 Zhihong Chen, Yuhao Du, Jinpeng Hu, Yang Liu, Guanbin Li, Xiang Wan, Tsung-Hui Chang

Besides, we conduct further analysis to better verify the effectiveness of different components of our approach and various settings of pre-training.

Self-Supervised Learning

Neighborhood Collective Estimation for Noisy Label Identification and Correction

1 code implementation5 Aug 2022 Jichang Li, Guanbin Li, Feng Liu, Yizhou Yu

Specifically, our method is divided into two steps: 1) Neighborhood Collective Noise Verification to separate all training samples into a clean or noisy subset, 2) Neighborhood Collective Label Correction to relabel noisy samples, and then auxiliary techniques are used to assist further model optimization.

Learning with noisy labels Model Optimization

Robust Real-World Image Super-Resolution against Adversarial Attacks

1 code implementation31 Jul 2022 Jiutao Yue, Haofeng Li, Pengxu Wei, Guanbin Li, Liang Lin

Since the frequency masking may not only destroys the adversarial perturbations but also affects the sharp details in a clean image, we further develop an adversarial sample classifier based on the frequency domain of images to determine if applying the proposed mask module.

Image Super-Resolution

Centrality and Consistency: Two-Stage Clean Samples Identification for Learning with Instance-Dependent Noisy Labels

1 code implementation29 Jul 2022 Ganlong Zhao, Guanbin Li, Yipeng Qin, Feng Liu, Yizhou Yu

In this paper, we propose a two-stage clean samples identification method to address the aforementioned challenge.

Ranked #3 on Image Classification on Clothing1M (using extra training data)

Image Classification

Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering

no code implementations26 Jul 2022 Yang Liu, Guanbin Li, Liang Lin

Existing visual question answering methods tend to capture the cross-modal spurious correlations, and fail to discover the true causal mechanism that facilitates reasoning truthfully based on the dominant visual evidence and the question intention.

Causal Inference Question Answering +2

Less is More: Adaptive Curriculum Learning for Thyroid Nodule Diagnosis

1 code implementation2 Jul 2022 Haifan Gong, Hui Cheng, Yifan Xie, Shuangyi Tan, Guanqi Chen, Fei Chen, Guanbin Li

Thyroid nodule classification aims at determining whether the nodule is benign or malignant based on a given ultrasound image.


BronchusNet: Region and Structure Prior Embedded Representation Learning for Bronchus Segmentation and Classification

no code implementations14 May 2022 Wenhao Huang, Haifan Gong, huan zhang, Yu Wang, Haofeng Li, Guanbin Li, Hong Shen

CT-based bronchial tree analysis plays an important role in the computer-aided diagnosis for respiratory diseases, as it could provide structured information for clinicians.

Classification Graph Learning +2

Multi-level Consistency Learning for Semi-supervised Domain Adaptation

1 code implementation9 May 2022 Zizheng Yan, Yushuang Wu, Guanbin Li, Yipeng Qin, Xiaoguang Han, Shuguang Cui

Semi-supervised domain adaptation (SSDA) aims to apply knowledge learned from a fully labeled source domain to a scarcely labeled target domain.

Domain Adaptation

Dual Adversarial Adaptation for Cross-Device Real-World Image Super-Resolution

1 code implementation CVPR 2022 Xiaoqian Xu, Pengxu Wei, Weikai Chen, Mingzhi Mao, Liang Lin, Guanbin Li

To address this issue, we propose an unsupervised domain adaptation mechanism for real-world SR, named Dual ADversarial Adaptation (DADA), which only requires LR images in the target domain with available real paired data from a source camera.

Image Super-Resolution Unsupervised Domain Adaptation

Causal Reasoning Meets Visual Representation Learning: A Prospective Study

no code implementations26 Apr 2022 Yang Liu, Yushen Wei, Hong Yan, Guanbin Li, Liang Lin

Visual representation learning is ubiquitous in various real-world applications, including visual comprehension, video understanding, multi-modal analysis, human-computer interaction, and urban computing.

Out-of-Distribution Generalization Representation Learning +1

Open Set Domain Adaptation By Novel Class Discovery

no code implementations7 Mar 2022 Jingyu Zhuang, Ziliang Chen, Pengxu Wei, Guanbin Li, Liang Lin

In Open Set Domain Adaptation (OSDA), large amounts of target samples are drawn from the implicit categories that never appear in the source domain.

Domain Adaptation Novel Class Discovery

Unsupervised Domain Adaptive Salient Object Detection Through Uncertainty-Aware Pseudo-Label Learning

1 code implementation26 Feb 2022 Pengxiang Yan, Ziyi Wu, Mengmeng Liu, Kun Zeng, Liang Lin, Guanbin Li

To relieve the burden of labor-intensive labeling, deep unsupervised SOD methods have been proposed to exploit noisy labels generated by handcrafted saliency methods.

object-detection Object Detection +2

PointMatch: A Consistency Training Framework for Weakly Supervised Semantic Segmentation of 3D Point Clouds

no code implementations22 Feb 2022 Yushuang Wu, Shengcai Cai, Zizheng Yan, Guanbin Li, Yizhou Yu, Xiaoguang Han, Shuguang Cui

Semantic segmentation of point cloud usually relies on dense annotation that is exhausting and costly, so it attracts wide attention to investigate solutions for the weakly supervised scheme with only sparse points annotated.

Representation Learning Weakly supervised Semantic Segmentation +1

Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image Segmentation

1 code implementation8 Feb 2022 Xinkai Zhao, Chaowei Fang, De-Jun Fan, Xutao Lin, Feng Gao, Guanbin Li

Semi-supervised learning (SSL), which aims at leveraging a few labeled images and a large number of unlabeled images for network training, is beneficial for relieving the burden of data annotation in medical image segmentation.

Contrastive Learning Image Segmentation +4

Explore before Moving: A Feasible Path Estimation and Memory Recalling Framework for Embodied Navigation

no code implementations16 Oct 2021 Yang Wu, Shirui Feng, Guanbin Li, Liang Lin

PEMR includes a "looking ahead" process, \textit{i. e.} a visual feature extractor module that estimates feasible paths for gathering 3D navigational information, which is mimicking the human sense of direction.

Common Sense Reasoning Embodied Question Answering +1

Road Network Guided Fine-Grained Urban Traffic Flow Inference

no code implementations29 Sep 2021 Lingbo Liu, Mengmeng Liu, Guanbin Li, Ziyi Wu, Liang Lin

Subsequently, we incorporate the road network feature and coarse-grained flow feature to regularize the short-range spatial distribution modeling of road-relative traffic flow.

Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for Open-Set Semi-Supervised Learning

no code implementations ICCV 2021 Junkai Huang, Chaowei Fang, Weikai Chen, Zhenhua Chai, Xiaolin Wei, Pengxu Wei, Liang Lin, Guanbin Li

Open-set semi-supervised learning (open-set SSL) investigates a challenging but practical scenario where out-of-distribution (OOD) samples are contained in the unlabeled data.

Towards Interpretable Deep Networks for Monocular Depth Estimation

1 code implementation ICCV 2021 Zunzhi You, Yi-Hsuan Tsai, Wei-Chen Chiu, Guanbin Li

Based on our observations, we quantify the interpretability of a deep MDE network by the depth selectivity of its hidden units.

Monocular Depth Estimation

Weakly-Supervised Spatio-Temporal Anomaly Detection in Surveillance Video

no code implementations9 Aug 2021 Jie Wu, Wei zhang, Guanbin Li, Wenhao Wu, Xiao Tan, YingYing Li, Errui Ding, Liang Lin

In this paper, we introduce a novel task, referred to as Weakly-Supervised Spatio-Temporal Anomaly Detection (WSSTAD) in surveillance video.

Anomaly Detection

Colorectal Polyp Classification from White-light Colonoscopy Images via Domain Alignment

no code implementations5 Aug 2021 Qin Wang, Hui Che, Weizhen Ding, Li Xiang, Guanbin Li, Zhen Li, Shuguang Cui

Thus, we propose a novel framework based on a teacher-student architecture for the accurate colorectal polyp classification (CPC) through directly using white-light (WL) colonoscopy images in the examination.

Contrastive Learning

Online Metro Origin-Destination Prediction via Heterogeneous Information Aggregation

1 code implementation2 Jul 2021 Lingbo Liu, Yuying Zhu, Guanbin Li, Ziyi Wu, Lei Bai, Liang Lin

In this work, we proposed a novel neural network module termed Heterogeneous Information Aggregation Machine (HIAM), which fully exploits heterogeneous information of historical data (e. g., incomplete OD matrices, unfinished order vectors, and DO matrices) to jointly learn the evolutionary patterns of OD and DO ridership.

Time Series Time Series Analysis

Bottom-Up Shift and Reasoning for Referring Image Segmentation

1 code implementation CVPR 2021 Sibei Yang, Meng Xia, Guanbin Li, Hong-Yu Zhou, Yizhou Yu

In this paper, we tackle the challenge by jointly performing compositional visual reasoning and accurate segmentation in a single stage via the proposed novel Bottom-Up Shift (BUS) and Bidirectional Attentive Refinement (BIAR) modules.

Image Segmentation Semantic Segmentation +1

Cross-Modal Progressive Comprehension for Referring Segmentation

1 code implementation15 May 2021 Si Liu, Tianrui Hui, Shaofei Huang, Yunchao Wei, Bo Li, Guanbin Li

In this paper, we propose a Cross-Modal Progressive Comprehension (CMPC) scheme to effectively mimic human behaviors and implement it as a CMPC-I (Image) module and a CMPC-V (Video) module to improve referring image and video segmentation models.

Image Segmentation Referring Expression Segmentation +3

Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation

no code implementations CVPR 2021 Tianrui Hui, Shaofei Huang, Si Liu, Zihan Ding, Guanbin Li, Wenguan Wang, Jizhong Han, Fei Wang

Though 3D convolutions are amenable to recognizing which actor is performing the queried actions, it also inevitably introduces misaligned spatial information from adjacent frames, which confuses features of the target frame and yields inaccurate segmentation.

Referring Expression Segmentation

Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation

2 code implementations CVPR 2021 Jichang Li, Guanbin Li, Yemin Shi, Yizhou Yu

Pseudo labeling expands the number of ``labeled" samples in each class in the target domain, and thus produces a more robust and powerful cluster core for each class to facilitate adversarial learning.

Domain Adaptation

Deep Transformers for Fast Small Intestine Grounding in Capsule Endoscope Video

no code implementations7 Apr 2021 Xinkai Zhao, Chaowei Fang, Feng Gao, De-Jun Fan, Xutao Lin, Guanbin Li

In this paper, we propose a deep model to ground shooting range of small intestine from a capsule endoscope video which has duration of tens of hours.

Scene-Intuitive Agent for Remote Embodied Visual Grounding

no code implementations CVPR 2021 Xiangru Lin, Guanbin Li, Yizhou Yu

Intuitively, we comprehend the semantics of the instruction to form an overview of where a bathroom is and what a blue towel is in mind; then, we navigate to the target location by consistently matching the bathroom appearance in mind with the current scene.

Navigate Referring Expression +1

LapsCore: Language-Guided Person Search via Color Reasoning

no code implementations ICCV 2021 Yushuang Wu, Zizheng Yan, Xiaoguang Han, Guanbin Li, Changqing Zou, Shuguang Cui

The key point of language-guided person search is to construct the cross-modal association between visual and textual input.

Association Colorization +3

Adversarial Training using Contrastive Divergence

no code implementations1 Jan 2021 Hongjun Wang, Guanbin Li, Liang Lin

To protect the security of machine learning models against adversarial examples, adversarial training becomes the most popular and powerful strategy against various adversarial attacks by injecting adversarial examples into training data.

A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack and Learning

no code implementations15 Oct 2020 Hongjun Wang, Guanbin Li, Xiaobai Liu, Liang Lin

Although deep convolutional neural networks (CNNs) have demonstrated remarkable performance on multiple computer vision tasks, researches on adversarial learning have shown that deep models are vulnerable to adversarial examples, which are crafted by adding visually imperceptible perturbations to the input images.

Adversarial Attack

Contralaterally Enhanced Networks for Thoracic Disease Detection

no code implementations9 Oct 2020 Gangming Zhao, Chaowei Fang, Guanbin Li, Licheng Jiao, Yizhou Yu

Aimed at improving the performance of existing detection methods, we propose a deep end-to-end module to exploit the contralateral context information for enhancing feature representations of disease proposals.

Referring Image Segmentation via Cross-Modal Progressive Comprehension

1 code implementation CVPR 2020 Shaofei Huang, Tianrui Hui, Si Liu, Guanbin Li, Yunchao Wei, Jizhong Han, Luoqi Liu, Bo Li

In addition to the CMPC module, we further leverage a simple yet effective TGFE module to integrate the reasoned multimodal features from different levels with the guidance of textual information.

Image Segmentation Referring Expression Segmentation +1

Reinforcement Learning for Weakly Supervised Temporal Grounding of Natural Language in Untrimmed Videos

no code implementations18 Sep 2020 Jie Wu, Guanbin Li, Xiaoguang Han, Liang Lin

Temporal grounding of natural language in untrimmed videos is a fundamental yet challenging multimedia task facilitating cross-media visual content retrieval.

reinforcement-learning reinforcement Learning +2

Online Alternate Generator against Adversarial Attacks

no code implementations17 Sep 2020 Haofeng Li, Yirui Zeng, Guanbin Li, Liang Lin, Yizhou Yu

The field of computer vision has witnessed phenomenal progress in recent years partially due to the development of deep convolutional neural networks.

Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision Action Recognition

1 code implementation1 Sep 2020 Yang Liu, Keze Wang, Guanbin Li, Liang Lin

In this paper, we propose a novel framework, named Semantics-aware Adaptive Knowledge Distillation Networks (SAKDN), to enhance action recognition in vision-sensor modality (videos) by adaptively transferring and distilling the knowledge from multiple wearable sensors.

Action Recognition Image Generation +2

Graph-Structured Referring Expression Reasoning in The Wild

1 code implementation CVPR 2020 Sibei Yang, Guanbin Li, Yizhou Yu

The linguistic structure of a referring expression provides a layout of reasoning over the visual contents, and it is often crucial to align and jointly understand the image and the referring expression.

Referring Expression

Efficient Crowd Counting via Structured Knowledge Transfer

2 code implementations23 Mar 2020 Lingbo Liu, Jiaqi Chen, Hefeng Wu, Tianshui Chen, Guanbin Li, Liang Lin

Crowd counting is an application-oriented task and its inference efficiency is crucial for real-world applications.

Crowd Counting Transfer Learning

Peeking into occluded joints: A novel framework for crowd pose estimation

1 code implementation ECCV 2020 Lingteng Qiu, Xuanye Zhang, Yan-ran Li, Guanbin Li, Xiao-Jun Wu, Zixiang Xiong, Xiaoguang Han, Shuguang Cui

Although occlusion widely exists in nature and remains a fundamental challenge for pose estimation, existing heatmap-based approaches suffer serious degradation on occlusions.

Pose Estimation

Depthwise Non-local Module for Fast Salient Object Detection Using a Single Thread

no code implementations22 Jan 2020 Haofeng Li, Guanbin Li, BinBin Yang, Guanqi Chen, Liang Lin, Yizhou Yu

The proposed algorithm for the first time achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.

Image Classification object-detection +3

Physical-Virtual Collaboration Modeling for Intra-and Inter-Station Metro Ridership Prediction

2 code implementations14 Jan 2020 Lingbo Liu, Jingwen Chen, Hefeng Wu, Jiajie Zhen, Guanbin Li, Liang Lin

To address this problem, we model a metro system as graphs with various topologies and propose a unified Physical-Virtual Collaboration Graph Network (PVCGN), which can effectively learn the complex ridership patterns from the tailor-designed graphs.

Representation Learning

An Adversarial Perturbation Oriented Domain Adaptation Approach for Semantic Segmentation

no code implementations18 Dec 2019 Jihan Yang, Ruijia Xu, Ruiyu Li, Xiaojuan Qi, Xiaoyong Shen, Guanbin Li, Liang Lin

In contrast to adversarial alignment, we propose to explicitly train a domain-invariant classifier by generating and defensing against pointwise feature space adversarial perturbations.

Semantic Segmentation Unsupervised Domain Adaptation

Globally Guided Progressive Fusion Network for 3D Pancreas Segmentation

no code implementations23 Nov 2019 Chaowei Fang, Guanbin Li, Chengwei Pan, Yiming Li, Yizhou Yu

Recently 3D volumetric organ segmentation attracts much research interest in medical image analysis due to its significance in computer aided diagnosis.

Pancreas Segmentation

Self-Enhanced Convolutional Network for Facial Video Hallucination

no code implementations23 Nov 2019 Chaowei Fang, Guanbin Li, Xiaoguang Han, Yizhou Yu

It further recurrently exploits the reconstructed results and intermediate features of a sequence of preceding frames to improve the initial super-resolution of the current frame by modelling the coherence of structural facial features across frames.

Video Super-Resolution

Knowledge Graph Transfer Network for Few-Shot Recognition

1 code implementation21 Nov 2019 Riquan Chen, Tianshui Chen, Xiaolu Hui, Hefeng Wu, Guanbin Li, Liang Lin

In this work, we represent the semantic correlations in the form of structured knowledge graph and integrate this graph into deep neural networks to promote few-shot learning by a novel Knowledge Graph Transfer Network (KGTN).

Few-Shot Learning Novel Concepts +1

Generalizing Energy-based Generative ConvNets from Particle Evolution Perspective

no code implementations31 Oct 2019 Yang Wu, Xu Cai, Pengxu Wei, Guanbin Li, Liang Lin

Compared with Generative Adversarial Networks (GAN), Energy-Based generative Models (EBMs) possess two appealing properties: i) they can be directly optimized without requiring an auxiliary network during the learning and synthesizing; ii) they can better approximate underlying distribution of the observed data by learning explicitly potential functions.

Learning to Recognize the Unseen Visual Predicates

no code implementations25 Sep 2019 Defa Zhu, Si Liu, Wentao Jiang, Guanbin Li, Tianyi Wu, Guodong Guo

Visual relationship recognition models are limited in the ability to generalize from finite seen predicates to unseen ones.

Question Answering Visual Question Answering +1

Dynamic Graph Attention for Referring Expression Comprehension

no code implementations ICCV 2019 Sibei Yang, Guanbin Li, Yizhou Yu

In this paper, we explore the problem of referring expression comprehension from the perspective of language-driven visual reasoning, and propose a dynamic graph attention network to perform multi-step reasoning by modeling both the relationships among the objects in the image and the linguistic structure of the expression.

Graph Attention Referring Expression +2

Motion Guided Attention for Video Salient Object Detection

2 code implementations ICCV 2019 Haofeng Li, Guanqi Chen, Guanbin Li, Yizhou Yu

In this paper, we develop a multi-task motion guided video salient object detection network, which learns to accomplish two sub-tasks using two sub-networks, one sub-network for salient object detection in still images and the other for motion saliency detection in optical flow images.

object-detection Optical Flow Estimation +3

Dynamic Spatial-Temporal Representation Learning for Traffic Flow Prediction

2 code implementations2 Sep 2019 Lingbo Liu, Jiajie Zhen, Guanbin Li, Geng Zhan, Zhaocheng He, Bowen Du, Liang Lin

Specifically, the first ConvLSTM unit takes normal traffic flow features as input and generates a hidden state at each time-step, which is further fed into the connected convolutional layer for spatial attention map inference.

Representation Learning Traffic Prediction

Fashion Retrieval via Graph Reasoning Networks on a Similarity Pyramid

no code implementations ICCV 2019 Zhanghui Kuang, Yiming Gao, Guanbin Li, Ping Luo, Yimin Chen, Liang Lin, Wayne Zhang

To address this issue, we propose a novel Graph Reasoning Network (GRNet) on a Similarity Pyramid, which learns similarities between a query and a gallery cloth by using both global and local representations in multiple scales.

Image Retrieval Retrieval

Crowd Counting with Deep Structured Scale Integration Network

no code implementations ICCV 2019 Lingbo Liu, Zhilin Qiu, Guanbin Li, Shufan Liu, Wanli Ouyang, Liang Lin

Automatic estimation of the number of people in unconstrained crowded scenes is a challenging task and one major difficulty stems from the huge scale variation of people.

Crowd Counting Representation Learning

Semi-Supervised Video Salient Object Detection Using Pseudo-Labels

1 code implementation ICCV 2019 Pengxiang Yan, Guanbin Li, Yuan Xie, Zhen Li, Chuan Wang, Tianshui Chen, Liang Lin

Specifically, we present an effective video saliency detector that consists of a spatial refinement network and a spatiotemporal module.

 Ranked #1 on Video Salient Object Detection on VOS-T (using extra training data)

object-detection Salient Object Detection +2

Semi-supervised Skin Detection by Network with Mutual Guidance

no code implementations ICCV 2019 Yi He, Jiayuan Shi, Chuan Wang, Haibin Huang, Jiaming Liu, Guanbin Li, Risheng Liu, Jue Wang

In this paper we present a new data-driven method for robust skin detection from a single human portrait image.

Multivariate-Information Adversarial Ensemble for Scalable Joint Distribution Matching

1 code implementation8 Jul 2019 Ziliang Chen, Zhanfu Yang, Xiaoxi Wang, Xiaodan Liang, Xiaopeng Yan, Guanbin Li, Liang Lin

A broad range of cross-$m$-domain generation researches boil down to matching a joint distribution by deep generative models (DGMs).

Relationship-Embedded Representation Learning for Grounding Referring Expressions

1 code implementation CVPR 2019 Sibei Yang, Guanbin Li, Yizhou Yu

Unfortunately, existing work on grounding referring expressions fails to accurately extract multi-order relationships from the referring expression and associate them with the objects and their related contexts in the image.

Referring Expression Representation Learning

Contextualized Spatial-Temporal Network for Taxi Origin-Destination Demand Prediction

no code implementations15 May 2019 Lingbo Liu, Zhilin Qiu, Guanbin Li, Qing Wang, Wanli Ouyang, Liang Lin

Finally, a GCC module is applied to model the correlation between all regions by computing a global correlation feature as a weighted sum of all regional features, with the weights being calculated as the similarity between the corresponding region pairs.

ROSA: Robust Salient Object Detection against Adversarial Attacks

no code implementations9 May 2019 Haofeng Li, Guanbin Li, Yizhou Yu

To our knowledge, this paper is the first one that mounts successful adversarial attacks on salient object detection models and verifies that adversarial samples are effective on a wide range of existing methods.

object-detection RGB Salient Object Detection +1

Face Hallucination by Attentive Sequence Optimization with Reinforcement Learning

no code implementations4 May 2019 Yukai Shi, Guanbin Li, Qingxing Cao, Keze Wang, Liang Lin

Face hallucination is a domain-specific super-resolution problem that aims to generate a high-resolution (HR) face image from a low-resolution~(LR) input.

Face Hallucination reinforcement-learning +2

Semantic Relationships Guided Representation Learning for Facial Action Unit Recognition

no code implementations22 Apr 2019 Guanbin Li, Xin Zhu, Yirui Zeng, Qing Wang, Liang Lin

Specifically, by analyzing the symbiosis and mutual exclusion of AUs in various facial expressions, we organize the facial AUs in the form of structured knowledge-graph and integrate a Gated Graph Neural Network (GGNN) in a multi-scale CNN framework to propagate node information through the graph for generating enhanced AU representation.

Facial Action Unit Detection Representation Learning

Harvesting Visual Objects from Internet Images via Deep Learning Based Objectness Assessment

no code implementations1 Apr 2019 Kan Wu, Guanbin Li, Haofeng Li, Jianjun Zhang, Yizhou Yu

As a concrete example, a database of over 1. 2 million visual objects has been built using the proposed method, and has been successfully used in various data-driven image applications.

Image Generation Re-Ranking

Deep RBFNet: Point Cloud Feature Learning using Radial Basis Functions

no code implementations11 Dec 2018 Weikai Chen, Xiaoguang Han, Guanbin Li, Chao Chen, Jun Xing, Yajie Zhao, Hao Li

Three-dimensional object recognition has recently achieved great progress thanks to the development of effective point cloud-based learning frameworks, such as PointNet and its extensions.

3D Object Recognition

Facial Landmark Machines: A Backbone-Branches Architecture with Progressive Representation Learning

no code implementations10 Dec 2018 Lingbo Liu, Guanbin Li, Yuan Xie, Yizhou Yu, Qing Wang, Liang Lin

In this paper, we propose a novel cascaded backbone-branches fully convolutional neural network~(BB-FCN) for rapidly and accurately localizing facial landmarks in unconstrained and cluttered settings.

Face Alignment Face Detection +2

FRAME Revisited: An Interpretation View Based on Particle Evolution

no code implementations4 Dec 2018 Xu Cai, Yang Wu, Guanbin Li, Ziliang Chen, Liang Lin

FRAME (Filters, Random fields, And Maximum Entropy) is an energy-based descriptive model that synthesizes visual realism by capturing mutual patterns from structural input signals.

Cross-Modal Attentional Context Learning for RGB-D Object Detection

no code implementations30 Oct 2018 Guanbin Li, Yukang Gan, Hejun Wu, Nong Xiao, Liang Lin

In this paper, we address this problem by developing a Cross-Modal Attentional Context (CMAC) learning framework, which enables the full exploitation of the context information from both RGB and depth data.

Autonomous Driving object-detection +1

Learning Deep Representations for Semantic Image Parsing: a Comprehensive Overview

no code implementations10 Oct 2018 Lili Huang, Jiefeng Peng, Ruimao Zhang, Guanbin Li, Liang Lin

Semantic image parsing, which refers to the process of decomposing images into semantic regions and constructing the structure representation of the input, has recently aroused widespread interest in the field of computer vision.

Representation Learning Semantic Segmentation

Attentive Crowd Flow Machines

no code implementations1 Sep 2018 Lingbo Liu, Ruimao Zhang, Jiefeng Peng, Guanbin Li, Bowen Du, Liang Lin

Traffic flow prediction is crucial for urban traffic management and public safety.


Non-locally Enhanced Encoder-Decoder Network for Single Image De-raining

no code implementations4 Aug 2018 Guanbin Li, Xiang He, Wei zhang, Huiyou Chang, Le Dong, Liang Lin

Single image rain streaks removal has recently witnessed substantial progress due to the development of deep convolutional neural networks.

Crowd Counting using Deep Recurrent Spatial-Aware Network

no code implementations2 Jul 2018 Lingbo Liu, Hongjun Wang, Guanbin Li, Wanli Ouyang, Liang Lin

Crowd counting from unconstrained scene images is a crucial task in many real-world applications like urban surveillance and management, but it is greatly challenged by the camera's perspective that causes huge appearance variations in people's scales and rotations.

Crowd Counting Management

Interpretable Video Captioning via Trajectory Structured Localization

no code implementations CVPR 2018 Xian Wu, Guanbin Li, Qingxing Cao, Qingge Ji, Liang Lin

Automatically describing open-domain videos with natural language are attracting increasing interest in the field of artificial intelligence.

Image Captioning Video Captioning +1

Visual Question Reasoning on General Dependency Tree

no code implementations CVPR 2018 Qingxing Cao, Xiaodan Liang, Bailing Li, Guanbin Li, Liang Lin

This network comprises of two collaborative modules: i) an adversarial attention module to exploit the local visual evidence for each word parsed from the question; ii) a residual composition module to compose the previously mined evidence.

Question Answering Visual Question Answering

Contrast-Oriented Deep Neural Networks for Salient Object Detection

no code implementations30 Mar 2018 Guanbin Li, Yizhou Yu

In this paper, we develop hybrid contrast-oriented deep neural networks to overcome the aforementioned limitations.

object-detection RGB Salient Object Detection +1

Weakly Supervised Salient Object Detection Using Image Labels

no code implementations17 Mar 2018 Guanbin Li, Yuan Xie, Liang Lin

Our algorithm is based on alternately exploiting a graphical model and training a fully convolutional network for model updating.

object-detection RGB Salient Object Detection +3

Context-Aware Semantic Inpainting

no code implementations21 Dec 2017 Haofeng Li, Guanbin Li, Liang Lin, Yizhou Yu

Our proposed GAN-based framework consists of a fully convolutional design for the generator which helps to better preserve spatial structures and a joint loss function with a revised perceptual loss to capture high-level semantics in the context.

Image Inpainting

Recurrent Attentional Reinforcement Learning for Multi-label Image Recognition

no code implementations20 Dec 2017 Tianshui Chen, Zhouxia Wang, Guanbin Li, Liang Lin

Recognizing multiple labels of images is a fundamental but challenging task in computer vision, and remarkable progress has been attained by localizing semantic-aware image regions and predicting their labels with deep convolutional neural networks.

reinforcement-learning reinforcement Learning

Multi-label Image Recognition by Recurrently Discovering Attentional Regions

no code implementations ICCV 2017 Zhouxia Wang, Tianshui Chen, Guanbin Li, Ruijia Xu, Liang Lin

This paper proposes a novel deep architecture to address multi-label image recognition, a fundamental and practical task towards general visual understanding.

General Classification Multi-Label Image Classification +1

Attention-Aware Face Hallucination via Deep Reinforcement Learning

no code implementations CVPR 2017 Qingxing Cao, Liang Lin, Yukai Shi, Xiaodan Liang, Guanbin Li

Face hallucination is a domain-specific super-resolution problem with the goal to generate high-resolution (HR) faces from low-resolution (LR) input images.

Face Hallucination reinforcement-learning +2

Visual Saliency Detection Based on Multiscale Deep CNN Features

2 code implementations7 Sep 2016 Guanbin Li, Yizhou Yu

The penultimate layer of our neural network has been confirmed to be a discriminative high-level feature vector for saliency detection, which we call deep contrast feature.

Saliency Detection

Deep Contrast Learning for Salient Object Detection

no code implementations CVPR 2016 Guanbin Li, Yizhou Yu

Our deep network consists of two complementary components, a pixel-level fully convolutional stream and a segment-wise spatial pooling stream.

object-detection RGB Salient Object Detection +1

Visual Saliency Based on Multiscale Deep Features

no code implementations CVPR 2015 Guanbin Li, Yizhou Yu

Visual saliency is a fundamental problem in both cognitive and computational sciences, including computer vision.

Image Segmentation Semantic Segmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.