Search Results for author: Jianbing Shen

Found 94 papers, 58 papers with code

CLNet: A Compact Latent Network for Fast Adjusting Siamese Trackers

1 code implementation ECCV 2020 Xingping Dong, Jianbing Shen, Ling Shao, Fatih Porikli

To make full use of these sequence-specific samples, {we propose a compact latent network to quickly adjust the tracking model to adapt to new scenes.}

High-Precision Self-Supervised Monocular Depth Estimation with Rich-Resource Prior

no code implementations1 Aug 2024 Wencheng Han, Jianbing Shen

In the area of self-supervised monocular depth estimation, models that utilize rich-resource inputs, such as high-resolution and multi-frame inputs, typically achieve better performance than models that use ordinary single image input.

Monocular Depth Estimation

RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception

1 code implementation15 Jul 2024 ChunLiang Li, Wencheng Han, Junbo Yin, Sanyuan Zhao, Jianbing Shen

Concurrent processing of multiple autonomous driving 3D perception tasks within the same spatiotemporal scene poses a significant challenge, in particular due to the computational inefficiencies and feature competition between tasks when using traditional multi-task learning approaches.

3D Lane Detection 3D Object Detection +3

AdaOcc: Adaptive Forward View Transformation and Flow Modeling for 3D Occupancy and Flow Prediction

no code implementations1 Jul 2024 Dubing Chen, Wencheng Han, Jin Fang, Jianbing Shen

In this technical report, we present our solution for the Vision-Centric 3D Occupancy and Flow Prediction track in the nuScenes Open-Occ Dataset Challenge at CVPR 2024.

Multi-threshold Deep Metric Learning for Facial Expression Recognition

no code implementations24 Jun 2024 Wenwu Yang, Jinyi Yu, Tuo Chen, Zhenguang Liu, Xun Wang, Jianbing Shen

Each embedding slice corresponds to a sample threshold and is learned by enforcing the corresponding triplet loss, yielding a set of distinct expression features, one for each embedding slice.

Facial Expression Recognition Facial Expression Recognition (FER) +2

Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving?

no code implementations28 May 2024 Yifan Bai, Dongming Wu, Yingfei Liu, Fan Jia, Weixin Mao, Ziheng Zhang, Yucheng Zhao, Jianbing Shen, Xing Wei, Tiancai Wang, Xiangyu Zhang

Despite its simplicity, Atlas demonstrates superior performance in both 3D detection and ego planning tasks on nuScenes dataset, proving that 3D-tokenized LLM is the key to reliable autonomous driving.

3D Object Detection Autonomous Driving +4

Visual In-Context Learning for Large Vision-Language Models

no code implementations18 Feb 2024 Yucheng Zhou, Xiang Li, Qianning Wang, Jianbing Shen

In Large Visual Language Models (LVLMs), the efficacy of In-Context Learning (ICL) remains limited by challenges in cross-modal interactions and representation disparities.

In-Context Learning Position +2

DME-Driver: Integrating Human Decision Logic and 3D Scene Perception in Autonomous Driving

no code implementations8 Jan 2024 Wencheng Han, Dongqian Guo, Cheng-Zhong Xu, Jianbing Shen

On the other hand, the generation of accurate control signals relies on precise and detailed environmental perception, which is where 3D scene perception models excel.

Autonomous Driving Language Modelling +1

Leveraging Frame Affinity for sRGB-to-RAW Video De-rendering

no code implementations CVPR 2024 Chen Zhang, Wencheng Han, Yang Zhou, Jianbing Shen, Cheng-Zhong Xu, Wentao Liu

These methods utilize both the metadata and the sRGB image to perform sRGB-to-RAW de-rendering and recover high-quality single-frame RAW data.

Image Reconstruction Video Editing +1

DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection

1 code implementation25 Dec 2023 Li Xiang, Junbo Yin, Wei Li, Cheng-Zhong Xu, Ruigang Yang, Jianbing Shen

Specifically, DMA builds a domain-mixing 3D instance bank for the teacher and student models during training, resulting in aligned data representation.

3D Object Detection object-detection +1

Thread of Thought Unraveling Chaotic Contexts

no code implementations15 Nov 2023 Yucheng Zhou, Xiubo Geng, Tao Shen, Chongyang Tao, Guodong Long, Jian-Guang Lou, Jianbing Shen

Large Language Models (LLMs) have ushered in a transformative era in the field of natural language processing, excelling in tasks related to text comprehension and generation.

Reading Comprehension

Decoupling the Curve Modeling and Pavement Regression for Lane Detection

no code implementations19 Sep 2023 Wencheng Han, Jianbing Shen

The curve-based lane representation is a popular approach in many lane detection methods, as it allows for the representation of lanes as a whole object and maximizes the use of holistic information about the lanes.

3D Lane Detection regression

Language Prompt for Autonomous Driving

2 code implementations8 Sep 2023 Dongming Wu, Wencheng Han, Tiancai Wang, Yingfei Liu, Xiangyu Zhang, Jianbing Shen

A new trend in the computer vision community is to capture objects of interest following flexible human command represented by a natural language prompt.

Autonomous Driving Object

TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision

no code implementations6 Jun 2023 Yukun Zhai, Xiaoqiang Zhang, Xiameng Qin, Sanyuan Zhao, Xingping Dong, Jianbing Shen

End-to-end text spotting is a vital computer vision task that aims to integrate scene text detection and recognition into a unified framework.

Decoder Scene Text Detection +2

Referring Multi-Object Tracking

1 code implementation CVPR 2023 Dongming Wu, Wencheng Han, Tiancai Wang, Xingping Dong, Xiangyu Zhang, Jianbing Shen

In this paper, we propose a new and general referring understanding task, termed referring multi-object tracking (RMOT).

Object Referring Multi-Object Tracking

Generalized Few-Shot 3D Object Detection of LiDAR Point Cloud for Autonomous Driving

no code implementations8 Feb 2023 Jiawei Liu, Xingping Dong, Sanyuan Zhao, Jianbing Shen

To achieve simultaneous detection for both common and rare objects, we propose a novel task, called generalized few-shot 3D object detection, where we have a large amount of training data for common (base) objects, but only a few data for rare (novel) classes.

3D Object Detection Autonomous Driving +1

Adaptive Siamese Tracking with a Compact Latent Network

no code implementations2 Feb 2023 Xingping Dong, Jianbing Shen, Fatih Porikli, Jiebo Luo, Ling Shao

Under this viewing, we perform an in-depth analysis for them through visual simulations and real tracking examples, and find that the failure cases in some challenging situations can be regarded as the issue of missing decisive samples in offline training.

LWSIS: LiDAR-guided Weakly Supervised Instance Segmentation for Autonomous Driving

1 code implementation7 Dec 2022 Xiang Li, Junbo Yin, Botian Shi, Yikang Li, Ruigang Yang, Jianbing Shen

In this paper, we present a more artful framework, LiDAR-guided Weakly Supervised Instance Segmentation (LWSIS), which leverages the off-the-shelf 3D data, i. e., Point Cloud, together with the 3D boxes, as natural weak supervisions for training the 2D image instance segmentation models.

Autonomous Driving Instance Segmentation +5

SSDA3D: Semi-supervised Domain Adaptation for 3D Object Detection from Point Cloud

1 code implementation6 Dec 2022 Yan Wang, Junbo Yin, Wei Li, Pascal Frossard, Ruigang Yang, Jianbing Shen

However, these UDA solutions just yield unsatisfactory 3D detection results when there is a severe domain shift, e. g., from Waymo (64-beam) to nuScenes (32-beam).

3D Object Detection Autonomous Driving +5

Semi-supervised 3D Object Detection with Proficient Teachers

1 code implementation26 Jul 2022 Junbo Yin, Jin Fang, Dingfu Zhou, Liangjun Zhang, Cheng-Zhong Xu, Jianbing Shen, Wenguan Wang

To reduce the dependence on large supervision, semi-supervised learning (SSL) based approaches have been proposed.

3D Object Detection Autonomous Driving +3

ProposalContrast: Unsupervised Pre-training for LiDAR-based 3D Object Detection

1 code implementation26 Jul 2022 Junbo Yin, Dingfu Zhou, Liangjun Zhang, Jin Fang, Cheng-Zhong Xu, Jianbing Shen, Wenguan Wang

Existing approaches for unsupervised point cloud pre-training are constrained to either scene-level or point/voxel-level instance discrimination.

3D Object Detection object-detection +2

Counterfactual Cycle-Consistent Learning for Instruction Following and Generation in Vision-Language Navigation

1 code implementation CVPR 2022 Hanqing Wang, Wei Liang, Jianbing Shen, Luc van Gool, Wenguan Wang

Since the rise of vision-language navigation (VLN), great progress has been made in instruction following -- building a follower to navigate environments under the guidance of instructions.

counterfactual Data Augmentation +3

Tree Energy Loss: Towards Sparsely Annotated Semantic Segmentation

1 code implementation CVPR 2022 Zhiyuan Liang, Tiancai Wang, Xiangyu Zhang, Jian Sun, Jianbing Shen

The tree energy loss is effective and easy to be incorporated into existing frameworks by combining it with a traditional segmentation loss.

Segmentation Semantic Segmentation

Consistency and Diversity induced Human Motion Segmentation

no code implementations10 Feb 2022 Tao Zhou, Huazhu Fu, Chen Gong, Ling Shao, Fatih Porikli, Haibin Ling, Jianbing Shen

Besides, a novel constraint based on the Hilbert Schmidt Independence Criterion (HSIC) is introduced to ensure the diversity of multi-level subspace representations, which enables the complementarity of multi-level representations to be explored to boost the transfer learning performance.

Diversity Motion Segmentation +2

A Graph Matching Perspective With Transformers on Video Instance Segmentation

no code implementations CVPR 2022 Zheyun Qin, Xiankai Lu, Xiushan Nie, Yilong Yin, Jianbing Shen

Video Instance Segmentation (VIS) needs to automatically track and segment multiple objects in videos that rely on modeling the spatial-temporal interactions of the instances.

Graph Matching Instance Segmentation +2

Multi-Level Representation Learning With Semantic Alignment for Referring Video Object Segmentation

no code implementations CVPR 2022 Dongming Wu, Xingping Dong, Ling Shao, Jianbing Shen

To address this, we propose a novel multi-level representation learning approach, which explores the inherent structure of the video content to provide a set of discriminative visual embedding, enabling more effective vision-language semantic alignment.

Object Referring Expression Segmentation +6

Full-Duplex Strategy for Video Object Segmentation

1 code implementation ICCV 2021 Ge-Peng Ji, Deng-Ping Fan, Keren Fu, Zhe Wu, Jianbing Shen, Ling Shao

Previous video object segmentation approaches mainly focus on using simplex solutions between appearance and motion, limiting feature collaboration efficiency among and across these two cues.

Object Salient Object Detection +6

Video Object Segmentation Using Global and Instance Embedding Learning

no code implementations CVPR 2021 Wenbin Ge, Xiankai Lu, Jianbing Shen

In this paper, we propose a feature embedding based video object segmentation (VOS) method which is simple, fast and effective.

Object Relation +4

Person Re-Identification by Context-aware Part Attention and Multi-Head Collaborative Learning

no code implementations IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY 2021 Dongming Wu, Mang Ye, Gaojie Lin, Xin Gao, Jianbing Shen

In addition, we propose a novel multi-head collaborative training scheme to improve the performance, which is collaboratively supervised by multiple heads with the same structure but different parameters.

Video-Based Person Re-Identification

Face Forensics in the Wild

1 code implementation CVPR 2021 Tianfei Zhou, Wenguan Wang, Zhiyuan Liang, Jianbing Shen

On existing public benchmarks, face forgery detection techniques have achieved great success.

Multiple Instance Learning

Structured Scene Memory for Vision-Language Navigation

1 code implementation CVPR 2021 Hanqing Wang, Wenguan Wang, Wei Liang, Caiming Xiong, Jianbing Shen

Recently, numerous algorithms have been developed to tackle the problem of vision-language navigation (VLN), i. e., entailing an agent to navigate 3D environments through following linguistic instructions.

Decision Making Navigate +1

Learning to Fuse Asymmetric Feature Maps in Siamese Trackers

1 code implementation CVPR 2021 Wencheng Han, Xingping Dong, Fahad Shahbaz Khan, Ling Shao, Jianbing Shen

We propose a learnable module, called the asymmetric convolution (ACM), which learns to better capture the semantic correlation information in offline training on large-scale data.

Visual Object Tracking Visual Tracking

Siamese Network for RGB-D Salient Object Detection and Beyond

2 code implementations26 Aug 2020 Keren Fu, Deng-Ping Fan, Ge-Peng Ji, Qijun Zhao, Jianbing Shen, Ce Zhu

Inspired by the observation that RGB and depth modalities actually present certain commonality in distinguishing salient objects, a novel joint learning and densely cooperative fusion (JL-DCF) architecture is designed to learn from both RGB and depth inputs through a shared network backbone, known as the Siamese architecture.

object-detection RGB-D Salient Object Detection +2

RGB-D Salient Object Detection: A Survey

9 code implementations1 Aug 2020 Tao Zhou, Deng-Ping Fan, Ming-Ming Cheng, Jianbing Shen, Ling Shao

Further, considering that the light field can also provide depth maps, we review SOD models and popular benchmark datasets from this domain as well.

Attribute Object +5

Weakly Supervised 3D Object Detection from Lidar Point Cloud

1 code implementation ECCV 2020 Qinghao Meng, Wenguan Wang, Tianfei Zhou, Jianbing Shen, Luc van Gool, Dengxin Dai

This work proposes a weakly supervised approach for 3D object detection, only requiring a small set of weakly annotated scenes, associated with a few precisely labeled object instances.

3D Object Detection Object +1

Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person Re-Identification

5 code implementations ECCV 2020 Mang Ye, Jianbing Shen, David J. Crandall, Ling Shao, Jiebo Luo

In this paper, we propose a novel dynamic dual-attentive aggregation (DDAG) learning method by mining both intra-modality part-level and cross-modality graph-level contextual cues for VI-ReID.

Person Re-Identification Retrieval

Active Visual Information Gathering for Vision-Language Navigation

1 code implementation ECCV 2020 Hanqing Wang, Wenguan Wang, Tianmin Shu, Wei Liang, Jianbing Shen

Vision-language navigation (VLN) is the task of entailing an agent to carry out navigational instructions inside photo-realistic environments.

Vision-Language Navigation

Video Object Segmentation with Episodic Graph Memory Networks

1 code implementation ECCV 2020 Xiankai Lu, Wenguan Wang, Martin Danelljan, Tianfei Zhou, Jianbing Shen, Luc van Gool

How to make a segmentation model efficiently adapt to a specific video and to online target appearance variations are fundamentally crucial issues in the field of video object segmentation.

Object Segmentation +4

Re-thinking Co-Salient Object Detection

2 code implementations7 Jul 2020 Deng-Ping Fan, Tengpeng Li, Zheng Lin, Ge-Peng Ji, Dingwen Zhang, Ming-Ming Cheng, Huazhu Fu, Jianbing Shen

CoSOD is an emerging and rapidly growing extension of salient object detection (SOD), which aims to detect the co-occurring salient objects in a group of images.

Benchmarking Co-Salient Object Detection +3

M2Net: Multi-modal Multi-channel Network for Overall Survival Time Prediction of Brain Tumor Patients

1 code implementation1 Jun 2020 Tao Zhou, Huazhu Fu, Yu Zhang, Changqing Zhang, Xiankai Lu, Jianbing Shen, Ling Shao

Then, we use a modality-specific network to extract implicit and high-level features from different MR scans.

Modeling and Enhancing Low-quality Retinal Fundus Images

1 code implementation12 May 2020 Ziyi Shen, Huazhu Fu, Jianbing Shen, Ling Shao

Retinal fundus images are widely used for the clinical screening and diagnosis of eye diseases.

Image Enhancement Medical Image Analysis +1

Self-Learning with Rectification Strategy for Human Parsing

no code implementations CVPR 2020 Tao Li, Zhiyuan Liang, Sanyuan Zhao, Jiahao Gong, Jianbing Shen

For the global error, we first transform category-wise features into a high-level graph model with coarse-grained structural information, and then decouple the high-level graph to reconstruct the category features.

Human Parsing Self-Learning

A Unified Object Motion and Affinity Model for Online Multi-Object Tracking

1 code implementation CVPR 2020 Junbo Yin, Wenguan Wang, Qinghao Meng, Ruigang Yang, Jianbing Shen

In this paper, we propose a novel MOT framework that unifies object motion and affinity model into a single network, named UMA, in order to learn a compact feature that is discriminative for both object motion and affinity measure.

Metric Learning Multi-Object Tracking +4

Hierarchical Human Parsing with Typed Part-Relation Reasoning

1 code implementation CVPR 2020 Wenguan Wang, Hailong Zhu, Jifeng Dai, Yanwei Pang, Jianbing Shen, Ling Shao

As human bodies are underlying hierarchically structured, how to model human structures is the central theme in this task.

Human Parsing Relation

Learning Video Object Segmentation from Unlabeled Videos

1 code implementation CVPR 2020 Xiankai Lu, Wenguan Wang, Jianbing Shen, Yu-Wing Tai, David Crandall, Steven C. H. Hoi

We propose a new method for video object segmentation (VOS) that addresses object pattern learning from unlabeled videos, unlike most existing methods which rely heavily on extensive annotated data.

Object Representation Learning +6

Cascaded Human-Object Interaction Recognition

1 code implementation CVPR 2020 Tianfei Zhou, Wenguan Wang, Siyuan Qi, Haibin Ling, Jianbing Shen

The interaction recognition network has two crucial parts: a relation ranking module for high-quality HOI proposal selection and a triple-stream classifier for relation prediction.

Human-Object Interaction Detection Object +1

Infinitely Wide Graph Convolutional Networks: Semi-supervised Learning via Gaussian Processes

no code implementations26 Feb 2020 Jilin Hu, Jianbing Shen, Bin Yang, Ling Shao

Graph convolutional neural networks~(GCNs) have recently demonstrated promising results on graph-based semi-supervised classification, but little work has been done to explore their theoretical properties.

Gaussian Processes General Classification

Hi-Net: Hybrid-fusion Network for Multi-modal MR Image Synthesis

2 code implementations11 Feb 2020 Tao Zhou, Huazhu Fu, Geng Chen, Jianbing Shen, Ling Shao

Medical image synthesis has been proposed as an effective solution to this, where any missing modalities are synthesized from the existing ones.

Image Generation

Learning Compositional Neural Information Fusion for Human Parsing

1 code implementation ICCV 2019 Wenguan Wang, Zhijie Zhang, Siyuan Qi, Jianbing Shen, Yanwei Pang, Ling Shao

The bottom-up and top-down inferences explicitly model the compositional and decompositional relations in human bodies, respectively.

Human Parsing

Zero-Shot Video Object Segmentation via Attentive Graph Neural Networks

1 code implementation ICCV 2019 Wenguan Wang, Xiankai Lu, Jianbing Shen, David Crandall, Ling Shao

Through parametric message passing, AGNN is able to efficiently capture and mine much richer and higher-order relations between video frames, thus enabling a more complete understanding of video content and more accurate foreground estimation.

Graph Neural Network Segmentation +5

Human-Aware Motion Deblurring

1 code implementation ICCV 2019 Ziyi Shen, Wenguan Wang, Xiankai Lu, Jianbing Shen, Haibin Ling, Tingfa Xu, Ling Shao

This paper proposes a human-aware deblurring model that disentangles the motion blur between foreground (FG) humans and background (BG).

Deblurring Decoder +1

NETNet: Neighbor Erasing and Transferring Network for Better Single Shot Object Detection

no code implementations CVPR 2020 Yazhao Li, Yanwei Pang, Jianbing Shen, Jiale Cao, Ling Shao

With this observation, we propose a new Neighbor Erasing and Transferring (NET) mechanism to reconfigure the pyramid features and explore scale-aware features.

Object object-detection +1

Deep Learning for Person Re-identification: A Survey and Outlook

5 code implementations13 Jan 2020 Mang Ye, Jianbing Shen, Gaojie Lin, Tao Xiang, Ling Shao, Steven C. H. Hoi

The widely studied closed-world setting is usually applied under various research-oriented assumptions, and has achieved inspiring success using deep learning techniques on a number of datasets.

Cross-Modal Person Re-Identification Metric Learning +3

Distilled Siamese Networks for Visual Tracking

no code implementations24 Jul 2019 Jianbing Shen, Yuanpei Liu, Xingping Dong, Xiankai Lu, Fahad Shahbaz Khan, Steven Hoi

This model is intuitively inspired by the one teacher vs. multiple students learning method typically employed in schools.

Knowledge Distillation Object Tracking +1

Evaluation of Retinal Image Quality Assessment Networks in Different Color-spaces

4 code implementations10 Jul 2019 Huazhu Fu, Boyang Wang, Jianbing Shen, Shanshan Cui, Yanwu Xu, Jiang Liu, Ling Shao

Retinal image quality assessment (RIQA) is essential for controlling the quality of retinal imaging and guaranteeing the reliability of diagnoses by ophthalmologists or automated analysis systems.

Image Quality Assessment

Understanding More about Human and Machine Attention in Deep Neural Networks

no code implementations20 Jun 2019 Qiuxia Lai, Salman Khan, Yongwei Nie, Jianbing Shen, Hanqiu Sun, Ling Shao

With three example computer vision tasks, diverse representative backbones, and famous architectures, corresponding real human gaze data, and systematically conducted large-scale quantitative studies, we quantify the consistency between artificial attention and human visual attention and offer novel insights into existing artificial attention mechanisms by giving preliminary answers to several key questions related to human and artificial attention mechanisms.

Fine-Grained Image Classification Semantic Segmentation +1

Extreme Points Derived Confidence Map as a Cue For Class-Agnostic Segmentation Using Deep Neural Network

1 code implementation6 Jun 2019 Shadab Khan, Ahmed H. Shahin, Javier Villafruela, Jianbing Shen, Ling Shao

To automate the process of segmenting an anatomy of interest, we can learn a model from previously annotated data.

Anatomy

Salient Object Detection in the Deep Learning Era: An In-Depth Survey

1 code implementation19 Apr 2019 Wenguan Wang, Qiuxia Lai, Huazhu Fu, Jianbing Shen, Haibin Ling, Ruigang Yang

As an essential problem in computer vision, salient object detection (SOD) has attracted an increasing amount of research attention over the years.

Attribute Object +4

Adversarial Defense by Restricting the Hidden Space of Deep Neural Networks

1 code implementation ICCV 2019 Aamir Mustafa, Salman Khan, Munawar Hayat, Roland Goecke, Jianbing Shen, Ling Shao

Deep neural networks are vulnerable to adversarial attacks, which can fool them by adding minuscule perturbations to the input images.

Adversarial Defense

Striking the Right Balance with Uncertainty

no code implementations CVPR 2019 Salman Khan, Munawar Hayat, Waqas Zamir, Jianbing Shen, Ling Shao

Rare classes tend to get a concentrated representation in the classification space which hampers the generalization of learned boundaries to new test examples.

Attribute Classification +3

Image Super-Resolution as a Defense Against Adversarial Attacks

1 code implementation7 Jan 2019 Aamir Mustafa, Salman H. Khan, Munawar Hayat, Jianbing Shen, Ling Shao

The proposed scheme is simple and has the following advantages: (1) it does not require any model training or parameter optimization, (2) it complements other existing defense mechanisms, (3) it is agnostic to the attacked model and attack type and (4) it provides superior performance across all popular attack algorithms.

Adversarial Defense Image Enhancement +2

Triplet Loss in Siamese Network for Object Tracking

no code implementations ECCV 2018 Xingping Dong, Jianbing Shen

In this paper, a novel triplet loss is proposed to extract expressive deep feature for object tracking by adding it into Siamese network framework instead of pairwise loss for training.

Object Object Tracking +1

Pyramid Dilated Deeper ConvLSTM for Video Salient Object Detection

1 code implementation ECCV 2018 Hongmei Song, Wenguan Wang, Sanyuan Zhao, Jianbing Shen, Kin-Man Lam

This paper proposes a fast video salient object detection model, based on a novel recurrent network architecture, named Pyramid Dilated Bidirectional ConvLSTM (PDB-ConvLSTM).

 Ranked #1 on Video Salient Object Detection on UVSD (using extra training data)

Object object-detection +5

Learning Human-Object Interactions by Graph Parsing Neural Networks

1 code implementation ECCV 2018 Siyuan Qi, Wenguan Wang, Baoxiong Jia, Jianbing Shen, Song-Chun Zhu

For a given scene, GPNN infers a parse graph that includes i) the HOI graph structure represented by an adjacency matrix, and ii) the node labels.

Human-Object Interaction Detection Object

Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification

1 code implementation CVPR 2018 Wenguan Wang, Yuanlu Xu, Jianbing Shen, Song-Chun Zhu

This paper proposes a knowledge-guided fashion network to solve the problem of visual fashion analysis, e. g., fashion landmark localization and clothing category classification.

General Classification

Salient Object Detection Driven by Fixation Prediction

1 code implementation CVPR 2018 Wenguan Wang, Jianbing Shen, Xingping Dong, Ali Borji

Salient object detection is then viewed as fine-grained object-level saliency segmentation and is progressively optimized with the guidance of the fixation map in a top-down manner.

Object object-detection +3

Revisiting Video Saliency: A Large-scale Benchmark and a New Model

1 code implementation CVPR 2018 Wenguan Wang, Jianbing Shen, Fang Guo, Ming-Ming Cheng, Ali Borji

Existing video saliency datasets lack variety and generality of common dynamic scenes and fall short in covering challenging situations in unconstrained environments.

Video Saliency Detection

Deep Cropping via Attention Box Prediction and Aesthetics Assessment

no code implementations ICCV 2017 Wenguan Wang, Jianbing Shen

We model the photo cropping problem as a cascade of attention box regression and aesthetic quality classification, based on deep learning.

Improved Face Detection and Alignment using Cascade Deep Convolutional Network

no code implementations28 Jul 2017 Weilin Cong, Sanyuan Zhao, Hui Tian, Jianbing Shen

Real-world face detection and alignment demand an advanced discriminative model to address challenges by pose, lighting and expression.

Face Detection

Quadruplet Network with One-Shot Learning for Fast Visual Object Tracking

no code implementations19 May 2017 Xingping Dong, Jianbing Shen, Dongming Wu, Kan Guo, Xiaogang Jin, Fatih Porikli

In this paper, we propose a new quadruplet deep network to examine the potential connections among the training instances, aiming to achieve a more powerful representation.

One-Shot Learning Triplet +1

Deep Visual Attention Prediction

1 code implementation journal 2017 Wenguan Wang, Jianbing Shen

Our model is based on a skip-layer network structure, which predicts human attention from multiple convolutional layers with various reception fields.

Saliency Prediction

Selective Video Object Cutout

no code implementations28 Feb 2017 Wenguan Wang, Jianbing Shen, Fatih Porikli

Conventional video segmentation approaches rely heavily on appearance models.

Computational Efficiency Object +3

Super-Trajectory for Video Segmentation

no code implementations ICCV 2017 Wenguan Wang, Jianbing Shen, Jianwen Xie, Fatih Porikli

We introduce a novel semi-supervised video segmentation approach based on an efficient video representation, called as "super-trajectory".

Clustering Segmentation +2

Video Salient Object Detection via Fully Convolutional Networks

no code implementations2 Feb 2017 Wenguan Wang, Jianbing Shen, Ling Shao

This paper proposes a deep learning model to efficiently detect salient regions in videos.

Data Augmentation Object +4

Linearization to Nonlinear Learning for Visual Tracking

no code implementations ICCV 2015 Bo Ma, Hongwei Hu, Jianbing Shen, Yuping Zhang, Fatih Porikli

Building on the theory of globally linear approximations to nonlinear functions, we introduce an elegant method that jointly learns a nonlinear classifier and a visual dictionary for tracking objects in a semi-supervised sparse coding fashion.

Descriptive Dictionary Learning +1

Saliency-Aware Geodesic Video Object Segmentation

1 code implementation CVPR 2015 Wenguan Wang, Jianbing Shen, Fatih Porikli

Building on the observation that foreground areas are surrounded by the regions with high spatiotemporal edge values, geodesic distance provides an initial estimation for foreground and background.

Ranked #5 on Video Salient Object Detection on DAVSOD-Difficult20 (using extra training data)

Object Segmentation +3

Cannot find the paper you are looking for? You can Submit a new open access paper.