1 code implementation • ECCV 2020 • Xingping Dong, Jianbing Shen, Ling Shao, Fatih Porikli
To make full use of these sequence-specific samples, {we propose a compact latent network to quickly adjust the tracking model to adapt to new scenes.}
no code implementations • 1 Aug 2024 • Wencheng Han, Jianbing Shen
In the area of self-supervised monocular depth estimation, models that utilize rich-resource inputs, such as high-resolution and multi-frame inputs, typically achieve better performance than models that use ordinary single image input.
1 code implementation • 15 Jul 2024 • ChunLiang Li, Wencheng Han, Junbo Yin, Sanyuan Zhao, Jianbing Shen
Concurrent processing of multiple autonomous driving 3D perception tasks within the same spatiotemporal scene poses a significant challenge, in particular due to the computational inefficiencies and feature competition between tasks when using traditional multi-task learning approaches.
Ranked #4 on 3D Lane Detection on OpenLane
no code implementations • 1 Jul 2024 • Dubing Chen, Wencheng Han, Jin Fang, Jianbing Shen
In this technical report, we present our solution for the Vision-Centric 3D Occupancy and Flow Prediction track in the nuScenes Open-Occ Dataset Challenge at CVPR 2024.
no code implementations • 24 Jun 2024 • Wenwu Yang, Jinyi Yu, Tuo Chen, Zhenguang Liu, Xun Wang, Jianbing Shen
Each embedding slice corresponds to a sample threshold and is learned by enforcing the corresponding triplet loss, yielding a set of distinct expression features, one for each embedding slice.
Facial Expression Recognition Facial Expression Recognition (FER) +2
no code implementations • 28 May 2024 • Yifan Bai, Dongming Wu, Yingfei Liu, Fan Jia, Weixin Mao, Ziheng Zhang, Yucheng Zhao, Jianbing Shen, Xing Wei, Tiancai Wang, Xiangyu Zhang
Despite its simplicity, Atlas demonstrates superior performance in both 3D detection and ego planning tasks on nuScenes dataset, proving that 3D-tokenized LLM is the key to reliable autonomous driving.
1 code implementation • CVPR 2024 • Junbo Yin, Jianbing Shen, Runnan Chen, Wei Li, Ruigang Yang, Pascal Frossard, Wenguan Wang
HSF applies Point-to-Grid and Grid-to-Region transformers to capture the multimodal scene context at different granularities.
no code implementations • 18 Feb 2024 • Yucheng Zhou, Xiang Li, Qianning Wang, Jianbing Shen
In Large Visual Language Models (LVLMs), the efficacy of In-Context Learning (ICL) remains limited by challenges in cross-modal interactions and representation disparities.
no code implementations • 8 Jan 2024 • Wencheng Han, Dongqian Guo, Cheng-Zhong Xu, Jianbing Shen
On the other hand, the generation of accurate control signals relies on precise and detailed environmental perception, which is where 3D scene perception models excel.
no code implementations • CVPR 2024 • Chen Zhang, Wencheng Han, Yang Zhou, Jianbing Shen, Cheng-Zhong Xu, Wentao Liu
These methods utilize both the metadata and the sRGB image to perform sRGB-to-RAW de-rendering and recover high-quality single-frame RAW data.
1 code implementation • 25 Dec 2023 • Li Xiang, Junbo Yin, Wei Li, Cheng-Zhong Xu, Ruigang Yang, Jianbing Shen
Specifically, DMA builds a domain-mixing 3D instance bank for the teacher and student models during training, resulting in aligned data representation.
no code implementations • 15 Nov 2023 • Yucheng Zhou, Xiubo Geng, Tao Shen, Chongyang Tao, Guodong Long, Jian-Guang Lou, Jianbing Shen
Large Language Models (LLMs) have ushered in a transformative era in the field of natural language processing, excelling in tasks related to text comprehension and generation.
1 code implementation • 10 Oct 2023 • Dongming Wu, Jiahao Chang, Fan Jia, Yingfei Liu, Tiancai Wang, Jianbing Shen
Further, we propose TopoMLP, a simple yet high-performance pipeline for driving topology reasoning.
Ranked #3 on 3D Lane Detection on OpenLane-V2 val
no code implementations • 19 Sep 2023 • Wencheng Han, Jianbing Shen
The curve-based lane representation is a popular approach in many lane detection methods, as it allows for the representation of lanes as a whole object and maximizes the use of holistic information about the lanes.
2 code implementations • 8 Sep 2023 • Dongming Wu, Wencheng Han, Tiancai Wang, Yingfei Liu, Xiangyu Zhang, Jianbing Shen
A new trend in the computer vision community is to capture objects of interest following flexible human command represented by a natural language prompt.
1 code implementation • ICCV 2023 • Wencheng Han, Junbo Yin, Jianbing Shen
To bridge this gap, we propose a new Direction-aware Cumulative Convolution Network (DaCCN), which improves the depth feature representation in two aspects.
Monocular Depth Estimation Unsupervised Monocular Depth Estimation
1 code implementation • ICCV 2023 • Dongming Wu, Tiancai Wang, Yuang Zhang, Xiangyu Zhang, Jianbing Shen
Referring video object segmentation (RVOS) aims at segmenting an object in a video following human instruction.
Referring Expression Segmentation Referring Video Object Segmentation +2
no code implementations • 6 Jun 2023 • Yukun Zhai, Xiaoqiang Zhang, Xiameng Qin, Sanyuan Zhao, Xingping Dong, Jianbing Shen
End-to-end text spotting is a vital computer vision task that aims to integrate scene text detection and recognition into a unified framework.
no code implementations • 25 May 2023 • Wenhao Cheng, Junbo Yin, Wei Li, Ruigang Yang, Jianbing Shen
In this work, we propose a new multi-modal visual grounding task, termed LiDAR Grounding.
no code implementations • CVPR 2023 • Runzhou Tao, Wencheng Han, Zhongying Qiu, Cheng-Zhong Xu, Jianbing Shen
When used as a pre-training method, our model can significantly outperform the corresponding fully-supervised baseline with only 1/3 3D labels.
1 code implementation • CVPR 2023 • Dongming Wu, Wencheng Han, Tiancai Wang, Xingping Dong, Xiangyu Zhang, Jianbing Shen
In this paper, we propose a new and general referring understanding task, termed referring multi-object tracking (RMOT).
no code implementations • 8 Feb 2023 • Jiawei Liu, Xingping Dong, Sanyuan Zhao, Jianbing Shen
To achieve simultaneous detection for both common and rare objects, we propose a novel task, called generalized few-shot 3D object detection, where we have a large amount of training data for common (base) objects, but only a few data for rare (novel) classes.
no code implementations • 2 Feb 2023 • Xingping Dong, Jianbing Shen, Fatih Porikli, Jiebo Luo, Ling Shao
Under this viewing, we perform an in-depth analysis for them through visual simulations and real tracking examples, and find that the failure cases in some challenging situations can be regarded as the issue of missing decisive samples in offline training.
1 code implementation • 7 Dec 2022 • Xiang Li, Junbo Yin, Botian Shi, Yikang Li, Ruigang Yang, Jianbing Shen
In this paper, we present a more artful framework, LiDAR-guided Weakly Supervised Instance Segmentation (LWSIS), which leverages the off-the-shelf 3D data, i. e., Point Cloud, together with the 3D boxes, as natural weak supervisions for training the 2D image instance segmentation models.
1 code implementation • 6 Dec 2022 • Yan Wang, Junbo Yin, Wei Li, Pascal Frossard, Ruigang Yang, Jianbing Shen
However, these UDA solutions just yield unsatisfactory 3D detection results when there is a severe domain shift, e. g., from Waymo (64-beam) to nuScenes (32-beam).
1 code implementation • 27 Sep 2022 • Xingping Dong, Jianbing Shen, Ling Shao
In this work, we prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
1 code implementation • 26 Jul 2022 • Junbo Yin, Jin Fang, Dingfu Zhou, Liangjun Zhang, Cheng-Zhong Xu, Jianbing Shen, Wenguan Wang
To reduce the dependence on large supervision, semi-supervised learning (SSL) based approaches have been proposed.
1 code implementation • 26 Jul 2022 • Junbo Yin, Dingfu Zhou, Liangjun Zhang, Jin Fang, Cheng-Zhong Xu, Jianbing Shen, Wenguan Wang
Existing approaches for unsupervised point cloud pre-training are constrained to either scene-level or point/voxel-level instance discrimination.
no code implementations • 26 Jul 2022 • Junbo Yin, Jianbing Shen, Xin Gao, David Crandall, Ruigang Yang
In this paper, we propose to detect 3D objects by exploiting temporal information in multiple frames, i. e., the point cloud videos.
1 code implementation • CVPR 2022 • Hanqing Wang, Wei Liang, Jianbing Shen, Luc van Gool, Wenguan Wang
Since the rise of vision-language navigation (VLN), great progress has been made in instruction following -- building a follower to navigate environments under the guidance of instructions.
1 code implementation • CVPR 2022 • Zhiyuan Liang, Tiancai Wang, Xiangyu Zhang, Jian Sun, Jianbing Shen
The tree energy loss is effective and easy to be incorporated into existing frameworks by combining it with a traditional segmentation loss.
no code implementations • 10 Feb 2022 • Tao Zhou, Huazhu Fu, Chen Gong, Ling Shao, Fatih Porikli, Haibin Ling, Jianbing Shen
Besides, a novel constraint based on the Hilbert Schmidt Independence Criterion (HSIC) is introduced to ensure the diversity of multi-level subspace representations, which enables the complementarity of multi-level representations to be explored to boost the transfer learning performance.
no code implementations • CVPR 2022 • Zheyun Qin, Xiankai Lu, Xiushan Nie, Yilong Yin, Jianbing Shen
Video Instance Segmentation (VIS) needs to automatically track and segment multiple objects in videos that rely on modeling the spatial-temporal interactions of the instances.
no code implementations • CVPR 2022 • Dongming Wu, Xingping Dong, Ling Shao, Jianbing Shen
To address this, we propose a novel multi-level representation learning approach, which explores the inherent structure of the video content to provide a set of discriminative visual embedding, enabling more effective vision-language semantic alignment.
1 code implementation • 14 Dec 2021 • JianJian Cao, Xiameng Qin, Sanyuan Zhao, Jianbing Shen
In this paper, we focus on these two problems and propose a Graph Matching Attention (GMA) network.
1 code implementation • ICCV 2021 • Ge-Peng Ji, Deng-Ping Fan, Keren Fu, Zhe Wu, Jianbing Shen, Ling Shao
Previous video object segmentation approaches mainly focus on using simplex solutions between appearance and motion, limiting feature collaboration efficiency among and across these two cues.
Ranked #9 on Video Polyp Segmentation on SUN-SEG-Hard (Unseen)
no code implementations • CVPR 2021 • Wenbin Ge, Xiankai Lu, Jianbing Shen
In this paper, we propose a feature embedding based video object segmentation (VOS) method which is simple, fast and effective.
no code implementations • IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY 2021 • Dongming Wu, Mang Ye, Gaojie Lin, Xin Gao, Jianbing Shen
In addition, we propose a novel multi-head collaborative training scheme to improve the performance, which is collaboratively supervised by multiple heads with the same structure but different parameters.
1 code implementation • CVPR 2021 • Tianfei Zhou, Wenguan Wang, Zhiyuan Liang, Jianbing Shen
On existing public benchmarks, face forgery detection techniques have achieved great success.
1 code implementation • CVPR 2021 • Hanqing Wang, Wenguan Wang, Wei Liang, Caiming Xiong, Jianbing Shen
Recently, numerous algorithms have been developed to tackle the problem of vision-language navigation (VLN), i. e., entailing an agent to navigate 3D environments through following linguistic instructions.
no code implementations • ICCV 2021 • Xin Hao, Sanyuan Zhao, Mang Ye, Jianbing Shen
Cross-modality person re-identification is a challenging task due to large cross-modality discrepancy and intra-modality variations.
1 code implementation • CVPR 2021 • Wencheng Han, Xingping Dong, Fahad Shahbaz Khan, Ling Shao, Jianbing Shen
We propose a learnable module, called the asymmetric convolution (ACM), which learns to better capture the semantic correlation information in offline training on large-scale data.
Ranked #24 on Visual Object Tracking on TrackingNet
2 code implementations • 26 Aug 2020 • Keren Fu, Deng-Ping Fan, Ge-Peng Ji, Qijun Zhao, Jianbing Shen, Ce Zhu
Inspired by the observation that RGB and depth modalities actually present certain commonality in distinguishing salient objects, a novel joint learning and densely cooperative fusion (JL-DCF) architecture is designed to learn from both RGB and depth inputs through a shared network backbone, known as the Siamese architecture.
Ranked #3 on RGB-D Salient Object Detection on STERE
9 code implementations • 1 Aug 2020 • Tao Zhou, Deng-Ping Fan, Ming-Ming Cheng, Jianbing Shen, Ling Shao
Further, considering that the light field can also provide depth maps, we review SOD models and popular benchmark datasets from this domain as well.
1 code implementation • ECCV 2020 • Qinghao Meng, Wenguan Wang, Tianfei Zhou, Jianbing Shen, Luc van Gool, Dengxin Dai
This work proposes a weakly supervised approach for 3D object detection, only requiring a small set of weakly annotated scenes, associated with a few precisely labeled object instances.
5 code implementations • ECCV 2020 • Mang Ye, Jianbing Shen, David J. Crandall, Ling Shao, Jiebo Luo
In this paper, we propose a novel dynamic dual-attentive aggregation (DDAG) learning method by mining both intra-modality part-level and cross-modality graph-level contextual cues for VI-ReID.
1 code implementation • ECCV 2020 • Hanqing Wang, Wenguan Wang, Tianmin Shu, Wei Liang, Jianbing Shen
Vision-language navigation (VLN) is the task of entailing an agent to carry out navigational instructions inside photo-realistic environments.
1 code implementation • ECCV 2020 • Xiankai Lu, Wenguan Wang, Martin Danelljan, Tianfei Zhou, Jianbing Shen, Luc van Gool
How to make a segmentation model efficiently adapt to a specific video and to online target appearance variations are fundamentally crucial issues in the field of video object segmentation.
2 code implementations • 7 Jul 2020 • Deng-Ping Fan, Tengpeng Li, Zheng Lin, Ge-Peng Ji, Dingwen Zhang, Ming-Ming Cheng, Huazhu Fu, Jianbing Shen
CoSOD is an emerging and rapidly growing extension of salient object detection (SOD), which aims to detect the co-occurring salient objects in a group of images.
Ranked #7 on Co-Salient Object Detection on CoCA
4 code implementations • 13 Jun 2020 • Deng-Ping Fan, Ge-Peng Ji, Tao Zhou, Geng Chen, Huazhu Fu, Jianbing Shen, Ling Shao
To address these challenges, we propose a parallel reverse attention network (PraNet) for accurate polyp segmentation in colonoscopy images.
Ranked #3 on Camouflaged Object Segmentation on PCOD_1200
1 code implementation • 1 Jun 2020 • Tao Zhou, Huazhu Fu, Yu Zhang, Changqing Zhang, Xiankai Lu, Jianbing Shen, Ling Shao
Then, we use a modality-specific network to extract implicit and high-level features from different MR scans.
1 code implementation • 12 May 2020 • Ziyi Shen, Huazhu Fu, Jianbing Shen, Ling Shao
Retinal fundus images are widely used for the clinical screening and diagnosis of eye diseases.
3 code implementations • 22 Apr 2020 • Deng-Ping Fan, Tao Zhou, Ge-Peng Ji, Yi Zhou, Geng Chen, Huazhu Fu, Jianbing Shen, Ling Shao
Coronavirus Disease 2019 (COVID-19) spread globally in early 2020, causing the world to face an existential health crisis.
no code implementations • CVPR 2020 • Tao Li, Zhiyuan Liang, Sanyuan Zhao, Jiahao Gong, Jianbing Shen
For the global error, we first transform category-wise features into a high-level graph model with coarse-grained structural information, and then decouple the high-level graph to reconstruct the category features.
1 code implementation • CVPR 2020 • Junbo Yin, Jianbing Shen, Chenye Guan, Dingfu Zhou, Ruigang Yang
In this paper, we propose an end-to-end online 3D video object detector that operates on point cloud sequences.
1 code implementation • CVPR 2020 • Junbo Yin, Wenguan Wang, Qinghao Meng, Ruigang Yang, Jianbing Shen
In this paper, we propose a novel MOT framework that unifies object motion and affinity model into a single network, named UMA, in order to learn a compact feature that is discriminative for both object motion and affinity measure.
1 code implementation • CVPR 2020 • Wenguan Wang, Hailong Zhu, Jifeng Dai, Yanwei Pang, Jianbing Shen, Ling Shao
As human bodies are underlying hierarchically structured, how to model human structures is the central theme in this task.
1 code implementation • CVPR 2020 • Xiankai Lu, Wenguan Wang, Jianbing Shen, Yu-Wing Tai, David Crandall, Steven C. H. Hoi
We propose a new method for video object segmentation (VOS) that addresses object pattern learning from unlabeled videos, unlike most existing methods which rely heavily on extensive annotated data.
1 code implementation • CVPR 2020 • Tianfei Zhou, Wenguan Wang, Siyuan Qi, Haibin Ling, Jianbing Shen
The interaction recognition network has two crucial parts: a relation ranking module for high-quality HOI proposal selection and a triple-stream classifier for relation prediction.
no code implementations • 26 Feb 2020 • Jilin Hu, Jianbing Shen, Bin Yang, Ling Shao
Graph convolutional neural networks~(GCNs) have recently demonstrated promising results on graph-based semi-supervised classification, but little work has been done to explore their theoretical properties.
2 code implementations • 11 Feb 2020 • Tao Zhou, Huazhu Fu, Geng Chen, Jianbing Shen, Ling Shao
Medical image synthesis has been proposed as an effective solution to this, where any missing modalities are synthesized from the existing ones.
1 code implementation • ICCV 2019 • Wenguan Wang, Zhijie Zhang, Siyuan Qi, Jianbing Shen, Yanwei Pang, Ling Shao
The bottom-up and top-down inferences explicitly model the compositional and decompositional relations in human bodies, respectively.
1 code implementation • CVPR 2019 • Xiankai Lu, Wenguan Wang, Chao Ma, Jianbing Shen, Ling Shao, Fatih Porikli
We introduce a novel network, called CO-attention Siamese Network (COSNet), to address the unsupervised video object segmentation task from a holistic view.
Ranked #11 on Video Polyp Segmentation on SUN-SEG-Hard (Unseen)
Semantic Segmentation Unsupervised Video Object Segmentation +2
1 code implementation • ICCV 2019 • Wenguan Wang, Xiankai Lu, Jianbing Shen, David Crandall, Ling Shao
Through parametric message passing, AGNN is able to efficiently capture and mine much richer and higher-order relations between video frames, thus enabling a more complete understanding of video content and more accurate foreground estimation.
1 code implementation • ICCV 2019 • Ziyi Shen, Wenguan Wang, Xiankai Lu, Jianbing Shen, Haibin Ling, Tingfa Xu, Ling Shao
This paper proposes a human-aware deblurring model that disentangles the motion blur between foreground (FG) humans and background (BG).
no code implementations • CVPR 2020 • Yazhao Li, Yanwei Pang, Jianbing Shen, Jiale Cao, Ling Shao
With this observation, we propose a new Neighbor Erasing and Transferring (NET) mechanism to reconfigure the pyramid features and explore scale-aware features.
5 code implementations • 13 Jan 2020 • Mang Ye, Jianbing Shen, Gaojie Lin, Tao Xiang, Ling Shao, Steven C. H. Hoi
The widely studied closed-world setting is usually applied under various research-oriented assumptions, and has achieved inspiring success using deep learning techniques on a number of datasets.
Ranked #1 on Cross-Modal Person Re-Identification on RegDB-C
1 code implementation • 25 Jul 2019 • Zhijie Zhang, Huazhu Fu, Hang Dai, Jianbing Shen, Yanwei Pang, Ling Shao
Segmentation is a fundamental task in medical image analysis.
Ranked #1 on Optic Disc Segmentation on REFUGE
no code implementations • 24 Jul 2019 • Jianbing Shen, Yuanpei Liu, Xingping Dong, Xiankai Lu, Fahad Shahbaz Khan, Steven Hoi
This model is intuitively inspired by the one teacher vs. multiple students learning method typically employed in schools.
4 code implementations • 10 Jul 2019 • Huazhu Fu, Boyang Wang, Jianbing Shen, Shanshan Cui, Yanwu Xu, Jiang Liu, Ling Shao
Retinal image quality assessment (RIQA) is essential for controlling the quality of retinal imaging and guaranteeing the reliability of diagnoses by ophthalmologists or automated analysis systems.
no code implementations • 20 Jun 2019 • Qiuxia Lai, Salman Khan, Yongwei Nie, Jianbing Shen, Hanqiu Sun, Ling Shao
With three example computer vision tasks, diverse representative backbones, and famous architectures, corresponding real human gaze data, and systematically conducted large-scale quantitative studies, we quantify the consistency between artificial attention and human visual attention and offer novel insights into existing artificial attention mechanisms by giving preliminary answers to several key questions related to human and artificial attention mechanisms.
1 code implementation • 6 Jun 2019 • Shadab Khan, Ahmed H. Shahin, Javier Villafruela, Jianbing Shen, Ling Shao
To automate the process of segmenting an anatomy of interest, we can learn a model from previously annotated data.
1 code implementation • 19 Apr 2019 • Wenguan Wang, Qiuxia Lai, Huazhu Fu, Jianbing Shen, Haibin Ling, Ruigang Yang
As an essential problem in computer vision, salient object detection (SOD) has attracted an increasing amount of research attention over the years.
1 code implementation • ICCV 2019 • Aamir Mustafa, Salman Khan, Munawar Hayat, Roland Goecke, Jianbing Shen, Ling Shao
Deep neural networks are vulnerable to adversarial attacks, which can fool them by adding minuscule perturbations to the input images.
Ranked #7 on Adversarial Defense on CIFAR-10
1 code implementation • 23 Jan 2019 • Munawar Hayat, Salman Khan, Waqas Zamir, Jianbing Shen, Ling Shao
Real-world object classes appear in imbalanced ratios.
no code implementations • CVPR 2019 • Salman Khan, Munawar Hayat, Waqas Zamir, Jianbing Shen, Ling Shao
Rare classes tend to get a concentrated representation in the classification space which hampers the generalization of learned boundaries to new test examples.
1 code implementation • 7 Jan 2019 • Aamir Mustafa, Salman H. Khan, Munawar Hayat, Jianbing Shen, Ling Shao
The proposed scheme is simple and has the following advantages: (1) it does not require any model training or parameter optimization, (2) it complements other existing defense mechanisms, (3) it is agnostic to the attacked model and attack type and (4) it provides superior performance across all popular attack algorithms.
no code implementations • ECCV 2018 • Xingping Dong, Jianbing Shen
In this paper, a novel triplet loss is proposed to extract expressive deep feature for object tracking by adding it into Siamese network framework instead of pairwise loss for training.
1 code implementation • ECCV 2018 • Hongmei Song, Wenguan Wang, Sanyuan Zhao, Jianbing Shen, Kin-Man Lam
This paper proposes a fast video salient object detection model, based on a novel recurrent network architecture, named Pyramid Dilated Bidirectional ConvLSTM (PDB-ConvLSTM).
Ranked #1 on Video Salient Object Detection on UVSD (using extra training data)
1 code implementation • ECCV 2018 • Siyuan Qi, Wenguan Wang, Baoxiong Jia, Jianbing Shen, Song-Chun Zhu
For a given scene, GPNN infers a parse graph that includes i) the HOI graph structure represented by an adjacency matrix, and ii) the node labels.
Ranked #32 on Human-Object Interaction Detection on V-COCO
1 code implementation • CVPR 2018 • Wenguan Wang, Yuanlu Xu, Jianbing Shen, Song-Chun Zhu
This paper proposes a knowledge-guided fashion network to solve the problem of visual fashion analysis, e. g., fashion landmark localization and clothing category classification.
1 code implementation • CVPR 2018 • Wenguan Wang, Jianbing Shen, Xingping Dong, Ali Borji
Salient object detection is then viewed as fine-grained object-level saliency segmentation and is progressively optimized with the guidance of the fixation map in a top-down manner.
no code implementations • CVPR 2018 • Xingping Dong, Jianbing Shen, Wenguan Wang, Yu Liu, Ling Shao, Fatih Porikli
Hyperparameters are numerical presets whose values are assigned prior to the commencement of the learning process.
1 code implementation • CVPR 2018 • Wenguan Wang, Jianbing Shen, Fang Guo, Ming-Ming Cheng, Ali Borji
Existing video saliency datasets lack variety and generality of common dynamic scenes and fall short in covering challenging situations in unconstrained environments.
no code implementations • ICCV 2017 • Wenguan Wang, Jianbing Shen
We model the photo cropping problem as a cascade of attention box regression and aesthetic quality classification, based on deep learning.
no code implementations • 28 Jul 2017 • Weilin Cong, Sanyuan Zhao, Hui Tian, Jianbing Shen
Real-world face detection and alignment demand an advanced discriminative model to address challenges by pose, lighting and expression.
no code implementations • 19 May 2017 • Xingping Dong, Jianbing Shen, Dongming Wu, Kan Guo, Xiaogang Jin, Fatih Porikli
In this paper, we propose a new quadruplet deep network to examine the potential connections among the training instances, aiming to achieve a more powerful representation.
1 code implementation • journal 2017 • Wenguan Wang, Jianbing Shen
Our model is based on a skip-layer network structure, which predicts human attention from multiple convolutional layers with various reception fields.
no code implementations • 28 Feb 2017 • Wenguan Wang, Jianbing Shen, Fatih Porikli
Conventional video segmentation approaches rely heavily on appearance models.
no code implementations • ICCV 2017 • Wenguan Wang, Jianbing Shen, Jianwen Xie, Fatih Porikli
We introduce a novel semi-supervised video segmentation approach based on an efficient video representation, called as "super-trajectory".
no code implementations • 2 Feb 2017 • Wenguan Wang, Jianbing Shen, Ling Shao
This paper proposes a deep learning model to efficiently detect salient regions in videos.
no code implementations • ICCV 2015 • Bo Ma, Hongwei Hu, Jianbing Shen, Yuping Zhang, Fatih Porikli
Building on the theory of globally linear approximations to nonlinear functions, we introduce an elegant method that jointly learns a nonlinear classifier and a visual dictionary for tracking objects in a semi-supervised sparse coding fashion.
1 code implementation • CVPR 2015 • Wenguan Wang, Jianbing Shen, Fatih Porikli
Building on the observation that foreground areas are surrounded by the regions with high spatiotemporal edge values, geodesic distance provides an initial estimation for foreground and background.
Ranked #5 on Video Salient Object Detection on DAVSOD-Difficult20 (using extra training data)
1 code implementation • IEEE Trans. on Image Processing 2014 • Jianbing Shen, Yunfan Du, Wenguan Wang, Xuelong. Li
Then, the boundaries of initial superpixels are obtained according to the probabilities and the commute time.