Search Results for author: Huchuan Lu

Found 183 papers, 106 papers with code

Unsupervised Video Object Segmentation with Joint Hotspot Tracking

no code implementations • ECCV 2020 • Lu Zhang, Jianming Zhang, Zhe Lin, Radomír Měch, Huchuan Lu, You He

We reformulate the problem of detecting and tracking of salient object spots as a new task called object hotspot tracking.

Ranked #6 on Unsupervised Video Object Segmentation on YouTube-Objects

Gaze Estimation Object +5

Paper
Add Code

CLIFFNet for Monocular Depth Estimation with Hierarchical Embedding Loss

no code implementations • ECCV 2020 • Lijun Wang, Jianming Zhang, Yifan Wang, Huchuan Lu, Xiang Ruan

This paper proposes a hierarchical loss for monocular depth estimation, which measures the differences between the prediction and ground truth in hierarchical embedding spaces of depth maps.

Monocular Depth Estimation

Paper
Add Code

Asymmetric Two-Stream Architecture for Accurate RGB-D Saliency Detection

1 code implementation • ECCV 2020 • Miao Zhang, Sun Xiao Fei, Jie Liu, Shuang Xu, Yongri Piao, Huchuan Lu

In this paper, we propose an asymmetric two-stream architecture taking account of the inherent differences between RGB and depth data for saliency detection.

Ranked #19 on Thermal Image Segmentation on RGB-T-Glass-Segmentation

Saliency Detection Thermal Image Segmentation +1

Paper
Code

Other Tokens Matter: Exploring Global and Local Features of Vision Transformers for Object Re-Identification

no code implementations • 23 Apr 2024 • Yingquan Wang, Pingping Zhang, Dong Wang, Huchuan Lu

In this work, we first explore the influence of global and local features of ViT and then further propose a novel Global-Local Transformer (GLTrans) for high-performance object Re-ID.

Object

Paper
Add Code

Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases

no code implementations • 16 Apr 2024 • Yanze Li, Wenhua Zhang, Kai Chen, Yanxin Liu, Pengxiang Li, Ruiyuan Gao, Lanqing Hong, Meng Tian, Xinhai Zhao, Zhenguo Li, Dit-yan Yeung, Huchuan Lu, Xu Jia

Large Vision-Language Models (LVLMs), due to the remarkable visual reasoning ability to understand images and videos, have received widespread attention in the autonomous driving domain, which significantly advances the development of interpretable end-to-end autonomous driving.

Autonomous Driving Visual Reasoning

Paper
Add Code

Multi-view Aggregation Network for Dichotomous Image Segmentation

2 code implementations • 11 Apr 2024 • Qian Yu, Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu

Dichotomous Image Segmentation (DIS) has recently emerged towards high-precision object segmentation from high-resolution natural images.

Ranked #1 on Dichotomous Image Segmentation on DIS-VD

Dichotomous Image Segmentation Image Segmentation +1

152

Paper
Code

Spatial-Temporal Multi-level Association for Video Object Segmentation

no code implementations • 9 Apr 2024 • Deshui Miao, Xin Li, Zhenyu He, Huchuan Lu, Ming-Hsuan Yang

In addition, we propose a spatial-temporal memory to assist feature association and temporal ID assignment and correlation.

Object Segmentation +3

Paper
Add Code

Fantastic Animals and Where to Find Them: Segment Any Marine Animal with Dual SAM

1 code implementation • 7 Apr 2024 • Pingping Zhang, Tianyu Yan, Yang Liu, Huchuan Lu

To this end, we first introduce a dual structure with SAM's paradigm to enhance feature learning of marine images.

Paper
Code

Exploring Dynamic Transformer for Efficient Object Tracking

no code implementations • 26 Mar 2024 • Jiawen Zhu, Xin Chen, Haiwen Diao, Shuai Li, Jun-Yan He, Chenyang Li, Bin Luo, Dong Wang, Huchuan Lu

For instance, DyTrack obtains 64. 9% AUC on LaSOT with a speed of 256 fps.

Object Visual Object Tracking

Paper
Add Code

Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters

1 code implementation • 18 Mar 2024 • Jiazuo Yu, Yunzhi Zhuge, Lu Zhang, Dong Wang, Huchuan Lu, You He

Continual learning can empower vision-language models to continuously acquire new knowledge, without the need for access to the entire historical dataset.

Continual Learning Incremental Learning +1

Paper
Code

Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification

1 code implementation • 15 Mar 2024 • Pingping Zhang, Yuhao Wang, Yang Liu, Zhengzheng Tu, Huchuan Lu

To address above issues, we propose a novel learning framework named \textbf{EDITOR} to select diverse tokens from vision Transformers for multi-modal object ReID.

Object

Paper
Code

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

no code implementations • 7 Mar 2024 • Junsong Chen, Chongjian Ge, Enze Xie, Yue Wu, Lewei Yao, Xiaozhe Ren, Zhongdao Wang, Ping Luo, Huchuan Lu, Zhenguo Li

In this paper, we introduce PixArt-\Sigma, a Diffusion Transformer model~(DiT) capable of directly generating images at 4K resolution.

4k Image Captioning +1

Paper
Add Code

Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception

no code implementations • 5 Mar 2024 • Junwen He, Yifan Wang, Lijun Wang, Huchuan Lu, Jun-Yan He, Jin-Peng Lan, Bin Luo, Xuansong Xie

Multimodal Large Language Model (MLLMs) leverages Large Language Models as a cognitive framework for diverse visual-language tasks.

Language Modelling Large Language Model +2

Paper
Add Code

Spectrum-guided Feature Enhancement Network for Event Person Re-Identification

no code implementations • 2 Feb 2024 • Hongchen Tan, Yi Zhang, Xiuping Liu, BaoCai Yin, Nan Ma, Xin Li, Huchuan Lu

This network consists of two innovative components: the Multi-grain Spectrum Attention Mechanism (MSAM) and the Consecutive Patch Dropout Module (CPDM).

Person Re-Identification

Paper
Add Code

StableIdentity: Inserting Anybody into Anywhere at First Sight

1 code implementation • 29 Jan 2024 • Qinghe Wang, Xu Jia, Xiaomin Li, Taiqing Li, Liqian Ma, Yunzhi Zhuge, Huchuan Lu

We believe that the proposed StableIdentity is an important step to unify image, video, and 3D customized generation models.

3D Generation

222

Paper
Code

PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety

1 code implementation • 22 Jan 2024 • Zaibin Zhang, Yongting Zhang, Lijun Li, Hongzhi Gao, Lijun Wang, Huchuan Lu, Feng Zhao, Yu Qiao, Jing Shao

In this paper, we explore these concerns through the innovative lens of agent psychology, revealing that the dark psychological states of agents constitute a significant threat to safety.

Paper
Code

Tracking with Human-Intent Reasoning

1 code implementation • 29 Dec 2023 • Jiawen Zhu, Zhi-Qi Cheng, Jun-Yan He, Chenyang Li, Bin Luo, Huchuan Lu, Yifeng Geng, Xuansong Xie

The perception component then generates the tracking results based on the embeddings.

Language Modelling Object +4

Paper
Code

UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces

2 code implementations • 25 Dec 2023 • Jiannan Wu, Yi Jiang, Bin Yan, Huchuan Lu, Zehuan Yuan, Ping Luo

We evaluate our unified models on various benchmarks.

Ranked #7 on Referring Expression Segmentation on Refer-YouTube-VOS (2021 public validation)

Image Segmentation Object +5

219

Paper
Code

TOP-ReID: Multi-spectral Object Re-Identification with Token Permutation

1 code implementation • 15 Dec 2023 • Yuhao Wang, Xuehu Liu, Pingping Zhang, Hu Lu, Zhengzheng Tu, Huchuan Lu

In addition, most of current Transformer-based ReID methods only utilize the global feature of class tokens to achieve the holistic retrieval, ignoring the local discriminative ones.

Paper
Code

Part Representation Learning with Teacher-Student Decoder for Occluded Person Re-identification

1 code implementation • 15 Dec 2023 • Shang Gao, Chenyang Yu, Pingping Zhang, Huchuan Lu

In addition, existing occluded person ReID benchmarks utilize occluded samples as queries, which will amplify the role of alleviating occlusion interference and underestimate the impact of the feature absence issue.

Human Parsing Long-range modeling +2

Paper
Code

TF-CLIP: Learning Text-free CLIP for Video-based Person Re-Identification

1 code implementation • 15 Dec 2023 • Chenyang Yu, Xuehu Liu, Yingquan Wang, Pingping Zhang, Huchuan Lu

Technically, TMC allows the frame-level memories in a sequence to communicate with each other, and to extract temporal information based on the relations within the sequence.

Cross-Modal Retrieval Video-Based Person Re-Identification

Paper
Code

Towards Automatic Power Battery Detection: New Challenge, Benchmark Dataset and Baseline

1 code implementation • 5 Dec 2023 • Xiaoqi Zhao, Youwei Pang, Zhenyu Chen, Qian Yu, Lihe Zhang, Hanqi Liu, Jiaming Zuo, Huchuan Lu

We conduct a comprehensive study on a new task named power battery detection (PBD), which aims to localize the dense cathode and anode plates endpoints from X-ray images to evaluate the quality of power batteries.

Crowd Counting object-detection +2

Paper
Code

TrackDiffusion: Tracklet-Conditioned Video Generation via Diffusion Models

no code implementations • 1 Dec 2023 • Pengxiang Li, Kai Chen, Zhili Liu, Ruiyuan Gao, Lanqing Hong, Guo Zhou, Hua Yao, Dit-yan Yeung, Huchuan Lu, Xu Jia

Despite remarkable achievements in video synthesis, achieving granular control over complex dynamics, such as nuanced movement among multiple interacting objects, still presents a significant hurdle for dynamic world modeling, compounded by the necessity to manage appearance and disappearance, drastic scale changes, and ensure consistency for instances across frames.

Image Classification Multi-Object Tracking +4

Paper
Add Code

Open-Vocabulary Camouflaged Object Segmentation

no code implementations • 19 Nov 2023 • Youwei Pang, Xiaoqi Zhao, Jiaming Zuo, Lihe Zhang, Huchuan Lu

With the proposed dataset and baseline, we hope that this new task with more practical value can further expand the research on open-vocabulary dense prediction tasks.

Camouflaged Object Segmentation Image Segmentation +4

Paper
Add Code

ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection

1 code implementation • 31 Oct 2023 • Youwei Pang, Xiaoqi Zhao, Tian-Zhu Xiang, Lihe Zhang, Huchuan Lu

Apart from the high intrinsic similarity between camouflaged objects and their background, objects are usually diverse in scale, fuzzy in appearance, and even severely occluded.

Ranked #1 on Camouflaged Object Segmentation on Camouflaged Animal Dataset (using extra training data)

Camouflaged Object Segmentation

Paper
Code

TransY-Net:Learning Fully Transformer Networks for Change Detection of Remote Sensing Images

1 code implementation • 22 Oct 2023 • Tianyu Yan, Zifu Wan, Pingping Zhang, Gong Cheng, Huchuan Lu

To relieve these issues, in this work we propose a novel Transformer-based learning framework named TransY-Net for remote sensing image CD, which improves the feature extraction from a global view and combines multi-level visual features in a pyramid manner.

Change Detection

Paper
Code

PixArt-$α$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

2 code implementations • 30 Sep 2023 • Junsong Chen, Jincheng Yu, Chongjian Ge, Lewei Yao, Enze Xie, Yue Wu, Zhongdao Wang, James Kwok, Ping Luo, Huchuan Lu, Zhenguo Li

We hope PIXART-$\alpha$ will provide new insights to the AIGC community and startups to accelerate building their own high-quality yet low-cost generative models from scratch.

Image Generation Language Modelling

2,163

Paper
Code

DCPT: Darkness Clue-Prompted Tracking in Nighttime UAVs

1 code implementation • 19 Sep 2023 • Jiawen Zhu, Huayi Tang, Zhi-Qi Cheng, Jun-Yan He, Bin Luo, Shihao Qiu, Shengming Li, Huchuan Lu

To address this, we propose a novel architecture called Darkness Clue-Prompted Tracking (DCPT) that achieves robust UAV tracking at night by efficiently learning to generate darkness clue prompts.

Paper
Code

Leveraging the Power of Data Augmentation for Transformer-based Tracking

no code implementations • 15 Sep 2023 • Jie Zhao, Johan Edstedt, Michael Felsberg, Dong Wang, Huchuan Lu

Due to long-distance correlation and powerful pretrained models, transformer-based methods have initiated a breakthrough in visual object tracking performance.

Data Augmentation Visual Object Tracking

Paper
Add Code

UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory

1 code implementation • 28 Aug 2023 • Haiwen Diao, Bo Wan, Ying Zhang, Xu Jia, Huchuan Lu, Long Chen

Parameter-efficient transfer learning (PETL), i. e., fine-tuning a small portion of parameters, is an effective strategy for adapting pre-trained models to downstream domains.

Question Answering Retrieval +5

Paper
Code

CiteTracker: Correlating Image and Text for Visual Tracking

1 code implementation • ICCV 2023 • Xin Li, Yuqing Huang, Zhenyu He, YaoWei Wang, Huchuan Lu, Ming-Hsuan Yang

Existing visual tracking methods typically take an image patch as the reference of the target to perform tracking.

Attribute Descriptive +2

Paper
Code

Exploring Lightweight Hierarchical Vision Transformers for Efficient Visual Tracking

no code implementations • ICCV 2023 • Ben Kang, Xin Chen, Dong Wang, Houwen Peng, Huchuan Lu

The Bridge Module incorporates the high-level information of deep features into the shallow large-resolution features.

Position Visual Tracking

Paper
Add Code

Isomer: Isomerous Transformer for Zero-shot Video Object Segmentation

1 code implementation • ICCV 2023 • Yichen Yuan, Yifan Wang, Lijun Wang, Xiaoqi Zhao, Huchuan Lu, Yu Wang, Weibo Su, Lei Zhang

Recent leading zero-shot video object segmentation (ZVOS) works devote to integrating appearance and motion information by elaborately designing feature fusion modules and identically applying them in multiple feature stages.

Semantic Segmentation Video Object Segmentation +2

Paper
Code

Exploring Transformers for Open-world Instance Segmentation

no code implementations • ICCV 2023 • Jiannan Wu, Yi Jiang, Bin Yan, Huchuan Lu, Zehuan Yuan, Ping Luo

Open-world instance segmentation is a rising task, which aims to segment all objects in the image by learning from a limited number of base-category objects.

Contrastive Learning Open-World Instance Segmentation +1

Paper
Add Code

Recurrent Multi-scale Transformer for High-Resolution Salient Object Detection

1 code implementation • 7 Aug 2023 • Xinhao Deng, Pingping Zhang, Wei Liu, Huchuan Lu

To address above issues, in this work, we first propose a new HRS10K dataset, which contains 10, 500 high-quality annotated images at 2K-8K resolution.

2k 8k +3

Paper
Code

Video-based Person Re-identification with Long Short-Term Representation Learning

no code implementations • 7 Aug 2023 • Xuehu Liu, Pingping Zhang, Huchuan Lu

Meanwhile, to extract short-term representations, we propose a Bi-direction Motion Estimator (BME), in which reciprocal motion information is efficiently extracted from consecutive frames.

Representation Learning Video-Based Person Re-Identification

Paper
Add Code

Hybrid-SORT: Weak Cues Matter for Online Multi-Object Tracking

2 code implementations • 1 Aug 2023 • Mingzhan Yang, Guangxin Han, Bin Yan, Wenhua Zhang, Jinqing Qi, Huchuan Lu, Dong Wang

Also, our method shows strong generalization for diverse trackers and scenarios in a plug-and-play and training-free manner.

Ranked #9 on Multi-Object Tracking on DanceTrack

Multi-Object Tracking Multiple Object Tracking +1

6,107

Paper
Code

Towards Deeply Unified Depth-aware Panoptic Segmentation with Bi-directional Guidance Learning

1 code implementation • ICCV 2023 • Junwen He, Yifan Wang, Lijun Wang, Huchuan Lu, Jun-Yan He, Jin-Peng Lan, Bin Luo, Yifeng Geng, Xuansong Xie

Our method sets the new state of the art for depth-aware panoptic segmentation on both Cityscapes-DVPS and SemKITTI-DVPS datasets.

Depth Estimation Panoptic Segmentation +1

Paper
Code

Tracking Anything in High Quality

1 code implementation • 26 Jul 2023 • Jiawen Zhu, Zhenyu Chen, Zeqi Hao, Shijie Chang, Lu Zhang, Dong Wang, Huchuan Lu, Bin Luo, Jun-Yan He, Jin-Peng Lan, Hanyuan Chen, Chenyang Li

To further improve the quality of tracking masks, a pretrained MR model is employed to refine the tracking results.

Ranked #5 on Semi-Supervised Video Object Segmentation on YouTube-VOS 2019 (using extra training data)

Object Semantic Segmentation +3

733

Paper
Code

ComPtr: Towards Diverse Bi-source Dense Prediction Tasks via A Simple yet General Complementary Transformer

1 code implementation • 23 Jul 2023 • Youwei Pang, Xiaoqi Zhao, Lihe Zhang, Huchuan Lu

Specifically, unlike existing methods that over-specialize in a single task or a subset of tasks, ComPtr starts from the more general concept of bi-source dense prediction.

Ranked #14 on Semantic Segmentation on NYU Depth v2

Change Detection Crowd Counting +4

Paper
Code

3rd Place Solution for PVUW2023 VSS Track: A Large Model for Semantic Segmentation on VSPW

1 code implementation • 4 Jun 2023 • Shijie Chang, Zeqi Hao, Ben Kang, Xiaoqi Zhao, Jiawen Zhu, Zhenyu Chen, Lihe Zhang, Lu Zhang, Huchuan Lu

In this paper, we introduce 3rd place solution for PVUW2023 VSS track.

Position Segmentation +2

Paper
Code

BEV-IO: Enhancing Bird's-Eye-View 3D Detection with Instance Occupancy

no code implementations • 26 May 2023 • Zaibin Zhang, Yuanhang Zhang, Lijun Wang, Yifan Wang, Huchuan Lu

At the core of our method is the newly-designed instance occupancy prediction (IOP) module, which aims to infer point-level occupancy status for each instance in the frustum space.

Paper
Add Code

Neural Image Re-Exposure

1 code implementation • 23 May 2023 • Xinyu Zhang, Hefei Huang, Xu Jia, Dong Wang, Huchuan Lu

In this work, we aim to re-expose the captured photo in post-processing to provide a more flexible way of addressing those issues within a unified framework.

Ranked #4 on Deblurring on GoPro (using extra training data)

Deblurring Joint Deblur and Frame Interpolation +5

Paper
Code

Deeply-Coupled Convolution-Transformer with Spatial-temporal Complementary Learning for Video-based Person Re-identification

1 code implementation • 27 Apr 2023 • Xuehu Liu, Chenyang Yu, Pingping Zhang, Huchuan Lu

Further, in spatial, we propose a Complementary Content Attention (CCA) to take advantages of the coupled structure and guide independent features for spatial complementary learning.

Video-Based Person Re-Identification

Paper
Code

Unified Sequence-to-Sequence Learning for Single- and Multi-Modal Visual Object Tracking

1 code implementation • CVPR 2023 • Xin Chen, Ben Kang, Jiawen Zhu, Dong Wang, Houwen Peng, Huchuan Lu

In this paper, we introduce a new sequence-to-sequence learning framework for RGB-based and multi-modal object tracking.

Ranked #1 on Rgb-T Tracking on LasHeR

Object Rgb-T Tracking +1

Paper
Code

MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation

1 code implementation • 19 Apr 2023 • Chongjian Ge, Junsong Chen, Enze Xie, Zhongdao Wang, Lanqing Hong, Huchuan Lu, Zhenguo Li, Ping Luo

These queries are then processed iteratively by a BEV-Evolving decoder, which selectively aggregates deep features from either LiDAR, cameras, or both modalities.

3D Object Detection Autonomous Driving +3

Paper
Code

GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from Multi-view Images

no code implementations • CVPR 2023 • Jianchuan Chen, Wentao Yi, Liqian Ma, Xu Jia, Huchuan Lu

The results demonstrate that our approach outperforms state-of-the-art methods in terms of novel view synthesis and geometric reconstruction.

Neural Rendering Novel View Synthesis

Paper
Add Code

ARKitTrack: A New Diverse Dataset for Tracking Using Mobile RGB-D Data

1 code implementation • CVPR 2023 • Haojie Zhao, Junsong Chen, Lijun Wang, Huchuan Lu

Compared with traditional RGB-only visual tracking, few datasets have been constructed for RGB-D tracking.

Visual Tracking

Paper
Code

Plug-and-Play Regulators for Image-Text Matching

1 code implementation • 23 Mar 2023 • Haiwen Diao, Ying Zhang, Wei Liu, Xiang Ruan, Huchuan Lu

Exploiting fine-grained correspondence and visual-semantic alignments has shown great potential in image-text matching.

Ranked #2 on Image Retrieval on Flickr30K 1K test

Image Retrieval Image-text matching +1

Paper
Code

M$^{2}$SNet: Multi-scale in Multi-scale Subtraction Network for Medical Image Segmentation

2 code implementations • 20 Mar 2023 • Xiaoqi Zhao, Hongpeng Jia, Youwei Pang, Long Lv, Feng Tian, Lihe Zhang, Weibing Sun, Huchuan Lu

Next, we expand the single-scale SU to the intra-layer multi-scale SU, which can provide the decoder with both pixel-level and structure-level difference information.

Computed Tomography (CT) Image Segmentation +3

Paper
Code

Visual Prompt Multi-Modal Tracking

1 code implementation • CVPR 2023 • Jiawen Zhu, Simiao Lai, Xin Chen, Dong Wang, Huchuan Lu

To inherit the powerful representations of the foundation model, a natural modus operandi for multi-modal tracking is full fine-tuning on the RGB-based parameters.

Ranked #10 on Rgb-T Tracking on LasHeR

Object Tracking Rgb-T Tracking

227

Paper
Code

Towards Diverse Binary Segmentation via A Simple yet General Gated Network

1 code implementation • 18 Mar 2023 • Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu, Lei Zhang

They ignore two key problems when the encoder exchanges information with the decoder: one is the lack of interference control mechanism between them, the other is without considering the disparity of the contributions from different encoder levels.

Segmentation Semantic Segmentation

155

Paper
Code

Adaptive Multi-source Predictor for Zero-shot Video Object Segmentation

1 code implementation • 18 Mar 2023 • Xiaoqi Zhao, Shijie Chang, Youwei Pang, Jiaxing Yang, Lihe Zhang, Huchuan Lu

In the static object predictor, the RGB source is converted to depth and static saliency sources, simultaneously.

Ranked #1 on Unsupervised Video Object Segmentation on YouTube-Objects

Object Optical Flow Estimation +4

Paper
Code

Dual Memory Aggregation Network for Event-Based Object Detection with Learnable Representation

1 code implementation • 17 Mar 2023 • Dongsheng Wang, Xu Jia, Yang Zhang, Xinyu Zhang, Yaoyuan Wang, Ziyang Zhang, Dong Wang, Huchuan Lu

To fully exploit information with event streams to detect objects, a dual-memory aggregation network (DMANet) is proposed to leverage both long and short memory along event streams to aggregate effective information for object detection.

Object object-detection +1

Paper
Code

Universal Instance Perception as Object Discovery and Retrieval

1 code implementation • CVPR 2023 • Bin Yan, Yi Jiang, Jiannan Wu, Dong Wang, Ping Luo, Zehuan Yuan, Huchuan Lu

All instance perception tasks aim at finding certain objects specified by some queries such as category names, language expressions, and target annotations, but this complete field has been split into multiple independent subtasks.

Ranked #1 on Referring Expression Segmentation on RefCoCo val (using extra training data)

Described Object Detection Generalized Referring Expression Comprehension +15

1,439

Paper
Code

Compression-Aware Video Super-Resolution

1 code implementation • CVPR 2023 • Yingwei Wang, Xu Jia, Xin Tao, Takashi Isobe, Huchuan Lu, Yu-Wing Tai

Videos stored on mobile devices or delivered on the Internet are usually in compressed format and are of various unknown compression parameters, but most video super-resolution (VSR) methods often assume ideal inputs resulting in large performance gap between experimental settings and real-world applications.

Model Compression Video Enhancement +1

Paper
Code

Representation Learning for Visual Object Tracking by Masked Appearance Transfer

1 code implementation • CVPR 2023 • Haojie Zhao, Dong Wang, Huchuan Lu

However, for the template, we make the decoder reconstruct the target appearance within the search region.

Representation Learning Visual Object Tracking

Paper
Code

Adaptive Illumination Mapping for Shadow Detection in Raw Images

1 code implementation • ICCV 2023 • Jiayu Sun, Ke Xu, Youwei Pang, Lihe Zhang, Huchuan Lu, Gerhard Hancke, Rynson Lau

In this paper, we propose a novel method to detect shadows from raw images.

Shadow Detection

Paper
Code

MetaBEV: Solving Sensor Failures for 3D Detection and Map Segmentation

no code implementations • ICCV 2023 • Chongjian Ge, Junsong Chen, Enze Xie, Zhongdao Wang, Lanqing Hong, Huchuan Lu, Zhenguo Li, Ping Luo

These queries are then processed iteratively by a BEV-Evolving decoder, which selectively aggregates deep features from either LiDAR, cameras, or both modalities.

3D Object Detection Autonomous Driving +3

Paper
Add Code

Segment Every Reference Object in Spatial and Temporal Spaces

no code implementations • ICCV 2023 • Jiannan Wu, Yi Jiang, Bin Yan, Huchuan Lu, Zehuan Yuan, Ping Luo

In this work, we end the current fragmented situation and propose UniRef to unify the three reference-based object segmentation tasks with a single architecture.

Ranked #4 on Referring Expression Segmentation on Refer-YouTube-VOS (2021 public validation)

Image Segmentation Object +5

Paper
Add Code

MetaFusion: Infrared and Visible Image Fusion via Meta-Feature Embedding From Object Detection

1 code implementation • CVPR 2023 • Wenda Zhao, Shigeng Xie, Fan Zhao, You He, Huchuan Lu

Conversely, detection task furnishes object semantic information to improve the infrared and visible image fusion.

Infrared And Visible Image Fusion Meta-Learning +3

Paper
Code

HS-Diffusion: Semantic-Mixing Diffusion for Head Swapping

1 code implementation • 13 Dec 2022 • Qinghe Wang, Lijie Liu, Miao Hua, Pengfei Zhu, WangMeng Zuo, QinGhua Hu, Huchuan Lu, Bing Cao

We blend the semantic layouts of source head and source body, and then inpaint the transition region by the semantic layout generator, achieving a coarse-grained head swapping.

Paper
Code

Pixel2ISDF: Implicit Signed Distance Fields based Human Body Model from Multi-view and Multi-pose Images

no code implementations • 6 Dec 2022 • Jianchuan Chen, Wentao Yi, Tiantian Wang, Xing Li, Liqian Ma, Yangyu Fan, Huchuan Lu

The integrated features acting as the latent code are anchored to the SMPLX mesh in the canonical space.

Paper
Add Code

Interactive Feature Embedding for Infrared and Visible Image Fusion

no code implementations • 9 Nov 2022 • Fan Zhao, Wenda Zhao, Huchuan Lu

General deep learning-based methods for infrared and visible image fusion rely on the unsupervised mechanism for vital information retention by utilizing elaborately designed loss functions.

Infrared And Visible Image Fusion Self-Supervised Learning

Paper
Add Code

Towards Grand Unification of Object Tracking

1 code implementation • 14 Jul 2022 • Bin Yan, Yi Jiang, Peize Sun, Dong Wang, Zehuan Yuan, Ping Luo, Huchuan Lu

We present a unified method, termed Unicorn, that can simultaneously solve four tracking problems (SOT, MOT, VOS, MOTS) with a single network using the same model parameters.

Ranked #2 on Multi-Object Tracking and Segmentation on BDD100K val

Multi-Object Tracking Multi-Object Tracking and Segmentation +3

942

Paper
Code

SRRT: Search Region Regulation Tracking

no code implementations • 10 Jul 2022 • Jiawen Zhu, Xin Chen, Pengyu Zhang, Xinying Wang, Dong Wang, Wenda Zhao, Huchuan Lu

Trackers tend to lose the target object due to the limited search region or be interfered with by distractors due to the excessive search region.

Paper
Add Code

Look Back and Forth: Video Super-Resolution with Explicit Temporal Difference Modeling

1 code implementation • CVPR 2022 • Takashi Isobe, Xu Jia, Xin Tao, Changlin Li, Ruihuang Li, Yongjie Shi, Jing Mu, Huchuan Lu, Yu-Wing Tai

Instead of directly feeding consecutive frames into a VSR model, we propose to compute the temporal difference between frames and divide those pixels into two subsets according to the level of difference.

Motion Compensation Optical Flow Estimation +1

Paper
Code

Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline

1 code implementation • CVPR 2022 • Pengyu Zhang, Jie Zhao, Dong Wang, Huchuan Lu, Xiang Ruan

With the popularity of multi-modal sensors, visible-thermal (RGB-T) object tracking is to achieve robust performance and wider application scenarios with the guidance of objects' temperature information.

Ranked #2 on Rgb-T Tracking on GTOT

Attribute Object Tracking +1

Paper
Code

Deeply Interleaved Two-Stream Encoder for Referring Video Segmentation

no code implementations • 30 Mar 2022 • Guang Feng, Lihe Zhang, Zhiwei Hu, Huchuan Lu

To address this task, we first design a two-stream encoder to extract CNN-based visual features and transformer-based linguistic features hierarchically, and a vision-language mutual guidance (VLMG) module is inserted into the encoder multiple times to promote the hierarchical and progressive fusion of multi-modal features.

Ranked #3 on Referring Expression Segmentation on J-HMDB

Referring Expression Segmentation Video Segmentation +2

Paper
Add Code

Efficient Visual Tracking via Hierarchical Cross-Attention Transformer

1 code implementation • 25 Mar 2022 • Xin Chen, Ben Kang, Dong Wang, Dongdong Li, Huchuan Lu

Most state-of-the-art trackers are satisfied with the real-time speed on powerful GPUs.

Visual Tracking

Paper
Code

High-Performance Transformer Tracking

1 code implementation • 25 Mar 2022 • Xin Chen, Bin Yan, Jiawen Zhu, Huchuan Lu, Xiang Ruan, Dong Wang

First, we present a transformer tracking (named TransT) method based on the Siamese-like feature extraction backbone, the designed attention-based fusion mechanism, and the classification and regression head.

Vocal Bursts Intensity Prediction

Paper
Code

TimeReplayer: Unlocking the Potential of Event Cameras for Video Interpolation

no code implementations • CVPR 2022 • Weihua He, Kaichao You, Zhendong Qiao, Xu Jia, Ziyang Zhang, Wenhui Wang, Huchuan Lu, Yaoyuan Wang, Jianxing Liao

Since event camera is a novel sensor, its potential has not been fulfilled due to the lack of processing algorithms.

Event-based vision

Paper
Add Code

Joint Learning of Salient Object Detection, Depth Estimation and Contour Extraction

1 code implementation • 9 Mar 2022 • Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu

In this paper, we propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD).

Depth Estimation object-detection +2

Paper
Code

Lane Detection with Versatile AtrousFormer and Local Semantic Guidance

no code implementations • 8 Mar 2022 • Jiaxing Yang, Lihe Zhang, Huchuan Lu

In this work, we propose Atrous Transformer (AtrousFormer) to solve the problem.

Ranked #25 on Lane Detection on CULane

Autonomous Driving Lane Detection

Paper
Add Code

Multi-Object Tracking Meets Moving UAV

no code implementations • CVPR 2022 • Shuai Liu, Xin Li, Huchuan Lu, You He

Multi-object tracking in unmanned aerial vehicle (UAV) videos is an important vision task and can be applied in a wide range of applications.

Multi-Object Tracking Object

Paper
Add Code

Multi-Source Uncertainty Mining for Deep Unsupervised Saliency Detection

no code implementations • CVPR 2022 • Yifan Wang, Wenbo Zhang, Lijun Wang, Ting Liu, Huchuan Lu

We design an Uncertainty Mining Network (UMNet) which consists of multiple Merge-and-Split (MS) modules to recursively analyze the commonality and difference among multiple noisy labels and infer pixel-wise uncertainty map for each label.

object-detection Object Detection +2

Paper
Add Code

An Informative Tracking Benchmark

1 code implementation • 13 Dec 2021 • Xin Li, Qiao Liu, Wenjie Pei, Qiuhong Shen, YaoWei Wang, Huchuan Lu, Ming-Hsuan Yang

Along with the rapid progress of visual tracking, existing benchmarks become less informative due to redundancy of samples and weak discrimination between current trackers, making evaluations on all datasets extremely time-consuming.

Visual Tracking

Paper
Code

CAVER: Cross-Modal View-Mixed Transformer for Bi-Modal Salient Object Detection

1 code implementation • 4 Dec 2021 • Youwei Pang, Xiaoqi Zhao, Lihe Zhang, Huchuan Lu

Most of the existing bi-modal (RGB-D and RGB-T) salient object detection methods utilize the convolution operation and construct complex interweave fusion structures to achieve cross-modal information integration.

object-detection RGB-D Salient Object Detection +1

Paper
Code

MFNet: Multi-filter Directive Network for Weakly Supervised Salient Object Detection

1 code implementation • ICCV 2021 • Yongri Piao, Jian Wang, Miao Zhang, Huchuan Lu

The multiple accurate cues from multiple DFs are then simultaneously propagated to the saliency network with a multi-guidance loss.

object-detection Object Detection +2

Paper
Code

Transformer-based Network for RGB-D Saliency Detection

no code implementations • 1 Dec 2021 • Yue Wang, Xu Jia, Lu Zhang, Yuke Li, James Elder, Huchuan Lu

TFFM conducts a sufficient feature fusion by integrating features from multiple scales and two modalities over all positions simultaneously.

Saliency Detection

Paper
Add Code

Joint Semantic Mining for Weakly Supervised RGB-D Salient Object Detection

1 code implementation • NeurIPS 2021 • Jingjing Li, Wei Ji, Qi Bi, Cheng Yan, Miao Zhang, Yongri Piao, Huchuan Lu, Li Cheng

As a by-product, a CapS dataset is constructed by augmenting existing benchmark training set with additional image tags and captions.

object-detection RGB-D Salient Object Detection +2

Paper
Code

MODNet-V: Improving Portrait Video Matting via Background Restoration

1 code implementation • 24 Sep 2021 • Jiayu Sun, Zhanghan Ke, Lihe Zhang, Huchuan Lu, Rynson W. H. Lau

In this work, we observe that instead of asking the user to explicitly provide a background image, we may recover it from the input video itself.

Image Matting Video Matting

Paper
Code

To be Critical: Self-Calibrated Weakly Supervised Learning for Salient Object Detection

no code implementations • 4 Sep 2021 • Yongri Piao, Jian Wang, Miao Zhang, Zhengxuan Ma, Huchuan Lu

Despite of the success of previous works, explorations on an effective training strategy for the saliency network and accurate matches between image-level annotations and salient objects are still inadequate.

object-detection Object Detection +2

Paper
Add Code

Automatic Polyp Segmentation via Multi-scale Subtraction Network

2 code implementations • 11 Aug 2021 • Xiaoqi Zhao, Lihe Zhang, Huchuan Lu

\keywords{Colorectal Cancer \and Automatic Polyp Segmentation \and Subtraction \and LossNet.}

Segmentation

Paper
Code

Multi-Source Fusion and Automatic Predictor Selection for Zero-Shot Video Object Segmentation

1 code implementation • 11 Aug 2021 • Xiaoqi Zhao, Youwei Pang, Jiaxing Yang, Lihe Zhang, Huchuan Lu

In this paper, we propose a novel multi-source fusion network for zero-shot video object segmentation.

Ranked #1 on Video Object Segmentation on FBMS (Jaccard (Mean) metric)

Depth Estimation Object +3

Paper
Code

Video Annotation for Visual Tracking via Selection and Refinement

1 code implementation • ICCV 2021 • Kenan Dai, Jie Zhao, Lijun Wang, Dong Wang, Jianhua Li, Huchuan Lu, Xuesheng Qian, Xiaoyun Yang

Deep learning based visual trackers entail offline pre-training on large volumes of video datasets with accurate bounding box annotations that are labor-expensive to achieve.

Visual Tracking

Paper
Code

HAT: Hierarchical Aggregation Transformers for Person Re-identification

1 code implementation • 13 Jul 2021 • Guowen Zhang, Pingping Zhang, Jinqing Qi, Huchuan Lu

In this work, we take advantages of both CNNs and Transformers, and propose a novel learning framework named Hierarchical Aggregation Transformer (HAT) for image-based person Re-ID with high performance.

Person Re-Identification Person Retrieval +1

Paper
Code

Animatable Neural Radiance Fields from Monocular RGB Videos

1 code implementation • 25 Jun 2021 • Jianchuan Chen, Ying Zhang, Di Kang, Xuefei Zhe, Linchao Bao, Xu Jia, Huchuan Lu

We present animatable neural radiance fields (animatable NeRF) for detailed human avatar creation from monocular videos.

3D Human Reconstruction Neural Rendering +2

228

Paper
Code

Self-Supervised Tracking via Target-Aware Data Synthesis

no code implementations • 21 Jun 2021 • Xin Li, Wenjie Pei, YaoWei Wang, Zhenyu He, Huchuan Lu, Ming-Hsuan Yang

While deep-learning based tracking methods have achieved substantial progress, they entail large-scale and high-quality annotated data for sufficient training.

Representation Learning Self-Supervised Learning +1

Paper
Add Code

Self-Generated Defocus Blur Detection via Dual Adversarial Discriminators

1 code implementation • CVPR 2021 • Wenda Zhao, Cai Shang, Huchuan Lu

The core insight is that a defocus blur region/focused clear area can be arbitrarily pasted to a given realistic full blurred image/full clear image without affecting the judgment of the full blurred image/full clear image.

Defocus Blur Detection

Paper
Code

Calibrated RGB-D Salient Object Detection

1 code implementation • CVPR 2021 • Wei Ji, Jingjing Li, Shuang Yu, Miao Zhang, Yongri Piao, Shunyu Yao, Qi Bi, Kai Ma, Yefeng Zheng, Huchuan Lu, Li Cheng

Complex backgrounds and similar appearances between objects and their surroundings are generally recognized as challenging scenarios in Salient Object Detection (SOD).

Ranked #13 on Thermal Image Segmentation on RGB-T-Glass-Segmentation

Object object-detection +3

Paper
Code

Multi-Target Domain Adaptation with Collaborative Consistency Learning

no code implementations • CVPR 2021 • Takashi Isobe, Xu Jia, Shuaijun Chen, Jianzhong He, Yongjie Shi, Jianzhuang Liu, Huchuan Lu, Shengjin Wang

To obtain a single model that works across multiple target domains, we propose to simultaneously learn a student model which is trained to not only imitate the output of each expert on the corresponding target domain, but also to pull different expert close to each other with regularization on their weights.

Ranked #4 on Domain Adaptation on GTAV to Cityscapes+Mapillary

Multi-target Domain Adaptation Semantic Segmentation +1

Paper
Add Code

Encoder Fusion Network with Co-Attention Embedding for Referring Image Segmentation

no code implementations • CVPR 2021 • Guang Feng, Zhiwei Hu, Lihe Zhang, Huchuan Lu

In this work, we propose an encoder fusion network (EFN), which transforms the visual encoder into a multi-modal feature learning network, and uses language to refine the multi-modal features progressively.

Image Segmentation Semantic Segmentation

Paper
Add Code

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search

1 code implementation • CVPR 2021 • Bin Yan, Houwen Peng, Kan Wu, Dong Wang, Jianlong Fu, Huchuan Lu

Object tracking has achieved significant progress over the past few years.

Neural Architecture Search Object +1

381

Paper
Code

A Video Is Worth Three Views: Trigeminal Transformers for Video-based Person Re-identification

no code implementations • 5 Apr 2021 • Xuehu Liu, Pingping Zhang, Chenyang Yu, Huchuan Lu, Xuesheng Qian, Xiaoyun Yang

To capture richer perceptions and extract more comprehensive video representations, in this paper we propose a novel framework named Trigeminal Transformers (TMT) for video-based person Re-ID.

Video-Based Person Re-Identification

Paper
Add Code

Learning Spatio-Temporal Transformer for Visual Tracking

1 code implementation • ICCV 2021 • Bin Yan, Houwen Peng, Jianlong Fu, Dong Wang, Huchuan Lu

In this paper, we present a new tracking architecture with an encoder-decoder transformer as the key component.

Ranked #18 on Visual Object Tracking on TrackingNet

Visual Object Tracking Visual Tracking

610

Paper
Code

Transformer Tracking

1 code implementation • CVPR 2021 • Xin Chen, Bin Yan, Jiawen Zhu, Dong Wang, Xiaoyun Yang, Huchuan Lu

The correlation operation is a simple fusion manner to consider the similarity between the template and the search region.

Ranked #5 on Visual Tracking on TNL2K

Visual Object Tracking Visual Tracking

554

Paper
Code

Watching You: Global-guided Reciprocal Learning for Video-based Person Re-identification

1 code implementation • CVPR 2021 • Xuehu Liu, Pingping Zhang, Chenyang Yu, Huchuan Lu, Xiaoyun Yang

Specifically, we first propose a Global-guided Correlation Estimation (GCE) to generate feature correlation maps of local features and global features, which help to localize the high- and low-correlation regions for identifying the same person.

Feature Correlation Video-Based Person Re-Identification

Paper
Code

Self-Supervised Pretraining for RGB-D Salient Object Detection

1 code implementation • 29 Jan 2021 • Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu, Xiang Ruan

Existing CNNs-Based RGB-D salient object detection (SOD) networks are all required to be pretrained on the ImageNet to learn the hierarchy features which helps provide a good initialization.

Object object-detection +3

Paper
Code

Neighbor2Neighbor: Self-Supervised Denoising from Single Noisy Images

12 code implementations • CVPR 2021 • Tao Huang, Songjiang Li, Xu Jia, Huchuan Lu, Jianzhuang Liu

In this paper, we present a very simple yet effective method named Neighbor2Neighbor to train an effective image denoising model with only noisy images.

Image Denoising Self-Supervised Learning

232

Paper
Code

Similarity Reasoning and Filtration for Image-Text Matching

1 code implementation • 5 Jan 2021 • Haiwen Diao, Ying Zhang, Lin Ma, Huchuan Lu

Image-text matching plays a critical role in bridging the vision and language, and great progress has been made by exploiting the global alignment between image and sentence, or local alignments between regions and words.

Ranked #3 on Image Retrieval on Flickr30K 1K test

Image Retrieval Image-text matching +2

199

Paper
Code

Dynamic Context-Sensitive Filtering Network for Video Salient Object Detection

1 code implementation • ICCV 2021 • Miao Zhang, Jie Liu, Yifei Wang, Yongri Piao, Shunyu Yao, Wei Ji, Jingjing Li, Huchuan Lu, Zhongxuan Luo

Our bidirectional dynamic fusion strategy encourages the interaction of spatial and temporal information in a dynamic manner.

Ranked #12 on Video Polyp Segmentation on SUN-SEG-Easy (Unseen)

object-detection Salient Object Detection +2

Paper
Code

CR-Fill: Generative Image Inpainting With Auxiliary Contextual Reconstruction

1 code implementation • ICCV 2021 • Yu Zeng, Zhe Lin, Huchuan Lu, Vishal M. Patel

The auxiliary branch (i. e. CR loss) is required only during training, and only the inpainting generator is required during the inference.

Ranked #8 on Image Inpainting on Places2

Image Inpainting

220

Paper
Code

Can Scale-Consistent Monocular Depth Be Learned in a Self-Supervised Scale-Invariant Manner?

no code implementations • ICCV 2021 • Lijun Wang, Yifan Wang, Linzhao Wang, Yunlong Zhan, Ying Wang, Huchuan Lu

The integration of SAG loss and two-stream network enables more consistent scale inference and more accurate relative depth estimation.

Depth Prediction Monocular Depth Estimation +1

Paper
Add Code

Learning Motion-Appearance Co-Attention for Zero-Shot Video Object Segmentation

1 code implementation • ICCV 2021 • Shu Yang, Lu Zhang, Jinqing Qi, Huchuan Lu, Shuo Wang, Xiaoxing Zhang

How to make the appearance and motion information interact effectively to accommodate complex scenarios is a fundamental issue in flow-based zero-shot video object segmentation.

Ranked #4 on Unsupervised Video Object Segmentation on YouTube-Objects

Semantic Segmentation Unsupervised Video Object Segmentation +2

Paper
Code

DUT-LFSaliency: Versatile Dataset and Light Field-to-RGB Saliency Detection

no code implementations • 30 Dec 2020 • Yongri Piao, Zhengkun Rong, Shuang Xu, Miao Zhang, Huchuan Lu

The success of learning-based light field saliency detection is heavily dependent on how a comprehensive dataset can be constructed for higher generalizability of models, how high dimensional light field data can be effectively exploited, and how a flexible model can be designed to achieve versatility for desktop computers and mobile devices.

Saliency Detection

Paper
Add Code

Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box Estimation

1 code implementation • CVPR 2021 • Bin Yan, Xinyu Zhang, Dong Wang, Huchuan Lu, Xiaoyun Yang

Many recent trackers adopt the multiple-stage tracking strategy to improve the quality of bounding box estimation.

Ranked #15 on Semi-Supervised Video Object Segmentation on VOT2020

Semi-Supervised Video Object Segmentation Visual Object Tracking

188

Paper
Code

Multi-modal Visual Tracking: Review and Experimental Comparison

2 code implementations • 8 Dec 2020 • Pengyu Zhang, Dong Wang, Huchuan Lu

Visual object tracking, as a fundamental task in computer vision, has drawn much attention in recent years.

Rgb-T Tracking Visual Object Tracking

Paper
Code

CR-Fill: Generative Image Inpainting with Auxiliary Contexutal Reconstruction

1 code implementation • 25 Nov 2020 • Yu Zeng, Zhe Lin, Huchuan Lu, Vishal M. Patel

Due to the lack of supervision signals for the correspondence between missing regions and known regions, it may fail to find proper reference features, which often leads to artifacts in the results.

Image Inpainting

220

Paper
Code

Coherent Loss: A Generic Framework for Stable Video Segmentation

no code implementations • 25 Oct 2020 • Mingyang Qian, Yi Fu, Xiao Tan, YingYing Li, Jinqing Qi, Huchuan Lu, Shilei Wen, Errui Ding

Video segmentation approaches are of great importance for numerous vision tasks especially in video manipulation for entertainment.

Segmentation Semantic Segmentation +2

Paper
Add Code

Accurate RGB-D Salient Object Detection via Collaborative Learning

2 code implementations • ECCV 2020 • Wei Ji, Jingjing Li, Miao Zhang, Yongri Piao, Huchuan Lu

The explicitly extracted edge information goes together with saliency to give more emphasis to the salient regions and object boundaries.

Ranked #19 on RGB-D Salient Object Detection on NJU2K

Object object-detection +5

Paper
Code

Multi-scale Interactive Network for Salient Object Detection

1 code implementation • CVPR 2020 • Youwei Pang, Xiaoqi Zhao, Lihe Zhang, Huchuan Lu

To obtain more efficient multi-scale features from the integrated features, the self-interaction modules are embedded in each decoder unit.

Object object-detection +2

236

Paper
Code

Suppress and Balance: A Simple Gated Network for Salient Object Detection

3 code implementations • ECCV 2020 • Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu, Lei Zhang

With the help of multilevel gate units, the valuable context information from the encoder can be optimally transmitted to the decoder.

Ranked #15 on Dichotomous Image Segmentation on DIS-TE4

Dichotomous Image Segmentation object-detection +1

155

Paper
Code

A Single Stream Network for Robust and Real-time RGB-D Salient Object Detection

1 code implementation • ECCV 2020 • Xiaoqi Zhao, Lihe Zhang, Youwei Pang, Huchuan Lu, Lei Zhang

In this work, we design a single stream network to directly use the depth map to guide early fusion and middle fusion between RGB and depth, which saves the feature encoder of the depth stream and achieves a lightweight and real-time model.

Ranked #15 on Thermal Image Segmentation on RGB-T-Glass-Segmentation

object-detection RGB-D Salient Object Detection +3

Paper
Code

Hierarchical Dynamic Filtering Network for RGB-D Salient Object Detection

1 code implementation • ECCV 2020 • Youwei Pang, Lihe Zhang, Xiaoqi Zhao, Huchuan Lu

The main purpose of RGB-D salient object detection (SOD) is how to better integrate and utilize cross-modal fusion information.

Ranked #5 on RGB-D Salient Object Detection on NJU2K

object-detection RGB-D Salient Object Detection +3

Paper
Code

Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box Estimation

1 code implementation • 4 Jul 2020 • Bin Yan, Dong Wang, Huchuan Lu, Xiaoyun Yang

In recent years, the multiple-stage strategy has become a popular trend for visual tracking.

Visual Tracking

188

Paper
Code

Jointly Modeling Motion and Appearance Cues for Robust RGB-T Tracking

no code implementations • 4 Jul 2020 • Pengyu Zhang, Jie Zhao, Dong Wang, Huchuan Lu, Xiaoyun Yang

In this study, we propose a novel RGB-T tracking framework by jointly modeling both appearance and motion cues.

Ranked #4 on Rgb-T Tracking on GTOT

Rgb-T Tracking

Paper
Add Code

Synergistic saliency and depth prediction for RGB-D saliency detection

no code implementations • 3 Jul 2020 • Yue Wang, Yuke Li, James H. Elder, Huchuan Lu, Runmin Wu, Lu Zhang

Evaluation on seven RGB-D datasets demonstrates that even without saliency ground truth for RGB-D datasets and using only the RGB data of RGB-D datasets at inference, our semi-supervised system performs favorable against state-of-the-art fully-supervised RGB-D saliency detection methods that use saliency ground truth for RGB-D datasets at training and depth data at inference on two largest testing datasets.

Depth Estimation Depth Prediction +1

Paper
Add Code

High-Resolution Image Inpainting with Iterative Confidence Feedback and Guided Upsampling

1 code implementation • ECCV 2020 • Yu Zeng, Zhe Lin, Jimei Yang, Jianming Zhang, Eli Shechtman, Huchuan Lu

To address this challenge, we propose an iterative inpainting method with a feedback mechanism.

Ranked #6 on Image Inpainting on Places2

Image Inpainting Vocal Bursts Intensity Prediction

209

Paper
Code

Pose-guided Visible Part Matching for Occluded Person ReID

1 code implementation • CVPR 2020 • Shang Gao, Jingya Wang, Huchuan Lu, Zimo Liu

Occluded person re-identification is a challenging task as the appearance varies substantially with various obstacles, especially in the crowd scenario.

Graph Matching Person Re-Identification

110

Paper
Code

High-Performance Long-Term Tracking with Meta-Updater

2 code implementations • CVPR 2020 • Kenan Dai, Yunhua Zhang, Dong Wang, Jianhua Li, Huchuan Lu, Xiaoyun Yang

Most top-ranked long-term trackers adopt the offline-trained Siamese architectures, thus, they cannot benefit from great progress of short-term trackers with online update.

Ranked #10 on Visual Object Tracking on LaSOT-ext

Visual Object Tracking Visual Tracking +1

257

Paper
Code

Cooling-Shrinking Attack: Blinding the Tracker with Imperceptible Noises

1 code implementation • CVPR 2020 • Bin Yan, Dong Wang, Huchuan Lu, Xiaoyun Yang

An effective and efficient perturbation generator is trained with a carefully designed adversarial loss, which can simultaneously cool hot regions where the target exists on the heatmaps and force the predicted bounding box to shrink, making the tracked target invisible to trackers.

Adversarial Attack

Paper
Code

When Relation Networks meet GANs: Relation GANs with Triplet Loss

1 code implementation • 24 Feb 2020 • Runmin Wu, Kunyao Zhang, Lijun Wang, Yue Wang, Pingping Zhang, Huchuan Lu, Yizhou Yu

Though recent research has achieved remarkable progress in generating realistic images with generative adversarial networks (GANs), the lack of training stability is still a lingering concern of most GANs, especially on high-resolution inputs and complex datasets.

Conditional Image Generation Relation +2

Paper
Code

Reverse Attention-Based Residual Network for Salient Object Detection

6 code implementations • IEEE Transactions on Image Processing 2020 • Shuhan Chen, Xiuli Tan, Ben Wang, Huchuan Lu, Xuelong Hu, Yun Fu

Benefiting from the quick development of deep convolutional neural networks, especially fully convolutional neural networks (FCNs), remarkable progresses have been achieved on salient object detection recently.

Object object-detection +2

334

Paper
Code

Memory-oriented Decoder for Light Field Salient Object Detection

1 code implementation • NeurIPS 2019 • Miao Zhang, Jingjing Li, Ji Wei, Yongri Piao, Huchuan Lu

In this paper, we present a deep-learning-based method where a novel memory-oriented decoder is tailored for light field saliency detection.

object-detection RGB Salient Object Detection +2

Paper
Code

Class-Conditional Domain Adaptation on Semantic Segmentation

no code implementations • 27 Nov 2019 • Yue Wang, Yuke Li, James H. Elder, Runmin Wu, Huchuan Lu

We address this problem by introducing a Class-Conditional Domain Adaptation method (CCDA).

Semantic Segmentation Unsupervised Domain Adaptation

Paper
Add Code

ROI Pooled Correlation Filters for Visual Tracking

1 code implementation • CVPR 2019 • Yuxuan Sun, Chong Sun, Dong Wang, You He, Huchuan Lu

The ROI (region-of-interest) based pooling method performs pooling operations on the cropped ROI regions for various samples and has shown great success in the object detection methods.

object-detection Object Detection +1

Paper
Code

VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results

1 code implementation • International Conference on Computer Vision Workshops 2019 • Dawei Du, Pengfei Zhu, Longyin Wen, Xiao Bian, Haibin Lin, QinGhua Hu, Tao Peng, Jiayu Zheng, Xinyao Wang, Yue Zhang, Liefeng Bo, Hailin Shi, Rui Zhu, Aashish Kumar, Aijin Li, Almaz Zinollayev, Anuar Askergaliyev, Arne Schumann, Binjie Mao, Byeongwon Lee, Chang Liu, Changrui Chen, Chunhong Pan, Chunlei Huo, Da Yu, Dechun Cong, Dening Zeng, Dheeraj Reddy Pailla, Di Li, Dong Wang, Donghyeon Cho, Dongyu Zhang, Furui Bai, George Jose, Guangyu Gao, Guizhong Liu, Haitao Xiong, Hao Qi, Haoran Wang, Heqian Qiu, Hongliang Li, Huchuan Lu, Ildoo Kim, Jaekyum Kim, Jane Shen, Jihoon Lee, Jing Ge, Jingjing Xu, Jingkai Zhou, Jonas Meier, Jun Won Choi, Junhao Hu, Junyi Zhang, Junying Huang, Kaiqi Huang, Keyang Wang, Lars Sommer, Lei Jin, Lei Zhang

Results of 33 object detection algorithms are presented.

Object object-detection +1

12,059

Paper
Code

Deep Multiphase Level Set for Scene Parsing

no code implementations • 8 Oct 2019 • Pingping Zhang, Wei Liu, Yinjie Lei, Hongyu Wang, Huchuan Lu

The proposed method consists of three modules, i. e., recurrent FCNs, adaptive multiphase level set, and deeply supervised learning.

Image Segmentation Scene Parsing +1

Paper
Add Code

GradNet: Gradient-Guided Network for Visual Object Tracking

2 code implementations • ICCV 2019 • Peixia Li, Bo-Yu Chen, Wanli Ouyang, Dong Wang, Xiaoyun Yang, Huchuan Lu

In this work, we propose a novel gradient-guided network to exploit the discriminative information in gradients and update the template in the siamese network through feed-forward and backward operations.

Ranked #3 on Visual Object Tracking on OTB-2015 (Precision metric)

Object Template Matching +2

Paper
Code

Joint Learning of Saliency Detection and Weakly Supervised Semantic Segmentation

1 code implementation • ICCV 2019 • Yu Zeng, Yunzhi Zhuge, Huchuan Lu, Lihe Zhang

SSNet consists of a segmentation network (SN) and a saliency aggregation module (SAM).

Multi-Task Learning Saliency Detection +4

Paper
Code

'Skimming-Perusal' Tracking: A Framework for Real-Time and Robust Long-term Tracking

1 code implementation • ICCV 2019 • Bin Yan, Haojie Zhao, Dong Wang, Huchuan Lu, Xiaoyun Yang

In this work, we present a novel robust and real-time long-term tracking framework based on the proposed skimming and perusal modules.

124

Paper
Code

Towards High-Resolution Salient Object Detection

1 code implementation • ICCV 2019 • Yi Zeng, Pingping Zhang, Jianming Zhang, Zhe Lin, Huchuan Lu

This paper pushes forward high-resolution saliency detection, and contributes a new dataset, named High-Resolution Salient Object Detection (HRSOD).

Ranked #11 on RGB Salient Object Detection on DAVIS-S (using extra training data)

Object object-detection +4

Paper
Code

Cascaded Context Pyramid for Full-Resolution 3D Semantic Scene Completion

no code implementations • ICCV 2019 • Pingping Zhang, Wei Liu, Yinjie Lei, Huchuan Lu, Xiaoyun Yang

To address these issues, in this work we propose a novel deep learning framework, named Cascaded Context Pyramid Network (CCPNet), to jointly infer the occupancy and semantic labels of a volumetric 3D scene from a single depth image.

Ranked #5 on 3D Semantic Scene Completion on NYUv2 (using extra training data)

3D Semantic Scene Completion

Paper
Add Code

Multi-source weak supervision for saliency detection

1 code implementation • CVPR 2019 • Yu Zeng, Yunzhi Zhuge, Huchuan Lu, Lihe Zhang, Mingyang Qian, Yizhou Yu

To this end, we propose a unified framework to train saliency detection models with diverse weak supervision sources.

Caption Generation Saliency Prediction

Paper
Code

Salient Object Detection with Lossless Feature Reflection and Weighted Structural Loss

no code implementations • 21 Jan 2019 • Pingping Zhang, Wei Liu, Huchuan Lu, Chunhua Shen

Inspired by the intrinsic reflection of natural images, in this paper we propose a novel feature learning framework for large-scale salient object detection.

Object object-detection +3

Paper
Add Code

Global and Local Sensitivity Guided Key Salient Object Re-augmentation for Video Saliency Detection

no code implementations • 19 Nov 2018 • Ziqi Zhou, Zheng Wang, Huchuan Lu, Song Wang, Meijun Sun

In this paper, based on the fact that salient areas in videos are relatively small and concentrated, we propose a \textbf{key salient object re-augmentation method (KSORA) using top-down semantic knowledge and bottom-up feature guidance} to improve detection accuracy in video scenes.

Decision Making feature selection +2

Paper
Add Code

DeepLens: Shallow Depth Of Field From A Single Image

no code implementations • 18 Oct 2018 • Lijun Wang, Xiaohui Shen, Jianming Zhang, Oliver Wang, Zhe Lin, Chih-Yao Hsieh, Sarah Kong, Huchuan Lu

To achieve this, we propose a novel neural network model comprised of a depth prediction module, a lens blur module, and a guided upsampling module.

Depth Estimation Depth Prediction

Paper
Add Code

Boundary-guided Feature Aggregation Network for Salient Object Detection

no code implementations • 28 Sep 2018 • Yunzhi Zhuge, Pingping Zhang, Huchuan Lu

Fully convolutional networks (FCN) has significantly improved the performance of many pixel-labeling tasks, such as semantic segmentation and depth estimation.

Depth Estimation Object +4

Paper
Add Code

Learning regression and verification networks for long-term visual tracking

3 code implementations • 12 Sep 2018 • Yunhua Zhang, Dong Wang, Lijun Wang, Jinqing Qi, Huchuan Lu

Compared with short-term tracking, the long-term tracking task requires determining the tracked object is present or absent, and then estimating the accurate bounding box if present or conducting image-wide re-detection if absent.

General Classification Object +3

203

Paper
Code

Structured Siamese Network for Real-Time Visual Tracking

no code implementations • ECCV 2018 • Yunhua Zhang, Lijun Wang, Jinqing Qi, Dong Wang, Mengyang Feng, Huchuan Lu

In this paper, we circumvent this issue by proposing a local structure learning method, which simultaneously considers the local patterns of the target and their structural relationships for more accurate target tracking.

Real-Time Visual Tracking

Paper
Add Code

Real-time 'Actor-Critic' Tracking

no code implementations • ECCV 2018 • Boyu Chen, Dong Wang, Peixia Li, Shuang Wang, Huchuan Lu

In this work, we propose a novel tracking algorithm with real-time performance based on the âActor-Criticâ framework.

Visual Tracking

Paper
Add Code

Troy: Give Attention to Saliency and for Saliency

no code implementations • 4 Aug 2018 • Pingping Zhang, Huchuan Lu, Chunhua Shen

In addition, our work has text overlap with arXiv:1804. 06242, arXiv:1705. 00938 by other authors.

Paper
Add Code

Detect Globally, Refine Locally: A Novel Approach to Saliency Detection

no code implementations • CVPR 2018 • Tiantian Wang, Lihe Zhang, Shuo Wang, Huchuan Lu, Gang Yang, Xiang Ruan, Ali Borji

Moreover, to effectively recover object boundaries, we propose a local Boundary Refinement Network (BRN) to adaptively learn the local contextual information for each spatial position.

Ranked #13 on RGB Salient Object Detection on DUTS-TE

object-detection RGB Salient Object Detection +2

Paper
Add Code

Progressive Attention Guided Recurrent Network for Salient Object Detection

1 code implementation • CVPR 2018 • Xiaoning Zhang, Tiantian Wang, Jinqing Qi, Huchuan Lu, Gang Wang

In this paper, we propose a novel attention guided network which selectively integrates multi-level contextual information in a progressive manner.

Ranked #12 on RGB Salient Object Detection on DUTS-TE (max F-measure metric)

Object object-detection +3

Paper
Code

Defocus Blur Detection via Multi-Stream Bottom-Top-Bottom Fully Convolutional Network

no code implementations • CVPR 2018 • Wenda Zhao, Fan Zhao, Dong Wang, Huchuan Lu

To address these issues, we propose a multi-stream bottom-top-bottom fully convolutional network (BTBNet), which is the first attempt to develop an end-to-end deep network for DBD.

Ranked #2 on Defocus Estimation on CUHK - Blur Detection Dataset (MAE metric)

Defocus Blur Detection Defocus Estimation

Paper
Add Code

Learning to Promote Saliency Detectors

1 code implementation • CVPR 2018 • Yu Zeng, Huchuan Lu, Lihe Zhang, Mengyang Feng, Ali Borji

The categories and appearance of salient objects vary from image to image, therefore, saliency detection is an image-specific task.

Saliency Detection Small Data Image Classification +1

Paper
Code

A Bi-Directional Message Passing Model for Salient Object Detection

no code implementations • CVPR 2018 • Lu Zhang, Ju Dai, Huchuan Lu, You He, Gang Wang

In this paper, we propose a novel bi-directional message passing model to integrate multi-level features for salient object detection.

Ranked #2 on RGB Salient Object Detection on ISTD

object-detection RGB Salient Object Detection +2

Paper
Add Code

Learning Dual Convolutional Neural Networks for Low-Level Vision

no code implementations • CVPR 2018 • Jinshan Pan, Sifei Liu, Deqing Sun, Jiawei Zhang, Yang Liu, Jimmy Ren, Zechao Li, Jinhui Tang, Huchuan Lu, Yu-Wing Tai, Ming-Hsuan Yang

These problems usually involve the estimation of two components of the target signals: structures and details.

Rain Removal Super-Resolution

Paper
Add Code

Correlation Tracking via Joint Discrimination and Reliability Learning

1 code implementation • CVPR 2018 • Chong Sun, Dong Wang, Huchuan Lu, Ming-Hsuan Yang

To address this issue, we propose a novel CF-based optimization problem to jointly model the discrimination and reliability information.

Visual Tracking

Paper
Code

HyperFusion-Net: Densely Reflective Fusion for Salient Object Detection

no code implementations • 14 Apr 2018 • Pingping Zhang, Huchuan Lu, Chunhua Shen

Salient object detection (SOD), which aims to find the most important region of interest and segment the relevant object/item in that area, is an important yet challenging vision task.

object-detection RGB Salient Object Detection +1

Paper
Add Code

Non-rigid Object Tracking via Deep Multi-scale Spatial-temporal Discriminative Saliency Maps

no code implementations • 22 Feb 2018 • Pingping Zhang, Wei Liu, Dong Wang, Yinjie Lei, Hongyu Wang, Chunhua Shen, Huchuan Lu

Extensive experiments demonstrate that the proposed algorithm achieves competitive performance in both saliency detection and visual tracking, especially outperforming other related trackers on the non-rigid object tracking datasets.

Object Object Tracking +2

Paper
Add Code

Video Person Re-identification by Temporal Residual Learning

no code implementations • 22 Feb 2018 • Ju Dai, Pingping Zhang, Huchuan Lu, Hongyu Wang

In this paper, we propose a novel feature learning framework for video person re-identification (re-ID).

Video-Based Person Re-Identification

Paper
Add Code

Unsupervised Band Selection of Hyperspectral Images via Multi-dictionary Sparse Representation

no code implementations • 20 Feb 2018 • Fei Li, Pingping Zhang, Huchuan Lu

Band selection is a direct and effective method to remove redundant information and reduce the spectral dimension for decreasing computational complexity and avoiding the curse of dimensionality.

Dictionary Learning General Classification +1

Paper
Add Code

Agile Amulet: Real-Time Salient Object Detection with Contextual Attention

no code implementations • 20 Feb 2018 • Pingping Zhang, Luyao Wang, Dong Wang, Huchuan Lu, Chunhua Shen

This paper proposes an Agile Aggregating Multi-Level feaTure framework (Agile Amulet) for salient object detection.

object-detection RGB Salient Object Detection +1

Paper
Add Code

Salient Object Detection by Lossless Feature Reflection

no code implementations • 19 Feb 2018 • Pingping Zhang, Wei Liu, Huchuan Lu, Chunhua Shen

Inspired by the intrinsic reflection of natural images, in this paper we propose a novel feature learning framework for large-scale salient object detection.

Object object-detection +3

Paper
Add Code

A Stagewise Refinement Model for Detecting Salient Objects in Images

1 code implementation • ICCV 2017 • Tiantian Wang, Ali Borji, Lihe Zhang, Pingping Zhang, Huchuan Lu

To remedy this problem, here we propose to augment feedforward neural networks with a novel pyramid pooling module and a multi-stage refinement mechanism for saliency detection.

Ranked #14 on RGB Salient Object Detection on DUTS-TE (max F-measure metric)

object-detection RGB Salient Object Detection +2

Paper
Code

Stepwise Metric Promotion for Unsupervised Video Person Re-Identification

no code implementations • ICCV 2017 • Zimo Liu, Dong Wang, Huchuan Lu

The intensive annotation cost and the rich but unlabeled data contained in videos motivate us to propose an unsupervised video-based person re-identification (re-ID) method.

Ranked #7 on Person Re-Identification on PRID2011

Retrieval Video-Based Person Re-Identification

Paper
Add Code

Statistics of Deep Generated Images

no code implementations • 9 Aug 2017 • Yu Zeng, Huchuan Lu, Ali Borji

Here, we explore the low-level statistics of images generated by state-of-the-art deep generative models.

Generative Adversarial Network

Paper
Add Code

An Unsupervised Game-Theoretic Approach to Saliency Detection

no code implementations • 8 Aug 2017 • Yu Zeng, Huchuan Lu, Ali Borji, Mengyang Feng

Saliency maps are generated according to each region's strategy in the Nash equilibrium of the proposed Saliency Game.

object-detection RGB Salient Object Detection +2

Paper
Add Code

Learning Uncertain Convolutional Features for Accurate Saliency Detection

1 code implementation • ICCV 2017 • Pingping Zhang, Dong Wang, Huchuan Lu, Hongyu Wang, Bao-Cai Yin

In this paper, we propose a novel deep fully convolutional network model for accurate salient object detection.

Ranked #5 on Saliency Detection on DUT-OMRON

object-detection RGB Salient Object Detection +2

Paper
Code

Amulet: Aggregating Multi-level Convolutional Features for Salient Object Detection

1 code implementation • ICCV 2017 • Pingping Zhang, Dong Wang, Huchuan Lu, Hongyu Wang, Xiang Ruan

In addition, to achieve accurate boundary inference and semantic enhancement, edge-aware feature maps in low-level layers and the predicted results of low resolution features are recursively embedded into the learning framework.

Ranked #20 on RGB Salient Object Detection on DUTS-TE (max F-measure metric)

Object object-detection +2

Paper
Code

Learning to Detect Salient Objects With Image-Level Supervision

no code implementations • CVPR 2017 • Lijun Wang, Huchuan Lu, Yifan Wang, Mengyang Feng, Dong Wang, Bao-Cai Yin, Xiang Ruan

In the second stage, FIN is fine-tuned with its predicted saliency maps as ground truth.

Object object-detection +5

Paper
Add Code

Learning Spatial-Aware Regressions for Visual Tracking

1 code implementation • CVPR 2018 • Chong Sun, Dong Wang, Huchuan Lu, Ming-Hsuan Yang

Second, we propose a fully convolutional neural network with spatially regularized kernels, through which the filter kernel corresponding to each output channel is forced to focus on a specific region of the target.

Ranked #12 on Visual Object Tracking on VOT2017/18

regression Visual Object Tracking +1

Paper
Code

Deep Mutual Learning

8 code implementations • CVPR 2018 • Ying Zhang, Tao Xiang, Timothy M. Hospedales, Huchuan Lu

Model distillation is an effective and widely used technique to transfer knowledge from a teacher to a student network.

Person Re-Identification

637

Paper
Code

Hierarchical Cellular Automata for Visual Saliency

1 code implementation • 26 May 2017 • Yao Qin, Mengyang Feng, Huchuan Lu, Garrison W. Cottrell

The CCA can act as an efficient pixel-wise aggregation algorithm that can integrate state-of-the-art methods, resulting in even better results.

Saliency Detection

Paper
Code

Pose Invariant Embedding for Deep Person Re-identification

no code implementations • 26 Jan 2017 • Liang Zheng, Yujia Huang, Huchuan Lu, Yi Yang

Second, to reduce the impact of pose estimation errors and information loss during PoseBox construction, we design a PoseBox fusion (PBF) CNN architecture that takes the original image, the PoseBox, and the pose estimation confidence as input.

Person Re-Identification Pose Estimation +1

Paper
Add Code

Dual Deep Network for Visual Tracking

1 code implementation • 19 Dec 2016 • Zhizhen Chi, Hongyang Li, Huchuan Lu, Ming-Hsuan Yang

In this paper, we propose a dual network to better utilize features among layers for visual tracking.

Visual Tracking

Paper
Code

Visual Tracking via Shallow and Deep Collaborative Model

no code implementations • 27 Jul 2016 • Bohan Zhuang, Lijun Wang, Huchuan Lu

In the discriminative model, we exploit the advances of deep learning architectures to learn generic features which are robust to both background clutters and foreground appearance variations.

Incremental Learning Visual Tracking

Paper
Add Code

Sample-Specific SVM Learning for Person Re-Identification

no code implementations • CVPR 2016 • Ying Zhang, Baohua Li, Huchuan Lu, Atshushi Irie, Xiang Ruan

Person re-identification addresses the problem of matching people across disjoint camera views and extensive efforts have been made to seek either the robust feature representation or the discriminative matching metrics.

Dictionary Learning imbalanced classification +1

Paper
Add Code

STCT: Sequentially Training Convolutional Networks for Visual Tracking

no code implementations • CVPR 2016 • Lijun Wang, Wanli Ouyang, Xiaogang Wang, Huchuan Lu

To further improve the robustness of each base learner, we propose to train the convolutional layers with random binary masks, which serves as a regularization to enforce each base learner to focus on different input features.

Visual Tracking

Paper
Add Code

Fixation prediction with a combined model of bottom-up saliency and vanishing point

no code implementations • 6 Dec 2015 • Mengyang Feng, Ali Borji, Huchuan Lu

By predicting where humans look in natural scenes, we can understand how they perceive complex natural scenes and prioritize information for further high-level visual processing.

Paper
Add Code

Visual Tracking With Fully Convolutional Networks

no code implementations • ICCV 2015 • Lijun Wang, Wanli Ouyang, Xiaogang Wang, Huchuan Lu

Instead of treating convolutional neural network (CNN) as a black-box feature extractor, we conduct in-depth study on the properties of CNN features offline pre-trained on massive image data and classification task on ImageNet.

Object Tracking Visual Tracking

Paper
Add Code

LCNN: Low-level Feature Embedded CNN for Salient Object Detection

no code implementations • 17 Aug 2015 • Hongyang Li, Huchuan Lu, Zhe Lin, Xiaohui Shen, Brian Price

In this paper, we propose a novel deep neural network framework embedded with low-level features (LCNN) for salient object detection in complex images.

object-detection RGB Salient Object Detection +1

Paper
Add Code

Saliency Detection via Cellular Automata

no code implementations • CVPR 2015 • Yao Qin, Huchuan Lu, Yiqun Xu, He Wang

In this paper, we introduce Cellular Automata--a dynamic evolution model to intuitively detect the salient object.

Saliency Detection

Paper
Add Code

Deep Networks for Saliency Detection via Local Estimation and Global Search

no code implementations • CVPR 2015 • Lijun Wang, Huchuan Lu, Xiang Ruan, Ming-Hsuan Yang

In the global search stage, the local saliency map together with global contrast and geometric information are used as global features to describe a set of object candidate regions.

Object Saliency Detection

Paper
Add Code

Salient Object Detection via Bootstrap Learning

no code implementations • CVPR 2015 • Na Tong, Huchuan Lu, Xiang Ruan, Ming-Hsuan Yang

Furthermore, we show that the proposed bootstrap learning approach can be easily applied to other bottom-up saliency models for significant improvement.

Object object-detection +3

Paper
Add Code

Subspace Clustering by Mixture of Gaussian Regression

no code implementations • CVPR 2015 • Baohua Li, Ying Zhang, Zhouchen Lin, Huchuan Lu

Therefore, we propose Mixture of Gaussian Regression (MoG Regression) for subspace clustering by modeling noise as a Mixture of Gaussians (MoG).

Clustering regression

Paper
Add Code

Inner and Inter Label Propagation: Salient Object Detection in the Wild

2 code implementations • 27 May 2015 • Hongyang Li, Huchuan Lu, Zhe Lin, Xiaohui Shen, Brian Price

For most natural images, some boundary superpixels serve as the background labels and the saliency of other superpixels are determined by ranking their similarities to the boundary labels based on an inner propagation scheme.

Computational Efficiency object-detection +4

Paper
Code

Vanishing Point Attracts Eye Movements in Scene Free-viewing

no code implementations • 14 May 2015 • Ali Borji, Mengyang Feng, Huchuan Lu

Eye movements are crucial in understanding complex scenes.

Paper
Add Code

Visual Tracking via Probability Continuous Outlier Model

no code implementations • CVPR 2014 • Dong Wang, Huchuan Lu

In this paper, we present a novel online visual tracking method based on linear representation.

Visual Tracking

Paper
Add Code

Least Soft-Threshold Squares Tracking

no code implementations • CVPR 2013 • Dong Wang, Huchuan Lu, Ming-Hsuan Yang

In this paper, we propose a generative tracking method based on a novel robust linear regression algorithm.

Paper
Add Code

Saliency Detection via Graph-Based Manifold Ranking

no code implementations • CVPR 2013 • Chuan Yang, Lihe Zhang, Huchuan Lu, Xiang Ruan, Ming-Hsuan Yang

The saliency of the image elements is defined based on their relevances to the given seeds or queries.

Saliency Detection Superpixels

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.