Search Results for author: Huchuan Lu

Found 177 papers, 103 papers with code

CLIFFNet for Monocular Depth Estimation with Hierarchical Embedding Loss

no code implementations ECCV 2020 Lijun Wang, Jianming Zhang, Yifan Wang, Huchuan Lu, Xiang Ruan

This paper proposes a hierarchical loss for monocular depth estimation, which measures the differences between the prediction and ground truth in hierarchical embedding spaces of depth maps.

Monocular Depth Estimation

Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters

1 code implementation18 Mar 2024 Jiazuo Yu, Yunzhi Zhuge, Lu Zhang, Dong Wang, Huchuan Lu, You He

Continual learning can empower vision-language models to continuously acquire new knowledge, without the need for access to the entire historical dataset.

Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification

1 code implementation15 Mar 2024 Pingping Zhang, Yuhao Wang, Yang Liu, Zhengzheng Tu, Huchuan Lu

To address above issues, we propose a novel learning framework named \textbf{EDITOR} to select diverse tokens from vision Transformers for multi-modal object ReID.

Object

Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception

no code implementations5 Mar 2024 Junwen He, Yifan Wang, Lijun Wang, Huchuan Lu, Jun-Yan He, Jin-Peng Lan, Bin Luo, Xuansong Xie

Multimodal Large Language Model (MLLMs) leverages Large Language Models as a cognitive framework for diverse visual-language tasks.

Language Modelling Large Language Model +2

Spectrum-guided Feature Enhancement Network for Event Person Re-Identification

no code implementations2 Feb 2024 Hongchen Tan, Yi Zhang, Xiuping Liu, BaoCai Yin, Nan Ma, Xin Li, Huchuan Lu

This network consists of two innovative components: the Multi-grain Spectrum Attention Mechanism (MSAM) and the Consecutive Patch Dropout Module (CPDM).

Person Re-Identification

StableIdentity: Inserting Anybody into Anywhere at First Sight

1 code implementation29 Jan 2024 Qinghe Wang, Xu Jia, Xiaomin Li, Taiqing Li, Liqian Ma, Yunzhi Zhuge, Huchuan Lu

We believe that the proposed StableIdentity is an important step to unify image, video, and 3D customized generation models.

PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety

1 code implementation22 Jan 2024 Zaibin Zhang, Yongting Zhang, Lijun Li, Hongzhi Gao, Lijun Wang, Huchuan Lu, Feng Zhao, Yu Qiao, Jing Shao

In this paper, we explore these concerns through the innovative lens of agent psychology, revealing that the dark psychological states of agents constitute a significant threat to safety.

Part Representation Learning with Teacher-Student Decoder for Occluded Person Re-identification

1 code implementation15 Dec 2023 Shang Gao, Chenyang Yu, Pingping Zhang, Huchuan Lu

In addition, existing occluded person ReID benchmarks utilize occluded samples as queries, which will amplify the role of alleviating occlusion interference and underestimate the impact of the feature absence issue.

Human Parsing Long-range modeling +2

TF-CLIP: Learning Text-free CLIP for Video-based Person Re-Identification

1 code implementation15 Dec 2023 Chenyang Yu, Xuehu Liu, Yingquan Wang, Pingping Zhang, Huchuan Lu

Technically, TMC allows the frame-level memories in a sequence to communicate with each other, and to extract temporal information based on the relations within the sequence.

Cross-Modal Retrieval Video-Based Person Re-Identification

TOP-ReID: Multi-spectral Object Re-Identification with Token Permutation

1 code implementation15 Dec 2023 Yuhao Wang, Xuehu Liu, Pingping Zhang, Hu Lu, Zhengzheng Tu, Huchuan Lu

In addition, most of current Transformer-based ReID methods only utilize the global feature of class tokens to achieve the holistic retrieval, ignoring the local discriminative ones.

Towards Automatic Power Battery Detection: New Challenge, Benchmark Dataset and Baseline

1 code implementation5 Dec 2023 Xiaoqi Zhao, Youwei Pang, Zhenyu Chen, Qian Yu, Lihe Zhang, Hanqi Liu, Jiaming Zuo, Huchuan Lu

We conduct a comprehensive study on a new task named power battery detection (PBD), which aims to localize the dense cathode and anode plates endpoints from X-ray images to evaluate the quality of power batteries.

Crowd Counting object-detection +2

TrackDiffusion: Multi-object Tracking Data Generation via Diffusion Models

no code implementations1 Dec 2023 Pengxiang Li, Zhili Liu, Kai Chen, Lanqing Hong, Yunzhi Zhuge, Dit-yan Yeung, Huchuan Lu, Xu Jia

Diffusion models have gained prominence in generating data for perception tasks such as image classification and object detection.

Image Classification Multi-Object Tracking +3

Open-Vocabulary Camouflaged Object Segmentation

no code implementations19 Nov 2023 Youwei Pang, Xiaoqi Zhao, Jiaming Zuo, Lihe Zhang, Huchuan Lu

To fill in the gaps, we introduce a new task, open-vocabulary camouflaged object segmentation (OVCOS) and construct a large-scale complex scene dataset (\textbf{OVCamo}) which containing 11, 483 hand-selected images with fine annotations and corresponding object classes.

Camouflaged Object Segmentation Image Segmentation +4

ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection

1 code implementation31 Oct 2023 Youwei Pang, Xiaoqi Zhao, Tian-Zhu Xiang, Lihe Zhang, Huchuan Lu

Apart from the high intrinsic similarity between camouflaged objects and their background, objects are usually diverse in scale, fuzzy in appearance, and even severely occluded.

Camouflaged Object Segmentation

TransY-Net:Learning Fully Transformer Networks for Change Detection of Remote Sensing Images

1 code implementation22 Oct 2023 Tianyu Yan, Zifu Wan, Pingping Zhang, Gong Cheng, Huchuan Lu

To relieve these issues, in this work we propose a novel Transformer-based learning framework named TransY-Net for remote sensing image CD, which improves the feature extraction from a global view and combines multi-level visual features in a pyramid manner.

Change Detection

PixArt-$α$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

2 code implementations30 Sep 2023 Junsong Chen, Jincheng Yu, Chongjian Ge, Lewei Yao, Enze Xie, Yue Wu, Zhongdao Wang, James Kwok, Ping Luo, Huchuan Lu, Zhenguo Li

We hope PIXART-$\alpha$ will provide new insights to the AIGC community and startups to accelerate building their own high-quality yet low-cost generative models from scratch.

Image Generation Language Modelling

DCPT: Darkness Clue-Prompted Tracking in Nighttime UAVs

1 code implementation19 Sep 2023 Jiawen Zhu, Huayi Tang, Zhi-Qi Cheng, Jun-Yan He, Bin Luo, Shihao Qiu, Shengming Li, Huchuan Lu

To address this, we propose a novel architecture called Darkness Clue-Prompted Tracking (DCPT) that achieves robust UAV tracking at night by efficiently learning to generate darkness clue prompts.

Leveraging the Power of Data Augmentation for Transformer-based Tracking

no code implementations15 Sep 2023 Jie Zhao, Johan Edstedt, Michael Felsberg, Dong Wang, Huchuan Lu

Due to long-distance correlation and powerful pretrained models, transformer-based methods have initiated a breakthrough in visual object tracking performance.

Data Augmentation Visual Object Tracking

UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory

1 code implementation28 Aug 2023 Haiwen Diao, Bo Wan, Ying Zhang, Xu Jia, Huchuan Lu, Long Chen

Parameter-efficient transfer learning (PETL), i. e., fine-tuning a small portion of parameters, is an effective strategy for adapting pre-trained models to downstream domains.

Question Answering Retrieval +5

CiteTracker: Correlating Image and Text for Visual Tracking

1 code implementation ICCV 2023 Xin Li, Yuqing Huang, Zhenyu He, YaoWei Wang, Huchuan Lu, Ming-Hsuan Yang

Existing visual tracking methods typically take an image patch as the reference of the target to perform tracking.

Attribute Descriptive +2

Isomer: Isomerous Transformer for Zero-shot Video Object Segmentation

1 code implementation ICCV 2023 Yichen Yuan, Yifan Wang, Lijun Wang, Xiaoqi Zhao, Huchuan Lu, Yu Wang, Weibo Su, Lei Zhang

Recent leading zero-shot video object segmentation (ZVOS) works devote to integrating appearance and motion information by elaborately designing feature fusion modules and identically applying them in multiple feature stages.

Semantic Segmentation Video Object Segmentation +2

Exploring Transformers for Open-world Instance Segmentation

no code implementations ICCV 2023 Jiannan Wu, Yi Jiang, Bin Yan, Huchuan Lu, Zehuan Yuan, Ping Luo

Open-world instance segmentation is a rising task, which aims to segment all objects in the image by learning from a limited number of base-category objects.

Contrastive Learning Open-World Instance Segmentation +1

Recurrent Multi-scale Transformer for High-Resolution Salient Object Detection

1 code implementation7 Aug 2023 Xinhao Deng, Pingping Zhang, Wei Liu, Huchuan Lu

To address above issues, in this work, we first propose a new HRS10K dataset, which contains 10, 500 high-quality annotated images at 2K-8K resolution.

object-detection Object Detection +1

Video-based Person Re-identification with Long Short-Term Representation Learning

no code implementations7 Aug 2023 Xuehu Liu, Pingping Zhang, Huchuan Lu

Meanwhile, to extract short-term representations, we propose a Bi-direction Motion Estimator (BME), in which reciprocal motion information is efficiently extracted from consecutive frames.

Representation Learning Video-Based Person Re-Identification

Tracking Anything in High Quality

1 code implementation26 Jul 2023 Jiawen Zhu, Zhenyu Chen, Zeqi Hao, Shijie Chang, Lu Zhang, Dong Wang, Huchuan Lu, Bin Luo, Jun-Yan He, Jin-Peng Lan, Hanyuan Chen, Chenyang Li

To further improve the quality of tracking masks, a pretrained MR model is employed to refine the tracking results.

Object Semantic Segmentation +3

ComPtr: Towards Diverse Bi-source Dense Prediction Tasks via A Simple yet General Complementary Transformer

1 code implementation23 Jul 2023 Youwei Pang, Xiaoqi Zhao, Lihe Zhang, Huchuan Lu

Specifically, unlike existing methods that over-specialize in a single task or a subset of tasks, ComPtr starts from the more general concept of bi-source dense prediction.

Change Detection Crowd Counting +4

BEV-IO: Enhancing Bird's-Eye-View 3D Detection with Instance Occupancy

no code implementations26 May 2023 Zaibin Zhang, Yuanhang Zhang, Lijun Wang, Yifan Wang, Huchuan Lu

At the core of our method is the newly-designed instance occupancy prediction (IOP) module, which aims to infer point-level occupancy status for each instance in the frustum space.

Neural Image Re-Exposure

1 code implementation23 May 2023 Xinyu Zhang, Hefei Huang, Xu Jia, Dong Wang, Huchuan Lu

In this work, we aim to re-expose the captured photo in post-processing to provide a more flexible way of addressing those issues within a unified framework.

Ranked #4 on Deblurring on GoPro (using extra training data)

Deblurring Joint Deblur and Frame Interpolation +5

Deeply-Coupled Convolution-Transformer with Spatial-temporal Complementary Learning for Video-based Person Re-identification

1 code implementation27 Apr 2023 Xuehu Liu, Chenyang Yu, Pingping Zhang, Huchuan Lu

Further, in spatial, we propose a Complementary Content Attention (CCA) to take advantages of the coupled structure and guide independent features for spatial complementary learning.

Video-Based Person Re-Identification

MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation

1 code implementation19 Apr 2023 Chongjian Ge, Junsong Chen, Enze Xie, Zhongdao Wang, Lanqing Hong, Huchuan Lu, Zhenguo Li, Ping Luo

These queries are then processed iteratively by a BEV-Evolving decoder, which selectively aggregates deep features from either LiDAR, cameras, or both modalities.

3D Object Detection Autonomous Driving +3

GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from Multi-view Images

no code implementations CVPR 2023 Jianchuan Chen, Wentao Yi, Liqian Ma, Xu Jia, Huchuan Lu

The results demonstrate that our approach outperforms state-of-the-art methods in terms of novel view synthesis and geometric reconstruction.

Neural Rendering Novel View Synthesis

ARKitTrack: A New Diverse Dataset for Tracking Using Mobile RGB-D Data

1 code implementation CVPR 2023 Haojie Zhao, Junsong Chen, Lijun Wang, Huchuan Lu

Compared with traditional RGB-only visual tracking, few datasets have been constructed for RGB-D tracking.

Visual Tracking

Plug-and-Play Regulators for Image-Text Matching

1 code implementation23 Mar 2023 Haiwen Diao, Ying Zhang, Wei Liu, Xiang Ruan, Huchuan Lu

Exploiting fine-grained correspondence and visual-semantic alignments has shown great potential in image-text matching.

Image Retrieval Image-text matching +1

Visual Prompt Multi-Modal Tracking

1 code implementation CVPR 2023 Jiawen Zhu, Simiao Lai, Xin Chen, Dong Wang, Huchuan Lu

To inherit the powerful representations of the foundation model, a natural modus operandi for multi-modal tracking is full fine-tuning on the RGB-based parameters.

Object Tracking

M$^{2}$SNet: Multi-scale in Multi-scale Subtraction Network for Medical Image Segmentation

2 code implementations20 Mar 2023 Xiaoqi Zhao, Hongpeng Jia, Youwei Pang, Long Lv, Feng Tian, Lihe Zhang, Weibing Sun, Huchuan Lu

Next, we expand the single-scale SU to the intra-layer multi-scale SU, which can provide the decoder with both pixel-level and structure-level difference information.

Computed Tomography (CT) Image Segmentation +3

Towards Diverse Binary Segmentation via A Simple yet General Gated Network

1 code implementation18 Mar 2023 Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu, Lei Zhang

They ignore two key problems when the encoder exchanges information with the decoder: one is the lack of interference control mechanism between them, the other is without considering the disparity of the contributions from different encoder levels.

Segmentation Semantic Segmentation

Dual Memory Aggregation Network for Event-Based Object Detection with Learnable Representation

1 code implementation17 Mar 2023 Dongsheng Wang, Xu Jia, Yang Zhang, Xinyu Zhang, Yaoyuan Wang, Ziyang Zhang, Dong Wang, Huchuan Lu

To fully exploit information with event streams to detect objects, a dual-memory aggregation network (DMANet) is proposed to leverage both long and short memory along event streams to aggregate effective information for object detection.

Object object-detection +1

Universal Instance Perception as Object Discovery and Retrieval

1 code implementation CVPR 2023 Bin Yan, Yi Jiang, Jiannan Wu, Dong Wang, Ping Luo, Zehuan Yuan, Huchuan Lu

All instance perception tasks aim at finding certain objects specified by some queries such as category names, language expressions, and target annotations, but this complete field has been split into multiple independent subtasks.

 Ranked #1 on Referring Expression Segmentation on RefCoCo val (using extra training data)

Described Object Detection Generalized Referring Expression Comprehension +15

Segment Every Reference Object in Spatial and Temporal Spaces

no code implementations ICCV 2023 Jiannan Wu, Yi Jiang, Bin Yan, Huchuan Lu, Zehuan Yuan, Ping Luo

In this work, we end the current fragmented situation and propose UniRef to unify the three reference-based object segmentation tasks with a single architecture.

Image Segmentation Object +5

Compression-Aware Video Super-Resolution

1 code implementation CVPR 2023 Yingwei Wang, Xu Jia, Xin Tao, Takashi Isobe, Huchuan Lu, Yu-Wing Tai

Videos stored on mobile devices or delivered on the Internet are usually in compressed format and are of various unknown compression parameters, but most video super-resolution (VSR) methods often assume ideal inputs resulting in large performance gap between experimental settings and real-world applications.

Model Compression Video Enhancement +1

MetaBEV: Solving Sensor Failures for 3D Detection and Map Segmentation

no code implementations ICCV 2023 Chongjian Ge, Junsong Chen, Enze Xie, Zhongdao Wang, Lanqing Hong, Huchuan Lu, Zhenguo Li, Ping Luo

These queries are then processed iteratively by a BEV-Evolving decoder, which selectively aggregates deep features from either LiDAR, cameras, or both modalities.

3D Object Detection Autonomous Driving +3

HS-Diffusion: Semantic-Mixing Diffusion for Head Swapping

1 code implementation13 Dec 2022 Qinghe Wang, Lijie Liu, Miao Hua, Pengfei Zhu, WangMeng Zuo, QinGhua Hu, Huchuan Lu, Bing Cao

We blend the semantic layouts of source head and source body, and then inpaint the transition region by the semantic layout generator, achieving a coarse-grained head swapping.

Interactive Feature Embedding for Infrared and Visible Image Fusion

no code implementations9 Nov 2022 Fan Zhao, Wenda Zhao, Huchuan Lu

General deep learning-based methods for infrared and visible image fusion rely on the unsupervised mechanism for vital information retention by utilizing elaborately designed loss functions.

Infrared And Visible Image Fusion Self-Supervised Learning

Towards Grand Unification of Object Tracking

1 code implementation14 Jul 2022 Bin Yan, Yi Jiang, Peize Sun, Dong Wang, Zehuan Yuan, Ping Luo, Huchuan Lu

We present a unified method, termed Unicorn, that can simultaneously solve four tracking problems (SOT, MOT, VOS, MOTS) with a single network using the same model parameters.

Multi-Object Tracking Multi-Object Tracking and Segmentation +3

SRRT: Search Region Regulation Tracking

no code implementations10 Jul 2022 Jiawen Zhu, Xin Chen, Pengyu Zhang, Xinying Wang, Dong Wang, Wenda Zhao, Huchuan Lu

Trackers tend to lose the target object due to the limited search region or be interfered with by distractors due to the excessive search region.

Look Back and Forth: Video Super-Resolution with Explicit Temporal Difference Modeling

1 code implementation CVPR 2022 Takashi Isobe, Xu Jia, Xin Tao, Changlin Li, Ruihuang Li, Yongjie Shi, Jing Mu, Huchuan Lu, Yu-Wing Tai

Instead of directly feeding consecutive frames into a VSR model, we propose to compute the temporal difference between frames and divide those pixels into two subsets according to the level of difference.

Motion Compensation Optical Flow Estimation +1

Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline

no code implementations CVPR 2022 Pengyu Zhang, Jie Zhao, Dong Wang, Huchuan Lu, Xiang Ruan

With the popularity of multi-modal sensors, visible-thermal (RGB-T) object tracking is to achieve robust performance and wider application scenarios with the guidance of objects' temperature information.

Attribute Object Tracking +1

Deeply Interleaved Two-Stream Encoder for Referring Video Segmentation

no code implementations30 Mar 2022 Guang Feng, Lihe Zhang, Zhiwei Hu, Huchuan Lu

To address this task, we first design a two-stream encoder to extract CNN-based visual features and transformer-based linguistic features hierarchically, and a vision-language mutual guidance (VLMG) module is inserted into the encoder multiple times to promote the hierarchical and progressive fusion of multi-modal features.

Referring Expression Segmentation Video Segmentation +2

High-Performance Transformer Tracking

1 code implementation25 Mar 2022 Xin Chen, Bin Yan, Jiawen Zhu, Huchuan Lu, Xiang Ruan, Dong Wang

First, we present a transformer tracking (named TransT) method based on the Siamese-like feature extraction backbone, the designed attention-based fusion mechanism, and the classification and regression head.

Vocal Bursts Intensity Prediction

Efficient Visual Tracking via Hierarchical Cross-Attention Transformer

1 code implementation25 Mar 2022 Xin Chen, Ben Kang, Dong Wang, Dongdong Li, Huchuan Lu

Most state-of-the-art trackers are satisfied with the real-time speed on powerful GPUs.

Visual Tracking

Joint Learning of Salient Object Detection, Depth Estimation and Contour Extraction

1 code implementation9 Mar 2022 Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu

In this paper, we propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD).

Depth Estimation object-detection +2

Multi-Object Tracking Meets Moving UAV

no code implementations CVPR 2022 Shuai Liu, Xin Li, Huchuan Lu, You He

Multi-object tracking in unmanned aerial vehicle (UAV) videos is an important vision task and can be applied in a wide range of applications.

Multi-Object Tracking Object

Multi-Source Uncertainty Mining for Deep Unsupervised Saliency Detection

no code implementations CVPR 2022 Yifan Wang, Wenbo Zhang, Lijun Wang, Ting Liu, Huchuan Lu

We design an Uncertainty Mining Network (UMNet) which consists of multiple Merge-and-Split (MS) modules to recursively analyze the commonality and difference among multiple noisy labels and infer pixel-wise uncertainty map for each label.

object-detection Object Detection +3

An Informative Tracking Benchmark

1 code implementation13 Dec 2021 Xin Li, Qiao Liu, Wenjie Pei, Qiuhong Shen, YaoWei Wang, Huchuan Lu, Ming-Hsuan Yang

Along with the rapid progress of visual tracking, existing benchmarks become less informative due to redundancy of samples and weak discrimination between current trackers, making evaluations on all datasets extremely time-consuming.

Visual Tracking

CAVER: Cross-Modal View-Mixed Transformer for Bi-Modal Salient Object Detection

1 code implementation4 Dec 2021 Youwei Pang, Xiaoqi Zhao, Lihe Zhang, Huchuan Lu

Most of the existing bi-modal (RGB-D and RGB-T) salient object detection methods utilize the convolution operation and construct complex interweave fusion structures to achieve cross-modal information integration.

object-detection RGB-D Salient Object Detection +1

MFNet: Multi-filter Directive Network for Weakly Supervised Salient Object Detection

1 code implementation ICCV 2021 Yongri Piao, Jian Wang, Miao Zhang, Huchuan Lu

The multiple accurate cues from multiple DFs are then simultaneously propagated to the saliency network with a multi-guidance loss.

object-detection Object Detection +2

Transformer-based Network for RGB-D Saliency Detection

no code implementations1 Dec 2021 Yue Wang, Xu Jia, Lu Zhang, Yuke Li, James Elder, Huchuan Lu

TFFM conducts a sufficient feature fusion by integrating features from multiple scales and two modalities over all positions simultaneously.

Saliency Detection

MODNet-V: Improving Portrait Video Matting via Background Restoration

1 code implementation24 Sep 2021 Jiayu Sun, Zhanghan Ke, Lihe Zhang, Huchuan Lu, Rynson W. H. Lau

In this work, we observe that instead of asking the user to explicitly provide a background image, we may recover it from the input video itself.

Image Matting Video Matting

To be Critical: Self-Calibrated Weakly Supervised Learning for Salient Object Detection

no code implementations4 Sep 2021 Yongri Piao, Jian Wang, Miao Zhang, Zhengxuan Ma, Huchuan Lu

Despite of the success of previous works, explorations on an effective training strategy for the saliency network and accurate matches between image-level annotations and salient objects are still inadequate.

object-detection Object Detection +2

Multi-Source Fusion and Automatic Predictor Selection for Zero-Shot Video Object Segmentation

1 code implementation11 Aug 2021 Xiaoqi Zhao, Youwei Pang, Jiaxing Yang, Lihe Zhang, Huchuan Lu

In this paper, we propose a novel multi-source fusion network for zero-shot video object segmentation.

 Ranked #1 on Video Object Segmentation on FBMS (Jaccard (Mean) metric)

Depth Estimation Object +3

Automatic Polyp Segmentation via Multi-scale Subtraction Network

2 code implementations11 Aug 2021 Xiaoqi Zhao, Lihe Zhang, Huchuan Lu

\keywords{Colorectal Cancer \and Automatic Polyp Segmentation \and Subtraction \and LossNet.}

Segmentation

Video Annotation for Visual Tracking via Selection and Refinement

1 code implementation ICCV 2021 Kenan Dai, Jie Zhao, Lijun Wang, Dong Wang, Jianhua Li, Huchuan Lu, Xuesheng Qian, Xiaoyun Yang

Deep learning based visual trackers entail offline pre-training on large volumes of video datasets with accurate bounding box annotations that are labor-expensive to achieve.

Visual Tracking

HAT: Hierarchical Aggregation Transformers for Person Re-identification

1 code implementation13 Jul 2021 Guowen Zhang, Pingping Zhang, Jinqing Qi, Huchuan Lu

In this work, we take advantages of both CNNs and Transformers, and propose a novel learning framework named Hierarchical Aggregation Transformer (HAT) for image-based person Re-ID with high performance.

Person Re-Identification Person Retrieval +1

Animatable Neural Radiance Fields from Monocular RGB Videos

1 code implementation25 Jun 2021 Jianchuan Chen, Ying Zhang, Di Kang, Xuefei Zhe, Linchao Bao, Xu Jia, Huchuan Lu

We present animatable neural radiance fields (animatable NeRF) for detailed human avatar creation from monocular videos.

3D Human Reconstruction Neural Rendering +2

Self-Supervised Tracking via Target-Aware Data Synthesis

no code implementations21 Jun 2021 Xin Li, Wenjie Pei, YaoWei Wang, Zhenyu He, Huchuan Lu, Ming-Hsuan Yang

While deep-learning based tracking methods have achieved substantial progress, they entail large-scale and high-quality annotated data for sufficient training.

Representation Learning Self-Supervised Learning +1

Self-Generated Defocus Blur Detection via Dual Adversarial Discriminators

1 code implementation CVPR 2021 Wenda Zhao, Cai Shang, Huchuan Lu

The core insight is that a defocus blur region/focused clear area can be arbitrarily pasted to a given realistic full blurred image/full clear image without affecting the judgment of the full blurred image/full clear image.

Defocus Blur Detection

Calibrated RGB-D Salient Object Detection

1 code implementation CVPR 2021 Wei Ji, Jingjing Li, Shuang Yu, Miao Zhang, Yongri Piao, Shunyu Yao, Qi Bi, Kai Ma, Yefeng Zheng, Huchuan Lu, Li Cheng

Complex backgrounds and similar appearances between objects and their surroundings are generally recognized as challenging scenarios in Salient Object Detection (SOD).

Object object-detection +3

Multi-Target Domain Adaptation with Collaborative Consistency Learning

no code implementations CVPR 2021 Takashi Isobe, Xu Jia, Shuaijun Chen, Jianzhong He, Yongjie Shi, Jianzhuang Liu, Huchuan Lu, Shengjin Wang

To obtain a single model that works across multiple target domains, we propose to simultaneously learn a student model which is trained to not only imitate the output of each expert on the corresponding target domain, but also to pull different expert close to each other with regularization on their weights.

Multi-target Domain Adaptation Semantic Segmentation +1

Encoder Fusion Network with Co-Attention Embedding for Referring Image Segmentation

no code implementations CVPR 2021 Guang Feng, Zhiwei Hu, Lihe Zhang, Huchuan Lu

In this work, we propose an encoder fusion network (EFN), which transforms the visual encoder into a multi-modal feature learning network, and uses language to refine the multi-modal features progressively.

Image Segmentation Semantic Segmentation

A Video Is Worth Three Views: Trigeminal Transformers for Video-based Person Re-identification

no code implementations5 Apr 2021 Xuehu Liu, Pingping Zhang, Chenyang Yu, Huchuan Lu, Xuesheng Qian, Xiaoyun Yang

To capture richer perceptions and extract more comprehensive video representations, in this paper we propose a novel framework named Trigeminal Transformers (TMT) for video-based person Re-ID.

Video-Based Person Re-Identification

Transformer Tracking

1 code implementation CVPR 2021 Xin Chen, Bin Yan, Jiawen Zhu, Dong Wang, Xiaoyun Yang, Huchuan Lu

The correlation operation is a simple fusion manner to consider the similarity between the template and the search region.

Visual Object Tracking Visual Tracking

Watching You: Global-guided Reciprocal Learning for Video-based Person Re-identification

1 code implementation CVPR 2021 Xuehu Liu, Pingping Zhang, Chenyang Yu, Huchuan Lu, Xiaoyun Yang

Specifically, we first propose a Global-guided Correlation Estimation (GCE) to generate feature correlation maps of local features and global features, which help to localize the high- and low-correlation regions for identifying the same person.

Feature Correlation Video-Based Person Re-Identification

Self-Supervised Pretraining for RGB-D Salient Object Detection

1 code implementation29 Jan 2021 Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu, Xiang Ruan

Existing CNNs-Based RGB-D salient object detection (SOD) networks are all required to be pretrained on the ImageNet to learn the hierarchy features which helps provide a good initialization.

Object object-detection +3

Neighbor2Neighbor: Self-Supervised Denoising from Single Noisy Images

12 code implementations CVPR 2021 Tao Huang, Songjiang Li, Xu Jia, Huchuan Lu, Jianzhuang Liu

In this paper, we present a very simple yet effective method named Neighbor2Neighbor to train an effective image denoising model with only noisy images.

Image Denoising Self-Supervised Learning

Similarity Reasoning and Filtration for Image-Text Matching

1 code implementation5 Jan 2021 Haiwen Diao, Ying Zhang, Lin Ma, Huchuan Lu

Image-text matching plays a critical role in bridging the vision and language, and great progress has been made by exploiting the global alignment between image and sentence, or local alignments between regions and words.

Image Retrieval Image-text matching +2

CR-Fill: Generative Image Inpainting With Auxiliary Contextual Reconstruction

1 code implementation ICCV 2021 Yu Zeng, Zhe Lin, Huchuan Lu, Vishal M. Patel

The auxiliary branch (i. e. CR loss) is required only during training, and only the inpainting generator is required during the inference.

Image Inpainting

DUT-LFSaliency: Versatile Dataset and Light Field-to-RGB Saliency Detection

no code implementations30 Dec 2020 Yongri Piao, Zhengkun Rong, Shuang Xu, Miao Zhang, Huchuan Lu

The success of learning-based light field saliency detection is heavily dependent on how a comprehensive dataset can be constructed for higher generalizability of models, how high dimensional light field data can be effectively exploited, and how a flexible model can be designed to achieve versatility for desktop computers and mobile devices.

Saliency Detection

Multi-modal Visual Tracking: Review and Experimental Comparison

2 code implementations8 Dec 2020 Pengyu Zhang, Dong Wang, Huchuan Lu

Visual object tracking, as a fundamental task in computer vision, has drawn much attention in recent years.

Rgb-T Tracking Visual Object Tracking

CR-Fill: Generative Image Inpainting with Auxiliary Contexutal Reconstruction

1 code implementation25 Nov 2020 Yu Zeng, Zhe Lin, Huchuan Lu, Vishal M. Patel

Due to the lack of supervision signals for the correspondence between missing regions and known regions, it may fail to find proper reference features, which often leads to artifacts in the results.

Image Inpainting

Coherent Loss: A Generic Framework for Stable Video Segmentation

no code implementations25 Oct 2020 Mingyang Qian, Yi Fu, Xiao Tan, YingYing Li, Jinqing Qi, Huchuan Lu, Shilei Wen, Errui Ding

Video segmentation approaches are of great importance for numerous vision tasks especially in video manipulation for entertainment.

Segmentation Semantic Segmentation +2

Accurate RGB-D Salient Object Detection via Collaborative Learning

2 code implementations ECCV 2020 Wei Ji, Jingjing Li, Miao Zhang, Yongri Piao, Huchuan Lu

The explicitly extracted edge information goes together with saliency to give more emphasis to the salient regions and object boundaries.

Object object-detection +5

Multi-scale Interactive Network for Salient Object Detection

1 code implementation CVPR 2020 Youwei Pang, Xiaoqi Zhao, Lihe Zhang, Huchuan Lu

To obtain more efficient multi-scale features from the integrated features, the self-interaction modules are embedded in each decoder unit.

Object object-detection +2

A Single Stream Network for Robust and Real-time RGB-D Salient Object Detection

1 code implementation ECCV 2020 Xiaoqi Zhao, Lihe Zhang, Youwei Pang, Huchuan Lu, Lei Zhang

In this work, we design a single stream network to directly use the depth map to guide early fusion and middle fusion between RGB and depth, which saves the feature encoder of the depth stream and achieves a lightweight and real-time model.

object-detection RGB-D Salient Object Detection +3

Jointly Modeling Motion and Appearance Cues for Robust RGB-T Tracking

no code implementations4 Jul 2020 Pengyu Zhang, Jie Zhao, Dong Wang, Huchuan Lu, Xiaoyun Yang

In this study, we propose a novel RGB-T tracking framework by jointly modeling both appearance and motion cues.

Rgb-T Tracking

Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box Estimation

1 code implementation4 Jul 2020 Bin Yan, Dong Wang, Huchuan Lu, Xiaoyun Yang

In recent years, the multiple-stage strategy has become a popular trend for visual tracking.

Visual Tracking

Synergistic saliency and depth prediction for RGB-D saliency detection

no code implementations3 Jul 2020 Yue Wang, Yuke Li, James H. Elder, Huchuan Lu, Runmin Wu, Lu Zhang

Evaluation on seven RGB-D datasets demonstrates that even without saliency ground truth for RGB-D datasets and using only the RGB data of RGB-D datasets at inference, our semi-supervised system performs favorable against state-of-the-art fully-supervised RGB-D saliency detection methods that use saliency ground truth for RGB-D datasets at training and depth data at inference on two largest testing datasets.

Depth Estimation Depth Prediction +1

Pose-guided Visible Part Matching for Occluded Person ReID

1 code implementation CVPR 2020 Shang Gao, Jingya Wang, Huchuan Lu, Zimo Liu

Occluded person re-identification is a challenging task as the appearance varies substantially with various obstacles, especially in the crowd scenario.

Graph Matching Person Re-Identification

High-Performance Long-Term Tracking with Meta-Updater

2 code implementations CVPR 2020 Kenan Dai, Yunhua Zhang, Dong Wang, Jianhua Li, Huchuan Lu, Xiaoyun Yang

Most top-ranked long-term trackers adopt the offline-trained Siamese architectures, thus, they cannot benefit from great progress of short-term trackers with online update.

Visual Object Tracking Visual Tracking +1

Cooling-Shrinking Attack: Blinding the Tracker with Imperceptible Noises

1 code implementation CVPR 2020 Bin Yan, Dong Wang, Huchuan Lu, Xiaoyun Yang

An effective and efficient perturbation generator is trained with a carefully designed adversarial loss, which can simultaneously cool hot regions where the target exists on the heatmaps and force the predicted bounding box to shrink, making the tracked target invisible to trackers.

Adversarial Attack

When Relation Networks meet GANs: Relation GANs with Triplet Loss

1 code implementation24 Feb 2020 Runmin Wu, Kunyao Zhang, Lijun Wang, Yue Wang, Pingping Zhang, Huchuan Lu, Yizhou Yu

Though recent research has achieved remarkable progress in generating realistic images with generative adversarial networks (GANs), the lack of training stability is still a lingering concern of most GANs, especially on high-resolution inputs and complex datasets.

Conditional Image Generation Relation +2

Reverse Attention-Based Residual Network for Salient Object Detection

6 code implementations IEEE Transactions on Image Processing 2020 Shuhan Chen, Xiuli Tan, Ben Wang, Huchuan Lu, Xuelong Hu, Yun Fu

Benefiting from the quick development of deep convolutional neural networks, especially fully convolutional neural networks (FCNs), remarkable progresses have been achieved on salient object detection recently.

Object object-detection +2

Memory-oriented Decoder for Light Field Salient Object Detection

1 code implementation NeurIPS 2019 Miao Zhang, Jingjing Li, Ji Wei, Yongri Piao, Huchuan Lu

In this paper, we present a deep-learning-based method where a novel memory-oriented decoder is tailored for light field saliency detection.

object-detection RGB Salient Object Detection +2

ROI Pooled Correlation Filters for Visual Tracking

1 code implementation CVPR 2019 Yuxuan Sun, Chong Sun, Dong Wang, You He, Huchuan Lu

The ROI (region-of-interest) based pooling method performs pooling operations on the cropped ROI regions for various samples and has shown great success in the object detection methods.

object-detection Object Detection +1

Deep Multiphase Level Set for Scene Parsing

no code implementations8 Oct 2019 Pingping Zhang, Wei Liu, Yinjie Lei, Hongyu Wang, Huchuan Lu

The proposed method consists of three modules, i. e., recurrent FCNs, adaptive multiphase level set, and deeply supervised learning.

Image Segmentation Scene Parsing +1

GradNet: Gradient-Guided Network for Visual Object Tracking

2 code implementations ICCV 2019 Peixia Li, Bo-Yu Chen, Wanli Ouyang, Dong Wang, Xiaoyun Yang, Huchuan Lu

In this work, we propose a novel gradient-guided network to exploit the discriminative information in gradients and update the template in the siamese network through feed-forward and backward operations.

Ranked #3 on Visual Object Tracking on OTB-2015 (Precision metric)

Object Template Matching +2

'Skimming-Perusal' Tracking: A Framework for Real-Time and Robust Long-term Tracking

1 code implementation ICCV 2019 Bin Yan, Haojie Zhao, Dong Wang, Huchuan Lu, Xiaoyun Yang

In this work, we present a novel robust and real-time long-term tracking framework based on the proposed skimming and perusal modules.

Towards High-Resolution Salient Object Detection

1 code implementation ICCV 2019 Yi Zeng, Pingping Zhang, Jianming Zhang, Zhe Lin, Huchuan Lu

This paper pushes forward high-resolution saliency detection, and contributes a new dataset, named High-Resolution Salient Object Detection (HRSOD).

Ranked #10 on RGB Salient Object Detection on DAVIS-S (using extra training data)

Object object-detection +4

Cascaded Context Pyramid for Full-Resolution 3D Semantic Scene Completion

no code implementations ICCV 2019 Pingping Zhang, Wei Liu, Yinjie Lei, Huchuan Lu, Xiaoyun Yang

To address these issues, in this work we propose a novel deep learning framework, named Cascaded Context Pyramid Network (CCPNet), to jointly infer the occupancy and semantic labels of a volumetric 3D scene from a single depth image.

Ranked #5 on 3D Semantic Scene Completion on NYUv2 (using extra training data)

3D Semantic Scene Completion

Multi-source weak supervision for saliency detection

1 code implementation CVPR 2019 Yu Zeng, Yunzhi Zhuge, Huchuan Lu, Lihe Zhang, Mingyang Qian, Yizhou Yu

To this end, we propose a unified framework to train saliency detection models with diverse weak supervision sources.

Saliency Prediction

Salient Object Detection with Lossless Feature Reflection and Weighted Structural Loss

no code implementations21 Jan 2019 Pingping Zhang, Wei Liu, Huchuan Lu, Chunhua Shen

Inspired by the intrinsic reflection of natural images, in this paper we propose a novel feature learning framework for large-scale salient object detection.

Object object-detection +3

Global and Local Sensitivity Guided Key Salient Object Re-augmentation for Video Saliency Detection

no code implementations19 Nov 2018 Ziqi Zhou, Zheng Wang, Huchuan Lu, Song Wang, Meijun Sun

In this paper, based on the fact that salient areas in videos are relatively small and concentrated, we propose a \textbf{key salient object re-augmentation method (KSORA) using top-down semantic knowledge and bottom-up feature guidance} to improve detection accuracy in video scenes.

Decision Making feature selection +2

DeepLens: Shallow Depth Of Field From A Single Image

no code implementations18 Oct 2018 Lijun Wang, Xiaohui Shen, Jianming Zhang, Oliver Wang, Zhe Lin, Chih-Yao Hsieh, Sarah Kong, Huchuan Lu

To achieve this, we propose a novel neural network model comprised of a depth prediction module, a lens blur module, and a guided upsampling module.

Depth Estimation Depth Prediction

Boundary-guided Feature Aggregation Network for Salient Object Detection

no code implementations28 Sep 2018 Yunzhi Zhuge, Pingping Zhang, Huchuan Lu

Fully convolutional networks (FCN) has significantly improved the performance of many pixel-labeling tasks, such as semantic segmentation and depth estimation.

Depth Estimation Object +4

Learning regression and verification networks for long-term visual tracking

3 code implementations12 Sep 2018 Yunhua Zhang, Dong Wang, Lijun Wang, Jinqing Qi, Huchuan Lu

Compared with short-term tracking, the long-term tracking task requires determining the tracked object is present or absent, and then estimating the accurate bounding box if present or conducting image-wide re-detection if absent.

General Classification Object +3

Real-time 'Actor-Critic' Tracking

no code implementations ECCV 2018 Boyu Chen, Dong Wang, Peixia Li, Shuang Wang, Huchuan Lu

In this work, we propose a novel tracking algorithm with real-time performance based on the ‘Actor-Critic’ framework.

Visual Tracking

Structured Siamese Network for Real-Time Visual Tracking

no code implementations ECCV 2018 Yunhua Zhang, Lijun Wang, Jinqing Qi, Dong Wang, Mengyang Feng, Huchuan Lu

In this paper, we circumvent this issue by proposing a local structure learning method, which simultaneously considers the local patterns of the target and their structural relationships for more accurate target tracking.

Real-Time Visual Tracking

Troy: Give Attention to Saliency and for Saliency

no code implementations4 Aug 2018 Pingping Zhang, Huchuan Lu, Chunhua Shen

In addition, our work has text overlap with arXiv:1804. 06242, arXiv:1705. 00938 by other authors.

Progressive Attention Guided Recurrent Network for Salient Object Detection

1 code implementation CVPR 2018 Xiaoning Zhang, Tiantian Wang, Jinqing Qi, Huchuan Lu, Gang Wang

In this paper, we propose a novel attention guided network which selectively integrates multi-level contextual information in a progressive manner.

Ranked #11 on RGB Salient Object Detection on DUTS-TE (max F-measure metric)

Object object-detection +3

Learning to Promote Saliency Detectors

1 code implementation CVPR 2018 Yu Zeng, Huchuan Lu, Lihe Zhang, Mengyang Feng, Ali Borji

The categories and appearance of salient objects vary from image to image, therefore, saliency detection is an image-specific task.

Saliency Detection Small Data Image Classification +1

Detect Globally, Refine Locally: A Novel Approach to Saliency Detection

no code implementations CVPR 2018 Tiantian Wang, Lihe Zhang, Shuo Wang, Huchuan Lu, Gang Yang, Xiang Ruan, Ali Borji

Moreover, to effectively recover object boundaries, we propose a local Boundary Refinement Network (BRN) to adaptively learn the local contextual information for each spatial position.

object-detection RGB Salient Object Detection +2

Defocus Blur Detection via Multi-Stream Bottom-Top-Bottom Fully Convolutional Network

no code implementations CVPR 2018 Wenda Zhao, Fan Zhao, Dong Wang, Huchuan Lu

To address these issues, we propose a multi-stream bottom-top-bottom fully convolutional network (BTBNet), which is the first attempt to develop an end-to-end deep network for DBD.

Defocus Blur Detection Defocus Estimation

Correlation Tracking via Joint Discrimination and Reliability Learning

1 code implementation CVPR 2018 Chong Sun, Dong Wang, Huchuan Lu, Ming-Hsuan Yang

To address this issue, we propose a novel CF-based optimization problem to jointly model the discrimination and reliability information.

Visual Tracking

HyperFusion-Net: Densely Reflective Fusion for Salient Object Detection

no code implementations14 Apr 2018 Pingping Zhang, Huchuan Lu, Chunhua Shen

Salient object detection (SOD), which aims to find the most important region of interest and segment the relevant object/item in that area, is an important yet challenging vision task.

object-detection RGB Salient Object Detection +1

Video Person Re-identification by Temporal Residual Learning

no code implementations22 Feb 2018 Ju Dai, Pingping Zhang, Huchuan Lu, Hongyu Wang

In this paper, we propose a novel feature learning framework for video person re-identification (re-ID).

Video-Based Person Re-Identification

Non-rigid Object Tracking via Deep Multi-scale Spatial-temporal Discriminative Saliency Maps

no code implementations22 Feb 2018 Pingping Zhang, Wei Liu, Dong Wang, Yinjie Lei, Hongyu Wang, Chunhua Shen, Huchuan Lu

Extensive experiments demonstrate that the proposed algorithm achieves competitive performance in both saliency detection and visual tracking, especially outperforming other related trackers on the non-rigid object tracking datasets.

Object Object Tracking +2

Unsupervised Band Selection of Hyperspectral Images via Multi-dictionary Sparse Representation

no code implementations20 Feb 2018 Fei Li, Pingping Zhang, Huchuan Lu

Band selection is a direct and effective method to remove redundant information and reduce the spectral dimension for decreasing computational complexity and avoiding the curse of dimensionality.

Dictionary Learning General Classification +1

Salient Object Detection by Lossless Feature Reflection

no code implementations19 Feb 2018 Pingping Zhang, Wei Liu, Huchuan Lu, Chunhua Shen

Inspired by the intrinsic reflection of natural images, in this paper we propose a novel feature learning framework for large-scale salient object detection.

Object object-detection +3

A Stagewise Refinement Model for Detecting Salient Objects in Images

1 code implementation ICCV 2017 Tiantian Wang, Ali Borji, Lihe Zhang, Pingping Zhang, Huchuan Lu

To remedy this problem, here we propose to augment feedforward neural networks with a novel pyramid pooling module and a multi-stage refinement mechanism for saliency detection.

Ranked #13 on RGB Salient Object Detection on DUTS-TE (max F-measure metric)

object-detection RGB Salient Object Detection +2

Stepwise Metric Promotion for Unsupervised Video Person Re-Identification

no code implementations ICCV 2017 Zimo Liu, Dong Wang, Huchuan Lu

The intensive annotation cost and the rich but unlabeled data contained in videos motivate us to propose an unsupervised video-based person re-identification (re-ID) method.

Retrieval Video-Based Person Re-Identification

Statistics of Deep Generated Images

no code implementations9 Aug 2017 Yu Zeng, Huchuan Lu, Ali Borji

Here, we explore the low-level statistics of images generated by state-of-the-art deep generative models.

Generative Adversarial Network

An Unsupervised Game-Theoretic Approach to Saliency Detection

no code implementations8 Aug 2017 Yu Zeng, Huchuan Lu, Ali Borji, Mengyang Feng

Saliency maps are generated according to each region's strategy in the Nash equilibrium of the proposed Saliency Game.

object-detection RGB Salient Object Detection +2

Amulet: Aggregating Multi-level Convolutional Features for Salient Object Detection

1 code implementation ICCV 2017 Pingping Zhang, Dong Wang, Huchuan Lu, Hongyu Wang, Xiang Ruan

In addition, to achieve accurate boundary inference and semantic enhancement, edge-aware feature maps in low-level layers and the predicted results of low resolution features are recursively embedded into the learning framework.

Ranked #19 on RGB Salient Object Detection on DUTS-TE (max F-measure metric)

Object object-detection +2

Learning Spatial-Aware Regressions for Visual Tracking

1 code implementation CVPR 2018 Chong Sun, Dong Wang, Huchuan Lu, Ming-Hsuan Yang

Second, we propose a fully convolutional neural network with spatially regularized kernels, through which the filter kernel corresponding to each output channel is forced to focus on a specific region of the target.

regression Visual Object Tracking +1

Deep Mutual Learning

8 code implementations CVPR 2018 Ying Zhang, Tao Xiang, Timothy M. Hospedales, Huchuan Lu

Model distillation is an effective and widely used technique to transfer knowledge from a teacher to a student network.

Person Re-Identification

Hierarchical Cellular Automata for Visual Saliency

1 code implementation26 May 2017 Yao Qin, Mengyang Feng, Huchuan Lu, Garrison W. Cottrell

The CCA can act as an efficient pixel-wise aggregation algorithm that can integrate state-of-the-art methods, resulting in even better results.

Saliency Detection

Pose Invariant Embedding for Deep Person Re-identification

no code implementations26 Jan 2017 Liang Zheng, Yujia Huang, Huchuan Lu, Yi Yang

Second, to reduce the impact of pose estimation errors and information loss during PoseBox construction, we design a PoseBox fusion (PBF) CNN architecture that takes the original image, the PoseBox, and the pose estimation confidence as input.

Person Re-Identification Pose Estimation +1

Dual Deep Network for Visual Tracking

1 code implementation19 Dec 2016 Zhizhen Chi, Hongyang Li, Huchuan Lu, Ming-Hsuan Yang

In this paper, we propose a dual network to better utilize features among layers for visual tracking.

Visual Tracking

Visual Tracking via Shallow and Deep Collaborative Model

no code implementations27 Jul 2016 Bohan Zhuang, Lijun Wang, Huchuan Lu

In the discriminative model, we exploit the advances of deep learning architectures to learn generic features which are robust to both background clutters and foreground appearance variations.

Incremental Learning Visual Tracking

Sample-Specific SVM Learning for Person Re-Identification

no code implementations CVPR 2016 Ying Zhang, Baohua Li, Huchuan Lu, Atshushi Irie, Xiang Ruan

Person re-identification addresses the problem of matching people across disjoint camera views and extensive efforts have been made to seek either the robust feature representation or the discriminative matching metrics.

Dictionary Learning imbalanced classification +1

STCT: Sequentially Training Convolutional Networks for Visual Tracking

no code implementations CVPR 2016 Lijun Wang, Wanli Ouyang, Xiaogang Wang, Huchuan Lu

To further improve the robustness of each base learner, we propose to train the convolutional layers with random binary masks, which serves as a regularization to enforce each base learner to focus on different input features.

Visual Tracking

Fixation prediction with a combined model of bottom-up saliency and vanishing point

no code implementations6 Dec 2015 Mengyang Feng, Ali Borji, Huchuan Lu

By predicting where humans look in natural scenes, we can understand how they perceive complex natural scenes and prioritize information for further high-level visual processing.

Visual Tracking With Fully Convolutional Networks

no code implementations ICCV 2015 Lijun Wang, Wanli Ouyang, Xiaogang Wang, Huchuan Lu

Instead of treating convolutional neural network (CNN) as a black-box feature extractor, we conduct in-depth study on the properties of CNN features offline pre-trained on massive image data and classification task on ImageNet.

Object Tracking Visual Tracking

LCNN: Low-level Feature Embedded CNN for Salient Object Detection

no code implementations17 Aug 2015 Hongyang Li, Huchuan Lu, Zhe Lin, Xiaohui Shen, Brian Price

In this paper, we propose a novel deep neural network framework embedded with low-level features (LCNN) for salient object detection in complex images.

object-detection RGB Salient Object Detection +1

Salient Object Detection via Bootstrap Learning

no code implementations CVPR 2015 Na Tong, Huchuan Lu, Xiang Ruan, Ming-Hsuan Yang

Furthermore, we show that the proposed bootstrap learning approach can be easily applied to other bottom-up saliency models for significant improvement.

Object object-detection +3

Subspace Clustering by Mixture of Gaussian Regression

no code implementations CVPR 2015 Baohua Li, Ying Zhang, Zhouchen Lin, Huchuan Lu

Therefore, we propose Mixture of Gaussian Regression (MoG Regression) for subspace clustering by modeling noise as a Mixture of Gaussians (MoG).

Clustering regression

Saliency Detection via Cellular Automata

no code implementations CVPR 2015 Yao Qin, Huchuan Lu, Yiqun Xu, He Wang

In this paper, we introduce Cellular Automata--a dynamic evolution model to intuitively detect the salient object.

Saliency Detection

Deep Networks for Saliency Detection via Local Estimation and Global Search

no code implementations CVPR 2015 Lijun Wang, Huchuan Lu, Xiang Ruan, Ming-Hsuan Yang

In the global search stage, the local saliency map together with global contrast and geometric information are used as global features to describe a set of object candidate regions.

Object Saliency Detection

Inner and Inter Label Propagation: Salient Object Detection in the Wild

2 code implementations27 May 2015 Hongyang Li, Huchuan Lu, Zhe Lin, Xiaohui Shen, Brian Price

For most natural images, some boundary superpixels serve as the background labels and the saliency of other superpixels are determined by ranking their similarities to the boundary labels based on an inner propagation scheme.

Computational Efficiency object-detection +4

Visual Tracking via Probability Continuous Outlier Model

no code implementations CVPR 2014 Dong Wang, Huchuan Lu

In this paper, we present a novel online visual tracking method based on linear representation.

Visual Tracking

Least Soft-Threshold Squares Tracking

no code implementations CVPR 2013 Dong Wang, Huchuan Lu, Ming-Hsuan Yang

In this paper, we propose a generative tracking method based on a novel robust linear regression algorithm.

Cannot find the paper you are looking for? You can Submit a new open access paper.