no code implementations • ECCV 2020 • Lu Zhang, Jianming Zhang, Zhe Lin, Radomír Měch, Huchuan Lu, You He
We reformulate the problem of detecting and tracking of salient object spots as a new task called object hotspot tracking.
no code implementations • ECCV 2020 • Lijun Wang, Jianming Zhang, Yifan Wang, Huchuan Lu, Xiang Ruan
This paper proposes a hierarchical loss for monocular depth estimation, which measures the differences between the prediction and ground truth in hierarchical embedding spaces of depth maps.
1 code implementation • ECCV 2020 • Miao Zhang, Sun Xiao Fei, Jie Liu, Shuang Xu, Yongri Piao, Huchuan Lu
In this paper, we propose an asymmetric two-stream architecture taking account of the inherent differences between RGB and depth data for saliency detection.
Ranked #19 on Thermal Image Segmentation on RGB-T-Glass-Segmentation
no code implementations • 26 Mar 2024 • Jiawen Zhu, Xin Chen, Haiwen Diao, Shuai Li, Jun-Yan He, Chenyang Li, Bin Luo, Dong Wang, Huchuan Lu
For instance, DyTrack obtains 64. 9% AUC on LaSOT with a speed of 256 fps.
1 code implementation • 18 Mar 2024 • Jiazuo Yu, Yunzhi Zhuge, Lu Zhang, Dong Wang, Huchuan Lu, You He
Continual learning can empower vision-language models to continuously acquire new knowledge, without the need for access to the entire historical dataset.
1 code implementation • 15 Mar 2024 • Pingping Zhang, Yuhao Wang, Yang Liu, Zhengzheng Tu, Huchuan Lu
To address above issues, we propose a novel learning framework named \textbf{EDITOR} to select diverse tokens from vision Transformers for multi-modal object ReID.
no code implementations • 7 Mar 2024 • Junsong Chen, Chongjian Ge, Enze Xie, Yue Wu, Lewei Yao, Xiaozhe Ren, Zhongdao Wang, Ping Luo, Huchuan Lu, Zhenguo Li
In this paper, we introduce PixArt-\Sigma, a Diffusion Transformer model~(DiT) capable of directly generating images at 4K resolution.
no code implementations • 5 Mar 2024 • Junwen He, Yifan Wang, Lijun Wang, Huchuan Lu, Jun-Yan He, Jin-Peng Lan, Bin Luo, Xuansong Xie
Multimodal Large Language Model (MLLMs) leverages Large Language Models as a cognitive framework for diverse visual-language tasks.
no code implementations • 2 Feb 2024 • Hongchen Tan, Yi Zhang, Xiuping Liu, BaoCai Yin, Nan Ma, Xin Li, Huchuan Lu
This network consists of two innovative components: the Multi-grain Spectrum Attention Mechanism (MSAM) and the Consecutive Patch Dropout Module (CPDM).
1 code implementation • 29 Jan 2024 • Qinghe Wang, Xu Jia, Xiaomin Li, Taiqing Li, Liqian Ma, Yunzhi Zhuge, Huchuan Lu
We believe that the proposed StableIdentity is an important step to unify image, video, and 3D customized generation models.
1 code implementation • 22 Jan 2024 • Zaibin Zhang, Yongting Zhang, Lijun Li, Hongzhi Gao, Lijun Wang, Huchuan Lu, Feng Zhao, Yu Qiao, Jing Shao
In this paper, we explore these concerns through the innovative lens of agent psychology, revealing that the dark psychological states of agents constitute a significant threat to safety.
1 code implementation • 29 Dec 2023 • Jiawen Zhu, Zhi-Qi Cheng, Jun-Yan He, Chenyang Li, Bin Luo, Huchuan Lu, Yifeng Geng, Xuansong Xie
The perception component then generates the tracking results based on the embeddings.
2 code implementations • 25 Dec 2023 • Jiannan Wu, Yi Jiang, Bin Yan, Huchuan Lu, Zehuan Yuan, Ping Luo
We evaluate our unified models on various benchmarks.
1 code implementation • 15 Dec 2023 • Yuhao Wang, Xuehu Liu, Pingping Zhang, Hu Lu, Zhengzheng Tu, Huchuan Lu
In addition, most of current Transformer-based ReID methods only utilize the global feature of class tokens to achieve the holistic retrieval, ignoring the local discriminative ones.
1 code implementation • 15 Dec 2023 • Shang Gao, Chenyang Yu, Pingping Zhang, Huchuan Lu
In addition, existing occluded person ReID benchmarks utilize occluded samples as queries, which will amplify the role of alleviating occlusion interference and underestimate the impact of the feature absence issue.
1 code implementation • 15 Dec 2023 • Chenyang Yu, Xuehu Liu, Yingquan Wang, Pingping Zhang, Huchuan Lu
Technically, TMC allows the frame-level memories in a sequence to communicate with each other, and to extract temporal information based on the relations within the sequence.
1 code implementation • 5 Dec 2023 • Xiaoqi Zhao, Youwei Pang, Zhenyu Chen, Qian Yu, Lihe Zhang, Hanqi Liu, Jiaming Zuo, Huchuan Lu
We conduct a comprehensive study on a new task named power battery detection (PBD), which aims to localize the dense cathode and anode plates endpoints from X-ray images to evaluate the quality of power batteries.
no code implementations • 1 Dec 2023 • Pengxiang Li, Kai Chen, Zhili Liu, Ruiyuan Gao, Lanqing Hong, Guo Zhou, Hua Yao, Dit-yan Yeung, Huchuan Lu, Xu Jia
Despite remarkable achievements in video synthesis, achieving granular control over complex dynamics, such as nuanced movement among multiple interacting objects, still presents a significant hurdle for dynamic world modeling, compounded by the necessity to manage appearance and disappearance, drastic scale changes, and ensure consistency for instances across frames.
no code implementations • 19 Nov 2023 • Youwei Pang, Xiaoqi Zhao, Jiaming Zuo, Lihe Zhang, Huchuan Lu
With the proposed dataset and baseline, we hope that this new task with more practical value can further expand the research on open-vocabulary dense prediction tasks.
1 code implementation • 31 Oct 2023 • Youwei Pang, Xiaoqi Zhao, Tian-Zhu Xiang, Lihe Zhang, Huchuan Lu
Apart from the high intrinsic similarity between camouflaged objects and their background, objects are usually diverse in scale, fuzzy in appearance, and even severely occluded.
Ranked #1 on Camouflaged Object Segmentation on Chameleon
1 code implementation • 22 Oct 2023 • Tianyu Yan, Zifu Wan, Pingping Zhang, Gong Cheng, Huchuan Lu
To relieve these issues, in this work we propose a novel Transformer-based learning framework named TransY-Net for remote sensing image CD, which improves the feature extraction from a global view and combines multi-level visual features in a pyramid manner.
2 code implementations • 30 Sep 2023 • Junsong Chen, Jincheng Yu, Chongjian Ge, Lewei Yao, Enze Xie, Yue Wu, Zhongdao Wang, James Kwok, Ping Luo, Huchuan Lu, Zhenguo Li
We hope PIXART-$\alpha$ will provide new insights to the AIGC community and startups to accelerate building their own high-quality yet low-cost generative models from scratch.
1 code implementation • 19 Sep 2023 • Jiawen Zhu, Huayi Tang, Zhi-Qi Cheng, Jun-Yan He, Bin Luo, Shihao Qiu, Shengming Li, Huchuan Lu
To address this, we propose a novel architecture called Darkness Clue-Prompted Tracking (DCPT) that achieves robust UAV tracking at night by efficiently learning to generate darkness clue prompts.
no code implementations • 15 Sep 2023 • Jie Zhao, Johan Edstedt, Michael Felsberg, Dong Wang, Huchuan Lu
Due to long-distance correlation and powerful pretrained models, transformer-based methods have initiated a breakthrough in visual object tracking performance.
1 code implementation • 28 Aug 2023 • Haiwen Diao, Bo Wan, Ying Zhang, Xu Jia, Huchuan Lu, Long Chen
Parameter-efficient transfer learning (PETL), i. e., fine-tuning a small portion of parameters, is an effective strategy for adapting pre-trained models to downstream domains.
1 code implementation • ICCV 2023 • Xin Li, Yuqing Huang, Zhenyu He, YaoWei Wang, Huchuan Lu, Ming-Hsuan Yang
Existing visual tracking methods typically take an image patch as the reference of the target to perform tracking.
no code implementations • ICCV 2023 • Ben Kang, Xin Chen, Dong Wang, Houwen Peng, Huchuan Lu
The Bridge Module incorporates the high-level information of deep features into the shallow large-resolution features.
1 code implementation • ICCV 2023 • Yichen Yuan, Yifan Wang, Lijun Wang, Xiaoqi Zhao, Huchuan Lu, Yu Wang, Weibo Su, Lei Zhang
Recent leading zero-shot video object segmentation (ZVOS) works devote to integrating appearance and motion information by elaborately designing feature fusion modules and identically applying them in multiple feature stages.
no code implementations • ICCV 2023 • Jiannan Wu, Yi Jiang, Bin Yan, Huchuan Lu, Zehuan Yuan, Ping Luo
Open-world instance segmentation is a rising task, which aims to segment all objects in the image by learning from a limited number of base-category objects.
1 code implementation • 7 Aug 2023 • Xinhao Deng, Pingping Zhang, Wei Liu, Huchuan Lu
To address above issues, in this work, we first propose a new HRS10K dataset, which contains 10, 500 high-quality annotated images at 2K-8K resolution.
no code implementations • 7 Aug 2023 • Xuehu Liu, Pingping Zhang, Huchuan Lu
Meanwhile, to extract short-term representations, we propose a Bi-direction Motion Estimator (BME), in which reciprocal motion information is efficiently extracted from consecutive frames.
Representation Learning Video-Based Person Re-Identification
2 code implementations • 1 Aug 2023 • Mingzhan Yang, Guangxin Han, Bin Yan, Wenhua Zhang, Jinqing Qi, Huchuan Lu, Dong Wang
Also, our method shows strong generalization for diverse trackers and scenarios in a plug-and-play and training-free manner.
Ranked #8 on Multi-Object Tracking on DanceTrack
1 code implementation • ICCV 2023 • Junwen He, Yifan Wang, Lijun Wang, Huchuan Lu, Jun-Yan He, Jin-Peng Lan, Bin Luo, Yifeng Geng, Xuansong Xie
Our method sets the new state of the art for depth-aware panoptic segmentation on both Cityscapes-DVPS and SemKITTI-DVPS datasets.
1 code implementation • 26 Jul 2023 • Jiawen Zhu, Zhenyu Chen, Zeqi Hao, Shijie Chang, Lu Zhang, Dong Wang, Huchuan Lu, Bin Luo, Jun-Yan He, Jin-Peng Lan, Hanyuan Chen, Chenyang Li
To further improve the quality of tracking masks, a pretrained MR model is employed to refine the tracking results.
Ranked #5 on Semi-Supervised Video Object Segmentation on YouTube-VOS 2019 (using extra training data)
1 code implementation • 23 Jul 2023 • Youwei Pang, Xiaoqi Zhao, Lihe Zhang, Huchuan Lu
Specifically, unlike existing methods that over-specialize in a single task or a subset of tasks, ComPtr starts from the more general concept of bi-source dense prediction.
Ranked #14 on Semantic Segmentation on NYU Depth v2
1 code implementation • 4 Jun 2023 • Shijie Chang, Zeqi Hao, Ben Kang, Xiaoqi Zhao, Jiawen Zhu, Zhenyu Chen, Lihe Zhang, Lu Zhang, Huchuan Lu
In this paper, we introduce 3rd place solution for PVUW2023 VSS track.
no code implementations • 26 May 2023 • Zaibin Zhang, Yuanhang Zhang, Lijun Wang, Yifan Wang, Huchuan Lu
At the core of our method is the newly-designed instance occupancy prediction (IOP) module, which aims to infer point-level occupancy status for each instance in the frustum space.
1 code implementation • 23 May 2023 • Xinyu Zhang, Hefei Huang, Xu Jia, Dong Wang, Huchuan Lu
In this work, we aim to re-expose the captured photo in post-processing to provide a more flexible way of addressing those issues within a unified framework.
Ranked #4 on Deblurring on GoPro (using extra training data)
1 code implementation • CVPR 2023 • Xin Chen, Ben Kang, Jiawen Zhu, Dong Wang, Houwen Peng, Huchuan Lu
In this paper, we introduce a new sequence-to-sequence learning framework for RGB-based and multi-modal object tracking.
Ranked #3 on Visual Object Tracking on NeedForSpeed
1 code implementation • 27 Apr 2023 • Xuehu Liu, Chenyang Yu, Pingping Zhang, Huchuan Lu
Further, in spatial, we propose a Complementary Content Attention (CCA) to take advantages of the coupled structure and guide independent features for spatial complementary learning.
1 code implementation • 19 Apr 2023 • Chongjian Ge, Junsong Chen, Enze Xie, Zhongdao Wang, Lanqing Hong, Huchuan Lu, Zhenguo Li, Ping Luo
These queries are then processed iteratively by a BEV-Evolving decoder, which selectively aggregates deep features from either LiDAR, cameras, or both modalities.
1 code implementation • CVPR 2023 • Haojie Zhao, Junsong Chen, Lijun Wang, Huchuan Lu
Compared with traditional RGB-only visual tracking, few datasets have been constructed for RGB-D tracking.
no code implementations • CVPR 2023 • Jianchuan Chen, Wentao Yi, Liqian Ma, Xu Jia, Huchuan Lu
The results demonstrate that our approach outperforms state-of-the-art methods in terms of novel view synthesis and geometric reconstruction.
1 code implementation • 23 Mar 2023 • Haiwen Diao, Ying Zhang, Wei Liu, Xiang Ruan, Huchuan Lu
Exploiting fine-grained correspondence and visual-semantic alignments has shown great potential in image-text matching.
Ranked #2 on Image Retrieval on Flickr30K 1K test
2 code implementations • 20 Mar 2023 • Xiaoqi Zhao, Hongpeng Jia, Youwei Pang, Long Lv, Feng Tian, Lihe Zhang, Weibing Sun, Huchuan Lu
Next, we expand the single-scale SU to the intra-layer multi-scale SU, which can provide the decoder with both pixel-level and structure-level difference information.
1 code implementation • CVPR 2023 • Jiawen Zhu, Simiao Lai, Xin Chen, Dong Wang, Huchuan Lu
To inherit the powerful representations of the foundation model, a natural modus operandi for multi-modal tracking is full fine-tuning on the RGB-based parameters.
Ranked #5 on Rgb-T Tracking on LasHeR
1 code implementation • 18 Mar 2023 • Xiaoqi Zhao, Shijie Chang, Youwei Pang, Jiaxing Yang, Lihe Zhang, Huchuan Lu
In the static object predictor, the RGB source is converted to depth and static saliency sources, simultaneously.
Ranked #1 on Unsupervised Video Object Segmentation on FBMS test
1 code implementation • 18 Mar 2023 • Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu, Lei Zhang
They ignore two key problems when the encoder exchanges information with the decoder: one is the lack of interference control mechanism between them, the other is without considering the disparity of the contributions from different encoder levels.
1 code implementation • 17 Mar 2023 • Dongsheng Wang, Xu Jia, Yang Zhang, Xinyu Zhang, Yaoyuan Wang, Ziyang Zhang, Dong Wang, Huchuan Lu
To fully exploit information with event streams to detect objects, a dual-memory aggregation network (DMANet) is proposed to leverage both long and short memory along event streams to aggregate effective information for object detection.
1 code implementation • CVPR 2023 • Bin Yan, Yi Jiang, Jiannan Wu, Dong Wang, Ping Luo, Zehuan Yuan, Huchuan Lu
All instance perception tasks aim at finding certain objects specified by some queries such as category names, language expressions, and target annotations, but this complete field has been split into multiple independent subtasks.
Ranked #1 on Referring Expression Segmentation on RefCoCo val (using extra training data)
Described Object Detection Generalized Referring Expression Comprehension +15
1 code implementation • CVPR 2023 • Haojie Zhao, Dong Wang, Huchuan Lu
However, for the template, we make the decoder reconstruct the target appearance within the search region.
1 code implementation • CVPR 2023 • Yingwei Wang, Xu Jia, Xin Tao, Takashi Isobe, Huchuan Lu, Yu-Wing Tai
Videos stored on mobile devices or delivered on the Internet are usually in compressed format and are of various unknown compression parameters, but most video super-resolution (VSR) methods often assume ideal inputs resulting in large performance gap between experimental settings and real-world applications.
no code implementations • ICCV 2023 • Jiannan Wu, Yi Jiang, Bin Yan, Huchuan Lu, Zehuan Yuan, Ping Luo
In this work, we end the current fragmented situation and propose UniRef to unify the three reference-based object segmentation tasks with a single architecture.
no code implementations • ICCV 2023 • Chongjian Ge, Junsong Chen, Enze Xie, Zhongdao Wang, Lanqing Hong, Huchuan Lu, Zhenguo Li, Ping Luo
These queries are then processed iteratively by a BEV-Evolving decoder, which selectively aggregates deep features from either LiDAR, cameras, or both modalities.
1 code implementation • ICCV 2023 • Jiayu Sun, Ke Xu, Youwei Pang, Lihe Zhang, Huchuan Lu, Gerhard Hancke, Rynson Lau
In this paper, we propose a novel method to detect shadows from raw images.
1 code implementation • CVPR 2023 • Wenda Zhao, Shigeng Xie, Fan Zhao, You He, Huchuan Lu
Conversely, detection task furnishes object semantic information to improve the infrared and visible image fusion.
1 code implementation • 13 Dec 2022 • Qinghe Wang, Lijie Liu, Miao Hua, Pengfei Zhu, WangMeng Zuo, QinGhua Hu, Huchuan Lu, Bing Cao
We blend the semantic layouts of source head and source body, and then inpaint the transition region by the semantic layout generator, achieving a coarse-grained head swapping.
no code implementations • 6 Dec 2022 • Jianchuan Chen, Wentao Yi, Tiantian Wang, Xing Li, Liqian Ma, Yangyu Fan, Huchuan Lu
The integrated features acting as the latent code are anchored to the SMPLX mesh in the canonical space.
no code implementations • 9 Nov 2022 • Fan Zhao, Wenda Zhao, Huchuan Lu
General deep learning-based methods for infrared and visible image fusion rely on the unsupervised mechanism for vital information retention by utilizing elaborately designed loss functions.
1 code implementation • 14 Jul 2022 • Bin Yan, Yi Jiang, Peize Sun, Dong Wang, Zehuan Yuan, Ping Luo, Huchuan Lu
We present a unified method, termed Unicorn, that can simultaneously solve four tracking problems (SOT, MOT, VOS, MOTS) with a single network using the same model parameters.
Multi-Object Tracking Multi-Object Tracking and Segmentation +3
no code implementations • 10 Jul 2022 • Jiawen Zhu, Xin Chen, Pengyu Zhang, Xinying Wang, Dong Wang, Wenda Zhao, Huchuan Lu
Trackers tend to lose the target object due to the limited search region or be interfered with by distractors due to the excessive search region.
1 code implementation • CVPR 2022 • Takashi Isobe, Xu Jia, Xin Tao, Changlin Li, Ruihuang Li, Yongjie Shi, Jing Mu, Huchuan Lu, Yu-Wing Tai
Instead of directly feeding consecutive frames into a VSR model, we propose to compute the temporal difference between frames and divide those pixels into two subsets according to the level of difference.
no code implementations • CVPR 2022 • Pengyu Zhang, Jie Zhao, Dong Wang, Huchuan Lu, Xiang Ruan
With the popularity of multi-modal sensors, visible-thermal (RGB-T) object tracking is to achieve robust performance and wider application scenarios with the guidance of objects' temperature information.
no code implementations • 30 Mar 2022 • Guang Feng, Lihe Zhang, Zhiwei Hu, Huchuan Lu
To address this task, we first design a two-stream encoder to extract CNN-based visual features and transformer-based linguistic features hierarchically, and a vision-language mutual guidance (VLMG) module is inserted into the encoder multiple times to promote the hierarchical and progressive fusion of multi-modal features.
Ranked #3 on Referring Expression Segmentation on J-HMDB
1 code implementation • 25 Mar 2022 • Xin Chen, Ben Kang, Dong Wang, Dongdong Li, Huchuan Lu
Most state-of-the-art trackers are satisfied with the real-time speed on powerful GPUs.
no code implementations • CVPR 2022 • Weihua He, Kaichao You, Zhendong Qiao, Xu Jia, Ziyang Zhang, Wenhui Wang, Huchuan Lu, Yaoyuan Wang, Jianxing Liao
Since event camera is a novel sensor, its potential has not been fulfilled due to the lack of processing algorithms.
1 code implementation • 25 Mar 2022 • Xin Chen, Bin Yan, Jiawen Zhu, Huchuan Lu, Xiang Ruan, Dong Wang
First, we present a transformer tracking (named TransT) method based on the Siamese-like feature extraction backbone, the designed attention-based fusion mechanism, and the classification and regression head.
1 code implementation • 9 Mar 2022 • Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu
In this paper, we propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD).
no code implementations • 8 Mar 2022 • Jiaxing Yang, Lihe Zhang, Huchuan Lu
In this work, we propose Atrous Transformer (AtrousFormer) to solve the problem.
Ranked #25 on Lane Detection on CULane
no code implementations • CVPR 2022 • Shuai Liu, Xin Li, Huchuan Lu, You He
Multi-object tracking in unmanned aerial vehicle (UAV) videos is an important vision task and can be applied in a wide range of applications.
no code implementations • CVPR 2022 • Yifan Wang, Wenbo Zhang, Lijun Wang, Ting Liu, Huchuan Lu
We design an Uncertainty Mining Network (UMNet) which consists of multiple Merge-and-Split (MS) modules to recursively analyze the commonality and difference among multiple noisy labels and infer pixel-wise uncertainty map for each label.
1 code implementation • 13 Dec 2021 • Xin Li, Qiao Liu, Wenjie Pei, Qiuhong Shen, YaoWei Wang, Huchuan Lu, Ming-Hsuan Yang
Along with the rapid progress of visual tracking, existing benchmarks become less informative due to redundancy of samples and weak discrimination between current trackers, making evaluations on all datasets extremely time-consuming.
1 code implementation • 4 Dec 2021 • Youwei Pang, Xiaoqi Zhao, Lihe Zhang, Huchuan Lu
Most of the existing bi-modal (RGB-D and RGB-T) salient object detection methods utilize the convolution operation and construct complex interweave fusion structures to achieve cross-modal information integration.
1 code implementation • ICCV 2021 • Yongri Piao, Jian Wang, Miao Zhang, Huchuan Lu
The multiple accurate cues from multiple DFs are then simultaneously propagated to the saliency network with a multi-guidance loss.
no code implementations • 1 Dec 2021 • Yue Wang, Xu Jia, Lu Zhang, Yuke Li, James Elder, Huchuan Lu
TFFM conducts a sufficient feature fusion by integrating features from multiple scales and two modalities over all positions simultaneously.
1 code implementation • NeurIPS 2021 • Jingjing Li, Wei Ji, Qi Bi, Cheng Yan, Miao Zhang, Yongri Piao, Huchuan Lu, Li Cheng
As a by-product, a CapS dataset is constructed by augmenting existing benchmark training set with additional image tags and captions.
1 code implementation • 24 Sep 2021 • Jiayu Sun, Zhanghan Ke, Lihe Zhang, Huchuan Lu, Rynson W. H. Lau
In this work, we observe that instead of asking the user to explicitly provide a background image, we may recover it from the input video itself.
no code implementations • 4 Sep 2021 • Yongri Piao, Jian Wang, Miao Zhang, Zhengxuan Ma, Huchuan Lu
Despite of the success of previous works, explorations on an effective training strategy for the saliency network and accurate matches between image-level annotations and salient objects are still inadequate.
2 code implementations • 11 Aug 2021 • Xiaoqi Zhao, Lihe Zhang, Huchuan Lu
\keywords{Colorectal Cancer \and Automatic Polyp Segmentation \and Subtraction \and LossNet.}
1 code implementation • 11 Aug 2021 • Xiaoqi Zhao, Youwei Pang, Jiaxing Yang, Lihe Zhang, Huchuan Lu
In this paper, we propose a novel multi-source fusion network for zero-shot video object segmentation.
Ranked #1 on Video Object Segmentation on FBMS (Jaccard (Mean) metric)
1 code implementation • ICCV 2021 • Kenan Dai, Jie Zhao, Lijun Wang, Dong Wang, Jianhua Li, Huchuan Lu, Xuesheng Qian, Xiaoyun Yang
Deep learning based visual trackers entail offline pre-training on large volumes of video datasets with accurate bounding box annotations that are labor-expensive to achieve.
1 code implementation • 13 Jul 2021 • Guowen Zhang, Pingping Zhang, Jinqing Qi, Huchuan Lu
In this work, we take advantages of both CNNs and Transformers, and propose a novel learning framework named Hierarchical Aggregation Transformer (HAT) for image-based person Re-ID with high performance.
1 code implementation • 25 Jun 2021 • Jianchuan Chen, Ying Zhang, Di Kang, Xuefei Zhe, Linchao Bao, Xu Jia, Huchuan Lu
We present animatable neural radiance fields (animatable NeRF) for detailed human avatar creation from monocular videos.
no code implementations • 21 Jun 2021 • Xin Li, Wenjie Pei, YaoWei Wang, Zhenyu He, Huchuan Lu, Ming-Hsuan Yang
While deep-learning based tracking methods have achieved substantial progress, they entail large-scale and high-quality annotated data for sufficient training.
1 code implementation • CVPR 2021 • Wei Ji, Jingjing Li, Shuang Yu, Miao Zhang, Yongri Piao, Shunyu Yao, Qi Bi, Kai Ma, Yefeng Zheng, Huchuan Lu, Li Cheng
Complex backgrounds and similar appearances between objects and their surroundings are generally recognized as challenging scenarios in Salient Object Detection (SOD).
Ranked #13 on Thermal Image Segmentation on RGB-T-Glass-Segmentation
1 code implementation • CVPR 2021 • Wenda Zhao, Cai Shang, Huchuan Lu
The core insight is that a defocus blur region/focused clear area can be arbitrarily pasted to a given realistic full blurred image/full clear image without affecting the judgment of the full blurred image/full clear image.
no code implementations • CVPR 2021 • Takashi Isobe, Xu Jia, Shuaijun Chen, Jianzhong He, Yongjie Shi, Jianzhuang Liu, Huchuan Lu, Shengjin Wang
To obtain a single model that works across multiple target domains, we propose to simultaneously learn a student model which is trained to not only imitate the output of each expert on the corresponding target domain, but also to pull different expert close to each other with regularization on their weights.
Ranked #4 on Domain Adaptation on GTAV to Cityscapes+Mapillary
no code implementations • CVPR 2021 • Guang Feng, Zhiwei Hu, Lihe Zhang, Huchuan Lu
In this work, we propose an encoder fusion network (EFN), which transforms the visual encoder into a multi-modal feature learning network, and uses language to refine the multi-modal features progressively.
1 code implementation • CVPR 2021 • Bin Yan, Houwen Peng, Kan Wu, Dong Wang, Jianlong Fu, Huchuan Lu
Object tracking has achieved significant progress over the past few years.
no code implementations • 5 Apr 2021 • Xuehu Liu, Pingping Zhang, Chenyang Yu, Huchuan Lu, Xuesheng Qian, Xiaoyun Yang
To capture richer perceptions and extract more comprehensive video representations, in this paper we propose a novel framework named Trigeminal Transformers (TMT) for video-based person Re-ID.
1 code implementation • ICCV 2021 • Bin Yan, Houwen Peng, Jianlong Fu, Dong Wang, Huchuan Lu
In this paper, we present a new tracking architecture with an encoder-decoder transformer as the key component.
Ranked #17 on Visual Object Tracking on TrackingNet
1 code implementation • CVPR 2021 • Xin Chen, Bin Yan, Jiawen Zhu, Dong Wang, Xiaoyun Yang, Huchuan Lu
The correlation operation is a simple fusion manner to consider the similarity between the template and the search region.
Ranked #5 on Visual Tracking on TNL2K
1 code implementation • CVPR 2021 • Xuehu Liu, Pingping Zhang, Chenyang Yu, Huchuan Lu, Xiaoyun Yang
Specifically, we first propose a Global-guided Correlation Estimation (GCE) to generate feature correlation maps of local features and global features, which help to localize the high- and low-correlation regions for identifying the same person.
1 code implementation • 29 Jan 2021 • Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu, Xiang Ruan
Existing CNNs-Based RGB-D salient object detection (SOD) networks are all required to be pretrained on the ImageNet to learn the hierarchy features which helps provide a good initialization.
12 code implementations • CVPR 2021 • Tao Huang, Songjiang Li, Xu Jia, Huchuan Lu, Jianzhuang Liu
In this paper, we present a very simple yet effective method named Neighbor2Neighbor to train an effective image denoising model with only noisy images.
1 code implementation • 5 Jan 2021 • Haiwen Diao, Ying Zhang, Lin Ma, Huchuan Lu
Image-text matching plays a critical role in bridging the vision and language, and great progress has been made by exploiting the global alignment between image and sentence, or local alignments between regions and words.
Ranked #3 on Image Retrieval on Flickr30K 1K test
1 code implementation • ICCV 2021 • Yu Zeng, Zhe Lin, Huchuan Lu, Vishal M. Patel
The auxiliary branch (i. e. CR loss) is required only during training, and only the inpainting generator is required during the inference.
Ranked #8 on Image Inpainting on Places2
no code implementations • ICCV 2021 • Lijun Wang, Yifan Wang, Linzhao Wang, Yunlong Zhan, Ying Wang, Huchuan Lu
The integration of SAG loss and two-stream network enables more consistent scale inference and more accurate relative depth estimation.
1 code implementation • ICCV 2021 • Shu Yang, Lu Zhang, Jinqing Qi, Huchuan Lu, Shuo Wang, Xiaoxing Zhang
How to make the appearance and motion information interact effectively to accommodate complex scenarios is a fundamental issue in flow-based zero-shot video object segmentation.
Semantic Segmentation Unsupervised Video Object Segmentation +2
1 code implementation • ICCV 2021 • Miao Zhang, Jie Liu, Yifei Wang, Yongri Piao, Shunyu Yao, Wei Ji, Jingjing Li, Huchuan Lu, Zhongxuan Luo
Our bidirectional dynamic fusion strategy encourages the interaction of spatial and temporal information in a dynamic manner.
Ranked #12 on Video Polyp Segmentation on SUN-SEG-Easy (Unseen)
no code implementations • 30 Dec 2020 • Yongri Piao, Zhengkun Rong, Shuang Xu, Miao Zhang, Huchuan Lu
The success of learning-based light field saliency detection is heavily dependent on how a comprehensive dataset can be constructed for higher generalizability of models, how high dimensional light field data can be effectively exploited, and how a flexible model can be designed to achieve versatility for desktop computers and mobile devices.
1 code implementation • CVPR 2021 • Bin Yan, Xinyu Zhang, Dong Wang, Huchuan Lu, Xiaoyun Yang
Many recent trackers adopt the multiple-stage tracking strategy to improve the quality of bounding box estimation.
Ranked #15 on Semi-Supervised Video Object Segmentation on VOT2020
Semi-Supervised Video Object Segmentation Visual Object Tracking
2 code implementations • 8 Dec 2020 • Pengyu Zhang, Dong Wang, Huchuan Lu
Visual object tracking, as a fundamental task in computer vision, has drawn much attention in recent years.
1 code implementation • 25 Nov 2020 • Yu Zeng, Zhe Lin, Huchuan Lu, Vishal M. Patel
Due to the lack of supervision signals for the correspondence between missing regions and known regions, it may fail to find proper reference features, which often leads to artifacts in the results.
no code implementations • 25 Oct 2020 • Mingyang Qian, Yi Fu, Xiao Tan, YingYing Li, Jinqing Qi, Huchuan Lu, Shilei Wen, Errui Ding
Video segmentation approaches are of great importance for numerous vision tasks especially in video manipulation for entertainment.
2 code implementations • ECCV 2020 • Wei Ji, Jingjing Li, Miao Zhang, Yongri Piao, Huchuan Lu
The explicitly extracted edge information goes together with saliency to give more emphasis to the salient regions and object boundaries.
Ranked #19 on RGB-D Salient Object Detection on NJU2K
1 code implementation • CVPR 2020 • Youwei Pang, Xiaoqi Zhao, Lihe Zhang, Huchuan Lu
To obtain more efficient multi-scale features from the integrated features, the self-interaction modules are embedded in each decoder unit.
3 code implementations • ECCV 2020 • Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu, Lei Zhang
With the help of multilevel gate units, the valuable context information from the encoder can be optimally transmitted to the decoder.
Ranked #14 on Dichotomous Image Segmentation on DIS-TE4
1 code implementation • ECCV 2020 • Xiaoqi Zhao, Lihe Zhang, Youwei Pang, Huchuan Lu, Lei Zhang
In this work, we design a single stream network to directly use the depth map to guide early fusion and middle fusion between RGB and depth, which saves the feature encoder of the depth stream and achieves a lightweight and real-time model.
Ranked #15 on Thermal Image Segmentation on RGB-T-Glass-Segmentation
1 code implementation • ECCV 2020 • Youwei Pang, Lihe Zhang, Xiaoqi Zhao, Huchuan Lu
The main purpose of RGB-D salient object detection (SOD) is how to better integrate and utilize cross-modal fusion information.
Ranked #5 on RGB-D Salient Object Detection on NJU2K
1 code implementation • 4 Jul 2020 • Bin Yan, Dong Wang, Huchuan Lu, Xiaoyun Yang
In recent years, the multiple-stage strategy has become a popular trend for visual tracking.
no code implementations • 4 Jul 2020 • Pengyu Zhang, Jie Zhao, Dong Wang, Huchuan Lu, Xiaoyun Yang
In this study, we propose a novel RGB-T tracking framework by jointly modeling both appearance and motion cues.
no code implementations • 3 Jul 2020 • Yue Wang, Yuke Li, James H. Elder, Huchuan Lu, Runmin Wu, Lu Zhang
Evaluation on seven RGB-D datasets demonstrates that even without saliency ground truth for RGB-D datasets and using only the RGB data of RGB-D datasets at inference, our semi-supervised system performs favorable against state-of-the-art fully-supervised RGB-D saliency detection methods that use saliency ground truth for RGB-D datasets at training and depth data at inference on two largest testing datasets.
1 code implementation • ECCV 2020 • Yu Zeng, Zhe Lin, Jimei Yang, Jianming Zhang, Eli Shechtman, Huchuan Lu
To address this challenge, we propose an iterative inpainting method with a feedback mechanism.
Ranked #6 on Image Inpainting on Places2
1 code implementation • CVPR 2020 • Shang Gao, Jingya Wang, Huchuan Lu, Zimo Liu
Occluded person re-identification is a challenging task as the appearance varies substantially with various obstacles, especially in the crowd scenario.
2 code implementations • CVPR 2020 • Kenan Dai, Yunhua Zhang, Dong Wang, Jianhua Li, Huchuan Lu, Xiaoyun Yang
Most top-ranked long-term trackers adopt the offline-trained Siamese architectures, thus, they cannot benefit from great progress of short-term trackers with online update.
Ranked #10 on Visual Object Tracking on LaSOT-ext
1 code implementation • CVPR 2020 • Bin Yan, Dong Wang, Huchuan Lu, Xiaoyun Yang
An effective and efficient perturbation generator is trained with a carefully designed adversarial loss, which can simultaneously cool hot regions where the target exists on the heatmaps and force the predicted bounding box to shrink, making the tracked target invisible to trackers.
1 code implementation • 24 Feb 2020 • Runmin Wu, Kunyao Zhang, Lijun Wang, Yue Wang, Pingping Zhang, Huchuan Lu, Yizhou Yu
Though recent research has achieved remarkable progress in generating realistic images with generative adversarial networks (GANs), the lack of training stability is still a lingering concern of most GANs, especially on high-resolution inputs and complex datasets.
6 code implementations • IEEE Transactions on Image Processing 2020 • Shuhan Chen, Xiuli Tan, Ben Wang, Huchuan Lu, Xuelong Hu, Yun Fu
Benefiting from the quick development of deep convolutional neural networks, especially fully convolutional neural networks (FCNs), remarkable progresses have been achieved on salient object detection recently.
1 code implementation • NeurIPS 2019 • Miao Zhang, Jingjing Li, Ji Wei, Yongri Piao, Huchuan Lu
In this paper, we present a deep-learning-based method where a novel memory-oriented decoder is tailored for light field saliency detection.
no code implementations • 27 Nov 2019 • Yue Wang, Yuke Li, James H. Elder, Runmin Wu, Huchuan Lu
We address this problem by introducing a Class-Conditional Domain Adaptation method (CCDA).
1 code implementation • CVPR 2019 • Yuxuan Sun, Chong Sun, Dong Wang, You He, Huchuan Lu
The ROI (region-of-interest) based pooling method performs pooling operations on the cropped ROI regions for various samples and has shown great success in the object detection methods.
1 code implementation • International Conference on Computer Vision Workshops 2019 • Dawei Du, Pengfei Zhu, Longyin Wen, Xiao Bian, Haibin Lin, QinGhua Hu, Tao Peng, Jiayu Zheng, Xinyao Wang, Yue Zhang, Liefeng Bo, Hailin Shi, Rui Zhu, Aashish Kumar, Aijin Li, Almaz Zinollayev, Anuar Askergaliyev, Arne Schumann, Binjie Mao, Byeongwon Lee, Chang Liu, Changrui Chen, Chunhong Pan, Chunlei Huo, Da Yu, Dechun Cong, Dening Zeng, Dheeraj Reddy Pailla, Di Li, Dong Wang, Donghyeon Cho, Dongyu Zhang, Furui Bai, George Jose, Guangyu Gao, Guizhong Liu, Haitao Xiong, Hao Qi, Haoran Wang, Heqian Qiu, Hongliang Li, Huchuan Lu, Ildoo Kim, Jaekyum Kim, Jane Shen, Jihoon Lee, Jing Ge, Jingjing Xu, Jingkai Zhou, Jonas Meier, Jun Won Choi, Junhao Hu, Junyi Zhang, Junying Huang, Kaiqi Huang, Keyang Wang, Lars Sommer, Lei Jin, Lei Zhang
Results of 33 object detection algorithms are presented.
no code implementations • 8 Oct 2019 • Pingping Zhang, Wei Liu, Yinjie Lei, Hongyu Wang, Huchuan Lu
The proposed method consists of three modules, i. e., recurrent FCNs, adaptive multiphase level set, and deeply supervised learning.
2 code implementations • ICCV 2019 • Peixia Li, Bo-Yu Chen, Wanli Ouyang, Dong Wang, Xiaoyun Yang, Huchuan Lu
In this work, we propose a novel gradient-guided network to exploit the discriminative information in gradients and update the template in the siamese network through feed-forward and backward operations.
Ranked #3 on Visual Object Tracking on OTB-2015 (Precision metric)
1 code implementation • ICCV 2019 • Yu Zeng, Yunzhi Zhuge, Huchuan Lu, Lihe Zhang
SSNet consists of a segmentation network (SN) and a saliency aggregation module (SAM).
1 code implementation • ICCV 2019 • Bin Yan, Haojie Zhao, Dong Wang, Huchuan Lu, Xiaoyun Yang
In this work, we present a novel robust and real-time long-term tracking framework based on the proposed skimming and perusal modules.
1 code implementation • ICCV 2019 • Yi Zeng, Pingping Zhang, Jianming Zhang, Zhe Lin, Huchuan Lu
This paper pushes forward high-resolution saliency detection, and contributes a new dataset, named High-Resolution Salient Object Detection (HRSOD).
Ranked #10 on RGB Salient Object Detection on DAVIS-S (using extra training data)
no code implementations • ICCV 2019 • Pingping Zhang, Wei Liu, Yinjie Lei, Huchuan Lu, Xiaoyun Yang
To address these issues, in this work we propose a novel deep learning framework, named Cascaded Context Pyramid Network (CCPNet), to jointly infer the occupancy and semantic labels of a volumetric 3D scene from a single depth image.
Ranked #5 on 3D Semantic Scene Completion on NYUv2 (using extra training data)
1 code implementation • CVPR 2019 • Yu Zeng, Yunzhi Zhuge, Huchuan Lu, Lihe Zhang, Mingyang Qian, Yizhou Yu
To this end, we propose a unified framework to train saliency detection models with diverse weak supervision sources.
no code implementations • 21 Jan 2019 • Pingping Zhang, Wei Liu, Huchuan Lu, Chunhua Shen
Inspired by the intrinsic reflection of natural images, in this paper we propose a novel feature learning framework for large-scale salient object detection.
no code implementations • 19 Nov 2018 • Ziqi Zhou, Zheng Wang, Huchuan Lu, Song Wang, Meijun Sun
In this paper, based on the fact that salient areas in videos are relatively small and concentrated, we propose a \textbf{key salient object re-augmentation method (KSORA) using top-down semantic knowledge and bottom-up feature guidance} to improve detection accuracy in video scenes.
no code implementations • 18 Oct 2018 • Lijun Wang, Xiaohui Shen, Jianming Zhang, Oliver Wang, Zhe Lin, Chih-Yao Hsieh, Sarah Kong, Huchuan Lu
To achieve this, we propose a novel neural network model comprised of a depth prediction module, a lens blur module, and a guided upsampling module.
no code implementations • 28 Sep 2018 • Yunzhi Zhuge, Pingping Zhang, Huchuan Lu
Fully convolutional networks (FCN) has significantly improved the performance of many pixel-labeling tasks, such as semantic segmentation and depth estimation.
3 code implementations • 12 Sep 2018 • Yunhua Zhang, Dong Wang, Lijun Wang, Jinqing Qi, Huchuan Lu
Compared with short-term tracking, the long-term tracking task requires determining the tracked object is present or absent, and then estimating the accurate bounding box if present or conducting image-wide re-detection if absent.
no code implementations • ECCV 2018 • Yunhua Zhang, Lijun Wang, Jinqing Qi, Dong Wang, Mengyang Feng, Huchuan Lu
In this paper, we circumvent this issue by proposing a local structure learning method, which simultaneously considers the local patterns of the target and their structural relationships for more accurate target tracking.
no code implementations • ECCV 2018 • Boyu Chen, Dong Wang, Peixia Li, Shuang Wang, Huchuan Lu
In this work, we propose a novel tracking algorithm with real-time performance based on the âActor-Criticâ framework.
no code implementations • 4 Aug 2018 • Pingping Zhang, Huchuan Lu, Chunhua Shen
In addition, our work has text overlap with arXiv:1804. 06242, arXiv:1705. 00938 by other authors.
no code implementations • CVPR 2018 • Tiantian Wang, Lihe Zhang, Shuo Wang, Huchuan Lu, Gang Yang, Xiang Ruan, Ali Borji
Moreover, to effectively recover object boundaries, we propose a local Boundary Refinement Network (BRN) to adaptively learn the local contextual information for each spatial position.
Ranked #12 on RGB Salient Object Detection on DUTS-TE
no code implementations • CVPR 2018 • Wenda Zhao, Fan Zhao, Dong Wang, Huchuan Lu
To address these issues, we propose a multi-stream bottom-top-bottom fully convolutional network (BTBNet), which is the first attempt to develop an end-to-end deep network for DBD.
Ranked #2 on Defocus Estimation on CUHK - Blur Detection Dataset (MAE metric)
1 code implementation • CVPR 2018 • Xiaoning Zhang, Tiantian Wang, Jinqing Qi, Huchuan Lu, Gang Wang
In this paper, we propose a novel attention guided network which selectively integrates multi-level contextual information in a progressive manner.
Ranked #11 on RGB Salient Object Detection on DUTS-TE (max F-measure metric)
1 code implementation • CVPR 2018 • Yu Zeng, Huchuan Lu, Lihe Zhang, Mengyang Feng, Ali Borji
The categories and appearance of salient objects vary from image to image, therefore, saliency detection is an image-specific task.
no code implementations • CVPR 2018 • Lu Zhang, Ju Dai, Huchuan Lu, You He, Gang Wang
In this paper, we propose a novel bi-directional message passing model to integrate multi-level features for salient object detection.
Ranked #2 on RGB Salient Object Detection on ISTD
no code implementations • CVPR 2018 • Jinshan Pan, Sifei Liu, Deqing Sun, Jiawei Zhang, Yang Liu, Jimmy Ren, Zechao Li, Jinhui Tang, Huchuan Lu, Yu-Wing Tai, Ming-Hsuan Yang
These problems usually involve the estimation of two components of the target signals: structures and details.
1 code implementation • CVPR 2018 • Chong Sun, Dong Wang, Huchuan Lu, Ming-Hsuan Yang
To address this issue, we propose a novel CF-based optimization problem to jointly model the discrimination and reliability information.
no code implementations • 14 Apr 2018 • Pingping Zhang, Huchuan Lu, Chunhua Shen
Salient object detection (SOD), which aims to find the most important region of interest and segment the relevant object/item in that area, is an important yet challenging vision task.
no code implementations • 22 Feb 2018 • Pingping Zhang, Wei Liu, Dong Wang, Yinjie Lei, Hongyu Wang, Chunhua Shen, Huchuan Lu
Extensive experiments demonstrate that the proposed algorithm achieves competitive performance in both saliency detection and visual tracking, especially outperforming other related trackers on the non-rigid object tracking datasets.
no code implementations • 22 Feb 2018 • Ju Dai, Pingping Zhang, Huchuan Lu, Hongyu Wang
In this paper, we propose a novel feature learning framework for video person re-identification (re-ID).
no code implementations • 20 Feb 2018 • Fei Li, Pingping Zhang, Huchuan Lu
Band selection is a direct and effective method to remove redundant information and reduce the spectral dimension for decreasing computational complexity and avoiding the curse of dimensionality.
no code implementations • 20 Feb 2018 • Pingping Zhang, Luyao Wang, Dong Wang, Huchuan Lu, Chunhua Shen
This paper proposes an Agile Aggregating Multi-Level feaTure framework (Agile Amulet) for salient object detection.
no code implementations • 19 Feb 2018 • Pingping Zhang, Wei Liu, Huchuan Lu, Chunhua Shen
Inspired by the intrinsic reflection of natural images, in this paper we propose a novel feature learning framework for large-scale salient object detection.
no code implementations • ICCV 2017 • Zimo Liu, Dong Wang, Huchuan Lu
The intensive annotation cost and the rich but unlabeled data contained in videos motivate us to propose an unsupervised video-based person re-identification (re-ID) method.
Ranked #7 on Person Re-Identification on PRID2011
1 code implementation • ICCV 2017 • Tiantian Wang, Ali Borji, Lihe Zhang, Pingping Zhang, Huchuan Lu
To remedy this problem, here we propose to augment feedforward neural networks with a novel pyramid pooling module and a multi-stage refinement mechanism for saliency detection.
Ranked #13 on RGB Salient Object Detection on DUTS-TE (max F-measure metric)
no code implementations • 9 Aug 2017 • Yu Zeng, Huchuan Lu, Ali Borji
Here, we explore the low-level statistics of images generated by state-of-the-art deep generative models.
no code implementations • 8 Aug 2017 • Yu Zeng, Huchuan Lu, Ali Borji, Mengyang Feng
Saliency maps are generated according to each region's strategy in the Nash equilibrium of the proposed Saliency Game.
1 code implementation • ICCV 2017 • Pingping Zhang, Dong Wang, Huchuan Lu, Hongyu Wang, Bao-Cai Yin
In this paper, we propose a novel deep fully convolutional network model for accurate salient object detection.
Ranked #5 on Saliency Detection on DUT-OMRON
1 code implementation • ICCV 2017 • Pingping Zhang, Dong Wang, Huchuan Lu, Hongyu Wang, Xiang Ruan
In addition, to achieve accurate boundary inference and semantic enhancement, edge-aware feature maps in low-level layers and the predicted results of low resolution features are recursively embedded into the learning framework.
Ranked #19 on RGB Salient Object Detection on DUTS-TE (max F-measure metric)
no code implementations • CVPR 2017 • Lijun Wang, Huchuan Lu, Yifan Wang, Mengyang Feng, Dong Wang, Bao-Cai Yin, Xiang Ruan
In the second stage, FIN is fine-tuned with its predicted saliency maps as ground truth.
1 code implementation • CVPR 2018 • Chong Sun, Dong Wang, Huchuan Lu, Ming-Hsuan Yang
Second, we propose a fully convolutional neural network with spatially regularized kernels, through which the filter kernel corresponding to each output channel is forced to focus on a specific region of the target.
Ranked #11 on Visual Object Tracking on VOT2017/18
8 code implementations • CVPR 2018 • Ying Zhang, Tao Xiang, Timothy M. Hospedales, Huchuan Lu
Model distillation is an effective and widely used technique to transfer knowledge from a teacher to a student network.
1 code implementation • 26 May 2017 • Yao Qin, Mengyang Feng, Huchuan Lu, Garrison W. Cottrell
The CCA can act as an efficient pixel-wise aggregation algorithm that can integrate state-of-the-art methods, resulting in even better results.
no code implementations • 26 Jan 2017 • Liang Zheng, Yujia Huang, Huchuan Lu, Yi Yang
Second, to reduce the impact of pose estimation errors and information loss during PoseBox construction, we design a PoseBox fusion (PBF) CNN architecture that takes the original image, the PoseBox, and the pose estimation confidence as input.
1 code implementation • 19 Dec 2016 • Zhizhen Chi, Hongyang Li, Huchuan Lu, Ming-Hsuan Yang
In this paper, we propose a dual network to better utilize features among layers for visual tracking.
no code implementations • 27 Jul 2016 • Bohan Zhuang, Lijun Wang, Huchuan Lu
In the discriminative model, we exploit the advances of deep learning architectures to learn generic features which are robust to both background clutters and foreground appearance variations.
no code implementations • CVPR 2016 • Lijun Wang, Wanli Ouyang, Xiaogang Wang, Huchuan Lu
To further improve the robustness of each base learner, we propose to train the convolutional layers with random binary masks, which serves as a regularization to enforce each base learner to focus on different input features.
no code implementations • CVPR 2016 • Ying Zhang, Baohua Li, Huchuan Lu, Atshushi Irie, Xiang Ruan
Person re-identification addresses the problem of matching people across disjoint camera views and extensive efforts have been made to seek either the robust feature representation or the discriminative matching metrics.
no code implementations • 6 Dec 2015 • Mengyang Feng, Ali Borji, Huchuan Lu
By predicting where humans look in natural scenes, we can understand how they perceive complex natural scenes and prioritize information for further high-level visual processing.
no code implementations • ICCV 2015 • Lijun Wang, Wanli Ouyang, Xiaogang Wang, Huchuan Lu
Instead of treating convolutional neural network (CNN) as a black-box feature extractor, we conduct in-depth study on the properties of CNN features offline pre-trained on massive image data and classification task on ImageNet.
no code implementations • 17 Aug 2015 • Hongyang Li, Huchuan Lu, Zhe Lin, Xiaohui Shen, Brian Price
In this paper, we propose a novel deep neural network framework embedded with low-level features (LCNN) for salient object detection in complex images.
no code implementations • CVPR 2015 • Yao Qin, Huchuan Lu, Yiqun Xu, He Wang
In this paper, we introduce Cellular Automata--a dynamic evolution model to intuitively detect the salient object.
no code implementations • CVPR 2015 • Baohua Li, Ying Zhang, Zhouchen Lin, Huchuan Lu
Therefore, we propose Mixture of Gaussian Regression (MoG Regression) for subspace clustering by modeling noise as a Mixture of Gaussians (MoG).
no code implementations • CVPR 2015 • Na Tong, Huchuan Lu, Xiang Ruan, Ming-Hsuan Yang
Furthermore, we show that the proposed bootstrap learning approach can be easily applied to other bottom-up saliency models for significant improvement.
no code implementations • CVPR 2015 • Lijun Wang, Huchuan Lu, Xiang Ruan, Ming-Hsuan Yang
In the global search stage, the local saliency map together with global contrast and geometric information are used as global features to describe a set of object candidate regions.
2 code implementations • 27 May 2015 • Hongyang Li, Huchuan Lu, Zhe Lin, Xiaohui Shen, Brian Price
For most natural images, some boundary superpixels serve as the background labels and the saliency of other superpixels are determined by ranking their similarities to the boundary labels based on an inner propagation scheme.
no code implementations • 14 May 2015 • Ali Borji, Mengyang Feng, Huchuan Lu
Eye movements are crucial in understanding complex scenes.
no code implementations • CVPR 2014 • Dong Wang, Huchuan Lu
In this paper, we present a novel online visual tracking method based on linear representation.
no code implementations • CVPR 2013 • Dong Wang, Huchuan Lu, Ming-Hsuan Yang
In this paper, we propose a generative tracking method based on a novel robust linear regression algorithm.
no code implementations • CVPR 2013 • Chuan Yang, Lihe Zhang, Huchuan Lu, Xiang Ruan, Ming-Hsuan Yang
The saliency of the image elements is defined based on their relevances to the given seeds or queries.