1 code implementation • 9 Mar 2025 • Chaocan Xue, Bineng Zhong, Qihua Liang, Yaozong Zheng, Ning li, Yuanliang Xue, Shuxiang Song
Based on this observation, we propose a similarity-guided layer adaptation approach to optimize the structure of ViTs.
1 code implementation • 9 Mar 2025 • Xiaohai Li, Bineng Zhong, Qihua Liang, Zhiyi Mo, Jian Nong, Shuxiang Song
Finally, the dynamic template and language descriptions that record the latest state of the target are used to update the multi-modal references, providing more accurate reference information for subsequent inference and enhancing the robustness of the tracker.
no code implementations • 10 Feb 2025 • Xiantao Hu, Bineng Zhong, Qihua Liang, Zhiyi Mo, Liangtao Shi, Ying Tai, Jian Yang
This strategy allows the model to dynamically adapt to various modalities and tasks without requiring additional fine-tuning between different tasks.
Ranked #6 on
Rgb-T Tracking
on RGBT234
1 code implementation • 1 Jan 2025 • Chenlong Xu, Bineng Zhong, Qihua Liang, Yaozong Zheng, Guorong Li, Shuxiang Song
Recently, several studies have shown that utilizing contextual information to perceive target states is crucial for object tracking.
1 code implementation • 20 Dec 2024 • Xiantao Hu, Ying Tai, Xu Zhao, Chen Zhao, Zhenyu Zhang, Jun Li, Bineng Zhong, Jian Yang
These temporal information tokens are used to guide the localization of the target in the next time state, establish long-range contextual relationships between video frames, and capture the temporal trajectory of the target.
Ranked #4 on
Rgb-T Tracking
on LasHeR
1 code implementation • 18 Dec 2024 • Jinxia Xie, Bineng Zhong, Qihua Liang, Ning li, Zhiyi Mo, Shuxiang Song
To alleviate the above issues, we propose a simple yet robust tracker that separates temporal information learning from appearance modeling and extracts temporal relations from a set of representative tokens rather than several images (or features).
1 code implementation • 18 Dec 2024 • Xiaohai Li, Bineng Zhong, Qihua Liang, Guorong Li, Zhiyi Mo, Shuxiang Song
To address this issue, we propose MambaLCT, which constructs and utilizes target variation cues from the first frame to the current frame for robust tracking.
no code implementations • 15 Mar 2024 • Jinxia Xie, Bineng Zhong, Zhiyi Mo, Shengping Zhang, Liangtao Shi, Shuxiang Song, Rongrong Ji
Firstly, we introduce a set of learnable and autoregressive queries to capture the instantaneous target appearance changes in a sliding window fashion.
Ranked #8 on
Visual Object Tracking
on DiDi
1 code implementation • 3 Mar 2024 • Qinglin Liu, Shengping Zhang, Quanling Meng, Bineng Zhong, Peiqiang Liu, Hongxun Yao
Finally, an instance matting network decodes the image features and united semantics guidance to predict all instance-level alpha mattes.
1 code implementation • 6 Jan 2024 • Liangtao Shi, Bineng Zhong, Qihua Liang, Ning li, Shengping Zhang, Xianxian Li
Specifically, we utilize spatio-temporal tokens to propagate information between consecutive frames without focusing on updating templates.
1 code implementation • 3 Jan 2024 • Yaozong Zheng, Bineng Zhong, Qihua Liang, Zhiyi Mo, Shengping Zhang, Xianxian Li
To alleviate the above problem, we propose a simple, flexible and effective video-level tracking pipeline, named \textbf{ODTrack}, which densely associates the contextual relationships of video frames in an online token propagation manner.
Ranked #1 on
Visual Object Tracking
on OTB-2015
Semi-Supervised Video Object Segmentation
Video Object Tracking
+2
no code implementations • CVPR 2024 • Jinxia Xie, Bineng Zhong, Zhiyi Mo, Shengping Zhang, Liangtao Shi, Shuxiang Song, Rongrong Ji
Firstly we introduce a set of learnable and autoregressive queries to capture the instantaneous target appearance changes in a sliding window fashion.
no code implementations • CVPR 2024 • Chenyang Wang, Zerong Zheng, Tao Yu, Xiaoqian Lv, Bineng Zhong, Shengping Zhang, Liqiang Nie
In this paper we propose a novel framework DiffPerformer to synthesize high-fidelity and temporally consistent human video.
1 code implementation • CVPR 2024 • Xiaoqian Lv, Shengping Zhang, Chenyang Wang, Yichen Zheng, Bineng Zhong, Chongyi Li, Liqiang Nie
Existing joint low-light enhancement and deblurring methods learn pixel-wise mappings from paired synthetic data which results in limited generalization in real-world scenes.
1 code implementation • 27 Aug 2023 • Yaozong Zheng, Bineng Zhong, Qihua Liang, Guorong Li, Rongrong Ji, Xianxian Li
In this paper, we present a simple, flexible and effective vision-language (VL) tracking pipeline, termed \textbf{MMTrack}, which casts VL tracking as a token generation task.
no code implementations • 11 Mar 2022 • Jie Ma, Yalong Bai, Bineng Zhong, Wei zhang, Ting Yao, Tao Mei
Vision Transformer (ViT) has become a leading tool in various computer vision tasks, owing to its unique self-attention mechanism that learns visual representations explicitly through cross-patch information interactions.
1 code implementation • 25 Sep 2021 • Qinglin Liu, Haozhe Xie, Shengping Zhang, Bineng Zhong, Rongrong Ji
Finally, we use the matting module which takes the image, trimap and context features to estimate the alpha matte.
Ranked #6 on
Image Matting
on Composition-1K
(using extra training data)
1 code implementation • CVPR 2021 • Qiong Wu, Pingyang Dai, Jie Chen, Chia-Wen Lin, Yongjian Wu, Feiyue Huang, Bineng Zhong, Rongrong Ji
In this paper, we propose a joint Modality and Pattern Alignment Network (MPANet) to discover cross-modality nuances in different patterns for visible-infrared person Re-ID, which introduces a modality alleviation module and a pattern alignment module to jointly extract discriminative features.
1 code implementation • CVPR 2021 • Zikai Zhang, Bineng Zhong, Shengping Zhang, Zhenjun Tang, Xin Liu, Zhaoxiang Zhang
A practical long-term tracker typically contains three key properties, i. e. an efficient model design, an effective global re-detection strategy and a robust distractor awareness mechanism.
1 code implementation • CVPR 2021 • Siyuan Cheng, Bineng Zhong, Guorong Li, Xin Liu, Zhenjun Tang, Xianxian Li, Jing Wang
RD performs in a meta-learning way to obtain a learning ability to filter the distractors from the background while RM aims to effectively integrate the proposed RD into the Siamese framework to generate accurate tracking result.
no code implementations • ICCV 2021 • Qinqin Zhou, Xiawu Zheng, Liujuan Cao, Bineng Zhong, Teng Xi, Gang Zhang, Errui Ding, Mingliang Xu, Rongrong Ji
EC-DARTS decouples different operations based on their categories to optimize the operation weights so that the operation gap between them is shrinked.
1 code implementation • CVPR 2020 • Jie Li, Rongrong Ji, Hong Liu, Jianzhuang Liu, Bineng Zhong, Cheng Deng, Qi Tian
For reducing the solution space, we first model the adversarial perturbation optimization problem as a process of recovering frequency-sparse perturbations with compressed sensing, under the setting that random noise in the low-frequency space is more likely to be adversarial.
no code implementations • CVPR 2020 • Jiahua Dong, Yang Cong, Gan Sun, Bineng Zhong, Xiaowei Xu
Unsupervised domain adaptation has attracted growing research attention on semantic segmentation.
2 code implementations • CVPR 2020 • Zedu Chen, Bineng Zhong, Guorong Li, Shengping Zhang, Rongrong Ji
Most of the existing trackers usually rely on either a multi-scale searching scheme or pre-defined anchor boxes to accurately estimate the scale and aspect ratio of a target.
2 code implementations • ICLR 2019 • Yulun Zhang, Kunpeng Li, Kai Li, Bineng Zhong, Yun Fu
To address this issue, we design local and non-local attention blocks to extract features that capture the long-range dependencies between pixels and pay more attention to the challenging parts.
no code implementations • 6 Mar 2019 • Gan Sun, Yang Cong, Qianqian Wang, Bineng Zhong, Yun Fu
Consider the lifelong machine learning paradigm whose objective is to learn a sequence of tasks depending on previous experiences, e. g., knowledge library or deep network weights.
3 code implementations • 25 Dec 2018 • Yulun Zhang, Yapeng Tian, Yu Kong, Bineng Zhong, Yun Fu
We fully exploit the hierarchical features from all the convolutional layers.
Ranked #1 on
Color Image Denoising
on Kodak24 sigma30
20 code implementations • ECCV 2018 • Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng Zhong, Yun Fu
To solve these problems, we propose the very deep residual channel attention networks (RCAN).
Ranked #21 on
Image Super-Resolution
on BSD100 - 4x upscaling
no code implementations • ICLR 2019 • Jun Li, Hongfu Liu, Bineng Zhong, Yue Wu, Yun Fu
To address this problem, we propose a simple yet effective method for improving stochastic gradient methods named predictive local smoothness (PLS).
16 code implementations • CVPR 2018 • Yulun Zhang, Yapeng Tian, Yu Kong, Bineng Zhong, Yun Fu
In this paper, we propose a novel residual dense network (RDN) to address this problem in image SR. We fully exploit the hierarchical features from all the convolutional layers.
Ranked #5 on
Image Super-Resolution
on IXI
no code implementations • 25 Mar 2016 • Yan Yan, Hanzi Wang, Cuihua Li, Chenhui Yang, Bineng Zhong
In this paper, an effective unconstrained correlation filter called Uncon- strained Optimal Origin Tradeoff Filter (UOOTF) is presented and applied to robust face recognition.
no code implementations • CVPR 2015 • Xian-Ming Liu, Rongrong Ji, Changhu Wang, Wei Liu, Bineng Zhong, Thomas S. Huang
A hierarchical shape parsing strategy is proposed to partition and organize image components into a hierarchical structure in the scale space.