no code implementations • 10 May 2025 • Chunming He, Rihan Zhang, Fengyang Xiao, Chengyu Fang, Longxiang Tang, Yulun Zhang, Sina Farsiu
Deep unfolding networks (DUNs) are widely employed in illumination degradation image restoration (IDIR) to merge the interpretability of model-based approaches with the generalization of learning-based methods.
1 code implementation • 15 Apr 2025 • Xiang Wang, Shiwei Zhang, Longxiang Tang, Yingya Zhang, Changxin Gao, Yuehuan Wang, Nong Sang
Furthermore, we adopt a simple concatenation operation to integrate the reference appearance into the model and incorporate the pose information of the reference image for enhanced pose alignment.
1 code implementation • 20 Mar 2025 • Zhihang Liu, Chen-Wei Xie, Pandeng Li, Liming Zhao, Longxiang Tang, Yun Zheng, Chuanbin Liu, Hongtao Xie
Specifically, the instruction condition is injected into the grouped visual tokens at the local level and the learnable tokens at the global level, and we conduct the attention mechanism to complete the conditional compression.
1 code implementation • 16 Mar 2025 • Tianyuan Qu, Longxiang Tang, Bohao Peng, Senqiao Yang, Bei Yu, Jiaya Jia
Together, our LSDBench and RHS framework address the unique challenges of high-NSD long-video tasks, setting a new standard for evaluating and improving LVLMs in this domain.
no code implementations • 10 Mar 2025 • Yujie Wei, Shiwei Zhang, Hangjie Yuan, Biao Gong, Longxiang Tang, Xiang Wang, Haonan Qiu, Hengjia Li, Shuai Tan, Yingya Zhang, Hongming Shan
First, in Relational Decoupling Learning, we disentangle relations from subject appearances using relation LoRA triplet and hybrid mask training strategy, ensuring better generalization across diverse relationships.
1 code implementation • 9 Mar 2025 • Hantao Zhou, Rui Yang, Longxiang Tang, Guanyi Qin, Yan Zhang, Runze Hu, Xiu Li
Image assessment aims to evaluate the quality and aesthetics of images and has been applied across various scenarios, such as natural and AIGC scenes.
1 code implementation • 20 Feb 2025 • Chengyu Fang, Chunming He, Longxiang Tang, Yuelin Zhang, Chenyang Zhu, Yuqi Shen, Chubin Chen, Guoxia Xu, Xiu Li
UniLearner exploits multimodal data unrelated to the COS task to improve the segmentation ability of the COS models by generating pseudo-modal content and cross-modal semantic associations.
no code implementations • 30 Jan 2025 • Chunming He, Rihan Zhang, Fengyang Xiao, Chenyu Fang, Longxiang Tang, Yulun Zhang, Linghe Kong, Deng-Ping Fan, Kai Li, Sina Farsiu
To address this, we propose the Reversible Unfolding Network (RUN), which applies reversible strategies across both mask and RGB domains through a theoretically grounded framework, enabling accurate segmentation.
1 code implementation • 12 Dec 2024 • Zhisheng Zhong, Chengyao Wang, Yuqi Liu, Senqiao Yang, Longxiang Tang, Yuechen Zhang, Jingyao Li, Tianyuan Qu, Yanwei Li, Yukang Chen, Shaozuo Yu, Sitong Wu, Eric Lo, Shu Liu, Jiaya Jia
As Multi-modal Large Language Models (MLLMs) evolve, expanding beyond single-domain capabilities is essential to meet the demands for more versatile and efficient AI.
Ranked #1 on
Visual Question Answering (VQA)
on EgoSchema
1 code implementation • 2 Dec 2024 • Chenyang Zhu, Kai Li, Yue Ma, Longxiang Tang, Chengyu Fang, Chubin Chen, Qifeng Chen, Xiu Li
They struggle to maintain consistency in both the foreground and background during concept swapping, especially when the shape difference is large between objects.
1 code implementation • 26 Aug 2024 • Fengyang Xiao, Sujie Hu, Yuqi Shen, Chengyu Fang, Jinfa Huang, Chunming He, Longxiang Tang, Ziyun Yang, Xiu Li
Camouflaged Object Detection (COD) refers to the task of identifying and segmenting objects that blend seamlessly into their surroundings, posing a significant challenge for computer vision systems.
1 code implementation • 7 Jul 2024 • Longxiang Tang, Zhuotao Tian, Kai Li, Chunming He, Hantao Zhou, Hengshuang Zhao, Xiu Li, Jiaya Jia
To address this problem efficiently, we propose the Distribution-aware Interference-free Knowledge Integration (DIKI) framework, retaining pre-trained knowledge of VLMs from a perspective of avoiding information interference.
1 code implementation • 17 Jun 2024 • Chunming He, Yuqi Shen, Chengyu Fang, Fengyang Xiao, Longxiang Tang, Yulun Zhang, WangMeng Zuo, Zhenhua Guo, Xiu Li
Deep generative models have garnered significant attention in low-level vision tasks due to their generative capabilities.
1 code implementation • 12 Jun 2024 • Chengyu Fang, Chunming He, Fengyang Xiao, Yulun Zhang, Longxiang Tang, Yuelin Zhang, Kai Li, Xiu Li
Real-world Image Dehazing (RID) aims to alleviate haze-induced degradation in real-world settings.
1 code implementation • 3 Jun 2024 • Hantao Zhou, Longxiang Tang, Rui Yang, Guanyi Qin, Yan Zhang, Runze Hu, Xiu Li
Image Quality Assessment (IQA) and Image Aesthetic Assessment (IAA) aim to simulate human subjective perception of image visual quality and aesthetic appeal.
1 code implementation • 20 Nov 2023 • Chunming He, Chengyu Fang, Yulun Zhang, Tian Ye, Kai Li, Longxiang Tang, Zhenhua Guo, Xiu Li, Sina Farsiu
These priors are subsequently utilized by RGformer to guide the decomposition of image features into their respective reflectance and illumination components.
no code implementations • 3 Aug 2023 • Longxiang Tang, Kai Li, Chunming He, Yulun Zhang, Xiu Li
In this paper, we propose a consistency regularization framework to develop a more generalizable SFDA method, which simultaneously boosts model performance on both target training and testing datasets.
no code implementations • 15 Jul 2023 • Chunming He, Kai Li, Guoxia Xu, Jiangpeng Yan, Longxiang Tang, Yulun Zhang, Xiu Li, YaoWei Wang
Specifically, we extract features from an HQ image and explicitly insert the features, which are expected to encode HQ cues, into the enhancement network to guide the LQ enhancement with the variational normalization module.
1 code implementation • 14 Jul 2023 • Longxiang Tang, Kai Li, Chunming He, Yulun Zhang, Xiu Li
This paper aims to address these two issues by proposing the Class-Balanced Mean Teacher (CBMT) model.
no code implementations • NeurIPS 2023 • Chunming He, Kai Li, Yachao Zhang, Guoxia Xu, Longxiang Tang, Yulun Zhang, Zhenhua Guo, Xiu Li
It remains a challenging task since (1) it is hard to distinguish concealed objects from the background due to the intrinsic similarity and (2) the sparsely-annotated training data only provide weak supervision for model learning.
no code implementations • CVPR 2023 • Chunming He, Kai Li, Yachao Zhang, Longxiang Tang, Yulun Zhang, Zhenhua Guo, Xiu Li
COD is a challenging task due to the intrinsic similarity of camouflaged objects with the background, as well as their ambiguous boundaries.