1 code implementation • 5 Jan 2025 • Ziyang Song, Zerong Wang, Bo Li, Hao Zhang, Ruijie Zhu, Li Liu, Peng-Tao Jiang, Tianzhu Zhang
First, to mitigate overfitting to texture details introduced by generative features, we propose a Feature Alignment module, which incorporates high-quality semantic features to enhance the denoising network's representation capability.
Ranked #2 on
Monocular Depth Estimation
on ETH3D
no code implementations • 2 Dec 2024 • Qirui Yang, Yinbo Li, Peng-Tao Jiang, Qihua Cheng, Biting Yu, Yihao Liu, Huanjing Yue, Jingyu Yang
Previous tone mapping methods mainly focus on how to enhance tones in low-resolution images and recover details using the high-frequent components extracted from the input image.
no code implementations • 2 Dec 2024 • Qirui Yang, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Bo Li, Huanjing Yue, Jingyu Yang
Specifically, we introduce the color-separated features that emphasize the light difference of different color channels and combine them with the traditional color-mixed features by Light Guided Attention (LGA).
1 code implementation • 29 Oct 2024 • Ruihao Xia, Yu Liang, Peng-Tao Jiang, Hao Zhang, Bo Li, Yang Tang, Pan Zhou
To address this issue, we propose Modality Adaptation with text-to-image Diffusion Models (MADM) for semantic segmentation task which utilizes text-to-image diffusion models pre-trained on extensive image-text pairs to enhance the model's cross-modality capabilities.
no code implementations • 18 Oct 2024 • Yuhao Wan, Peng-Tao Jiang, Qibin Hou, Hao Zhang, Jinwei Chen, Ming-Ming Cheng, Bo Li
We show that the proper use of latent LR embeddings can produce higher-quality control signals, which enables the super-resolution results to be more consistent with the LR image and leads to clearer visual results.
no code implementations • 17 Oct 2024 • Junhao Gu, Peng-Tao Jiang, Hao Zhang, Mi Zhou, Jinwei Chen, Wenming Yang, Bo Li
To address this challenge, we introduce ConsisSR to handle both semantic and pixel-level consistency.
no code implementations • 14 Oct 2024 • Qian Yu, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Bo Li, Lihe Zhang, Huchuan Lu
In the realm of high-resolution (HR), fine-grained image segmentation, the primary challenge is balancing broad contextual awareness with the precision required for detailed object delineation, capturing intricate details and the finest edges of objects.
1 code implementation • 9 Oct 2024 • Ruihao Xia, Yu Liang, Peng-Tao Jiang, Hao Zhang, Qianru Sun, Yang Tang, Bo Li, Pan Zhou
For training objectives, the proposed regularization and trimap loss aim to retain the prior from the pre-trained model and push the matting logits extracted from the mask decoder to contain trimap-based semantic information.
no code implementations • 23 May 2024 • Lv Tang, Haoke Xiao, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Bo Li
To address this challenge, State Space Models (SSMs) like Mamba have emerged as efficient alternatives, initially matching Transformer performance in NLP tasks and later surpassing Vision Transformers (ViTs) in various CV tasks.
1 code implementation • CVPR 2024 • YuQi Yang, Peng-Tao Jiang, Qibin Hou, Hao Zhang, Jinwei Chen, Bo Li
Furthermore, to control the parameters and computational cost brought by the increase in the number of experts, we take inspiration from LoRA and propose to leverage the low-rank format of a vanilla convolution in the expert network.
no code implementations • 21 Mar 2024 • YuQi Yang, Peng-Tao Jiang, Jing Wang, Hao Zhang, Kai Zhao, Jinwei Chen, Bo Li
Multi-modal large language models (MLLMs) can understand image-language prompts and demonstrate impressive reasoning ability.
no code implementations • 4 Mar 2024 • Cong Geng, Tian Han, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Søren Hauberg, Bo Li
Generative models have shown strong generation ability while efficient likelihood estimation is less explored.
no code implementations • 8 Dec 2023 • Xi Wang, Xueyang Fu, Peng-Tao Jiang, Jie Huang, Mi Zhou, Bo Li, Zheng-Jun Zha
The former facilitates channel-dependent degradation removal operation, allowing the network to tailor responses to various adverse weather types; the latter, by integrating Fourier's global properties into channel-independent content features, enhances network capacity for consistent global content reconstruction.
1 code implementation • CVPR 2024 • Yurui Zhu, Xueyang Fu, Peng-Tao Jiang, Hao Zhang, Qibin Sun, Jinwei Chen, Zheng-Jun Zha, Bo Li
This research focuses on the issue of single-image reflection removal (SIRR) in real-world conditions, examining it from two angles: the collection pipeline of real reflection pairs and the perception of real reflection locations.
1 code implementation • 19 Nov 2023 • Lv Tang, Peng-Tao Jiang, Zhihao Shen, Hao Zhang, Jinwei Chen, Bo Li
In this paper, we introduce a novel multimodal camo-perceptive framework (MMCPF) aimed at handling zero-shot Camouflaged Object Detection (COD) by leveraging the powerful capabilities of Multimodal Large Language Models (MLLMs).
1 code implementation • 17 Oct 2023 • Lv Tang, Peng-Tao Jiang, Hao-Ke Xiao, Bo Li
The realm of computer vision has witnessed a paradigm shift with the advent of foundational models, mirroring the transformative influence of large language models in the domain of natural language processing.
no code implementations • 6 Jun 2023 • Yanwen Fang, Jintai Chen, Peng-Tao Jiang, Chao Li, Yifeng Geng, Eddy K. F. LAM, Guodong Li
Multi-person motion prediction is a challenging task, especially for real-world scenarios of highly interacted persons.
no code implementations • 2 May 2023 • Peng-Tao Jiang, YuQi Yang
Weakly supervised semantic segmentation with weak labels is a long-lived ill-posed problem.
1 code implementation • CVPR 2023 • Jiaxiong Qiu, Peng-Tao Jiang, Yifan Zhu, Ze-Xin Yin, Ming-Ming Cheng, Bo Ren
To remedy this issue, we present a novel surface reconstruction framework, NeuS-HSR, based on implicit neural rendering.
1 code implementation • CVPR 2024 • Peng-Tao Jiang, YuQi Yang, Yang Cao, Qibin Hou, Ming-Ming Cheng, Chunhua Shen
To date, most existing datasets focus on autonomous driving scenes.
1 code implementation • CVPR 2022 • Peng-Tao Jiang, YuQi Yang, Qibin Hou, Yunchao Wei
Our framework conducts the global network to learn the captured rich object detail knowledge from a global view and thereby produces high-quality attention maps that can be directly used as pseudo annotations for semantic segmentation networks.
Ranked #17 on
Weakly-Supervised Semantic Segmentation
on PASCAL VOC 2012 test
(using extra training data)
no code implementations • 23 Jan 2022 • Ming-Ming Cheng, Peng-Tao Jiang, Ling-Hao Han, Liang Wang, Philip Torr
The proposed framework can generate a deep hierarchy of strongly associated supporting evidence for the network decision, which provides insight into the decision-making process.
1 code implementation • 15 Nov 2021 • Meng-Hao Guo, Tian-Xing Xu, Jiang-Jiang Liu, Zheng-Ning Liu, Peng-Tao Jiang, Tai-Jiang Mu, Song-Hai Zhang, Ralph R. Martin, Ming-Ming Cheng, Shi-Min Hu
Humans can naturally and effectively find salient regions in complex scenes.
1 code implementation • ICCV 2021 • Yu Zhang, Chang-Bin Zhang, Peng-Tao Jiang, Ming-Ming Cheng, Feng Mao
In this paper, we address the problem of personalized image segmentation.
3 code implementations • IEEE 2021 • Peng-Tao Jiang, Chang-Bin Zhang, Qibin Hou, Ming-Ming Cheng, Yunchao Wei
To evaluate the quality of the class activation maps produced by LayerCAM, we apply them to weakly-supervised object localization and semantic segmentation.
2 code implementations • 25 Nov 2020 • Chang-Bin Zhang, Peng-Tao Jiang, Qibin Hou, Yunchao Wei, Qi Han, Zhen Li, Ming-Ming Cheng
Experiments demonstrate that based on the same classification models, the proposed approach can effectively improve the classification performance on CIFAR-100, ImageNet, and fine-grained datasets.
2 code implementations • ICCV 2019 • Peng-Tao Jiang, Qibin Hou, Yang Cao, Ming-Ming Cheng, Yunchao Wei, Hong-Kai Xiong
In order to accumulate the discovered different object parts, we propose an online attention accumulation (OAA) strategy which maintains a cumulative attention map for each target category in each training image so that the integral object regions can be gradually promoted as the training goes.
no code implementations • NeurIPS 2018 • Qibin Hou, Peng-Tao Jiang, Yunchao Wei, Ming-Ming Cheng
To test the quality of the generated attention maps, we employ the mined object regions as heuristic cues for learning semantic segmentation models.