Search Results for author: Peng-Tao Jiang

Found 28 papers, 15 papers with code

DepthMaster: Taming Diffusion Models for Monocular Depth Estimation

1 code implementation5 Jan 2025 Ziyang Song, Zerong Wang, Bo Li, Hao Zhang, Ruijie Zhu, Li Liu, Peng-Tao Jiang, Tianzhu Zhang

First, to mitigate overfitting to texture details introduced by generative features, we propose a Feature Alignment module, which incorporates high-quality semantic features to enhance the denoising network's representation capability.

Denoising Monocular Depth Estimation

Learning Differential Pyramid Representation for Tone Mapping

no code implementations2 Dec 2024 Qirui Yang, Yinbo Li, Peng-Tao Jiang, Qihua Cheng, Biting Yu, Yihao Liu, Huanjing Yue, Jingyu Yang

Previous tone mapping methods mainly focus on how to enhance tones in low-resolution images and recover details using the high-frequent components extracted from the input image.

Tone Mapping

Learning Adaptive Lighting via Channel-Aware Guidance

no code implementations2 Dec 2024 Qirui Yang, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Bo Li, Huanjing Yue, Jingyu Yang

Specifically, we introduce the color-separated features that emphasize the light difference of different color channels and combine them with the traditional color-mixed features by Light Guided Attention (LGA).

Exposure Correction Image Retouching

Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation

1 code implementation29 Oct 2024 Ruihao Xia, Yu Liang, Peng-Tao Jiang, Hao Zhang, Bo Li, Yang Tang, Pan Zhou

To address this issue, we propose Modality Adaptation with text-to-image Diffusion Models (MADM) for semantic segmentation task which utilizes text-to-image diffusion models pre-trained on extensive image-text pairs to enhance the model's cross-modality capabilities.

Pseudo Label Semantic Segmentation +1

ClearSR: Latent Low-Resolution Image Embeddings Help Diffusion-Based Real-World Super Resolution Models See Clearer

no code implementations18 Oct 2024 Yuhao Wan, Peng-Tao Jiang, Qibin Hou, Hao Zhang, Jinwei Chen, Ming-Ming Cheng, Bo Li

We show that the proper use of latent LR embeddings can produce higher-quality control signals, which enables the super-resolution results to be more consistent with the LR image and leads to clearer visual results.

Image Super-Resolution

High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity

no code implementations14 Oct 2024 Qian Yu, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Bo Li, Lihe Zhang, Huchuan Lu

In the realm of high-resolution (HR), fine-grained image segmentation, the primary challenge is balancing broad contextual awareness with the precision required for detailed object delineation, capturing intricate details and the finest edges of objects.

Denoising Dichotomous Image Segmentation +5

Towards Natural Image Matting in the Wild via Real-Scenario Prior

1 code implementation9 Oct 2024 Ruihao Xia, Yu Liang, Peng-Tao Jiang, Hao Zhang, Qianru Sun, Yang Tang, Bo Li, Pan Zhou

For training objectives, the proposed regularization and trimap loss aim to retain the prior from the pre-trained model and push the matting logits extracted from the mask decoder to contain trimap-based semantic information.

Decoder Image Matting +2

Scalable Visual State Space Model with Fractal Scanning

no code implementations23 May 2024 Lv Tang, Haoke Xiao, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Bo Li

To address this challenge, State Space Models (SSMs) like Mamba have emerged as efficient alternatives, initially matching Transformer performance in NLP tasks and later surpassing Vision Transformers (ViTs) in various CV tasks.

Image Classification Mamba +2

Multi-Task Dense Prediction via Mixture of Low-Rank Experts

1 code implementation CVPR 2024 YuQi Yang, Peng-Tao Jiang, Qibin Hou, Hao Zhang, Jinwei Chen, Bo Li

Furthermore, to control the parameters and computational cost brought by the increase in the number of experts, we take inspiration from LoRA and propose to leverage the low-rank format of a vanilla convolution in the expert network.

Decoder

Empowering Segmentation Ability to Multi-modal Large Language Models

no code implementations21 Mar 2024 YuQi Yang, Peng-Tao Jiang, Jing Wang, Hao Zhang, Kai Zhao, Jinwei Chen, Bo Li

Multi-modal large language models (MLLMs) can understand image-language prompts and demonstrate impressive reasoning ability.

Dialogue Generation Reasoning Segmentation +2

Improving Adversarial Energy-Based Model via Diffusion Process

no code implementations4 Mar 2024 Cong Geng, Tian Han, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Søren Hauberg, Bo Li

Generative models have shown strong generation ability while efficient likelihood estimation is less explored.

Denoising Density Estimation +1

Decoupling Degradation and Content Processing for Adverse Weather Image Restoration

no code implementations8 Dec 2023 Xi Wang, Xueyang Fu, Peng-Tao Jiang, Jie Huang, Mi Zhou, Bo Li, Zheng-Jun Zha

The former facilitates channel-dependent degradation removal operation, allowing the network to tailor responses to various adverse weather types; the latter, by integrating Fourier's global properties into channel-independent content features, enhances network capacity for consistent global content reconstruction.

Image Restoration

Revisiting Single Image Reflection Removal In the Wild

1 code implementation CVPR 2024 Yurui Zhu, Xueyang Fu, Peng-Tao Jiang, Hao Zhang, Qibin Sun, Jinwei Chen, Zheng-Jun Zha, Bo Li

This research focuses on the issue of single-image reflection removal (SIRR) in real-world conditions, examining it from two angles: the collection pipeline of real reflection pairs and the perception of real reflection locations.

Reflection Removal

Chain of Visual Perception: Harnessing Multimodal Large Language Models for Zero-shot Camouflaged Object Detection

1 code implementation19 Nov 2023 Lv Tang, Peng-Tao Jiang, Zhihao Shen, Hao Zhang, Jinwei Chen, Bo Li

In this paper, we introduce a novel multimodal camo-perceptive framework (MMCPF) aimed at handling zero-shot Camouflaged Object Detection (COD) by leveraging the powerful capabilities of Multimodal Large Language Models (MLLMs).

counterfactual Hallucination +3

Towards Training-free Open-world Segmentation via Image Prompt Foundation Models

1 code implementation17 Oct 2023 Lv Tang, Peng-Tao Jiang, Hao-Ke Xiao, Bo Li

The realm of computer vision has witnessed a paradigm shift with the advent of foundational models, mirroring the transformative influence of large language models in the domain of natural language processing.

Segmentation

L2G: A Simple Local-to-Global Knowledge Transfer Framework for Weakly Supervised Semantic Segmentation

1 code implementation CVPR 2022 Peng-Tao Jiang, YuQi Yang, Qibin Hou, Yunchao Wei

Our framework conducts the global network to learn the captured rich object detail knowledge from a global view and thereby produces high-quality attention maps that can be directly used as pseudo annotations for semantic segmentation networks.

Object Transfer Learning +2

Deeply Explain CNN via Hierarchical Decomposition

no code implementations23 Jan 2022 Ming-Ming Cheng, Peng-Tao Jiang, Ling-Hao Han, Liang Wang, Philip Torr

The proposed framework can generate a deep hierarchy of strongly associated supporting evidence for the network decision, which provides insight into the decision-making process.

Decision Making

LayerCAM: Exploring Hierarchical Class Activation Maps for Localization

3 code implementations IEEE 2021 Peng-Tao Jiang, Chang-Bin Zhang, Qibin Hou, Ming-Ming Cheng, Yunchao Wei

To evaluate the quality of the class activation maps produced by LayerCAM, we apply them to weakly-supervised object localization and semantic segmentation.

Object Semantic Segmentation +1

Delving Deep into Label Smoothing

2 code implementations25 Nov 2020 Chang-Bin Zhang, Peng-Tao Jiang, Qibin Hou, Yunchao Wei, Qi Han, Zhen Li, Ming-Ming Cheng

Experiments demonstrate that based on the same classification models, the proposed approach can effectively improve the classification performance on CIFAR-100, ImageNet, and fine-grained datasets.

Classification General Classification

Integral Object Mining via Online Attention Accumulation

2 code implementations ICCV 2019 Peng-Tao Jiang, Qibin Hou, Yang Cao, Ming-Ming Cheng, Yunchao Wei, Hong-Kai Xiong

In order to accumulate the discovered different object parts, we propose an online attention accumulation (OAA) strategy which maintains a cumulative attention map for each target category in each training image so that the integral object regions can be gradually promoted as the training goes.

General Classification Object +4

Self-Erasing Network for Integral Object Attention

no code implementations NeurIPS 2018 Qibin Hou, Peng-Tao Jiang, Yunchao Wei, Ming-Ming Cheng

To test the quality of the generated attention maps, we employ the mined object regions as heuristic cues for learning semantic segmentation models.

Object Semantic Segmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.