Search Results for author: Yunhang Shen

Found 37 papers, 23 papers with code

Enabling Deep Residual Networks for Weakly Supervised Object Detection

no code implementations • ECCV 2020 • Yunhang Shen, Rongrong Ji, Yan Wang, Zhiwei Chen, Feng Zheng, Feiyue Huang, Yunsheng Wu

Weakly supervised object detection (WSOD) has attracted extensive research attention due to its great flexibility of exploiting large-scale image-level annotation for detector training.

Object object-detection +1

Paper
Add Code

Multi-Modal Prompt Learning on Blind Image Quality Assessment

no code implementations • 23 Apr 2024 • Wensheng Pan, Timin Gao, Yan Zhang, Runze Hu, Xiawu Zheng, Enwei Zhang, Yuting Gao, Yutao Liu, Yunhang Shen, Ke Li, Shengchuan Zhang, Liujuan Cao, Rongrong Ji

Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly.

Paper
Add Code

Fusion-Mamba for Cross-modality Object Detection

no code implementations • 14 Apr 2024 • Wenhao Dong, Haodong Zhu, Shaohui Lin, Xiaoyan Luo, Yunhang Shen, Xuhui Liu, Juan Zhang, Guodong Guo, Baochang Zhang

In this paper, we investigate cross-modality fusion by associating cross-modal features in a hidden state space based on an improved Mamba with a gating mechanism.

Object object-detection +1

Paper
Add Code

A General and Efficient Training for Transformer via Token Expansion

1 code implementation • 31 Mar 2024 • Wenxuan Huang, Yunhang Shen, Jiao Xie, Baochang Zhang, Gaoqi He, Ke Li, Xing Sun, Shaohui Lin

The remarkable performance of Vision Transformers (ViTs) typically requires an extremely large training cost.

Paper
Code

Rethinking Centered Kernel Alignment in Knowledge Distillation

no code implementations • 22 Jan 2024 • Zikai Zhou, Yunhang Shen, Shitong Shao, Linrui Gong, Shaohui Lin

Knowledge distillation has emerged as a highly effective method for bridging the representation discrepancy between large-scale models and lightweight models.

Image Classification Knowledge Distillation +2

Paper
Add Code

Feature Denoising Diffusion Model for Blind Image Quality Assessment

no code implementations • 22 Jan 2024 • Xudong Li, Jingyuan Zheng, Runze Hu, Yan Zhang, Ke Li, Yunhang Shen, Xiawu Zheng, Yutao Liu, Shengchuan Zhang, Pingyang Dai, Rongrong Ji

Blind Image Quality Assessment (BIQA) aims to evaluate image quality in line with human perception, without reference benchmarks.

Blind Image Quality Assessment Denoising +1

Paper
Add Code

Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud Semantic Segmentation via Decoupling Optimization

no code implementations • 13 Jan 2024 • Mengtian Li, Shaohui Lin, Zihan Wang, Yunhang Shen, Baochang Zhang, Lizhuang Ma

Semi-supervised learning (SSL), thanks to the significant reduction of data annotation costs, has been an active research topic for large-scale 3D scene understanding.

Pseudo Label Representation Learning +2

Paper
Add Code

Weakly Supervised Open-Vocabulary Object Detection

no code implementations • 19 Dec 2023 • Jianghang Lin, Yunhang Shen, Bingquan Wang, Shaohui Lin, Ke Li, Liujuan Cao

Despite weakly supervised object detection (WSOD) being a promising step toward evading strong instance-level annotations, its capability is confined to closed-set categories within a single training dataset.

Attribute Novel Concepts +6

Paper
Add Code

A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise

2 code implementations • 19 Dec 2023 • Chaoyou Fu, Renrui Zhang, Zihan Wang, Yubo Huang, Zhengye Zhang, Longtian Qiu, Gaoxiang Ye, Yunhang Shen, Mengdan Zhang, Peixian Chen, Sirui Zhao, Shaohui Lin, Deqiang Jiang, Di Yin, Peng Gao, Ke Li, Hongsheng Li, Xing Sun

They endow Large Language Models (LLMs) with powerful capabilities in visual understanding, enabling them to tackle diverse multi-modal tasks.

Visual Reasoning

8,890

Paper
Code

SPD-DDPM: Denoising Diffusion Probabilistic Models in the Symmetric Positive Definite Space

1 code implementation • 13 Dec 2023 • Yunchen Li, Zhou Yu, Gaoqi He, Yunhang Shen, Ke Li, Xing Sun, Shaohui Lin

On the other hand, the model unconditionally learns the probability distribution of the data $p(X)$ and generates samples that conform to this distribution.

Denoising Traffic Prediction

Paper
Code

Adaptive Feature Selection for No-Reference Image Quality Assessment using Contrastive Mitigating Semantic Noise Sensitivity

no code implementations • 11 Dec 2023 • Xudong Li, Timin Gao, Xiawu Zheng, Runze Hu, Jingyuan Zheng, Yunhang Shen, Ke Li, Yutao Liu, Pingyang Dai, Yan Zhang, Rongrong Ji

The current state-of-the-art No-Reference Image Quality Assessment (NR-IQA) methods typically use feature extraction in upstream backbone networks, which assumes that all extracted features are relevant.

Contrastive Learning feature selection +2

Paper
Add Code

Aligning and Prompting Everything All at Once for Universal Visual Perception

2 code implementations • 4 Dec 2023 • Yunhang Shen, Chaoyou Fu, Peixian Chen, Mengdan Zhang, Ke Li, Xing Sun, Yunsheng Wu, Shaohui Lin, Rongrong Ji

However, predominant paradigms, driven by casting instance-level tasks as an object-word alignment, bring heavy cross-modality interaction, which is not effective in prompting object detection and visual grounding.

Object object-detection +6

415

Paper
Code

Less is More: Learning Reference Knowledge Using No-Reference Image Quality Assessment

no code implementations • 1 Dec 2023 • Xudong Li, Jingyuan Zheng, Xiawu Zheng, Runze Hu, Enwei Zhang, Yuting Gao, Yunhang Shen, Ke Li, Yutao Liu, Pingyang Dai, Yan Zhang, Rongrong Ji

Concretely, by innovatively introducing a novel feature distillation method in IQA, we propose a new framework to learn comparative knowledge from non-aligned reference images.

Inductive Bias No-Reference Image Quality Assessment +1

Paper
Add Code

Woodpecker: Hallucination Correction for Multimodal Large Language Models

1 code implementation • 24 Oct 2023 • Shukang Yin, Chaoyou Fu, Sirui Zhao, Tong Xu, Hao Wang, Dianbo Sui, Yunhang Shen, Ke Li, Xing Sun, Enhong Chen

Hallucination is a big shadow hanging over the rapidly evolving Multimodal Large Language Models (MLLMs), referring to the phenomenon that the generated text is inconsistent with the image content.

Hallucination

536

Paper
Code

Filter Pruning for Efficient CNNs via Knowledge-driven Differential Filter Sampler

1 code implementation • 1 Jul 2023 • Shaohui Lin, Wenxuan Huang, Jiao Xie, Baochang Zhang, Yunhang Shen, Zhou Yu, Jungong Han, David Doermann

In this paper, we propose a novel Knowledge-driven Differential Filter Sampler~(KDFS) with Masked Filter Modeling~(MFM) framework for filter pruning, which globally prunes the redundant filters based on the prior knowledge of a pre-trained model in a differential and non-alternative optimization.

Image Classification Network Pruning

Paper
Code

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

3 code implementations • 23 Jun 2023 • Chaoyou Fu, Peixian Chen, Yunhang Shen, Yulei Qin, Mengdan Zhang, Xu Lin, Jinrui Yang, Xiawu Zheng, Ke Li, Xing Sun, Yunsheng Wu, Rongrong Ji

Multimodal Large Language Model (MLLM) relies on the powerful LLM to perform multimodal tasks, showing amazing emergent abilities in recent studies, such as writing poems based on an image.

Benchmarking Language Modelling +3

8,890

Paper
Code

Active Teacher for Semi-Supervised Object Detection

1 code implementation • CVPR 2022 • Peng Mi, Jianghang Lin, Yiyi Zhou, Yunhang Shen, Gen Luo, Xiaoshuai Sun, Liujuan Cao, Rongrong Fu, Qiang Xu, Rongrong Ji

In this paper, we study teacher-student learning from the perspective of data initialization and propose a novel algorithm called Active Teacher(Source code are available at: \url{https://github. com/HunterJ-Lin/ActiveTeacher}) for semi-supervised object detection (SSOD).

Object object-detection +2

Paper
Code

Category-aware Allocation Transformer for Weakly Supervised Object Localization

no code implementations • ICCV 2023 • Zhiwei Chen, Jinren Ding, Liujuan Cao, Yunhang Shen, Shengchuan Zhang, Guannan Jiang, Rongrong Ji

Weakly supervised object localization (WSOL) aims to localize objects based on only image-level labels as supervision.

Object Weakly-Supervised Object Localization

Paper
Add Code

FoPro: Few-Shot Guided Robust Webly-Supervised Prototypical Learning

1 code implementation • 1 Dec 2022 • Yulei Qin, Xingyu Chen, Chao Chen, Yunhang Shen, Bo Ren, Yun Gu, Jie Yang, Chunhua Shen

Most existing methods focus on learning noise-robust models from web images while neglecting the performance drop caused by the differences between web domain and real-world domain.

Contrastive Learning Representation Learning

Paper
Code

ECO-TR: Efficient Correspondences Finding Via Coarse-to-Fine Refinement

1 code implementation • 25 Sep 2022 • Dongli Tan, Jiang-Jiang Liu, Xingyu Chen, Chao Chen, Ruixin Zhang, Yunhang Shen, Shouhong Ding, Rongrong Ji

In this paper, we propose an efficient structure named Efficient Correspondence Transformer (ECO-TR) by finding correspondences in a coarse-to-fine manner, which significantly improves the efficiency of functional correspondence model.

Outlier Detection

Paper
Code

LAB-Net: LAB Color-Space Oriented Lightweight Network for Shadow Removal

1 code implementation • 27 Aug 2022 • Hong Yang, Gongrui Nan, Mingbao Lin, Fei Chao, Yunhang Shen, Ke Li, Rongrong Ji

Finally, the LSA modules are further developed to fully use the prior information in non-shadow regions to cleanse the shadow regions.

Shadow Removal

Paper
Code

Open Vocabulary Object Detection with Proposal Mining and Prediction Equalization

2 code implementations • 22 Jun 2022 • Peixian Chen, Kekai Sheng, Mengdan Zhang, Mingbao Lin, Yunhang Shen, Shaohui Lin, Bo Ren, Ke Li

Open-vocabulary object detection (OVD) aims to scale up vocabulary size to detect objects of novel categories beyond the training vocabulary.

Ranked #12 on Open Vocabulary Object Detection on LVIS v1.0

Causal Inference object-detection +1

Paper
Code

Efficient Decoder-free Object Detection with Transformers

2 code implementations • 14 Jun 2022 • Peixian Chen, Mengdan Zhang, Yunhang Shen, Kekai Sheng, Yuting Gao, Xing Sun, Ke Li, Chunhua Shen

A natural usage of ViTs in detection is to replace the CNN-based backbone with a transformer-based backbone, which is straightforward and effective, with the price of bringing considerable computation burden for inference.

Object Object Detection

Paper
Code

MISSU: 3D Medical Image Segmentation via Self-distilling TransUNet

1 code implementation • 2 Jun 2022 • Nan Wang, Shaohui Lin, Xiaoxiao Li, Ke Li, Yunhang Shen, Yue Gao, Lizhuang Ma

U-Nets have achieved tremendous success in medical image segmentation.

Image Segmentation Medical Image Segmentation +2

Paper
Code

PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation

1 code implementation • 2 Apr 2022 • Jing He, Yiyi Zhou, Qi Zhang, Jun Peng, Yunhang Shen, Xiaoshuai Sun, Chao Chen, Rongrong Ji

Pixel synthesis is a promising research paradigm for image generation, which can well exploit pixel-wise prior knowledge for generation.

Image Generation regression

Paper
Code

End-to-End Zero-Shot HOI Detection via Vision and Language Knowledge Distillation

1 code implementation • 1 Apr 2022 • Mingrui Wu, Jiaxin Gu, Yunhang Shen, Mingbao Lin, Chao Chen, Xiaoshuai Sun

Extensive experiments on HICO-Det dataset demonstrate that our model discovers potential interactive pairs and enables the recognition of unseen HOIs.

Human-Object Interaction Detection Knowledge Distillation +4

Paper
Code

SeqTR: A Simple yet Universal Network for Visual Grounding

3 code implementations • 30 Mar 2022 • Chaoyang Zhu, Yiyi Zhou, Yunhang Shen, Gen Luo, Xingjia Pan, Mingbao Lin, Chao Chen, Liujuan Cao, Xiaoshuai Sun, Rongrong Ji

In this paper, we propose a simple yet universal network termed SeqTR for visual grounding tasks, e. g., phrase localization, referring expression comprehension (REC) and segmentation (RES).

Referring Expression Referring Expression Comprehension +1

122

Paper
Code

CF-ViT: A General Coarse-to-Fine Method for Vision Transformer

1 code implementation • 8 Mar 2022 • Mengzhao Chen, Mingbao Lin, Ke Li, Yunhang Shen, Yongjian Wu, Fei Chao, Rongrong Ji

Our proposed CF-ViT is motivated by two important observations in modern ViT models: (1) The coarse-grained patch splitting can locate informative regions of an input image.

Paper
Code

Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks

1 code implementation • 8 Mar 2022 • Yunshan Zhong, Mingbao Lin, Xunchao Li, Ke Li, Yunhang Shen, Fei Chao, Yongjian Wu, Rongrong Ji

However, these methods suffer from severe performance degradation when quantizing the SR models to ultra-low precision (e. g., 2-bit and 3-bit) with the low-cost layer-wise quantizer.

Quantization Super-Resolution

Paper
Code

HybridCR: Weakly-Supervised 3D Point Cloud Semantic Segmentation via Hybrid Contrastive Regularization

1 code implementation • CVPR 2022 • Mengtian Li, Yuan Xie, Yunhang Shen, Bo Ke, Ruizhi Qiao, Bo Ren, Shaohui Lin, Lizhuang Ma

To address the huge labeling cost in large-scale point cloud semantic segmentation, we propose a novel hybrid contrastive regularization (HybridCR) framework in weakly-supervised setting, which obtains competitive performance compared to its fully-supervised counterpart.

Semantic Segmentation Semantic Similarity +1

Paper
Code

LCTR: On Awakening the Local Continuity of Transformer for Weakly Supervised Object Localization

no code implementations • 10 Dec 2021 • Zhiwei Chen, Changan Wang, Yabiao Wang, Guannan Jiang, Yunhang Shen, Ying Tai, Chengjie Wang, Wei zhang, Liujuan Cao

In this paper, we propose a novel framework built upon the transformer, termed LCTR (Local Continuity TRansformer), which targets at enhancing the local perception capability of global features among long-range feature dependencies.

Inductive Bias Object +1

Paper
Add Code

Fine-grained Data Distribution Alignment for Post-Training Quantization

1 code implementation • 9 Sep 2021 • Yunshan Zhong, Mingbao Lin, Mengzhao Chen, Ke Li, Yunhang Shen, Fei Chao, Yongjian Wu, Rongrong Ji

While post-training quantization receives popularity mostly due to its evasion in accessing the original complete training dataset, its poor performance also stems from scarce images.

Quantization

Paper
Code

Toward Joint Thing-and-Stuff Mining for Weakly Supervised Panoptic Segmentation

no code implementations • CVPR 2021 • Yunhang Shen, Liujuan Cao, Zhiwei Chen, Feihong Lian, Baochang Zhang, Chi Su, Yongjian Wu, Feiyue Huang, Rongrong Ji

To date, learning weakly supervised panoptic segmentation (WSPS) with only image-level labels remains unexplored.

Instance Segmentation Multiple Instance Learning +6

Paper
Add Code

Parallel Detection-and-Segmentation Learning for Weakly Supervised Instance Segmentation

no code implementations • ICCV 2021 • Yunhang Shen, Liujuan Cao, Zhiwei Chen, Baochang Zhang, Chi Su, Yongjian Wu, Feiyue Huang, Rongrong Ji

Weakly supervised instance segmentation (WSIS) with only image-level labels has recently drawn much attention.

Instance Segmentation object-detection +5

Paper
Add Code

UWSOD: Toward Fully-Supervised-Level Capacity Weakly Supervised Object Detection

1 code implementation • NeurIPS 2020 • Yunhang Shen, Rongrong Ji, Zhiwei Chen, Yongjian Wu, Feiyue Huang

In this paper, we propose a unified WSOD framework, termed UWSOD, to develop a high-capacity general detection model with only image-level labels, which is self-contained and does not require external modules or additional supervision.

Object object-detection +2

Paper
Code

Noise-Aware Fully Webly Supervised Object Detection

no code implementations • CVPR 2020 • Yunhang Shen, Rongrong Ji, Zhiwei Chen, Xiaopeng Hong, Feng Zheng, Jianzhuang Liu, Mingliang Xu, Qi Tian

We investigate the emerging task of learning object detectors with sole image-level labels on the web without requiring any other supervision like precise annotations or additional images from well-annotated benchmark datasets.

Object object-detection +1

Paper
Add Code

Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation

1 code implementation • CVPR 2019 • Yunhang Shen, Rongrong Ji, Yan Wang, Yongjian Wu, Liujuan Cao

In this paper, we join weakly supervised object detection and segmentation tasks with a multi-task learning scheme for the first time, which uses their respective failure patterns to complement each other's learning.

Ranked #4 on Image-level Supervised Instance Segmentation on COCO 2017 val (using extra training data)

Image-level Supervised Instance Segmentation Multi-Task Learning +6

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.