Search Results for author: Kehan Li

Found 14 papers, 6 papers with code

GraCo: Granularity-Controllable Interactive Segmentation

no code implementations • 1 May 2024 • Yian Zhao, Kehan Li, Zesen Cheng, Pengchong Qiao, Xiawu Zheng, Rongrong Ji, Chang Liu, Li Yuan, Jie Chen

In this work, we introduce Granularity-Controllable Interactive Segmentation (GraCo), a novel approach that allows precise control of prediction granularity by introducing additional parameters to input.

Paper
Add Code

Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation

1 code implementation • 18 Jan 2024 • Zesen Cheng, Kehan Li, Hao Li, Peng Jin, Chang Liu, Xiawu Zheng, Rongrong Ji, Jie Chen

To mold instance queries to follow Brownian bridge and accomplish alignment with class texts, we design Bridge-Text Alignment (BTA) to learn discriminative bridge-level representations of instances via contrastive objectives.

Instance Segmentation Semantic Segmentation +1

Paper
Code

FreestyleRet: Retrieving Images from Style-Diversified Queries

1 code implementation • 5 Dec 2023 • Hao Li, Curise Jia, Peng Jin, Zesen Cheng, Kehan Li, Jialu Sui, Chang Liu, Li Yuan

In this paper, we propose the Style-Diversified Query-Based Image Retrieval task, which enables retrieval based on various query styles.

Image Retrieval Retrieval

Paper
Code

Forensic Histopathological Recognition via a Context-Aware MIL Network Powered by Self-Supervised Contrastive Learning

no code implementations • 27 Aug 2023 • Chen Shen, Jun Zhang, Xinggong Liang, Zeyi Hao, Kehan Li, Fan Wang, Zhenyuan Wang, Chunfeng Lian

Forensic pathology is critical in analyzing death manner and time from the microscopic aspect to assist in the establishment of reliable factual bases for criminal investigation.

Contrastive Learning Domain Generalization +3

Paper
Add Code

WiCo: Win-win Cooperation of Bottom-up and Top-down Referring Image Segmentation

no code implementations • 19 Jun 2023 • Zesen Cheng, Peng Jin, Hao Li, Kehan Li, Siheng Li, Xiangyang Ji, Chang Liu, Jie Chen

Bottom-up methods are mainly perturbed by Inferior Positive (IP) errors due to the lack of prior object information.

Image Segmentation Referring Expression Segmentation +1

Paper
Add Code

Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation

no code implementations • ICCV 2023 • Kehan Li, Yian Zhao, Zhennan Wang, Zesen Cheng, Peng Jin, Xiangyang Ji, Li Yuan, Chang Liu, Jie Chen

Interactive segmentation enables users to segment as needed by providing cues of objects, which introduces human-computer interaction for many fields, such as image editing and medical image analysis.

Interactive Segmentation

Paper
Add Code

DiffusionRet: Generative Text-Video Retrieval with Diffusion Model

4 code implementations • ICCV 2023 • Peng Jin, Hao Li, Zesen Cheng, Kehan Li, Xiangyang Ji, Chang Liu, Li Yuan, Jie Chen

Existing text-video retrieval solutions are, in essence, discriminant models focused on maximizing the conditional likelihood, i. e., p(candidates|query).

Ranked #15 on Video Retrieval on MSVD

Retrieval Video Retrieval

Paper
Code

Parallel Vertex Diffusion for Unified Visual Grounding

no code implementations • 13 Mar 2023 • Zesen Cheng, Kehan Li, Peng Jin, Xiangyang Ji, Li Yuan, Chang Liu, Jie Chen

An intuitive materialization of our paradigm is Parallel Vertex Diffusion (PVD) to directly set vertex coordinates as the generation target and use a diffusion model to train and infer.

Visual Grounding

Paper
Add Code

LaPE: Layer-adaptive Position Embedding for Vision Transformers with Independent Layer Normalization

1 code implementation • ICCV 2023 • Runyi Yu, Zhennan Wang, Yinhuai Wang, Kehan Li, Chang Liu, Haoyi Duan, Xiangyang Ji, Jie Chen

A typical way to introduce position information is adding the absolute Position Embedding (PE) to patch embedding before entering VTs.

Image Classification object-detection +3

Paper
Code

Position Embedding Needs an Independent Layer Normalization

1 code implementation • 10 Dec 2022 • Runyi Yu, Zhennan Wang, Yinhuai Wang, Kehan Li, Yian Zhao, Jian Zhang, Guoli Song, Jie Chen

By analyzing the input and output of each encoder layer in VTs using reparameterization and visualization, we find that the default PE joining method (simply adding the PE and patch embedding together) operates the same affine transformation to token embedding and PE, which limits the expressiveness of PE and hence constrains the performance of VTs.

Position

Paper
Code

Out-of-Candidate Rectification for Weakly Supervised Semantic Segmentation

no code implementations • CVPR 2023 • Zesen Cheng, Pengchong Qiao, Kehan Li, Siheng Li, Pengxu Wei, Xiangyang Ji, Li Yuan, Chang Liu, Jie Chen

Weakly supervised semantic segmentation is typically inspired by class activation maps, which serve as pseudo masks with class-discriminative regions highlighted.

Optical Character Recognition (OCR) Weakly supervised Semantic Segmentation +1

Paper
Add Code

ACSeg: Adaptive Conceptualization for Unsupervised Semantic Segmentation

no code implementations • CVPR 2023 • Kehan Li, Zhennan Wang, Zesen Cheng, Runyi Yu, Yian Zhao, Guoli Song, Chang Liu, Li Yuan, Jie Chen

Recently, self-supervised large-scale visual pre-training models have shown great promise in representing pixel-level semantic relationships, significantly promoting the development of unsupervised dense prediction tasks, e. g., unsupervised semantic segmentation (USS).

Image Segmentation Unsupervised Semantic Segmentation

Paper
Add Code

Locality Guidance for Improving Vision Transformers on Tiny Datasets

1 code implementation • 20 Jul 2022 • Kehan Li, Runyi Yu, Zhennan Wang, Li Yuan, Guoli Song, Jie Chen

Therefore, our locality guidance approach is very simple and efficient, and can serve as a basic performance enhancement method for VTs on tiny datasets.

Paper
Code

$L_2$BN: Enhancing Batch Normalization by Equalizing the $L_2$ Norms of Features

no code implementations • 6 Jul 2022 • Zhennan Wang, Kehan Li, Runyi Yu, Yian Zhao, Pengchong Qiao, Chang Liu, Fan Xu, Xiangyang Ji, Guoli Song, Jie Chen

In this paper, we analyze batch normalization from the perspective of discriminability and find the disadvantages ignored by previous studies: the difference in $l_2$ norms of sample features can hinder batch normalization from obtaining more distinguished inter-class features and more compact intra-class features.

Acoustic Scene Classification Image Classification +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.