Search Results for author: Yisi Zhang

Found 9 papers, 5 papers with code

The Instance-centric Transformer for the RVOS Track of LSVOS Challenge: 3rd Place Solution

no code implementations20 Aug 2024 Bin Cao, Yisi Zhang, Hanyi Wang, Xingjian He, Jing Liu

Referring Video Object Segmentation is an emerging multi-modal task that aims to segment objects in the video given a natural language expression.

Referring Video Object Segmentation Retrieval +2

2nd Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation

no code implementations20 Jun 2024 Bin Cao, Yisi Zhang, Xuanxu Lin, Xingjian He, Bo Zhao, Jing Liu

Motion Expression guided Video Segmentation is a challenging task that aims at segmenting objects in the video based on natural language expressions with motion descriptions.

Instance Segmentation Referring Video Object Segmentation +5

Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with Human Intentions

1 code implementation17 Feb 2024 Wenxuan Wang, Yisi Zhang, Xingjian He, Yichen Yan, Zijia Zhao, Xinlong Wang, Jing Liu

To promote classic VG towards human intention interpretation, we propose a new intention-driven visual grounding (IVG) task and build a large-scale IVG dataset termed IntentionVG with free-form intention expressions.

Visual Grounding

Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation

1 code implementation CVPR 2024 Wenxuan Wang, Tongtian Yue, Yisi Zhang, Longteng Guo, Xingjian He, Xinlong Wang, Jing Liu

To foster future research into fine-grained visual grounding our benchmark RefCOCOm the MRES-32M dataset and model UniRES will be publicly available at https://github. com/Rubics-Xuan/MRES.

Descriptive Object +3

Adaptive FSS: A Novel Few-Shot Segmentation Framework via Prototype Enhancement

2 code implementations25 Dec 2023 Jing Wang, Jinagyun Li, Chen Chen, Yisi Zhang, Haoran Shen, Tianxiang Zhang

In this paper, we propose a novel framework based on the adapter mechanism, namely Adaptive FSS, which can efficiently adapt the existing FSS model to the novel classes.

Meta-Learning

Unveiling Parts Beyond Objects:Towards Finer-Granularity Referring Expression Segmentation

1 code implementation13 Dec 2023 Wenxuan Wang, Tongtian Yue, Yisi Zhang, Longteng Guo, Xingjian He, Xinlong Wang, Jing Liu

To foster future research into fine-grained visual grounding, our benchmark RefCOCOm, the MRES-32M dataset and model UniRES will be publicly available at https://github. com/Rubics-Xuan/MRES

Descriptive Object +3

CM-MaskSD: Cross-Modality Masked Self-Distillation for Referring Image Segmentation

no code implementations19 May 2023 Wenxuan Wang, Jing Liu, Xingjian He, Yisi Zhang, Chen Chen, Jiachen Shen, Yan Zhang, Jiangyun Li

Referring image segmentation (RIS) is a fundamental vision-language task that intends to segment a desired object from an image based on a given natural language expression.

Image Segmentation Segmentation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.