Search Results for author: Chengyao Wang

Found 4 papers, 3 papers with code

Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

2 code implementations27 Mar 2024 Yanwei Li, Yuechen Zhang, Chengyao Wang, Zhisheng Zhong, Yixin Chen, Ruihang Chu, Shaoteng Liu, Jiaya Jia

We try to narrow the gap by mining the potential of VLMs for better performance and any-to-any workflow from three aspects, i. e., high-resolution visual tokens, high-quality data, and VLM-guided generation.

Image Comprehension Visual Dialog +1

GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding

1 code implementation14 Mar 2024 Chengyao Wang, Li Jiang, Xiaoyang Wu, Zhuotao Tian, Bohao Peng, Hengshuang Zhao, Jiaya Jia

To address this issue, we propose GroupContrast, a novel approach that combines segment grouping and semantic-aware contrastive learning.

Contrastive Learning Representation Learning +2

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models

2 code implementations28 Nov 2023 Yanwei Li, Chengyao Wang, Jiaya Jia

Current VLMs, while proficient in tasks like image captioning and visual question answering, face computational burdens when processing long videos due to the excessive visual tokens.

Image Captioning Video-based Generative Performance Benchmarking +2

Hierarchical Dense Correlation Distillation for Few-Shot Segmentation-Extended Abstract

no code implementations27 Jun 2023 Bohao Peng, Zhuotao Tian, Xiaoyang Wu, Chengyao Wang, Shu Liu, Jingyong Su, Jiaya Jia

We hope our work can benefit broader industrial applications where novel classes with limited annotations are required to be decently identified.

Few-Shot Semantic Segmentation Segmentation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.