Search Results for author: Kunyu Shi

Found 4 papers, 2 papers with code

Enhancing Vision-Language Pre-training with Rich Supervisions

no code implementations5 Mar 2024 Yuan Gao, Kunyu Shi, Pengkai Zhu, Edouard Belval, Oren Nuriel, Srikar Appalaraju, Shabnam Ghadar, Vijay Mahadevan, Zhuowen Tu, Stefano Soatto

We propose Strongly Supervised pre-training with ScreenShots (S4) - a novel pre-training paradigm for Vision-Language Models using data from large-scale web screenshot rendering.

Table Detection

Non-autoregressive Sequence-to-Sequence Vision-Language Models

no code implementations4 Mar 2024 Kunyu Shi, Qi Dong, Luis Goncalves, Zhuowen Tu, Stefano Soatto

Sequence-to-sequence vision-language models are showing promise, but their applicability is limited by their inference latency due to their autoregressive way of generating predictions.

Language Modelling

Musketeer: Joint Training for Multi-task Vision Language Model with Task Explanation Prompts

1 code implementation11 May 2023 Zhaoyang Zhang, Yantao Shen, Kunyu Shi, Zhaowei Cai, Jun Fang, Siqi Deng, Hao Yang, Davide Modolo, Zhuowen Tu, Stefano Soatto

We present a vision-language model whose parameters are jointly trained on all tasks and fully shared among multiple heterogeneous tasks which may interfere with each other, resulting in a single model which we named Musketeer.

Language Modelling

Learning Instance Occlusion for Panoptic Segmentation

1 code implementation CVPR 2020 Justin Lazarow, Kwonjoon Lee, Kunyu Shi, Zhuowen Tu

Panoptic segmentation requires segments of both "things" (countable object instances) and "stuff" (uncountable and amorphous regions) within a single output.

Instance Segmentation Panoptic Segmentation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.