1 code implementation • 12 Mar 2024 • Shihao Zhao, Shaozhe Hao, Bojia Zi, Huaizhe xu, Kwan-Yee K. Wong
In this paper, we explore this objective and propose LaVi-Bridge, a pipeline that enables the integration of diverse pre-trained language models and generative vision models for text-to-image generation.
3 code implementations • 22 Nov 2023 • Feng Li, Qing Jiang, Hao Zhang, Tianhe Ren, Shilong Liu, Xueyan Zou, Huaizhe xu, Hongyang Li, Chunyuan Li, Jianwei Yang, Lei Zhang, Jianfeng Gao
In-context prompting in large language models (LLMs) has become a prevalent approach to improve zero-shot capabilities, but this idea is less explored in the vision domain.
1 code implementation • CVPR 2023 • Hao Zhang, Feng Li, Huaizhe xu, Shijia Huang, Shilong Liu, Lionel M. Ni, Lei Zhang
We present a mask-piloted Transformer which improves masked-attention in Mask2Former for image segmentation.
9 code implementations • CVPR 2023 • Feng Li, Hao Zhang, Huaizhe xu, Shilong Liu, Lei Zhang, Lionel M. Ni, Heung-Yeung Shum
In this paper we present Mask DINO, a unified object detection and segmentation framework.
Ranked #1 on Panoptic Segmentation on COCO test-dev