Search Results for author: Serena Yeung-Levy

Found 10 papers, 5 papers with code

Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models

1 code implementation • 19 Mar 2024 • Elaine Sui, Xiaohan Wang, Serena Yeung-Levy

Advancements in vision-language models (VLMs) have propelled the field of computer vision, particularly in the zero-shot learning setting.

Prompt Engineering Zero-shot Generalization +1

Paper
Code

Depth-guided NeRF Training via Earth Mover's Distance

no code implementations • 19 Mar 2024 • Anita Rau, Josiah Aklilu, F. Christopher Holsinger, Serena Yeung-Levy

This work proposes a novel approach to uncertainty in depth priors for NeRF supervision.

Denoising

Paper
Add Code

VideoAgent: Long-form Video Understanding with Large Language Model as Agent

no code implementations • 15 Mar 2024 • Xiaohan Wang, Yuhui Zhang, Orr Zohar, Serena Yeung-Levy

Long-form video understanding represents a significant challenge within computer vision, demanding a model capable of reasoning over long multi-modal sequences.

Ranked #1 on Zero-Shot Video Question Answer on NExT-QA

Language Modelling Large Language Model +2

Paper
Add Code

Training Small Multimodal Models to Bridge Biomedical Competency Gap: A Case Study in Radiology Imaging

no code implementations • 12 Mar 2024 • Juan Manuel Zambrano Chaves, Shih-Cheng Huang, Yanbo Xu, Hanwen Xu, Naoto Usuyama, Sheng Zhang, Fei Wang, Yujia Xie, Mahmoud Khademi, ZiYi Yang, Hany Awadalla, Julia Gong, Houdong Hu, Jianwei Yang, Chunyuan Li, Jianfeng Gao, Yu Gu, Cliff Wong, Mu Wei, Tristan Naumann, Muhao Chen, Matthew P. Lungren, Serena Yeung-Levy, Curtis P. Langlotz, Sheng Wang, Hoifung Poon

Frontier models such as GPT-4V still have major competency gaps in multimodal capabilities for biomedical applications.

Cross-Modal Retrieval

Paper
Add Code

Multi-Human Mesh Recovery with Transformers

no code implementations • 26 Feb 2024 • Zeyu Wang, Zhenzhen Weng, Serena Yeung-Levy

Conventional approaches to human mesh recovery predominantly employ a region-based strategy.

Human Mesh Recovery

Paper
Add Code

Revisiting Active Learning in the Era of Vision Foundation Models

1 code implementation • 25 Jan 2024 • Sanket Rajan Gupte, Josiah Aklilu, Jeffrey J. Nirschl, Serena Yeung-Levy

Foundation vision or vision-language models are trained on large unlabeled or noisy data and learn robust representations that can achieve impressive zero- or few-shot performance on diverse tasks.

Active Learning Image Classification

Paper
Code

Template-Free Single-View 3D Human Digitalization with Diffusion-Guided LRM

no code implementations • 22 Jan 2024 • Zhenzhen Weng, Jingyuan Liu, Hao Tan, Zhan Xu, Yang Zhou, Serena Yeung-Levy, Jimei Yang

We present Human-LRM, a diffusion-guided feed-forward model that predicts the implicit field of a human from a single image.

Paper
Add Code

Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data

1 code implementation • 16 Jan 2024 • Yuhui Zhang, Elaine Sui, Serena Yeung-Levy

However, this assumption is under-explored due to the poorly understood geometry of the multi-modal contrastive space, where a modality gap exists.

Text-to-Image Generation Video Captioning

Paper
Code

Describing Differences in Image Sets with Natural Language

1 code implementation • 5 Dec 2023 • Lisa Dunlap, Yuhui Zhang, Xiaohan Wang, Ruiqi Zhong, Trevor Darrell, Jacob Steinhardt, Joseph E. Gonzalez, Serena Yeung-Levy

To aid in this discovery process, we explore the task of automatically describing the differences between two $\textbf{sets}$ of images, which we term Set Difference Captioning.

Language Modelling

Paper
Code

Diffusion-HPC: Synthetic Data Generation for Human Mesh Recovery in Challenging Domains

1 code implementation • 16 Mar 2023 • Zhenzhen Weng, Laura Bravo-Sánchez, Serena Yeung-Levy

Recent text-to-image generative models have exhibited remarkable abilities in generating high-fidelity and photo-realistic images.

Human Mesh Recovery Synthetic Data Generation

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.