Search Results for author: Yuchen Xian

Found 1 papers, 0 papers with code

Vista-LLaMA: Reliable Video Narrator via Equal Distance to Visual Tokens

no code implementations • 12 Dec 2023 • Fan Ma, Xiaojie Jin, Heng Wang, Yuchen Xian, Jiashi Feng, Yi Yang

This amplifies the effect of visual tokens on text generation, especially when the relative distance is longer between visual and text tokens.

Ranked #6 on Zero-Shot Video Question Answer on MSRVTT-QA

Hallucination Position +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.