no code implementations • 6 Feb 2024 • Daechul Ahn, Yura Choi, Youngjae Yu, Dongyeop Kang, Jonghyun Choi
Recent advancements in large language models have influenced the development of video large multimodal models (VLMMs).
1 code implementation • ICCV 2023 • Daechul Ahn, Daneul Kim, Gwangmo Song, Seung Hwan Kim, Honglak Lee, Dongyeop Kang, Jonghyun Choi
Story visualization (SV) is a challenging text-to-image generation task for the difficulty of not only rendering visual details from the text descriptions but also encoding a long-term context across multiple sentences.
1 code implementation • ICCV 2021 • Jinwoo Nam, Daechul Ahn, Dongyeop Kang, Seong Jong Ha, Jonghyun Choi
Understanding videos to localize moments with natural language often requires large expensive annotated video regions paired with language queries.