no code implementations • 1 Aug 2024 • Shiji Zhou, Lianzhe Wang, Jiangnan Ye, Yongliang Wu, Heng Chang
Generative AI (GenAI), which aims to synthesize realistic and diverse data samples from latent variables or other data modalities, has achieved remarkable results in various domains, such as natural language, images, audio, and graphs.
no code implementations • 25 Jun 2024 • Yongliang Wu, Bozheng Li, Jiawang Cao, Wenbo Zhu, Yi Lu, Weiheng Chi, Chuyun Xie, Haolin Zheng, Ziyue Su, Jay Wu, Xu Yang
The Long-form Video Question-Answering task requires the comprehension and analysis of extended video content to respond accurately to questions by utilizing both temporal and contextual information.
no code implementations • 24 May 2024 • Yongliang Wu, Shiji Zhou, Mingzhuo Yang, Lianzhe Wang, Wenbo Zhu, Heng Chang, Xiao Zhou, Xu Yang
Current text-to-image diffusion models have achieved groundbreaking results in image generation tasks.
no code implementations • 10 Mar 2024 • Jiawang Cao, Yongliang Wu, Weiheng Chi, Wenbo Zhu, Ziyue Su, Jay Wu
The proliferation of mobile devices and social media has revolutionized content dissemination, with short-form video becoming increasingly prevalent.
1 code implementation • NeurIPS 2023 • Xu Yang, Yongliang Wu, Mingzhuo Yang, Haokun Chen, Xin Geng
After discovering that Language Models (LMs) can be good in-context few-shot learners, numerous strategies have been proposed to optimize in-context sequence configurations.