no code implementations • 23 Apr 2024 • Hang Hua, Jing Shi, Kushal Kafle, Simon Jenni, Daoan Zhang, John Collomosse, Scott Cohen, Jiebo Luo
To address this, we propose FineMatch, a new aspect-based fine-grained text and image matching benchmark, focusing on text and image mismatch detection and correction.
1 code implementation • 23 Apr 2024 • Shuhang Lin, Wenyue Hua, Lingyao Li, Che-Jui Chang, Lizhou Fan, Jianchao Ji, Hang Hua, Mingyu Jin, Jiebo Luo, Yongfeng Zhang
This novel system aims to simulate complex dynamic interactions among multiple agents, as well as between agents and their environments, over a period of time.
no code implementations • 18 Apr 2024 • Hang Hua, Yunlong Tang, Chenliang Xu, Jiebo Luo
Recent efforts have been made to expand from unimodal to multimodal video summarization, categorizing the task into three sub-tasks based on the summary's modality: video-to-video (V2V), video-to-text (V2T), and a combination of video and text summarization (V2VT).
no code implementations • 1 Feb 2024 • Pinxin Liu, Luchuan Song, Daoan Zhang, Hang Hua, Yunlong Tang, Huaijin Tu, Jiebo Luo, Chenliang Xu
To address the above problems, we propose the Efficient Monotonic Video Style Avatar (Emo-Avatar) through deferred neural rendering that enhances StyleGAN's capacity for producing dynamic, drivable portrait videos.
1 code implementation • 21 Mar 2023 • Jingyang Lin, Hang Hua, Ming Chen, Yikang Li, Jenhao Hsiao, Chiuman Ho, Jiebo Luo
We propose a new joint video and text summarization task.
Ranked #1 on Video Summarization on videoxum
no code implementations • ICCV 2023 • Yushi Hu, Hang Hua, Zhengyuan Yang, Weijia Shi, Noah A. Smith, Jiebo Luo
PromptCap outperforms generic captions by a large margin and achieves state-of-the-art accuracy on knowledge-based VQA tasks (60. 4% on OK-VQA and 59. 6% on A-OKVQA).
1 code implementation • 15 Nov 2022 • Yushi Hu, Hang Hua, Zhengyuan Yang, Weijia Shi, Noah A Smith, Jiebo Luo
PromptCap outperforms generic captions by a large margin and achieves state-of-the-art accuracy on knowledge-based VQA tasks (60. 4% on OK-VQA and 59. 6% on A-OKVQA).
Ranked #1 on Visual Question Answering on TextVQA test-standard
no code implementations • 12 Jun 2022 • Hang Hua, Xingjian Li, Dejing Dou, Cheng-Zhong Xu, Jiebo Luo
The advent of large-scale pre-trained language models has contributed greatly to the recent progress in natural language processing.
no code implementations • NAACL 2021 • Hang Hua, Xingjian Li, Dejing Dou, Cheng-Zhong Xu, Jiebo Luo
The brittleness of this process is often reflected by the sensitivity to random seeds.
2 code implementations • NeurIPS 2019 • Ke Wang, Hang Hua, Xiaojun Wan
Unsupervised text attribute transfer automatically transforms a text to alter a specific attribute (e. g. sentiment) without using any parallel data, while simultaneously preserving its attribute-independent content.