Search Results for author: Hang Hua

Found 9 papers, 3 papers with code

FINEMATCH: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction

no code implementations23 Apr 2024 Hang Hua, Jing Shi, Kushal Kafle, Simon Jenni, Daoan Zhang, John Collomosse, Scott Cohen, Jiebo Luo

To address this, we propose FineMatch, a new aspect-based fine-grained text and image matching benchmark, focusing on text and image mismatch detection and correction.

V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning

no code implementations18 Apr 2024 Hang Hua, Yunlong Tang, Chenliang Xu, Jiebo Luo

Recent efforts have been made to expand from unimodal to multimodal video summarization, categorizing the task into three sub-tasks based on the summary's modality: video-to-video (V2V), video-to-text (V2T), and a combination of video and text summarization (V2VT).

Text Summarization Video Summarization

Emo-Avatar: Efficient Monocular Video Style Avatar through Texture Rendering

no code implementations1 Feb 2024 Pinxin Liu, Luchuan Song, Daoan Zhang, Hang Hua, Yunlong Tang, Huaijin Tu, Jiebo Luo, Chenliang Xu

To address the above problems, we propose the Efficient Monotonic Video Style Avatar (Emo-Avatar) through deferred neural rendering that enhances StyleGAN's capacity for producing dynamic, drivable portrait videos.

Contrastive Learning Neural Rendering

PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3

no code implementations ICCV 2023 Yushi Hu, Hang Hua, Zhengyuan Yang, Weijia Shi, Noah A. Smith, Jiebo Luo

PromptCap outperforms generic captions by a large margin and achieves state-of-the-art accuracy on knowledge-based VQA tasks (60. 4% on OK-VQA and 59. 6% on A-OKVQA).

Image Captioning Question Answering +3

PromptCap: Prompt-Guided Task-Aware Image Captioning

1 code implementation15 Nov 2022 Yushi Hu, Hang Hua, Zhengyuan Yang, Weijia Shi, Noah A Smith, Jiebo Luo

PromptCap outperforms generic captions by a large margin and achieves state-of-the-art accuracy on knowledge-based VQA tasks (60. 4% on OK-VQA and 59. 6% on A-OKVQA).

Image Captioning Language Modelling +5

Improving Pre-trained Language Model Fine-tuning with Noise Stability Regularization

no code implementations12 Jun 2022 Hang Hua, Xingjian Li, Dejing Dou, Cheng-Zhong Xu, Jiebo Luo

The advent of large-scale pre-trained language models has contributed greatly to the recent progress in natural language processing.

Domain Generalization Language Modelling +3

Controllable Unsupervised Text Attribute Transfer via Editing Entangled Latent Representation

2 code implementations NeurIPS 2019 Ke Wang, Hang Hua, Xiaojun Wan

Unsupervised text attribute transfer automatically transforms a text to alter a specific attribute (e. g. sentiment) without using any parallel data, while simultaneously preserving its attribute-independent content.

Attribute Text Attribute Transfer

Cannot find the paper you are looking for? You can Submit a new open access paper.