no code implementations • 13 Aug 2024 • Harry Cheng, Yangyang Guo, Qingpei Guo, Ming Yang, Tian Gan, Liqiang Nie
Multi-modal Large Language Models (MLLMs) have advanced significantly, offering powerful vision-language understanding capabilities.
no code implementations • 21 May 2024 • Yi Cheng, Ziwei Xu, Dongyun Lin, Harry Cheng, Yongkang Wong, Ying Sun, Joo Hwee Lim, Mohan Kankanhalli
To address these challenges, we propose a knowledge-enhanced iterative refinement framework for visual content generation.
no code implementations • 8 May 2024 • Tianrui Guan, Yurou Yang, Harry Cheng, Muyuan Lin, Richard Kim, Rajasimman Madhivanan, Arnie Sen, Dinesh Manocha
In this paper, we present LOC-ZSON, a novel Language-driven Object-Centric image representation for object navigation task within complex scenes.
1 code implementation • 29 Jan 2024 • Harry Cheng, Yangyang Guo, Tianyi Wang, Liqiang Nie, Mohan Kankanhalli
In particular, this dataset leverages 30, 000 carefully collected textual and visual prompts, ensuring the synthesis of images with both high fidelity and semantic consistency.
no code implementations • 2 Nov 2023 • Tianyi Wang, Mengxiao Huang, Harry Cheng, Bin Ma, Yinglong Wang
Falsification and source tracing are accomplished by justifying the consistency between the content-matched identity perceptual watermark and the recovered robust watermark from the image.
1 code implementation • 27 Jul 2023 • Harry Cheng, Yangyang Guo, Liqiang Nie, Zhiyong Cheng, Mohan Kankanhalli
Training an effective video action recognition model poses significant computational challenges, particularly under limited resource budgets.
no code implementations • 24 Jul 2023 • Harry Cheng, Yangyang Guo, Tianyi Wang, Liqiang Nie, Mohan Kankanhalli
The existing deepfake detection methods have reached a bottleneck in generalizing to unseen forgeries and manipulation approaches.
no code implementations • 12 Sep 2022 • Tianyi Wang, Harry Cheng, Kam Pui Chow, Liqiang Nie
Most existing deep learning methods mainly focus on local features and relations within the face image using convolutional neural networks as a backbone.
no code implementations • 4 Mar 2022 • Harry Cheng, Yangyang Guo, Tianyi Wang, Qi Li, Xiaojun Chang, Liqiang Nie
To this end, a voice-face matching method is devised to measure the matching degree of these two.
1 code implementation • 25 Feb 2022 • Yangyang Guo, Liqiang Nie, Harry Cheng, Zhiyong Cheng, Mohan Kankanhalli, Alberto del Bimbo
From the results on four datasets regarding the above three tasks, our method yields remarkable performance improvements compared with the baselines, demonstrating its superiority on reducing the modality bias problem.