Search Results for author: Xiaobao Guo

Found 10 papers, 4 papers with code

Unimodal and Crossmodal Refinement Network for Multimodal Sequence Fusion

no code implementations EMNLP 2021 Xiaobao Guo, Adams Kong, Huan Zhou, Xianfeng Wang, Min Wang

Specifically, to improve unimodal representations, a unimodal refinement module is designed to refine modality-specific learning via iteratively updating the distribution with transformer-based attention layers.

Representation Learning

MACS: Multi-source Audio-to-image Generation with Contextual Significance and Semantic Alignment

1 code implementation13 Mar 2025 Hao Zhou, Xiaobao Guo, Yuzhe Zhu, Adams Wai-Kin Kong

In the second stage, efficient image generation is achieved by mapping the separated audio signals to the generation condition using only a trainable adapter and a MLP layer.

Image Generation

SparseMamba-PCL: Scribble-Supervised Medical Image Segmentation via SAM-Guided Progressive Collaborative Learning

no code implementations3 Mar 2025 Luyi Qiu, Tristan Till, Xiaobao Guo, Adams Wai-Kin Kong

Scribble annotations significantly reduce the cost and labor required for dense labeling in large medical datasets with complex anatomical structures.

Decoder Image Segmentation +3

Benchmarking Cross-Domain Audio-Visual Deception Detection

no code implementations11 May 2024 Xiaobao Guo, Zitong Yu, Nithish Muthuchamy Selvaraj, Bingquan Shen, Adams Wai-Kin Kong, Alex C. Kot

Automated deception detection is crucial for assisting humans in accurately assessing truthfulness and identifying deceptive behavior.

Benchmarking Deception Detection +1

Improving Concept Alignment in Vision-Language Concept Bottleneck Models

1 code implementation3 May 2024 Nithish Muthuchamy Selvaraj, Xiaobao Guo, Adams Wai-Kin Kong, Alex Kot

To address this issue, we propose a novel Contrastive Semi-Supervised (CSS) learning method that leverages a few labeled concept samples to activate truthful visual concepts and improve concept alignment in the CLIP model.

Classification Concept Alignment

MIMIC: Mask Image Pre-training with Mix Contrastive Fine-tuning for Facial Expression Recognition

no code implementations14 Jan 2024 Fan Zhang, Xiaobao Guo, Xiaojiang Peng, Alex Kot

In addition, when compared with the domain disparity existing between face datasets and FER datasets, the divergence between general datasets and FER datasets is more pronounced.

Contrastive Learning Face Recognition +3

Retrieving Multimodal Information for Augmented Generation: A Survey

no code implementations20 Mar 2023 Ruochen Zhao, Hailin Chen, Weishi Wang, Fangkai Jiao, Xuan Long Do, Chengwei Qin, Bosheng Ding, Xiaobao Guo, Minzhi Li, Xingxuan Li, Shafiq Joty

As Large Language Models (LLMs) become popular, there emerged an important trend of using multimodality to augment the LLMs' generation ability, which enables LLMs to better interact with the world.

Retrieval Survey

Audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning

1 code implementation ICCV 2023 Xiaobao Guo, Nithish Muthuchamy Selvaraj, Zitong Yu, Adams Wai-Kin Kong, Bingquan Shen, Alex Kot

Despite this, deception detection research is hindered by the lack of high-quality deception datasets, as well as the difficulties of learning multimodal features effectively.

Deception Detection Multi-Task Learning

Flexible-modal Deception Detection with Audio-Visual Adapter

no code implementations11 Feb 2023 Zhaoxu Li, Zitong Yu, Nithish Muthuchamy Selvaraj, Xiaobao Guo, Bingquan Shen, Adams Wai-Kin Kong, Alex Kot

Detecting deception by human behaviors is vital in many fields such as custom security and multimedia anti-fraud.

Deception Detection

Towards Photo-Realistic Virtual Try-On by Adaptively Generating$\leftrightarrow$Preserving Image Content

3 code implementations12 Mar 2020 Han Yang, Ruimao Zhang, Xiaobao Guo, Wei Liu, WangMeng Zuo, Ping Luo

First, a semantic layout generation module utilizes semantic segmentation of the reference image to progressively predict the desired semantic layout after try-on.

Ranked #4 on Virtual Try-on on VITON (IS metric)

Layout Generation Semantic Segmentation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.