Search Results for author: Yuhao Cui

Found 6 papers, 4 papers with code

CASEIN: Cascading Explicit and Implicit Control for Fine-grained Emotion Intensity Regulation

no code implementations27 Jun 2023 Yuhao Cui, Xiongwei Wang, Zhongzhou Zhao, Wei Zhou, Haiqing Chen

However, these high-level semantic probabilities are often inaccurate and unsmooth at the phoneme level, leading to bias in learning.

Disentanglement

COOP: Decoupling and Coupling of Whole-Body Grasping Pose Generation

1 code implementation ICCV 2023 Yanzhao Zheng, Yunzhou Shi, Yuhao Cui, Zhongzhou Zhao, Zhiling Luo, Wei Zhou

To address this issue, we propose a novel framework called COOP (DeCOupling and COupling of Whole-Body GrasPing Pose Generation) to synthesize life-like whole-body poses that cover the widest range of human grasping capabilities.

ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

1 code implementation16 Aug 2021 Yuhao Cui, Zhou Yu, Chunqi Wang, Zhongzhou Zhao, Ji Zhang, Meng Wang, Jun Yu

Nevertheless, most existing VLP approaches have not fully utilized the intrinsic knowledge within the image-text pairs, which limits the effectiveness of the learned alignments and further restricts the performance of their models.

Visual Reasoning

Deep Multimodal Neural Architecture Search

1 code implementation25 Apr 2020 Zhou Yu, Yuhao Cui, Jun Yu, Meng Wang, DaCheng Tao, Qi Tian

Most existing works focus on a single task and design neural architectures manually, which are highly task-specific and hard to generalize to different tasks.

Image-text matching Neural Architecture Search +4

Multimodal Unified Attention Networks for Vision-and-Language Interactions

no code implementations12 Aug 2019 Zhou Yu, Yuhao Cui, Jun Yu, DaCheng Tao, Qi Tian

Learning an effective attention mechanism for multimodal data is important in many vision-and-language tasks that require a synergic understanding of both the visual and textual contents.

Question Answering Visual Grounding +1

Deep Modular Co-Attention Networks for Visual Question Answering

7 code implementations CVPR 2019 Zhou Yu, Jun Yu, Yuhao Cui, DaCheng Tao, Qi Tian

In this paper, we propose a deep Modular Co-Attention Network (MCAN) that consists of Modular Co-Attention (MCA) layers cascaded in depth.

Question Answering Visual Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.