Search Results for author: Yuhao Cui

Found 6 papers, 4 papers with code

CASEIN: Cascading Explicit and Implicit Control for Fine-grained Emotion Intensity Regulation

no code implementations • 27 Jun 2023 • Yuhao Cui, Xiongwei Wang, Zhongzhou Zhao, Wei Zhou, Haiqing Chen

However, these high-level semantic probabilities are often inaccurate and unsmooth at the phoneme level, leading to bias in learning.

Disentanglement

Paper
Add Code

COOP: Decoupling and Coupling of Whole-Body Grasping Pose Generation

1 code implementation • ICCV 2023 • Yanzhao Zheng, Yunzhou Shi, Yuhao Cui, Zhongzhou Zhao, Zhiling Luo, Wei Zhou

To address this issue, we propose a novel framework called COOP (DeCOupling and COupling of Whole-Body GrasPing Pose Generation) to synthesize life-like whole-body poses that cover the widest range of human grasping capabilities.

Paper
Code

ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

1 code implementation • 16 Aug 2021 • Yuhao Cui, Zhou Yu, Chunqi Wang, Zhongzhou Zhao, Ji Zhang, Meng Wang, Jun Yu

Nevertheless, most existing VLP approaches have not fully utilized the intrinsic knowledge within the image-text pairs, which limits the effectiveness of the learned alignments and further restricts the performance of their models.

Visual Reasoning

Paper
Code

Deep Multimodal Neural Architecture Search

1 code implementation • 25 Apr 2020 • Zhou Yu, Yuhao Cui, Jun Yu, Meng Wang, DaCheng Tao, Qi Tian

Most existing works focus on a single task and design neural architectures manually, which are highly task-specific and hard to generalize to different tasks.

Ranked #19 on Visual Question Answering (VQA) on VQA v2 test-std

Image-text matching Neural Architecture Search +4

Paper
Code

Multimodal Unified Attention Networks for Vision-and-Language Interactions

no code implementations • 12 Aug 2019 • Zhou Yu, Yuhao Cui, Jun Yu, DaCheng Tao, Qi Tian

Learning an effective attention mechanism for multimodal data is important in many vision-and-language tasks that require a synergic understanding of both the visual and textual contents.

Question Answering Visual Grounding +1

Paper
Add Code

Deep Modular Co-Attention Networks for Visual Question Answering

7 code implementations • CVPR 2019 • Zhou Yu, Jun Yu, Yuhao Cui, DaCheng Tao, Qi Tian

In this paper, we propose a deep Modular Co-Attention Network (MCAN) that consists of Modular Co-Attention (MCA) layers cascaded in depth.

Ranked #5 on Question Answering on SQA3D

Question Answering Visual Question Answering

432

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.