Search Results for author: Yuhang Yang

Found 8 papers, 3 papers with code

EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views

no code implementations22 May 2024 Yuhang Yang, Wei Zhai, Chengfeng Wang, Chengjun Yu, Yang Cao, Zheng-Jun Zha

For the egocentric HOI, in addition to perceiving semantics e. g., ''what'' interaction is occurring, capturing ''where'' the interaction specifically manifests in 3D space is also crucial, which links the perception and operation.

Human-Object Interaction Detection Object

LEMON: Learning 3D Human-Object Interaction Relation from 2D Images

no code implementations CVPR 2024 Yuhang Yang, Wei Zhai, Hongchen Luo, Yang Cao, Zheng-Jun Zha

Which underexploit certain correlations between the interaction counterparts (human and object), and struggle to address the uncertainty in interactions.

Human-Object Interaction Detection Object +1

Adapting OpenAI's Whisper for Speech Recognition on Code-Switch Mandarin-English SEAME and ASRU2019 Datasets

no code implementations29 Nov 2023 Yuhang Yang, Yizhou Peng, Xionghu Zhong, Hao Huang, Eng Siong Chng

The Mixed Error Rate results show that the amount of adaptation data may be as low as $1\sim10$ hours to achieve saturation in performance gain (SEAME) while the ASRU task continued to show performance with more adaptation data ($>$100 hours).

speech-recognition Speech Recognition

Grounding 3D Object Affordance from 2D Interactions in Images

1 code implementation ICCV 2023 Yuhang Yang, Wei Zhai, Hongchen Luo, Yang Cao, Jiebo Luo, Zheng-Jun Zha

Comprehensive experiments on PIAD demonstrate the reliability of the proposed task and the superiority of our method.

Object

Speech-text based multi-modal training with bidirectional attention for improved speech recognition

1 code implementation1 Nov 2022 Yuhang Yang, HaiHua Xu, Hao Huang, Eng Siong Chng, Sheng Li

To let the state-of-the-art end-to-end ASR model enjoy data efficiency, as well as much more unpaired text data by multi-modal training, one needs to address two problems: 1) the synchronicity of feature sampling rates between speech and language (aka text data); 2) the homogeneity of the learned representations from two encoders.

speech-recognition Speech Recognition

Self-supervised Feature Enhancement: Applying Internal Pretext Task to Supervised Learning

no code implementations9 Jun 2021 Yuhang Yang, Zilin Ding, Xuan Cheng, Xiaomin Wang, Ming Liu

In this paper, we show that feature transformations within CNNs can also be regarded as supervisory signals to construct the self-supervised task, called \emph{internal pretext task}.

Self-Supervised Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.