Search Results for author: Jia Jia

Found 13 papers, 2 papers with code

Imitating Arbitrary Talking Style for Realistic Audio-DrivenTalking Face Synthesis

1 code implementation30 Oct 2021 Haozhe Wu, Jia Jia, Haoyu Wang, Yishun Dou, Chao Duan, Qingshan Deng

Due to such huge differences between different styles, it is necessary to incorporate the talking style into audio-driven talking face synthesis framework.

Face Generation

Towards Multi-Scale Style Control for Expressive Speech Synthesis

no code implementations8 Apr 2021 Xiang Li, Changhe Song, Jingbei Li, Zhiyong Wu, Jia Jia, Helen Meng

This paper introduces a multi-scale speech style modeling method for end-to-end expressive speech synthesis.

Expressive Speech Synthesis Style Transfer

ChoreoNet: Towards Music to Dance Synthesis with Choreographic Action Unit

no code implementations16 Sep 2020 Zijie Ye, Haozhe Wu, Jia Jia, Yaohua Bu, Wei Chen, Fanbo Meng, Yan-Feng Wang

Meanwhile, human choreographers design dance motions from music in a two-stage manner: they firstly devise multiple choreographic dance units (CAUs), each with a series of dance motions, and then arrange the CAU sequence according to the rhythm, melody and emotion of the music.

Visual-speech Synthesis of Exaggerated Corrective Feedback

no code implementations12 Sep 2020 Yaohua Bu, Weijun Li, Tianyi Ma, Shengqi Chen, Jia Jia, Kun Li, Xiaobo Lu

To provide more discriminative feedback for the second language (L2) learners to better identify their mispronunciation, we propose a method for exaggerated visual-speech feedback in computer-assisted pronunciation training (CAPT).

Speech Synthesis

Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams

no code implementations20 Jun 2020 Huirong Huang, Zhiyong Wu, Shiyin Kang, Dongyang Dai, Jia Jia, Tianxiao Fu, Deyi Tuo, Guangzhi Lei, Peng Liu, Dan Su, Dong Yu, Helen Meng

Recent approaches mainly have following limitations: 1) most speaker-independent methods need handcrafted features that are time-consuming to design or unreliable; 2) there is no convincing method to support multilingual or mixlingual speech as input.

Talking Head Generation

Mining Unfollow Behavior in Large-Scale Online Social Networks via Spatial-Temporal Interaction

1 code implementation17 Nov 2019 Haozhe Wu, Zhiyuan Hu, Jia Jia, Yaohua Bu, Xiangnan He, Tat-Seng Chua

Next, we define user's attributes as two categories: spatial attributes (e. g., social role of user) and temporal attributes (e. g., post content of user).

Informativeness

Exploring RNN-Transducer for Chinese Speech Recognition

no code implementations13 Nov 2018 Senmao Wang, Pan Zhou, Wei Chen, Jia Jia, Lei Xie

End-to-end approaches have drawn much attention recently for significantly simplifying the construction of an automatic speech recognition (ASR) system.

Automatic Speech Recognition

An Online Attention-based Model for Speech Recognition

no code implementations13 Nov 2018 Ruchao Fan, Pan Zhou, Wei Chen, Jia Jia, Gang Liu

In previous work, researchers have shown that such architectures can acquire comparable results to state-of-the-art ASR systems, especially when using a bidirectional encoder and global soft attention (GSA) mechanism.

Automatic Speech Recognition

Modality Attention for End-to-End Audio-visual Speech Recognition

no code implementations13 Nov 2018 Pan Zhou, Wenwen Yang, Wei Chen, Yan-Feng Wang, Jia Jia

In this paper, we propose a novel multimodal attention based method for audio-visual speech recognition which could automatically learn the fused representation from both modalities based on their importance.

Audio-Visual Speech Recognition Robust Speech Recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.