Search Results for author: Fuming Fang

Found 11 papers, 2 papers with code

A unified multichannel far-field speech recognition system: combining neural beamforming with attention based end-to-end model

no code implementations • 5 Jan 2024 • Dongdi Zhao, Jianbo Ma, Lu Lu, Jinke Li, Xuan Ji, Lei Zhu, Fuming Fang, Ming Liu, Feijun Jiang

Far-field speech recognition is a challenging task that conventionally uses signal processing beamforming to attack noise and interference problem.

Speech Enhancement speech-recognition +1

Paper
Add Code

A Method for Identifying Origin of Digital Images Using a Convolution Neural Network

no code implementations • 2 Nov 2019 • Rong Huang, Fuming Fang, Huy H. Nguyen, Junichi Yamagishi, Isao Echizen

The rapid development of deep learning techniques has created new challenges in identifying the origin of digital images because generative adversarial networks and variational autoencoders can create plausible digital images whose contents are not present in natural scenes.

Paper
Add Code

Security of Facial Forensics Models Against Adversarial Attacks

no code implementations • 2 Nov 2019 • Rong Huang, Fuming Fang, Huy H. Nguyen, Junichi Yamagishi, Isao Echizen

We experimentally demonstrated the existence of individual adversarial perturbations (IAPs) and universal adversarial perturbations (UAPs) that can lead a well-performed FFM to misbehave.

Paper
Add Code

Zero-Shot Multi-Speaker Text-To-Speech with State-of-the-art Neural Speaker Embeddings

3 code implementations • 23 Oct 2019 • Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Fuming Fang, Xin Wang, Nanxin Chen, Junichi Yamagishi

While speaker adaptation for end-to-end speech synthesis using speaker embeddings can produce good speaker similarity for speakers seen during training, there remains a gap for zero-shot adaptation to unseen speakers.

Audio and Speech Processing

264

Paper
Code

Generating Sentiment-Preserving Fake Online Reviews Using Neural Language Models and Their Human- and Machine-based Detection

no code implementations • 22 Jul 2019 • David Ifeoluwa Adelani, Haotian Mai, Fuming Fang, Huy H. Nguyen, Junichi Yamagishi, Isao Echizen

Advanced neural language models (NLMs) are widely used in sequence generation tasks because they are able to produce fluent and meaningful sentences.

Paper
Add Code

Multi-task Learning For Detecting and Segmenting Manipulated Facial Images and Videos

1 code implementation • 17 Jun 2019 • Huy H. Nguyen, Fuming Fang, Junichi Yamagishi, Isao Echizen

The output of one branch of the decoder is used for segmenting the manipulated regions while that of the other branch is used for reconstructing the input, which helps improve overall performance.

Binary Classification Face Swapping +2

Paper
Code

Speaker Anonymization Using X-vector and Neural Waveform Models

no code implementations • 30 May 2019 • Fuming Fang, Xin Wang, Junichi Yamagishi, Isao Echizen, Massimiliano Todisco, Nicholas Evans, Jean-Francois Bonastre

One solution to mitigate these concerns involves the concealing of speaker identities before the sharing of speech data.

Speaker Verification Speech Synthesis

Paper
Add Code

Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet

no code implementations • 29 Mar 2019 • Mingyang Zhang, Xin Wang, Fuming Fang, Haizhou Li, Junichi Yamagishi

We propose using an extended model architecture of Tacotron, that is a multi-source sequence-to-sequence model with a dual attention mechanism as the shared model for both the TTS and VC tasks.

Speech Synthesis Voice Conversion

Paper
Add Code

Audiovisual speaker conversion: jointly and simultaneously transforming facial expression and acoustic characteristics

no code implementations • 29 Oct 2018 • Fuming Fang, Xin Wang, Junichi Yamagishi, Isao Echizen

Transforming the facial and acoustic features together makes it possible for the converted voice and facial expressions to be highly correlated and for the generated target speaker to appear and sound natural.

Image Reconstruction

Paper
Add Code

High-quality nonparallel voice conversion based on cycle-consistent adversarial network

no code implementations • 2 Apr 2018 • Fuming Fang, Junichi Yamagishi, Isao Echizen, Jaime Lorenzo-Trueba

Although voice conversion (VC) algorithms have achieved remarkable success along with the development of machine learning, superior performance is still difficult to achieve when using nonparallel data.

Generative Adversarial Network Image-to-Image Translation +4

Paper
Add Code

Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's voice using GAN, WaveNet and low-quality found data

no code implementations • 2 Mar 2018 • Jaime Lorenzo-Trueba, Fuming Fang, Xin Wang, Isao Echizen, Junichi Yamagishi, Tomi Kinnunen

Thanks to the growing availability of spoofing databases and rapid advances in using them, systems for detecting voice spoofing attacks are becoming more and more capable, and error rates close to zero are being reached for the ASVspoof2015 database.

Generative Adversarial Network Speech Enhancement +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.