Search Results for author: Jitong Chen

Found 10 papers, 4 papers with code

Differentiable Wavetable Synthesis

1 code implementation19 Nov 2021 Siyuan Shan, Lamtharn Hantrakul, Jitong Chen, Matt Avent, David Trevelyan

Differentiable Wavetable Synthesis (DWTS) is a technique for neural audio synthesis which learns a dictionary of one-period waveforms i. e. wavetables, through end-to-end training.

Audio Synthesis One-Shot Learning

GiantMIDI-Piano: A large-scale MIDI dataset for classical piano music

3 code implementations11 Oct 2020 Qiuqiang Kong, Bochen Li, Jitong Chen, Yuxuan Wang

In this article, we create a GiantMIDI-Piano (GP) dataset containing 38, 700, 838 transcribed notes and 10, 855 unique solo piano works composed by 2, 786 composers.

Information Retrieval Music Information Retrieval +1

ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders

no code implementations23 Apr 2020 Yu Gu, Xiang Yin, Yonghui Rao, Yuan Wan, Benlai Tang, Yang Zhang, Jitong Chen, Yuxuan Wang, Zejun Ma

This paper presents ByteSing, a Chinese singing voice synthesis (SVS) system based on duration allocated Tacotron-like acoustic models and WaveRNN neural vocoders.

Singing Voice Synthesis

Neural Voice Cloning with a Few Samples

2 code implementations NeurIPS 2018 Sercan O. Arik, Jitong Chen, Kainan Peng, Wei Ping, Yanqi Zhou

Speaker adaptation is based on fine-tuning a multi-speaker generative model with a few cloning samples.

Speech Synthesis Voice Cloning

Supervised Speech Separation Based on Deep Learning: An Overview

no code implementations24 Aug 2017 DeLiang Wang, Jitong Chen

A more recent approach formulates speech separation as a supervised learning problem, where the discriminative patterns of speech, speakers, and background noise are learned from training data.

Speaker Separation Speech Dereverberation +1

Exploring Neural Transducers for End-to-End Speech Recognition

no code implementations24 Jul 2017 Eric Battenberg, Jitong Chen, Rewon Child, Adam Coates, Yashesh Gaur, Yi Li, Hairong Liu, Sanjeev Satheesh, David Seetapun, Anuroop Sriram, Zhenyao Zhu

In this work, we perform an empirical comparison among the CTC, RNN-Transducer, and attention-based Seq2Seq models for end-to-end speech recognition.

Language Modelling speech-recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.