Search Results for author: Xinjian Li

Found 24 papers, 4 papers with code

Multi-Faceted Hierarchical Multi-Task Learning for a Large Number of Tasks with Multi-dimensional Relations

no code implementations26 Oct 2021 Junning Liu, Zijie Xia, Yu Lei, Xinjian Li, Xu Wang

For example, when using MTL to model various user behaviors in RS, if we differentiate new users and new items from old ones, there will be a cartesian product style increase of tasks with multi-dimensional relations.

Multi-Task Learning Recommendation Systems

On Prosody Modeling for ASR+TTS based Voice Conversion

no code implementations20 Jul 2021 Wen-Chin Huang, Tomoki Hayashi, Xinjian Li, Shinji Watanabe, Tomoki Toda

In voice conversion (VC), an approach showing promising results in the latest voice conversion challenge (VCC) 2020 is to first use an automatic speech recognition (ASR) model to transcribe the source speech into the underlying linguistic contents; these are then used as input by a text-to-speech (TTS) system to generate the converted speech.

Automatic Speech Recognition Voice Conversion

Phoneme Recognition through Fine Tuning of Phonetic Representations: a Case Study on Luhya Language Varieties

no code implementations4 Apr 2021 Kathleen Siminyu, Xinjian Li, Antonios Anastasopoulos, David Mortensen, Michael R. Marlo, Graham Neubig

Models pre-trained on multiple languages have shown significant promise for improving speech recognition, particularly for low-resource languages.

Speech Recognition

End-to-end Quantized Training via Log-Barrier Extensions

no code implementations1 Jan 2021 Juncheng B Li, Shuhui Qu, Xinjian Li, Emma Strubell, Florian Metze

Quantization of neural network parameters and activations has emerged as a successful approach to reducing the model size and inference time on hardware that sup-ports native low-precision arithmetic.


Acoustics Based Intent Recognition Using Discovered Phonetic Units for Low Resource Languages

no code implementations7 Nov 2020 Akshat Gupta, Xinjian Li, Sai Krishna Rallabandi, Alan W Black

With the aim of aiding development of spoken dialog systems in low resourced languages, we propose a novel acoustics based intent recognition system that uses discovered phonetic units for intent classification.

Automatic Speech Recognition Cross-Lingual Transfer +1

Revisiting Factorizing Aggregated Posterior in Learning Disentangled Representations

no code implementations12 Sep 2020 Ze Cheng, Juncheng Li, Chenxu Wang, Jixuan Gu, Hao Xu, Xinjian Li, Florian Metze

In this paper, we provide a theoretical explanation that low total correlation of sampled representation cannot guarantee low total correlation of the mean representation.

AlloVera: A Multilingual Allophone Database

no code implementations LREC 2020 David R. Mortensen, Xinjian Li, Patrick Littell, Alexis Michaud, Shruti Rijhwani, Antonios Anastasopoulos, Alan W. black, Florian Metze, Graham Neubig

While phonemic representations are language specific, phonetic representations (stated in terms of (allo)phones) are much closer to a universal (language-independent) transcription.

Speech Recognition

Towards Zero-shot Learning for Automatic Phonemic Transcription

no code implementations26 Feb 2020 Xinjian Li, Siddharth Dalmia, David R. Mortensen, Juncheng Li, Alan W. black, Florian Metze

The difficulty of this task is that phoneme inventories often differ between the training languages and the target language, making it infeasible to recognize unseen phonemes.

Zero-Shot Learning

Adversarial Music: Real World Audio Adversary Against Wake-word Detection System

no code implementations NeurIPS 2019 Juncheng B. Li, Shuhui Qu, Xinjian Li, Joseph Szurley, J. Zico Kolter, Florian Metze

In this work, we target our attack on the wake-word detection system, jamming the model with some inconspicuous background music to deactivate the VAs while our audio adversary is present.

Real-World Adversarial Attack


no code implementations25 Sep 2019 Ze Cheng, Juncheng B Li, Chenxu Wang, Jixuan Gu, Hao Xu, Xinjian Li, Florian Metze

In the problem of unsupervised learning of disentangled representations, one of the promising methods is to penalize the total correlation of sampled latent vari-ables.


Multilingual Speech Recognition with Corpus Relatedness Sampling

no code implementations2 Aug 2019 Xinjian Li, Siddharth Dalmia, Alan W. black, Florian Metze

For example, the target corpus might benefit more from a corpus in the same domain or a corpus from a close language.

Speech Recognition

SHE2: Stochastic Hamiltonian Exploration and Exploitation for Derivative-Free Optimization

no code implementations ICLR 2019 Haoyi Xiong, Wenqing Hu, Zhanxing Zhu, Xinjian Li, Yunchao Zhang, Jun Huan

Derivative-free optimization (DFO) using trust region methods is frequently used for machine learning applications, such as (hyper-)parameter optimization without the derivatives of objective functions known.

Text-to-Image Generation

The ARIEL-CMU Systems for LoReHLT18

no code implementations24 Feb 2019 Aditi Chaudhary, Siddharth Dalmia, Junjie Hu, Xinjian Li, Austin Matthews, Aldrian Obaja Muis, Naoki Otani, Shruti Rijhwani, Zaid Sheikh, Nidhi Vyas, Xinyi Wang, Jiateng Xie, Ruochen Xu, Chunting Zhou, Peter J. Jansen, Yiming Yang, Lori Levin, Florian Metze, Teruko Mitamura, David R. Mortensen, Graham Neubig, Eduard Hovy, Alan W. black, Jaime Carbonell, Graham V. Horwood, Shabnam Tafreshi, Mona Diab, Efsun S. Kayi, Noura Farra, Kathleen McKeown

This paper describes the ARIEL-CMU submissions to the Low Resource Human Language Technologies (LoReHLT) 2018 evaluations for the tasks Machine Translation (MT), Entity Discovery and Linking (EDL), and detection of Situation Frames in Text and Speech (SF Text and Speech).

Machine Translation Translation

Phoneme Level Language Models for Sequence Based Low Resource ASR

no code implementations20 Feb 2019 Siddharth Dalmia, Xinjian Li, Alan W. black, Florian Metze

Building multilingual and crosslingual models help bring different languages together in a language universal space.

Language Modelling

Real-time Neural-based Input Method

no code implementations ICLR 2019 Jiali Yao, Raphael Shu, Xinjian Li, Katsutoshi Ohtsuki, Hideki Nakayama

The input method is an essential service on every mobile and desktop devices that provides text suggestions.

Language Modelling

Domain Robust Feature Extraction for Rapid Low Resource ASR Development

no code implementations28 Jul 2018 Siddharth Dalmia, Xinjian Li, Florian Metze, Alan W. black

We demonstrate the effectiveness of using a pre-trained English recognizer, which is robust to such mismatched conditions, as a domain normalizing feature extractor on a low resource language.

Cannot find the paper you are looking for? You can Submit a new open access paper.