1 code implementation • Findings (ACL) 2022 • Xinjian Li, Florian Metze, David Mortensen, Shinji Watanabe, Alan Black
Grapheme-to-Phoneme (G2P) has many applications in NLP and speech fields.
no code implementations • LREC 2022 • Xinjian Li, Florian Metze, David R. Mortensen, Alan W Black, Shinji Watanabe
Identifying phone inventories is a crucial component in language documentation and the preservation of endangered languages.
1 code implementation • 30 Jan 2023 • Takaaki Saeki, Soumi Maiti, Xinjian Li, Shinji Watanabe, Shinnosuke Takamichi, Hiroshi Saruwatari
While neural text-to-speech (TTS) has achieved human-like natural synthetic speech, multilingual TTS systems are limited to resource-rich languages due to the need for paired text and studio-quality audio data.
no code implementations • 31 Oct 2022 • Xinjian Li, Ye Jia, Chung-Cheng Chiu
Research on speech-to-speech translation (S2ST) has progressed rapidly in recent years.
1 code implementation • 6 Sep 2022 • Xinjian Li, Florian Metze, David R Mortensen, Alan W Black, Shinji Watanabe
We achieve 50% CER and 74% WER on the Wilderness dataset with Crubadan statistics only and improve them to 45% CER and 69% WER when using 10000 raw text utterances.
no code implementations • 26 Oct 2021 • Junning Liu, Zijie Xia, Yu Lei, Xinjian Li, Xu Wang
For example, when using MTL to model various user behaviors in RS, if we differentiate new users and new items from old ones, there will be a cartesian product style increase of tasks with multi-dimensional relations.
no code implementations • 20 Jul 2021 • Wen-Chin Huang, Tomoki Hayashi, Xinjian Li, Shinji Watanabe, Tomoki Toda
In voice conversion (VC), an approach showing promising results in the latest voice conversion challenge (VCC) 2020 is to first use an automatic speech recognition (ASR) model to transcribe the source speech into the underlying linguistic contents; these are then used as input by a text-to-speech (TTS) system to generate the converted speech.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • 4 Apr 2021 • Kathleen Siminyu, Xinjian Li, Antonios Anastasopoulos, David Mortensen, Michael R. Marlo, Graham Neubig
Models pre-trained on multiple languages have shown significant promise for improving speech recognition, particularly for low-resource languages.
no code implementations • 2 Apr 2021 • David R. Mortensen, Jordan Picone, Xinjian Li, Kathleen Siminyu
There is additionally interest in building language technologies for low-resource and endangered languages.
no code implementations • 1 Jan 2021 • Juncheng B Li, Shuhui Qu, Xinjian Li, Emma Strubell, Florian Metze
Quantization of neural network parameters and activations has emerged as a successful approach to reducing the model size and inference time on hardware that sup-ports native low-precision arithmetic.
no code implementations • 7 Nov 2020 • Akshat Gupta, Xinjian Li, Sai Krishna Rallabandi, Alan W Black
With the aim of aiding development of spoken dialog systems in low resourced languages, we propose a novel acoustics based intent recognition system that uses discovered phonetic units for intent classification.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+5
no code implementations • 12 Sep 2020 • Ze Cheng, Juncheng Li, Chenxu Wang, Jixuan Gu, Hao Xu, Xinjian Li, Florian Metze
In this paper, we provide a theoretical explanation that low total correlation of sampled representation cannot guarantee low total correlation of the mean representation.
1 code implementation • 18 Jun 2020 • Cheng Cui, Zhi Ye, Yangxi Li, Xinjian Li, Min Yang, Kai Wei, Bing Dai, Yanmei Zhao, Zhongji Liu, Rong Pang
One of the difficulties of this competition is how to use unlabeled data.
Ranked #226 on
Image Classification
on ImageNet
no code implementations • LREC 2020 • Graham Neubig, Shruti Rijhwani, Alexis Palmer, Jordan MacKenzie, Hilaria Cruz, Xinjian Li, Matthew Lee, Aditi Chaudhary, Luke Gessler, Steven Abney, Shirley Anugrah Hayati, Antonios Anastasopoulos, Olga Zamaraeva, Emily Prud'hommeaux, Jennette Child, Sara Child, Rebecca Knowles, Sarah Moeller, Jeffrey Micher, Yiyuan Li, Sydney Zink, Mengzhou Xia, Roshan S Sharma, Patrick Littell
Despite recent advances in natural language processing and other language technology, the application of such technology to language documentation and conservation has been limited.
no code implementations • LREC 2020 • David R. Mortensen, Xinjian Li, Patrick Littell, Alexis Michaud, Shruti Rijhwani, Antonios Anastasopoulos, Alan W. black, Florian Metze, Graham Neubig
While phonemic representations are language specific, phonetic representations (stated in terms of (allo)phones) are much closer to a universal (language-independent) transcription.
no code implementations • 26 Feb 2020 • Xinjian Li, Siddharth Dalmia, David R. Mortensen, Juncheng Li, Alan W. black, Florian Metze
The difficulty of this task is that phoneme inventories often differ between the training languages and the target language, making it infeasible to recognize unseen phonemes.
1 code implementation • 26 Feb 2020 • Xinjian Li, Siddharth Dalmia, Juncheng Li, Matthew Lee, Patrick Littell, Jiali Yao, Antonios Anastasopoulos, David R. Mortensen, Graham Neubig, Alan W. black, Florian Metze
Multilingual models can improve language processing, particularly for low resource situations, by sharing parameters across languages.
no code implementations • NeurIPS 2019 • Juncheng B. Li, Shuhui Qu, Xinjian Li, Joseph Szurley, J. Zico Kolter, Florian Metze
In this work, we target our attack on the wake-word detection system, jamming the model with some inconspicuous background music to deactivate the VAs while our audio adversary is present.
no code implementations • 25 Sep 2019 • Ze Cheng, Juncheng B Li, Chenxu Wang, Jixuan Gu, Hao Xu, Xinjian Li, Florian Metze
In the problem of unsupervised learning of disentangled representations, one of the promising methods is to penalize the total correlation of sampled latent vari-ables.
no code implementations • 2 Aug 2019 • Xinjian Li, Zhong Zhou, Siddharth Dalmia, Alan W. black, Florian Metze
In this work, we present SANTLR: Speech Annotation Toolkit for Low Resource Languages.
no code implementations • 2 Aug 2019 • Xinjian Li, Siddharth Dalmia, Alan W. black, Florian Metze
For example, the target corpus might benefit more from a corpus in the same domain or a corpus from a close language.
1 code implementation • NAACL 2019 • Jiali Yao, Raphael Shu, Xinjian Li, Katsutoshi Ohtsuki, Hideki Nakayama
Input method editor (IME) converts sequential alphabet key inputs to words in a target language.
no code implementations • ICLR 2019 • Haoyi Xiong, Wenqing Hu, Zhanxing Zhu, Xinjian Li, Yunchao Zhang, Jun Huan
Derivative-free optimization (DFO) using trust region methods is frequently used for machine learning applications, such as (hyper-)parameter optimization without the derivatives of objective functions known.
no code implementations • 24 Feb 2019 • Aditi Chaudhary, Siddharth Dalmia, Junjie Hu, Xinjian Li, Austin Matthews, Aldrian Obaja Muis, Naoki Otani, Shruti Rijhwani, Zaid Sheikh, Nidhi Vyas, Xinyi Wang, Jiateng Xie, Ruochen Xu, Chunting Zhou, Peter J. Jansen, Yiming Yang, Lori Levin, Florian Metze, Teruko Mitamura, David R. Mortensen, Graham Neubig, Eduard Hovy, Alan W. black, Jaime Carbonell, Graham V. Horwood, Shabnam Tafreshi, Mona Diab, Efsun S. Kayi, Noura Farra, Kathleen McKeown
This paper describes the ARIEL-CMU submissions to the Low Resource Human Language Technologies (LoReHLT) 2018 evaluations for the tasks Machine Translation (MT), Entity Discovery and Linking (EDL), and detection of Situation Frames in Text and Speech (SF Text and Speech).
no code implementations • 20 Feb 2019 • Siddharth Dalmia, Xinjian Li, Alan W. black, Florian Metze
Building multilingual and crosslingual models help bring different languages together in a language universal space.
no code implementations • ICLR 2019 • Jiali Yao, Raphael Shu, Xinjian Li, Katsutoshi Ohtsuki, Hideki Nakayama
The input method is an essential service on every mobile and desktop devices that provides text suggestions.
no code implementations • 27 Sep 2018 • Xinjian Li, Siddharth Dalmia, David R. Mortensen, Florian Metze, Alan W Black
Our model is able to recognize unseen phonemes in the target language, if only a small text corpus is available.
no code implementations • 28 Jul 2018 • Siddharth Dalmia, Xinjian Li, Florian Metze, Alan W. black
We demonstrate the effectiveness of using a pre-trained English recognizer, which is robust to such mismatched conditions, as a domain normalizing feature extractor on a low resource language.