no code implementations • ACL (IWSLT) 2021 • Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alexander Waibel, Changhan Wang, Matthew Wiesner
The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2021) featured this year four shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Multilingual speech translation, (iv) Low-resource speech translation.
no code implementations • IWSLT (ACL) 2022 • Jinyi Yang, Amir Hussein, Matthew Wiesner, Sanjeev Khudanpur
This paper details the Johns Hopkins speech translation (ST) system used in the IWLST2022 dialect speech translation task.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 1 Jun 2023 • Dongji Gao, Matthew Wiesner, Hainan Xu, Leibny Paola Garcia, Daniel Povey, Sanjeev Khudanpur
Imperfectly transcribed speech is a prevalent issue in human-annotated speech corpora, which degrades the performance of ASR models.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 2 Nov 2022 • Brian Yan, Matthew Wiesner, Ondrej Klejch, Preethi Jyothi, Shinji Watanabe
In this work, we seek to build effective code-switched (CS) automatic speech recognition systems (ASR) under the zero-shot setting where no transcribed CS speech data is available for training.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 10 Oct 2021 • Matthew Wiesner, Desh Raj, Sanjeev Khudanpur
Self-supervised model pre-training has recently garnered significant interest, but relatively few efforts have explored using additional resources in fine-tuning these models.
no code implementations • NAACL 2021 • Motoi Omachi, Yuya Fujita, Shinji Watanabe, Matthew Wiesner
We propose a Transformer-based sequence-to-sequence model for automatic speech recognition (ASR) capable of simultaneously transcribing and annotating audio with linguistic information such as phonemic transcripts or part-of-speech (POS) tags.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+5
no code implementations • 2 Feb 2021 • Elizabeth Salesky, Matthew Wiesner, Jacob Bremerman, Roldano Cattoni, Matteo Negri, Marco Turchi, Douglas W. Oard, Matt Post
We present the Multilingual TEDx corpus, built to support speech recognition (ASR) and speech translation (ST) research across many non-English source languages.
no code implementations • ACL 2020 • Elizabeth Salesky, Eleanor Chodroff, Tiago Pimentel, Matthew Wiesner, Ryan Cotterell, Alan W. black, Jason Eisner
A major hurdle in data-driven research on typology is having sufficient data in many languages to draw meaningful conclusions.
1 code implementation • WS 2020 • Oliver Adams, Matthew Wiesner, Jan Trmal, Garrett Nicolai, David Yarowsky
We investigate the problem of searching for a lexeme-set in speech by searching for its inflectional variants.
no code implementations • NAACL 2019 • Oliver Adams, Matthew Wiesner, Shinji Watanabe, David Yarowsky
We report on adaptation of multilingual end-to-end speech recognition models trained on as many as 100 languages.
no code implementations • 10 Dec 2018 • Matthew Wiesner, Adithya Renduchintala, Shinji Watanabe, Chunxi Liu, Najim Dehak, Sanjeev Khudanpur
Using transcribed speech from nearby languages gives a further 20-30% relative reduction in character error rate.
no code implementations • 7 Nov 2018 • Martin Karafiát, Murali Karthick Baskar, Shinji Watanabe, Takaaki Hori, Matthew Wiesner, Jan "Honza'' Černocký
This paper investigates the applications of various multilingual approaches developed in conventional hidden Markov model (HMM) systems to sequence-to-sequence (seq2seq) automatic speech recognition (ASR).
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 4 Oct 2018 • Jaejin Cho, Murali Karthick Baskar, Ruizhi Li, Matthew Wiesner, Sri Harish Mallidi, Nelson Yalta, Martin Karafiat, Shinji Watanabe, Takaaki Hori
In this work, we attempt to use data from 10 BABEL languages to build a multi-lingual seq2seq model as a prior model, and then port them towards 4 other BABEL languages using transfer learning approach.
Language Modelling
Sequence-To-Sequence Speech Recognition
+2
no code implementations • 17 Jul 2018 • Chunxi Liu, Matthew Wiesner, Shinji Watanabe, Craig Harman, Jan Trmal, Najim Dehak, Sanjeev Khudanpur
In topic identification (topic ID) on real-world unstructured audio, an audio instance of variable topic shifts is first broken into sequential segments, and each segment is independently classified.
no code implementations • 30 Mar 2018 • Shinji Watanabe, Takaaki Hori, Shigeki Karita, Tomoki Hayashi, Jiro Nishitoba, Yuya Unno, Nelson Enrique Yalta Soplin, Jahn Heymann, Matthew Wiesner, Nanxin Chen, Adithya Renduchintala, Tsubasa Ochiai
This paper introduces a new open source platform for end-to-end speech processing named ESPnet.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 27 Mar 2018 • Adithya Renduchintala, Shuoyang Ding, Matthew Wiesner, Shinji Watanabe
We present a new end-to-end architecture for automatic speech recognition (ASR) that can be trained using \emph{symbolic} input in addition to the traditional acoustic input.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 23 Feb 2018 • Matthew Wiesner, Chunxi Liu, Lucas Ondel, Craig Harman, Vimal Manohar, Jan Trmal, Zhongqiang Huang, Najim Dehak, Sanjeev Khudanpur
Automatic speech recognition (ASR) systems often need to be developed for extremely low-resource languages to serve end-uses such as audio content categorization and search.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • 22 Mar 2017 • Chunxi Liu, Jan Trmal, Matthew Wiesner, Craig Harman, Sanjeev Khudanpur
Modern topic identification (topic ID) systems for speech use automatic speech recognition (ASR) to produce speech transcripts, and perform supervised classification on such ASR outputs.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3