Search Results for author: Matthew Wiesner

Found 22 papers, 4 papers with code

FINDINGS OF THE IWSLT 2021 EVALUATION CAMPAIGN

no code implementations • ACL (IWSLT) 2021 • Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alexander Waibel, Changhan Wang, Matthew Wiesner

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2021) featured this year four shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Multilingual speech translation, (iv) Low-resource speech translation.

Translation

Paper
Add Code

JHU IWSLT 2022 Dialect Speech Translation System Description

no code implementations • IWSLT (ACL) 2022 • Jinyi Yang, Amir Hussein, Matthew Wiesner, Sanjeev Khudanpur

This paper details the Johns Hopkins speech translation (ST) system used in the IWLST2022 dialect speech translation task.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

On Speaker Attribution with SURT

1 code implementation • 28 Jan 2024 • Desh Raj, Matthew Wiesner, Matthew Maciejewski, Leibny Paola Garcia-Perera, Daniel Povey, Sanjeev Khudanpur

The Streaming Unmixing and Recognition Transducer (SURT) has recently become a popular framework for continuous, streaming, multi-talker speech recognition (ASR).

speech-recognition Speech Recognition

766

Paper
Code

Speech collage: code-switched audio generation by collaging monolingual corpora

1 code implementation • 27 Sep 2023 • Amir Hussein, Dorsa Zeinali, Ondřej Klejch, Matthew Wiesner, Brian Yan, Shammur Chowdhury, Ahmed Ali, Shinji Watanabe, Sanjeev Khudanpur

Designing effective automatic speech recognition (ASR) systems for Code-Switching (CS) often depends on the availability of the transcribed CS resources.

Audio Generation Automatic Speech Recognition +2

Paper
Code

The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios

no code implementations • 23 Jun 2023 • Samuele Cornell, Matthew Wiesner, Shinji Watanabe, Desh Raj, Xuankai Chang, Paola Garcia, Matthew Maciejewski, Yoshiki Masuyama, Zhong-Qiu Wang, Stefano Squartini, Sanjeev Khudanpur

The CHiME challenges have played a significant role in the development and evaluation of robust automatic speech recognition (ASR) systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation

1 code implementation • 20 Jun 2023 • Cihan Xiao, Henry Li Xinyuan, Jinyi Yang, Dongji Gao, Matthew Wiesner, Kevin Duh, Sanjeev Khudanpur

We introduce HK-LegiCoST, a new three-way parallel corpus of Cantonese-English translations, containing 600+ hours of Cantonese audio, its standard traditional Chinese transcript, and English translation, segmented and aligned at the sentence level.

Cross-corpus Sentence +3

Paper
Code

Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts

no code implementations • 1 Jun 2023 • Dongji Gao, Matthew Wiesner, Hainan Xu, Leibny Paola Garcia, Daniel Povey, Sanjeev Khudanpur

Imperfectly transcribed speech is a prevalent issue in human-annotated speech corpora, which degrades the performance of ASR models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Towards Zero-Shot Code-Switched Speech Recognition

no code implementations • 2 Nov 2022 • Brian Yan, Matthew Wiesner, Ondrej Klejch, Preethi Jyothi, Shinji Watanabe

In this work, we seek to build effective code-switched (CS) automatic speech recognition systems (ASR) under the zero-shot setting where no transcribed CS speech data is available for training.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Injecting Text and Cross-lingual Supervision in Few-shot Learning from Self-Supervised Models

no code implementations • 10 Oct 2021 • Matthew Wiesner, Desh Raj, Sanjeev Khudanpur

Self-supervised model pre-training has recently garnered significant interest, but relatively few efforts have explored using additional resources in fine-tuning these models.

Few-Shot Learning

Paper
Add Code

End-to-end ASR to jointly predict transcriptions and linguistic annotations

no code implementations • NAACL 2021 • Motoi Omachi, Yuya Fujita, Shinji Watanabe, Matthew Wiesner

We propose a Transformer-based sequence-to-sequence model for automatic speech recognition (ASR) capable of simultaneously transcribing and annotating audio with linguistic information such as phonemic transcripts or part-of-speech (POS) tags.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

The Multilingual TEDx Corpus for Speech Recognition and Translation

no code implementations • 2 Feb 2021 • Elizabeth Salesky, Matthew Wiesner, Jacob Bremerman, Roldano Cattoni, Matteo Negri, Marco Turchi, Douglas W. Oard, Matt Post

We present the Multilingual TEDx corpus, built to support speech recognition (ASR) and speech translation (ST) research across many non-English source languages.

speech-recognition Speech Recognition +1

Paper
Add Code

A Corpus for Large-Scale Phonetic Typology

no code implementations • ACL 2020 • Elizabeth Salesky, Eleanor Chodroff, Tiago Pimentel, Matthew Wiesner, Ryan Cotterell, Alan W. black, Jason Eisner

A major hurdle in data-driven research on typology is having sufficient data in many languages to draw meaningful conclusions.

Paper
Add Code

Induced Inflection-Set Keyword Search in Speech

1 code implementation • WS 2020 • Oliver Adams, Matthew Wiesner, Jan Trmal, Garrett Nicolai, David Yarowsky

We investigate the problem of searching for a lexeme-set in speech by searching for its inflectional variants.

Paper
Code

Massively Multilingual Adversarial Speech Recognition

no code implementations • NAACL 2019 • Oliver Adams, Matthew Wiesner, Shinji Watanabe, David Yarowsky

We report on adaptation of multilingual end-to-end speech recognition models trained on as many as 100 languages.

General Classification speech-recognition +1

Paper
Add Code

Pretraining by Backtranslation for End-to-end ASR in Low-Resource Settings

no code implementations • 10 Dec 2018 • Matthew Wiesner, Adithya Renduchintala, Shinji Watanabe, Chunxi Liu, Najim Dehak, Sanjeev Khudanpur

Using transcribed speech from nearby languages gives a further 20-30% relative reduction in character error rate.

Data Augmentation

Paper
Add Code

Analysis of Multilingual Sequence-to-Sequence speech recognition systems

no code implementations • 7 Nov 2018 • Martin Karafiát, Murali Karthick Baskar, Shinji Watanabe, Takaaki Hori, Matthew Wiesner, Jan "Honza'' Černocký

This paper investigates the applications of various multilingual approaches developed in conventional hidden Markov model (HMM) systems to sequence-to-sequence (seq2seq) automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Multilingual sequence-to-sequence speech recognition: architecture, transfer learning, and language modeling

no code implementations • 4 Oct 2018 • Jaejin Cho, Murali Karthick Baskar, Ruizhi Li, Matthew Wiesner, Sri Harish Mallidi, Nelson Yalta, Martin Karafiat, Shinji Watanabe, Takaaki Hori

In this work, we attempt to use data from 10 BABEL languages to build a multi-lingual seq2seq model as a prior model, and then port them towards 4 other BABEL languages using transfer learning approach.

Language Modelling Sequence-To-Sequence Speech Recognition +2

Paper
Add Code

Low-Resource Contextual Topic Identification on Speech

no code implementations • 17 Jul 2018 • Chunxi Liu, Matthew Wiesner, Shinji Watanabe, Craig Harman, Jan Trmal, Najim Dehak, Sanjeev Khudanpur

In topic identification (topic ID) on real-world unstructured audio, an audio instance of variable topic shifts is first broken into sequential segments, and each segment is independently classified.

General Classification Topic Classification +1

Paper
Add Code

ESPnet: End-to-End Speech Processing Toolkit

no code implementations • 30 Mar 2018 • Shinji Watanabe, Takaaki Hori, Shigeki Karita, Tomoki Hayashi, Jiro Nishitoba, Yuya Unno, Nelson Enrique Yalta Soplin, Jahn Heymann, Matthew Wiesner, Nanxin Chen, Adithya Renduchintala, Tsubasa Ochiai

This paper introduces a new open source platform for end-to-end speech processing named ESPnet.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Multi-Modal Data Augmentation for End-to-End ASR

no code implementations • 27 Mar 2018 • Adithya Renduchintala, Shuoyang Ding, Matthew Wiesner, Shinji Watanabe

We present a new end-to-end architecture for automatic speech recognition (ASR) that can be trained using \emph{symbolic} input in addition to the traditional acoustic input.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Automatic Speech Recognition and Topic Identification for Almost-Zero-Resource Languages

no code implementations • 23 Feb 2018 • Matthew Wiesner, Chunxi Liu, Lucas Ondel, Craig Harman, Vimal Manohar, Jan Trmal, Zhongqiang Huang, Najim Dehak, Sanjeev Khudanpur

Automatic speech recognition (ASR) systems often need to be developed for extremely low-resource languages to serve end-uses such as audio content categorization and search.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Topic Identification for Speech without ASR

no code implementations • 22 Mar 2017 • Chunxi Liu, Jan Trmal, Matthew Wiesner, Craig Harman, Sanjeev Khudanpur

Modern topic identification (topic ID) systems for speech use automatic speech recognition (ASR) to produce speech transcripts, and perform supervised classification on such ASR outputs.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.