Search Results for author: Sameer Bansal

Found 8 papers, 1 papers with code

Pre-training on high-resource speech recognition improves low-resource speech-to-text translation

1 code implementation • NAACL 2019 • Sameer Bansal, Herman Kamper, Karen Livescu, Adam Lopez, Sharon Goldwater

Finally, we show that the approach improves performance on a true low-resource task: pre-training on a combination of English ASR and French ASR improves Mboshi-French ST, where only 4 hours of data are available, from 3. 5 to 7. 1

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

Low-Resource Speech-to-Text Translation

no code implementations • 24 Mar 2018 • Sameer Bansal, Herman Kamper, Karen Livescu, Adam Lopez, Sharon Goldwater

We explore models trained on between 20 and 160 hours of data, and find that although models trained on less data have considerably lower BLEU scores, they can still predict words with relatively high precision and recall---around 50% for a model trained on 50 hours of data, versus around 60% for the full 160 hour model.

Machine Translation speech-recognition +3

Paper
Add Code

Towards speech-to-text translation without speech recognition

no code implementations • EACL 2017 • Sameer Bansal, Herman Kamper, Adam Lopez, Sharon Goldwater

We explore the problem of translating speech to text in low-resource scenarios where neither automatic speech recognition (ASR) nor machine translation (MT) are available, but we have training data in the form of audio paired with text translations.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Weakly supervised spoken term discovery using cross-lingual side information

no code implementations • 21 Sep 2016 • Sameer Bansal, Herman Kamper, Sharon Goldwater, Adam Lopez

Recent work on unsupervised term discovery (UTD) aims to identify and cluster repeated word-like units from audio alone.

Paper
Add Code

Spoken Term Discovery for Language Documentation using Translations

no code implementations • WS 2017 • Antonios Anastasopoulos, Sameer Bansal, David Chiang, Sharon Goldwater, Adam Lopez

Vast amounts of speech data collected for language documentation and research remain untranscribed and unsearchable, but often a small amount of speech may have text translations available.

Translation

Paper
Add Code

Cross-lingual topic prediction for speech using translations

no code implementations • 29 Aug 2019 • Sameer Bansal, Herman Kamper, Adam Lopez, Sharon Goldwater

Given a large amount of unannotated speech in a low-resource language, can we classify the speech utterances by topic?

Humanitarian Speech-to-Text Translation +1

Paper
Add Code

Analyzing ASR pretraining for low-resource speech-to-text translation

no code implementations • 23 Oct 2019 • Mihaela C. Stoian, Sameer Bansal, Sharon Goldwater

Previous work has shown that for low-resource source languages, automatic speech-to-text translation (AST) can be improved by pretraining an end-to-end model on automatic speech recognition (ASR) data from a high-resource language.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Comparing Euclidean and Hyperbolic Embeddings on the WordNet Nouns Hypernymy Graph

no code implementations • EMNLP (insights) 2021 • Sameer Bansal, Adrian Benton

Nickel and Kiela (2017) present a new method for embedding tree nodes in the Poincare ball, and suggest that these hyperbolic embeddings are far more effective than Euclidean embeddings at embedding nodes in large, hierarchically structured graphs like the WordNet nouns hypernymy tree.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.