Search Results for author: Shane Settle

Found 7 papers, 5 papers with code

Acoustic span embeddings for multilingual query-by-example search

1 code implementation24 Nov 2020 Yushi Hu, Shane Settle, Karen Livescu

In this work, we generalize AWE training to spans of words, producing acoustic span embeddings (ASE), and explore the application of ASE to QbE with arbitrary-length queries in multiple unseen languages.

Dynamic Time Warping Word Embeddings

Whole-Word Segmental Speech Recognition with Acoustic Word Embeddings

1 code implementation1 Jul 2020 Bowen Shi, Shane Settle, Karen Livescu

We find that word error rate can be reduced by a large margin by pre-training the acoustic segment representation with AWEs, and additional (smaller) gains can be obtained by pre-training the word prediction layer with AGWEs.

Speech Recognition Word Embeddings

Multilingual Jointly Trained Acoustic and Written Word Embeddings

1 code implementation24 Jun 2020 Yushi Hu, Shane Settle, Karen Livescu

The pre-trained models can then be used for unseen zero-resource languages, or fine-tuned on data from low-resource languages.

Dynamic Time Warping Word Embeddings

Acoustically Grounded Word Embeddings for Improved Acoustics-to-Word Speech Recognition

no code implementations29 Mar 2019 Shane Settle, Kartik Audhkhasi, Karen Livescu, Michael Picheny

Direct acoustics-to-word (A2W) systems for end-to-end automatic speech recognition are simpler to train, and more efficient to decode with, than sub-word systems.

Automatic Speech Recognition Word Embeddings

Query-by-Example Search with Discriminative Neural Acoustic Word Embeddings

1 code implementation12 Jun 2017 Shane Settle, Keith Levin, Herman Kamper, Karen Livescu

Query-by-example search often uses dynamic time warping (DTW) for comparing queries and proposed matching segments.

Dynamic Time Warping Word Embeddings

Visually grounded learning of keyword prediction from untranscribed speech

1 code implementation23 Mar 2017 Herman Kamper, Shane Settle, Gregory Shakhnarovich, Karen Livescu

In this setting of images paired with untranscribed spoken captions, we consider whether computer vision systems can be used to obtain textual labels for the speech.

Language Acquisition TAG

Discriminative Acoustic Word Embeddings: Recurrent Neural Network-Based Approaches

no code implementations8 Nov 2016 Shane Settle, Karen Livescu

Acoustic word embeddings --- fixed-dimensional vector representations of variable-length spoken word segments --- have begun to be considered for tasks such as speech recognition and query-by-example search.

Dynamic Time Warping General Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.