Search Results for author: Shane Settle

Found 9 papers, 6 papers with code

Discriminative Acoustic Word Embeddings: Recurrent Neural Network-Based Approaches

no code implementations8 Nov 2016 Shane Settle, Karen Livescu

Acoustic word embeddings --- fixed-dimensional vector representations of variable-length spoken word segments --- have begun to be considered for tasks such as speech recognition and query-by-example search.

Dynamic Time Warping General Classification +3

Visually grounded learning of keyword prediction from untranscribed speech

1 code implementation23 Mar 2017 Herman Kamper, Shane Settle, Gregory Shakhnarovich, Karen Livescu

In this setting of images paired with untranscribed spoken captions, we consider whether computer vision systems can be used to obtain textual labels for the speech.

Language Acquisition TAG

Query-by-Example Search with Discriminative Neural Acoustic Word Embeddings

1 code implementation12 Jun 2017 Shane Settle, Keith Levin, Herman Kamper, Karen Livescu

Query-by-example search often uses dynamic time warping (DTW) for comparing queries and proposed matching segments.

Dynamic Time Warping Word Embeddings

Acoustically Grounded Word Embeddings for Improved Acoustics-to-Word Speech Recognition

no code implementations29 Mar 2019 Shane Settle, Kartik Audhkhasi, Karen Livescu, Michael Picheny

Direct acoustics-to-word (A2W) systems for end-to-end automatic speech recognition are simpler to train, and more efficient to decode with, than sub-word systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Multilingual Jointly Trained Acoustic and Written Word Embeddings

1 code implementation24 Jun 2020 Yushi Hu, Shane Settle, Karen Livescu

The pre-trained models can then be used for unseen zero-resource languages, or fine-tuned on data from low-resource languages.

Dynamic Time Warping Retrieval +1

Whole-Word Segmental Speech Recognition with Acoustic Word Embeddings

1 code implementation1 Jul 2020 Bowen Shi, Shane Settle, Karen Livescu

We find that word error rate can be reduced by a large margin by pre-training the acoustic segment representation with AWEs, and additional (smaller) gains can be obtained by pre-training the word prediction layer with AGWEs.

speech-recognition Speech Recognition +1

Acoustic span embeddings for multilingual query-by-example search

1 code implementation24 Nov 2020 Yushi Hu, Shane Settle, Karen Livescu

In this work, we generalize AWE training to spans of words, producing acoustic span embeddings (ASE), and explore the application of ASE to QbE with arbitrary-length queries in multiple unseen languages.

Dynamic Time Warping Word Embeddings

What Do Self-Supervised Speech Models Know About Words?

1 code implementation30 Jun 2023 Ankita Pasad, Chung-Ming Chien, Shane Settle, Karen Livescu

Many self-supervised speech models (S3Ms) have been introduced over the last few years, improving performance and data efficiency on various speech tasks.

Sentence Sentence Similarity +1

Neural approaches to spoken content embedding

no code implementations28 Aug 2023 Shane Settle

As an alternative, acoustic word embeddings -- fixed-dimensional vector representations of variable-length spoken word segments -- have begun to be considered for such tasks as well.

Automatic Speech Recognition Dynamic Time Warping +3

Cannot find the paper you are looking for? You can Submit a new open access paper.