Search Results for author: Ankita Pasad

Found 12 papers, 6 papers with code

On the Evaluation of Speech Foundation Models for Spoken Language Understanding

no code implementations14 Jun 2024 Siddhant Arora, Ankita Pasad, Chung-Ming Chien, Jionghao Han, Roshan Sharma, Jee-weon Jung, Hira Dhamyal, William Chen, Suwon Shon, Hung-Yi Lee, Karen Livescu, Shinji Watanabe

To answer this, we perform an extensive evaluation of multiple supervised and self-supervised SFMs using several evaluation protocols: (i) frozen SFMs with a lightweight prediction head, (ii) frozen SFMs with a complex prediction head, and (iii) fine-tuned SFMs with a lightweight prediction head.

Benchmarking speech-recognition +2

What Do Self-Supervised Speech Models Know About Words?

1 code implementation30 Jun 2023 Ankita Pasad, Chung-Ming Chien, Shane Settle, Karen Livescu

Many self-supervised speech models (S3Ms) have been introduced over the last few years, improving performance and data efficiency on various speech tasks.

Sentence Sentence Similarity +1

SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding Tasks

no code implementations20 Dec 2022 Suwon Shon, Siddhant Arora, Chyi-Jiunn Lin, Ankita Pasad, Felix Wu, Roshan Sharma, Wei-Lun Wu, Hung-Yi Lee, Karen Livescu, Shinji Watanabe

In this work, we introduce several new annotated SLU benchmark tasks based on freely available speech data, which complement existing benchmarks and address gaps in the SLU evaluation landscape.

Dialog Act Classification Question Answering +4

Comparative layer-wise analysis of self-supervised speech models

1 code implementation8 Nov 2022 Ankita Pasad, Bowen Shi, Karen Livescu

We further investigate the utility of our analyses for downstream tasks by comparing the property trends with performance on speech recognition and spoken language understanding tasks.

speech-recognition Speech Recognition +1

On the Use of External Data for Spoken Named Entity Recognition

1 code implementation NAACL 2022 Ankita Pasad, Felix Wu, Suwon Shon, Karen Livescu, Kyu J. Han

In this work we focus on low-resource spoken named entity recognition (NER) and address the question: Beyond self-supervised pre-training, how can we use external speech and/or text data that are not annotated for the task?

Knowledge Distillation named-entity-recognition +6

Layer-wise Analysis of a Self-supervised Speech Representation Model

1 code implementation10 Jul 2021 Ankita Pasad, Ju-chieh Chou, Karen Livescu

Recently proposed self-supervised learning approaches have been successful for pre-training speech representation models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Taskology: Utilizing Task Relations at Scale

no code implementations CVPR 2021 Yao Lu, Sören Pirk, Jan Dlabal, Anthony Brohan, Ankita Pasad, Zhao Chen, Vincent Casser, Anelia Angelova, Ariel Gordon

Many computer vision tasks address the problem of scene understanding and are naturally interrelated e. g. object classification, detection, scene segmentation, depth estimation, etc.

Depth Estimation Motion Estimation +4

Improving Semantic Segmentation through Spatio-Temporal Consistency Learned from Videos

no code implementations11 Apr 2020 Ankita Pasad, Ariel Gordon, Tsung-Yi Lin, Anelia Angelova

We leverage unsupervised learning of depth, egomotion, and camera intrinsics to improve the performance of single-image semantic segmentation, by enforcing 3D-geometric and temporal consistency of segmentation masks across video frames.

Segmentation Semantic Segmentation

On the Contributions of Visual and Textual Supervision in Low-Resource Semantic Speech Retrieval

no code implementations24 Apr 2019 Ankita Pasad, Bowen Shi, Herman Kamper, Karen Livescu

Recent work has shown that speech paired with images can be used to learn semantically meaningful speech representations even without any textual supervision.

Retrieval Visual Grounding

Cannot find the paper you are looking for? You can Submit a new open access paper.