Search Results for author: Evelina Bakhturina

Found 16 papers, 7 papers with code

LibriSpeech-PC: Benchmark for Evaluation of Punctuation and Capitalization Capabilities of end-to-end ASR Models

no code implementations • 4 Oct 2023 • Aleksandr Meister, Matvei Novikov, Nikolay Karpov, Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg

Traditional automatic speech recognition (ASR) models output lower-cased words without punctuation marks, which reduces readability and necessitates a subsequent text processing model to convert ASR transcripts into a proper format.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Retrieval meets Long Context Large Language Models

no code implementations • 4 Oct 2023 • Peng Xu, Wei Ping, Xianchao Wu, Lawrence McAfee, Chen Zhu, Zihan Liu, Sandeep Subramanian, Evelina Bakhturina, Mohammad Shoeybi, Bryan Catanzaro

Perhaps surprisingly, we find that LLM with 4K context window using simple retrieval-augmentation at generation can achieve comparable performance to finetuned LLM with 16K context window via positional interpolation on long context tasks, while taking much less computation.

16k 4k +4

Paper
Add Code

A Chat About Boring Problems: Studying GPT-based text normalization

no code implementations • 23 Sep 2023 • Yang Zhang, Travis M. Bartley, Mariana Graterol-Fuenmayor, Vitaly Lavrukhin, Evelina Bakhturina, Boris Ginsburg

Through this new framework, we can identify strengths and weaknesses of GPT-based TN, opening opportunities for future work.

Prompt Engineering

Paper
Add Code

P-Flow: A Fast and Data-Efficient Zero-Shot TTS through Speech Prompting

1 code implementation • NeurIPS 2023 • Sungwon Kim ~Sungwon_Kim2, Kevin J. Shih, Rohan Badlani, Joao Felipe Santos, Evelina Bakhturina, Mikyas T. Desta, Rafael Valle, Sungroh Yoon, Bryan Catanzaro

P-Flow comprises a speech-prompted text encoder for speaker adaptation and a flow matching generative decoder for high-quality and fast speech synthesis.

Speech Synthesis

163

Paper
Code

SpellMapper: A non-autoregressive neural spellchecker for ASR customization with candidate retrieval based on n-gram mappings

1 code implementation • 4 Jun 2023 • Alexandra Antonova, Evelina Bakhturina, Boris Ginsburg

Contextual spelling correction models are an alternative to shallow fusion to improve automatic speech recognition (ASR) quality given user vocabulary.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

10,034

Paper
Code

Automatic Heteronym Resolution Pipeline Using RAD-TTS Aligners

no code implementations • 28 Feb 2023 • Jocelyn Huang, Evelina Bakhturina, Oktai Tatanov

Grapheme-to-phoneme (G2P) transduction is part of the standard text-to-speech (TTS) pipeline.

Paper
Add Code

Thutmose Tagger: Single-pass neural model for Inverse Text Normalization

no code implementations • 29 Jul 2022 • Alexandra Antonova, Evelina Bakhturina, Boris Ginsburg

The model is trained on the Google Text Normalization dataset and achieves state-of-the-art sentence accuracy on both English and Russian test sets.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text Normalization

1 code implementation • 29 Mar 2022 • Evelina Bakhturina, Yang Zhang, Boris Ginsburg

First, a non-deterministic WFST outputs all normalization candidates, and then a neural language model picks the best one -- similar to shallow fusion for automatic speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

10,034

Paper
Code

A Unified Transformer-based Framework for Duplex Text Normalization

no code implementations • 23 Aug 2021 • Tuan Manh Lai, Yang Zhang, Evelina Bakhturina, Boris Ginsburg, Heng Ji

In addition, we also create a cleaned dataset from the Spoken Wikipedia Corpora for German and report the performance of our systems on the dataset.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

SGD-QA: Fast Schema-Guided Dialogue State Tracking for Unseen Services

no code implementations • 17 May 2021 • Yang Zhang, Vahid Noroozi, Evelina Bakhturina, Boris Ginsburg

In this paper, we propose SGD-QA, a simple and extensible model for schema-guided dialogue state tracking based on a question answering approach.

Dialogue State Tracking Goal-Oriented Dialogue Systems +1

Paper
Add Code

A Toolbox for Construction and Analysis of Speech Datasets

1 code implementation • 11 Apr 2021 • Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg

Automatic Speech Recognition and Text-to-Speech systems are primarily trained in a supervised fashion and require high-quality, accurately labeled speech datasets.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

292

Paper
Code

NeMo Inverse Text Normalization: From Development To Production

1 code implementation • 11 Apr 2021 • Yang Zhang, Evelina Bakhturina, Kyle Gorman, Boris Ginsburg

Inverse text normalization (ITN) converts spoken-domain automatic speech recognition (ASR) output into written-domain text to improve the readability of the ASR output.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

5,577

Paper
Code

Hi-Fi Multi-Speaker English TTS Dataset

no code implementations • 3 Apr 2021 • Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg, Yang Zhang

This paper introduces a new multi-speaker English dataset for training text-to-speech models.

Paper
Add Code

BioMegatron: Larger Biomedical Domain Language Model

1 code implementation • EMNLP 2020 • Hoo-chang Shin, Yang Zhang, Evelina Bakhturina, Raul Puri, Mostofa Patwary, Mohammad Shoeybi, Raghav Mani

There has been an influx of biomedical domain-specific language models, showing language models pre-trained on biomedical text perform better on biomedical domain benchmarks than those trained on general domain text corpora such as Wikipedia and Books.

Ranked #1 on Named Entity Recognition (NER) on BC5CDR-disease

Language Modelling named-entity-recognition +4

10,034

Paper
Code

A Fast and Robust BERT-based Dialogue State Tracker for Schema-Guided Dialogue Dataset

1 code implementation • 27 Aug 2020 • Vahid Noroozi, Yang Zhang, Evelina Bakhturina, Tomasz Kornuta

Dialog State Tracking (DST) is one of the most crucial modules for goal-oriented dialogue systems.

Data Augmentation dialog state tracking +2

10,034

Paper
Code

Sentiment Classification using Images and Label Embeddings

no code implementations • 3 Dec 2017 • Laura Graesser, Abhinav Gupta, Lakshay Sharma, Evelina Bakhturina

In this project we analysed how much semantic information images carry, and how much value image data can add to sentiment analysis of the text associated with the images.

Classification General Classification +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.