1 code implementation • 28 Feb 2025 • Grigor Nalbandyan, Rima Shahbazyan, Evelina Bakhturina
Typical evaluations of Large Language Models (LLMs) report a single metric per dataset, often representing the model's best-case performance under carefully selected settings.
no code implementations • 4 Oct 2023 • Peng Xu, Wei Ping, Xianchao Wu, Lawrence McAfee, Chen Zhu, Zihan Liu, Sandeep Subramanian, Evelina Bakhturina, Mohammad Shoeybi, Bryan Catanzaro
Perhaps surprisingly, we find that LLM with 4K context window using simple retrieval-augmentation at generation can achieve comparable performance to finetuned LLM with 16K context window via positional interpolation on long context tasks, while taking much less computation.
no code implementations • 4 Oct 2023 • Aleksandr Meister, Matvei Novikov, Nikolay Karpov, Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg
Traditional automatic speech recognition (ASR) models output lower-cased words without punctuation marks, which reduces readability and necessitates a subsequent text processing model to convert ASR transcripts into a proper format.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 23 Sep 2023 • Yang Zhang, Travis M. Bartley, Mariana Graterol-Fuenmayor, Vitaly Lavrukhin, Evelina Bakhturina, Boris Ginsburg
Through this new framework, we can identify strengths and weaknesses of GPT-based TN, opening opportunities for future work.
1 code implementation • NeurIPS 2023 • Sungwon Kim ~Sungwon_Kim2, Kevin J. Shih, Rohan Badlani, Joao Felipe Santos, Evelina Bakhturina, Mikyas T. Desta, Rafael Valle, Sungroh Yoon, Bryan Catanzaro
P-Flow comprises a speech-prompted text encoder for speaker adaptation and a flow matching generative decoder for high-quality and fast speech synthesis.
1 code implementation • 4 Jun 2023 • Alexandra Antonova, Evelina Bakhturina, Boris Ginsburg
Contextual spelling correction models are an alternative to shallow fusion to improve automatic speech recognition (ASR) quality given user vocabulary.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 28 Feb 2023 • Jocelyn Huang, Evelina Bakhturina, Oktai Tatanov
Grapheme-to-phoneme (G2P) transduction is part of the standard text-to-speech (TTS) pipeline.
no code implementations • 29 Jul 2022 • Alexandra Antonova, Evelina Bakhturina, Boris Ginsburg
The model is trained on the Google Text Normalization dataset and achieves state-of-the-art sentence accuracy on both English and Russian test sets.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
2 code implementations • 29 Mar 2022 • Evelina Bakhturina, Yang Zhang, Boris Ginsburg
First, a non-deterministic WFST outputs all normalization candidates, and then a neural language model picks the best one -- similar to shallow fusion for automatic speech recognition.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 23 Aug 2021 • Tuan Manh Lai, Yang Zhang, Evelina Bakhturina, Boris Ginsburg, Heng Ji
In addition, we also create a cleaned dataset from the Spoken Wikipedia Corpora for German and report the performance of our systems on the dataset.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+6
no code implementations • 17 May 2021 • Yang Zhang, Vahid Noroozi, Evelina Bakhturina, Boris Ginsburg
In this paper, we propose SGD-QA, a simple and extensible model for schema-guided dialogue state tracking based on a question answering approach.
1 code implementation • 11 Apr 2021 • Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg
Automatic Speech Recognition and Text-to-Speech systems are primarily trained in a supervised fashion and require high-quality, accurately labeled speech datasets.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
1 code implementation • 11 Apr 2021 • Yang Zhang, Evelina Bakhturina, Kyle Gorman, Boris Ginsburg
Inverse text normalization (ITN) converts spoken-domain automatic speech recognition (ASR) output into written-domain text to improve the readability of the ASR output.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 3 Apr 2021 • Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg, Yang Zhang
This paper introduces a new multi-speaker English dataset for training text-to-speech models.
1 code implementation • EMNLP 2020 • Hoo-chang Shin, Yang Zhang, Evelina Bakhturina, Raul Puri, Mostofa Patwary, Mohammad Shoeybi, Raghav Mani
There has been an influx of biomedical domain-specific language models, showing language models pre-trained on biomedical text perform better on biomedical domain benchmarks than those trained on general domain text corpora such as Wikipedia and Books.
Ranked #1 on
Named Entity Recognition (NER)
on BC5CDR-disease
1 code implementation • 27 Aug 2020 • Vahid Noroozi, Yang Zhang, Evelina Bakhturina, Tomasz Kornuta
Dialog State Tracking (DST) is one of the most crucial modules for goal-oriented dialogue systems.
no code implementations • 3 Dec 2017 • Laura Graesser, Abhinav Gupta, Lakshay Sharma, Evelina Bakhturina
In this project we analysed how much semantic information images carry, and how much value image data can add to sentiment analysis of the text associated with the images.