no code implementations • 8 Nov 2023 • Karan Singla, Shahab Jalalvand, Yeon-Jun Kim, Antonio Moreno Daniel, Srinivas Bangalore, Andrej Ljolje, Ben Stern
Recent studies have made some progress in refining end-to-end (E2E) speech recognition encoders by applying Connectionist Temporal Classification (CTC) loss to enhance named entity recognition within transcriptions.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 16 Feb 2023 • Karan Singla, Yeon-Jun Kim, Srinivas Bangalore
In human-computer conversations, extracting entities such as names, street addresses and email addresses from speech is a challenging task.
no code implementations • 20 Apr 2022 • Karan Singla, Daniel Pressel, Ryan Price, Bhargav Srinivas Chinnari, Yeon-Jun Kim, Srinivas Bangalore
In this paper, we propose a novel architecture for multi-modal speech and text input.
no code implementations • 29 Mar 2022 • Karan Singla, Shahab Jalalvand, Yeon-Jun Kim, Ryan Price, Daniel Pressel, Srinivas Bangalore
Person name capture from human speech is a difficult task in human-machine conversations.
no code implementations • NAACL 2021 • Ryan Price, Mahnoosh Mehrabani, Narendra Gupta, Yeon-Jun Kim, Shahab Jalalvand, Minhua Chen, Yanjie Zhao, Srinivas Bangalore
Spoken language understanding (SLU) extracts the intended mean- ing from a user utterance and is a critical component of conversational virtual agents.
no code implementations • LREC 2012 • Alistair Conkie, Thomas Okken, Yeon-Jun Kim, Giuseppe Di Fabbrizio
The AT{\&}T VoiceBuilder provides a new tool to researchers and practitioners who want to have their voices synthesized by a high-quality commercial-grade text-to-speech system without the need to install, configure, or manage speech processing software and equipment. It is implemented as a web service on the AT{\&}T Speech Mashup Portal. The system records and validates users' utterances, processes them to build a synthetic voice and provides a web service API to make the voice available to real-time applications through a scalable cloud-based processing platform.