no code implementations • NAACL 2019 • Pooja Chitkara, Ashutosh Modi, Pravalika Avvaru, Sepehr Janghorbani, Mubbasir Kapadia
Additionally, in contrast to offline processing of dialog, we also analyze the performance of our model in a more realistic setting i. e. in an online setting where the topic is identified in real time as the dialog progresses.
no code implementations • NAACL 2021 • Deepak Muralidharan, Joel Ruben Antony Moniz, Sida Gao, Xiao Yang, Justine Kao, Stephen Pulman, Atish Kothari, Ray Shen, Yinying Pan, Vivek Kaul, Mubarak Seyed Ibrahim, Gang Xiang, Nan Dun, Yidan Zhou, Andy O, Yuan Zhang, Pooja Chitkara, Xuan Wang, Alkesh Patel, Kushal Tayal, Roger Zheng, Peter Grasch, Jason D. Williams, Lin Li
Named Entity Recognition (NER) and Entity Linking (EL) play an essential role in voice assistant interaction, but are challenging due to the special difficulties associated with spoken user queries.
no code implementations • EMNLP (NLP4ConvAI) 2021 • Sahas Dendukuri, Pooja Chitkara, Joel Ruben Antony Moniz, Xiao Yang, Manos Tsagkias, Stephen Pulman
Entity tags in human-machine dialog are integral to natural language understanding (NLU) tasks in conversational assistants.
no code implementations • 7 Oct 2021 • Jialu Li, Vimal Manohar, Pooja Chitkara, Andros Tjandra, Michael Picheny, Frank Zhang, Xiaohui Zhang, Yatharth Saraf
Domain-adversarial training (DAT) and multi-task learning (MTL) are two common approaches for building accent-robust ASR models.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 18 Nov 2021 • Chunxi Liu, Michael Picheny, Leda Sari, Pooja Chitkara, Alex Xiao, Xiaohui Zhang, Mark Chou, Andres Alvarado, Caner Hazirbas, Yatharth Saraf
This paper presents initial Speech Recognition results on "Casual Conversations" -- a publicly released 846 hour corpus designed to help researchers evaluate their computer vision and audio models for accuracy across a diverse set of metadata, including age, gender, and skin tone.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 20 Jul 2022 • Laxmi Pandey, Debjyoti Paul, Pooja Chitkara, Yutong Pang, Xuedong Zhang, Kjell Schubert, Mark Chou, Shu Liu, Yatharth Saraf
Inverse text normalization (ITN) is used to convert the spoken form output of an automatic speech recognition (ASR) system to a written form.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 22 Dec 2022 • Pooja Chitkara, Morgane Riviere, Jade Copet, Frank Zhang, Yatharth Saraf
Speech to text models tend to be trained and evaluated against a single target accent.
no code implementations • 1 Mar 2023 • Philipp Klumpp, Pooja Chitkara, Leda Sari, Prashant Serai, JiLong Wu, Irina-Elena Veliche, Rongqing Huang, Qing He
In this work, we improve an accent-conversion model (ACM) which transforms native US-English speech into accented pronunciation.