no code implementations • 11 Feb 2021 • Karel Mundnich, Alexandra Fenster, Aparna Khare, Shiva Sundaram
To better study the task of highlight detection, we run a pilot experiment with highlights annotations for a small subset of video clips and fine-tune our best model on it.
no code implementations • 20 Nov 2020 • Aparna Khare, Srinivas Parthasarathy, Shiva Sundaram
Self-supervised learning has shown improvements on tasks with limited labeled datasets in domains like speech and natural language.
no code implementations • 10 Sep 2020 • Aparna Khare, Srinivas Parthasarathy, Shiva Sundaram
General embeddings like word2vec, GloVe and ELMo have shown a lot of success in natural language tasks.
no code implementations • ACL 2020 • Georgios Paraskevopoulos, Srinivas Parthasarathy, Aparna Khare, Shiva Sundaram
We particularly focus on the scene context provided by the visual information, to ground the ASR.
no code implementations • 29 Apr 2020 • Georgios Paraskevopoulos, Srinivas Parthasarathy, Aparna Khare, Shiva Sundaram
We particularly focus on the scene context provided by the visual information, to ground the ASR.
no code implementations • 6 Feb 2020 • Taejin Park, Kenichi Kumatani, Minhua Wu, Shiva Sundaram
In this paper, we further develop this idea and use frequency aligned network for robust multi-channel automatic speech recognition (ASR).
no code implementations • 1 Feb 2020 • Sanna Wager, Aparna Khare, Minhua Wu, Kenichi Kumatani, Shiva Sundaram
Using a large offline teacher model trained on beamformed audio, we trained a simpler multi-channel student acoustic model used in the speech recognition system.
no code implementations • 5 Jan 2019 • Ladislav Mošner, Minhua Wu, Anirudh Raju, Sree Hari Krishnan Parthasarathi, Kenichi Kumatani, Shiva Sundaram, Roland Maas, Björn Hoffmeister
For real-world speech recognition applications, noise robustness is still a challenge.