no code implementations • 19 Sep 2023 • Sri Harsha Dumpala, Chandramouli Sastry, Sageev Oore
In this paper, we study the application of Test-Time Training (TTT) as a solution to handling distribution shifts in speech applications.
no code implementations • 15 Jun 2023 • Chandramouli Sastry, Sri Harsha Dumpala, Sageev Oore
Score-matching and diffusion models have emerged as state-of-the-art generative models for both conditional and unconditional generation.
no code implementations • 2 Aug 2021 • Jason d'Eon, Sri Harsha Dumpala, Chandramouli Shama Sastry, Dani Oore, Sageev Oore
In this paper, we propose a new compositional tool that will generate a musical outline of speech recorded/provided by the user for use as a musical building block in their compositions.
no code implementations • 24 Jul 2021 • Sri Harsha Dumpala, Sebastian Rodriguez, Sheri Rempel, Rudolf Uher, Sageev Oore
In this work, we analyze the significance of speaker embeddings for the task of depression detection from speech.
no code implementations • 18 Dec 2019 • Sri Harsha Dumpala, Imran Sheikh, Rupayan Chakraborty, Sunil Kumar Kopparapu
Naturally introduced perturbations in audio signal, caused by emotional and physical states of the speaker, can significantly degrade the performance of Automatic Speech Recognition (ASR) systems.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • WS 2018 • Imran Sheikh, Sri Harsha Dumpala, Rupayan Chakraborty, Sunil Kumar Kopparapu
Multimodal sentiment classification in practical applications may have to rely on erroneous and imperfect views, namely (a) language transcription from a speech recognizer and (b) under-performing acoustic views.
Automatic Speech Recognition (ASR) General Classification +2
no code implementations • 15 Dec 2017 • Sri Harsha Dumpala, Rupayan Chakraborty, Sunil Kumar Kopparapu
Deep learning based discriminative methods, being the state-of-the-art machine learning techniques, are ill-suited for learning from lower amounts of data.
no code implementations • 24 Apr 2017 • Sri Harsha Dumpala, Rupayan Chakraborty, Sunil Kumar Kopparapu
It is not immediately clear (a) how a priori temporal knowledge can be used in a FFNN architecture (b) how a FFNN performs when provided with this knowledge about temporal correlations (assuming available) during training.