no code implementations • 8 Jul 2024 • Dushyant Sharma, James Fosburgh, Sri Harsha Dumpala, Chandramouli Shama Sastri, Stanislav Yu. Kruchinin, Patrick A. Naylor
The XANE embeddings are used to estimate specific parameters related to the background acoustic properties of the signal which allows the embeddings to be explainable in terms of those parameters.
no code implementations • 7 Jun 2024 • Sri Harsha Dumpala, Dushyant Sharma, Chandramouli Shama Sastri, Stanislav Kruchinin, James Fosburgh, Patrick A. Naylor
We present a novel method for extracting neural embeddings that model the background acoustics of a speech signal.
no code implementations • 30 Dec 2023 • Vaibhav Vaishnav, Anoop Jain, Dushyant Sharma
It is shown that the in situ presence of a concealed auxiliary layer not only guarantees resilience against stealthy bounded attacks on both frequency and power-sharing but also facilitates a network-enabled attack identification mechanism.
no code implementations • 25 Mar 2022 • Dushyant Sharma, Rong Gong, James Fosburgh, Stanislav Yu. Kruchinin, Patrick A. Naylor, Ljubomir Milanovic
We present a novel multi-channel front-end based on channel shortening with theWeighted Prediction Error (WPE) method followed by a fixed MVDR beamformer used in combination with a recently proposed self-attention-based channel combination (SACC) scheme, for tackling the distant ASR problem.
no code implementations • 23 Sep 2021 • Marco Gaudesi, Felix Weninger, Dushyant Sharma, Puming Zhan
End-to-end (E2E) multi-channel ASR systems show state-of-the-art performance in far-field ASR tasks by joint training of a multi-channel front-end along with the ASR model.
no code implementations • 10 Sep 2021 • Rong Gong, Carl Quillen, Dushyant Sharma, Andrew Goderre, José Laínez, Ljubomir Milanović
When a sufficiently large far-field training data is presented, jointly optimizing a multichannel frontend and an end-to-end (E2E) Automatic Speech Recognition (ASR) backend shows promising results.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1