no code implementations • 29 Jan 2020 • Rohith Aralikatti, Sharad Roy, Abhinav Thanda, Dilip Kumar Margam, Pujitha Appan Kandala, Tanay Sharma, Shankar M Venkatesan
In this work, we propose novel methods to fuse information from audio and visual modalities at inference time.
no code implementations • 25 Jun 2019 • Dilip Kumar Margam, Rohith Aralikatti, Tanay Sharma, Abhinav Thanda, Pujitha A K, Sharad Roy, Shankar M Venkatesan
We also verify the method on a second dataset of $81$ speakers which we collected.
no code implementations • 10 Jan 2017 • Abhinav Thanda, Shankar M Venkatesan
Multi-task learning (MTL) involves the simultaneous training of two or more related tasks over shared representations.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 9 Nov 2016 • Abhinav Thanda, Shankar M Venkatesan
The frame labels obtained from the acoustic model are then used to perform a non-linear dimensionality reduction of the visual features using a deep bottleneck network.
Audio-Visual Speech Recognition Automatic Speech Recognition +4