1 code implementation • 25 Jul 2023 • Rahul Vashisht, Harish G. Ramaswamy
Attention models are typically learned by optimizing one of three standard loss functions that are variously called -- soft attention, hard attention, and latent variable marginal likelihood (LVML) attention.
1 code implementation • 30 Dec 2022 • Lakshmi Narayan Pandey, Rahul Vashisht, Harish G. Ramaswamy
In trained models with an attention mechanism, the outputs of an intermediate module that encodes the segment of input responsible for the output is often used as a way to peek into the `reasoning` of the network.
1 code implementation • 16 Dec 2020 • Depen Morwani, Rahul Vashisht, Harish G. Ramaswamy
Recent papers have shown that sufficiently overparameterized neural networks can perfectly fit even random labels.
no code implementations • 17 Aug 2019 • Rahul Vashisht, H. Viji, T. Sundararajan, D. Mohankumar, S. Sumitra
Deep learning architectures like CNN (Convolutional neural network) and LSTM(Long Short Term Memory) are good candidates for representation learning from high dimensional data.