no code implementations • 26 Jun 2019 • Phanideep Gampa, Sairam Satwik Kondamudi, Lakshmanan Kailasam
We consider the finite horizon continuous reinforcement learning problem.
no code implementations • 23 Oct 2019 • Phanideep Gampa, Sumio Fujita
In the domain of learning to rank for IR, current deep learning models are trained on objective functions different from the measures they are evaluated on.
no code implementations • 29 Jun 2020 • Anirudh Goyal, Alex Lamb, Phanideep Gampa, Philippe Beaudoin, Sergey Levine, Charles Blundell, Yoshua Bengio, Michael Mozer
To use a video game as an illustration, two enemies of the same type will share schemata but will have separate object files to encode their distinct state (e. g., health, position).
no code implementations • ICLR 2021 • Anirudh Goyal, Alex Lamb, Phanideep Gampa, Philippe Beaudoin, Charles Blundell, Sergey Levine, Yoshua Bengio, Michael Curtis Mozer
To use a video game as an illustration, two enemies of the same type will share schemata but will have separate object files to encode their distinct state (e. g., health, position).
1 code implementation • 1 Nov 2021 • Soham Dan, Anirbit Mukherjee, Avirup Das, Phanideep Gampa
On various state-of-the-art neural network training on SVHN, CIFAR-10 and CIFAR-100 we demonstrate how our new proposal of $S_{\rm rel}$, as opposed to the original definition, much more sharply detects the property of the weight updates preferring to make prediction changes within the same class as the sampled data.
no code implementations • 24 Sep 2023 • Belhassen Bayar, Phanideep Gampa, Ainur Yessenalina, Zhen Wen
Current multi-armed bandit approaches in recommender systems (RS) have focused more on devising effective exploration techniques, while not adequately addressing common exploitation challenges related to distributional changes and item cannibalization.
no code implementations • 25 Sep 2023 • Phanideep Gampa, Farnoosh Javadi, Belhassen Bayar, Ainur Yessenalina
Our proposed framework is designed to enrich training examples with active users representation through upsampling, and capable of learning geographic-based user embeddings by leveraging MTL.