Search Results for author: Somjit Nath

Found 11 papers, 4 papers with code

Spectral Temporal Contrastive Learning

no code implementations1 Dec 2023 Sacha Morin, Somjit Nath, Samira Ebrahimi Kahou, Guy Wolf

This work is concerned with the temporal contrastive learning (TCL) setting where the sequential structure of the data is used instead to define positive pairs, which is more commonly used in RL and robotics contexts.

Contrastive Learning Self-Supervised Learning

CAMMARL: Conformal Action Modeling in Multi Agent Reinforcement Learning

1 code implementation19 Jun 2023 Nikunj Gupta, Somjit Nath, Samira Ebrahimi Kahou

Before taking actions in an environment with more than one intelligent agent, an autonomous agent may benefit from reasoning about the other agents and utilizing a notion of a guarantee or confidence about the behavior of the system.

Conformal Prediction Decision Making +2

Discovering Object-Centric Generalized Value Functions From Pixels

1 code implementation27 Apr 2023 Somjit Nath, Gopeshh Raaj Subbaraj, Khimya Khetarpal, Samira Ebrahimi Kahou

Deep Reinforcement Learning has shown significant progress in extracting useful representations from high-dimensional inputs albeit using hand-crafted auxiliary tasks and pseudo rewards.

Object

Locally Constrained Representations in Reinforcement Learning

no code implementations20 Sep 2022 Somjit Nath, Rushiv Arora, Samira Ebrahimi Kahou

This encourages the representations to be driven not only by the value/policy learning but also by an additional loss that constrains the representations from over-fitting to the value loss.

Continuous Control reinforcement-learning +2

A Learning Based Framework for Handling Uncertain Lead Times in Multi-Product Inventory Management

no code implementations2 Mar 2022 Hardik Meisheri, Somjit Nath, Mayank Baranwal, Harshad Khadilkar

Through empirical evaluations, it is further shown that the inventory management with uncertain lead times is not only equivalent to that of delay in information sharing across multiple echelons (\emph{observation delay}), a model trained to handle one kind of delay is capable to handle delays of another kind without requiring to be retrained.

Management Q-Learning

Follow your Nose: Using General Value Functions for Directed Exploration in Reinforcement Learning

no code implementations2 Mar 2022 Durgesh Kalwar, Omkar Shelke, Somjit Nath, Hardik Meisheri, Harshad Khadilkar

Exploration methods have been used to sample better trajectories in large environments while auxiliary tasks have been incorporated where the reward is sparse.

reinforcement-learning Reinforcement Learning (RL)

Training Recurrent Neural Networks Online by Learning Explicit State Variables

no code implementations ICLR 2020 Somjit Nath, Vincent Liu, Alan Chan, Xin Li, Adam White, Martha White

Recurrent neural networks (RNNs) allow an agent to construct a state-representation from a stream of experience, which is essential in partially observable problems.

SIBRE: Self Improvement Based REwards for Adaptive Feedback in Reinforcement Learning

no code implementations21 Apr 2020 Somjit Nath, Richa Verma, Abhik Ray, Harshad Khadilkar

We propose a generic reward shaping approach for improving the rate of convergence in reinforcement learning (RL), called Self Improvement Based REwards, or SIBRE.

reinforcement-learning Reinforcement Learning (RL)

Two-Timescale Networks for Nonlinear Value Function Approximation

no code implementations ICLR 2019 Wesley Chung, Somjit Nath, Ajin Joseph, Martha White

A key component for many reinforcement learning agents is to learn a value function, either for policy evaluation or control.

Q-Learning Vocal Bursts Valence Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.