no code implementations • 6 Oct 2024 • Rached Bouchoucha, Ahmed Haj Yahmed, Darshan Patil, Janarthanan Rajendran, Amin Nikanjam, Sarath Chandar, Foutse khomh
In this paper, we propose RLExplorer, the first fault diagnosis approach for DRL-based software systems.
no code implementations • 19 Sep 2024 • Domenic Rosati, Giles Edkins, Harsh Raj, David Atanasov, Subhabrata Majumdar, Janarthanan Rajendran, Frank Rudzicz, Hassan Sajjad
While there has been progress towards aligning Large Language Models (LLMs) with human values and ensuring safe behaviour at inference time, safety-guards can easily be removed when fine-tuned on unsafe and harmful datasets. While this setting has been treated extensively, another popular training paradigm, learning from unsafe feedback with reinforcement learning, has previously been unexplored.
1 code implementation • 2 May 2024 • Darshan Patil, Janarthanan Rajendran, Glen Berseth, Sarath Chandar
In the real world, the strong episode resetting mechanisms that are needed to train agents in simulation are unavailable.
no code implementations • 22 Apr 2024 • Jesse Thibodeau, Hadi Nekoei, Afaf Taïk, Janarthanan Rajendran, Golnoosh Farnadi
We find that, upon deploying a learned tax and redistribution policy, social welfare improves on that of the fairness-agnostic baseline, and approaches that of the analytically optimal fairness-aware baseline for the multi-armed and contextual bandit settings, and surpassing it by 13. 19% in the full RL setting.
1 code implementation • 7 Mar 2024 • Mohammad Reza Samsami, Artem Zholus, Janarthanan Rajendran, Sarath Chandar
Through a diverse set of illustrative tasks, we systematically demonstrate that R2I not only establishes a new state-of-the-art for challenging memory and credit assignment RL tasks, such as BSuite and POPGym, but also showcases superhuman performance in the complex memory domain of Memory Maze.
no code implementations • 13 Nov 2023 • Arjun Vaithilingam Sudhakar, Prasanna Parthasarathi, Janarthanan Rajendran, Sarath Chandar
In this work, we explore and evaluate updating LLM used for candidate recommendation during the learning of the text based game as well to mitigate the reliance on the human annotated gameplays, which are costly to acquire.
1 code implementation • 20 Aug 2023 • Hadi Nekoei, Xutong Zhao, Janarthanan Rajendran, Miao Liu, Sarath Chandar
In this work, we show empirically that state-of-the-art ZSC algorithms have poor performance when paired with agents trained with different learning methods, and they require millions of interaction samples to adapt to these new partners.
1 code implementation • 16 Mar 2023 • Xutong Zhao, Yangchen Pan, Chenjun Xiao, Sarath Chandar, Janarthanan Rajendran
Efficient exploration is critical in cooperative deep Multi-Agent Reinforcement Learning (MARL).
no code implementations • 15 Mar 2023 • Ali Rahimi-Kalahroudi, Janarthanan Rajendran, Ida Momennejad, Harm van Seijen, Sarath Chandar
This is challenging for deep-learning-based world models due to catastrophic forgetting.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • 6 Feb 2023 • Hadi Nekoei, Akilesh Badrinaaraayanan, Amit Sinha, Mohammad Amini, Janarthanan Rajendran, Aditya Mahajan, Sarath Chandar
In our proposed method, when one agent updates its policy, other agents are allowed to update their policies as well, but at a slower rate.
no code implementations • 11 Nov 2022 • Gabriele Prato, Yale Song, Janarthanan Rajendran, R Devon Hjelm, Neel Joshi, Sarath Chandar
We show that our method is successful at enabling vision transformers to encode the temporal component of video data.
1 code implementation • 25 Apr 2022 • Yi Wan, Ali Rahimi-Kalahroudi, Janarthanan Rajendran, Ida Momennejad, Sarath Chandar, Harm van Seijen
We empirically validate these insights in the case of linear function approximation by demonstrating that a modified version of linear Dyna achieves effective adaptation to local changes.
Model-based Reinforcement Learning reinforcement-learning +2
no code implementations • EMNLP (NLP4ConvAI) 2021 • Janarthanan Rajendran, Jonathan K. Kummerfeld, Satinder Singh
For each goal-oriented dialog task of interest, large amounts of data need to be collected for end-to-end learning of a neural dialog system.
no code implementations • 25 Feb 2021 • Ethan A. Brooks, Janarthanan Rajendran, Richard L. Lewis, Satinder Singh
Learning to flexibly follow task instructions in dynamic environments poses interesting challenges for reinforcement learning agents.
no code implementations • EMNLP (NLP-COVID19) 2020 • Laura Biester, Katie Matton, Janarthanan Rajendran, Emily Mower Provost, Rada Mihalcea
The COVID-19 pandemic, like many of the disease outbreaks that have preceded it, is likely to have a profound effect on mental health.
1 code implementation • NeurIPS 2020 • Janarthanan Rajendran, Alex Irpan, Eric Jang
Meta-learning algorithms aim to learn two components: a model that predicts targets for a task, and a base learner that quickly updates that model when given examples from a new task.
no code implementations • 15 Dec 2019 • Janarthanan Rajendran, Richard Lewis, Vivek Veeriah, Honglak Lee, Satinder Singh
We present a method for learning intrinsic reward functions to drive the learning of an agent during periods of practice in which extrinsic task rewards are not available.
no code implementations • NeurIPS 2019 • Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Richard Lewis, Janarthanan Rajendran, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh
Arguably, intelligent agents ought to be able to discover their own questions so that in learning answers for them they learn unanticipated useful knowledge and skills; this departs from the focus in much of machine learning on agents learning answers to externally defined questions.
1 code implementation • TACL 2019 • Janarthanan Rajendran, Jatin Ganhotra, Lazaros Polymenakos
In this work, we propose an end-to-end trainable method for neural goal-oriented dialog systems which handles new user behaviors at deployment by transferring the dialog to a human agent intelligently.
1 code implementation • EMNLP 2018 • Janarthanan Rajendran, Jatin Ganhotra, Satinder Singh, Lazaros Polymenakos
We also propose a new and more effective testbed, permuted-bAbI dialog tasks, by introducing multiple valid next utterances to the original-bAbI dialog tasks, which allows evaluation of goal-oriented dialog systems in a more realistic setting.
1 code implementation • RANLP 2019 • Janarthanan Rajendran, Jatin Ganhotra, Xiaoxiao Guo, Mo Yu, Satinder Singh, Lazaros Polymenakos
Many Natural Language Processing (NLP) tasks depend on using Named Entities (NEs) that are contained in texts and in external knowledge sources.
no code implementations • ICLR 2018 • Janarthanan Rajendran, Jatin Ganhotra, Xiaoxiao Guo, Mo Yu, Satinder Singh
Many goal-oriented dialog tasks, especially ones in which the dialog system has to interact with external knowledge sources such as databases, have to handle a large number of Named Entities (NEs).
no code implementations • COLING 2016 • Amrita Saha, Mitesh M. Khapra, Sarath Chandar, Janarthanan Rajendran, Kyunghyun Cho
However, there is no parallel training data available between X and Y but, training data is available between X & Z and Z & Y (as is often the case in many real world applications).
1 code implementation • NAACL 2016 • Janarthanan Rajendran, Mitesh M. Khapra, Sarath Chandar, Balaraman Ravindran
In this work, we address a real-world scenario where no direct parallel data is available between two views of interest (say, $V_1$ and $V_2$) but parallel data is available between each of these views and a pivot view ($V_3$).
2 code implementations • 10 Oct 2015 • Janarthanan Rajendran, Aravind Srinivas, Mitesh M. Khapra, P. Prasanna, Balaraman Ravindran
Second, the agent should be able to selectively transfer, which is the ability to select and transfer from different and multiple source tasks for different parts of the state space of the target task.