Search Results for author: Janarthanan Rajendran

Found 25 papers, 11 papers with code

Mitigating Unsafe Feedback with Learning Constraints

no code implementations19 Sep 2024 Domenic Rosati, Giles Edkins, Harsh Raj, David Atanasov, Subhabrata Majumdar, Janarthanan Rajendran, Frank Rudzicz, Hassan Sajjad

While there has been progress towards aligning Large Language Models (LLMs) with human values and ensuring safe behaviour at inference time, safety-guards can easily be removed when fine-tuned on unsafe and harmful datasets. While this setting has been treated extensively, another popular training paradigm, learning from unsafe feedback with reinforcement learning, has previously been unexplored.

Safety Alignment Text Generation

Intelligent Switching for Reset-Free RL

1 code implementation2 May 2024 Darshan Patil, Janarthanan Rajendran, Glen Berseth, Sarath Chandar

In the real world, the strong episode resetting mechanisms that are needed to train agents in simulation are unavailable.

Fairness Incentives in Response to Unfair Dynamic Pricing

no code implementations22 Apr 2024 Jesse Thibodeau, Hadi Nekoei, Afaf Taïk, Janarthanan Rajendran, Golnoosh Farnadi

We find that, upon deploying a learned tax and redistribution policy, social welfare improves on that of the fairness-agnostic baseline, and approaches that of the analytically optimal fairness-aware baseline for the multi-armed and contextual bandit settings, and surpassing it by 13. 19% in the full RL setting.

Fairness Reinforcement Learning (RL)

Mastering Memory Tasks with World Models

1 code implementation7 Mar 2024 Mohammad Reza Samsami, Artem Zholus, Janarthanan Rajendran, Sarath Chandar

Through a diverse set of illustrative tasks, we systematically demonstrate that R2I not only establishes a new state-of-the-art for challenging memory and credit assignment RL tasks, such as BSuite and POPGym, but also showcases superhuman performance in the complex memory domain of Memory Maze.

Model-based Reinforcement Learning State Space Models

Language Model-In-The-Loop: Data Optimal Approach to Learn-To-Recommend Actions in Text Games

no code implementations13 Nov 2023 Arjun Vaithilingam Sudhakar, Prasanna Parthasarathi, Janarthanan Rajendran, Sarath Chandar

In this work, we explore and evaluate updating LLM used for candidate recommendation during the learning of the text based game as well to mitigate the reliance on the human annotated gameplays, which are costly to acquire.

Language Modelling text-based games

Towards Few-shot Coordination: Revisiting Ad-hoc Teamplay Challenge In the Game of Hanabi

1 code implementation20 Aug 2023 Hadi Nekoei, Xutong Zhao, Janarthanan Rajendran, Miao Liu, Sarath Chandar

In this work, we show empirically that state-of-the-art ZSC algorithms have poor performance when paired with agents trained with different learning methods, and they require millions of interaction samples to adapt to these new partners.

Game of Hanabi Multi-agent Reinforcement Learning +1

PatchBlender: A Motion Prior for Video Transformers

no code implementations11 Nov 2022 Gabriele Prato, Yale Song, Janarthanan Rajendran, R Devon Hjelm, Neel Joshi, Sarath Chandar

We show that our method is successful at enabling vision transformers to encode the temporal component of video data.

Towards Evaluating Adaptivity of Model-Based Reinforcement Learning Methods

1 code implementation25 Apr 2022 Yi Wan, Ali Rahimi-Kalahroudi, Janarthanan Rajendran, Ida Momennejad, Sarath Chandar, Harm van Seijen

We empirically validate these insights in the case of linear function approximation by demonstrating that a modified version of linear Dyna achieves effective adaptation to local changes.

Model-based Reinforcement Learning reinforcement-learning +2

Reinforcement Learning of Implicit and Explicit Control Flow in Instructions

no code implementations25 Feb 2021 Ethan A. Brooks, Janarthanan Rajendran, Richard L. Lewis, Satinder Singh

Learning to flexibly follow task instructions in dynamic environments poses interesting challenges for reinforcement learning agents.

Minecraft reinforcement-learning +4

Quantifying the Effects of COVID-19 on Mental Health Support Forums

no code implementations EMNLP (NLP-COVID19) 2020 Laura Biester, Katie Matton, Janarthanan Rajendran, Emily Mower Provost, Rada Mihalcea

The COVID-19 pandemic, like many of the disease outbreaks that have preceded it, is likely to have a profound effect on mental health.

Meta-Learning Requires Meta-Augmentation

1 code implementation NeurIPS 2020 Janarthanan Rajendran, Alex Irpan, Eric Jang

Meta-learning algorithms aim to learn two components: a model that predicts targets for a task, and a base learner that quickly updates that model when given examples from a new task.

Meta-Learning

How Should an Agent Practice?

no code implementations15 Dec 2019 Janarthanan Rajendran, Richard Lewis, Vivek Veeriah, Honglak Lee, Satinder Singh

We present a method for learning intrinsic reward functions to drive the learning of an agent during periods of practice in which extrinsic task rewards are not available.

Discovery of Useful Questions as Auxiliary Tasks

no code implementations NeurIPS 2019 Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Richard Lewis, Janarthanan Rajendran, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh

Arguably, intelligent agents ought to be able to discover their own questions so that in learning answers for them they learn unanticipated useful knowledge and skills; this departs from the focus in much of machine learning on agents learning answers to externally defined questions.

Reinforcement Learning Reinforcement Learning (RL)

Learning End-to-End Goal-Oriented Dialog with Maximal User Task Success and Minimal Human Agent Use

1 code implementation TACL 2019 Janarthanan Rajendran, Jatin Ganhotra, Lazaros Polymenakos

In this work, we propose an end-to-end trainable method for neural goal-oriented dialog systems which handles new user behaviors at deployment by transferring the dialog to a human agent intelligently.

Goal-Oriented Dialog

Learning End-to-End Goal-Oriented Dialog with Multiple Answers

1 code implementation EMNLP 2018 Janarthanan Rajendran, Jatin Ganhotra, Satinder Singh, Lazaros Polymenakos

We also propose a new and more effective testbed, permuted-bAbI dialog tasks, by introducing multiple valid next utterances to the original-bAbI dialog tasks, which allows evaluation of goal-oriented dialog systems in a more realistic setting.

Goal-Oriented Dialog Reinforcement Learning +1

NE-Table: A Neural key-value table for Named Entities

1 code implementation RANLP 2019 Janarthanan Rajendran, Jatin Ganhotra, Xiaoxiao Guo, Mo Yu, Satinder Singh, Lazaros Polymenakos

Many Natural Language Processing (NLP) tasks depend on using Named Entities (NEs) that are contained in texts and in external knowledge sources.

Goal-Oriented Dialog Question Answering +2

A Neural Method for Goal-Oriented Dialog Systems to interact with Named Entities

no code implementations ICLR 2018 Janarthanan Rajendran, Jatin Ganhotra, Xiaoxiao Guo, Mo Yu, Satinder Singh

Many goal-oriented dialog tasks, especially ones in which the dialog system has to interact with external knowledge sources such as databases, have to handle a large number of Named Entities (NEs).

Goal-Oriented Dialog Question Answering

A Correlational Encoder Decoder Architecture for Pivot Based Sequence Generation

no code implementations COLING 2016 Amrita Saha, Mitesh M. Khapra, Sarath Chandar, Janarthanan Rajendran, Kyunghyun Cho

However, there is no parallel training data available between X and Y but, training data is available between X & Z and Z & Y (as is often the case in many real world applications).

Decoder Transliteration

Bridge Correlational Neural Networks for Multilingual Multimodal Representation Learning

1 code implementation NAACL 2016 Janarthanan Rajendran, Mitesh M. Khapra, Sarath Chandar, Balaraman Ravindran

In this work, we address a real-world scenario where no direct parallel data is available between two views of interest (say, $V_1$ and $V_2$) but parallel data is available between each of these views and a pivot view ($V_3$).

Document Classification Representation Learning +2

Attend, Adapt and Transfer: Attentive Deep Architecture for Adaptive Transfer from multiple sources in the same domain

2 code implementations10 Oct 2015 Janarthanan Rajendran, Aravind Srinivas, Mitesh M. Khapra, P. Prasanna, Balaraman Ravindran

Second, the agent should be able to selectively transfer, which is the ability to select and transfer from different and multiple source tasks for different parts of the state space of the target task.

Cannot find the paper you are looking for? You can Submit a new open access paper.