no code implementations • 24 Jul 2024 • Michael-Andrei Panaitescu-Liess, Zora Che, Bang An, Yuancheng Xu, Pankayaraj Pathmanathan, Souradip Chakraborty, Sicheng Zhu, Tom Goldstein, Furong Huang
Surprisingly, we find that watermarking adversely affects the success rate of MIAs, complicating the task of detecting copyrighted text in the pretraining dataset.
no code implementations • 21 Jun 2024 • Mucong Ding, Souradip Chakraborty, Vibhu Agrawal, Zora Che, Alec Koppel, Mengdi Wang, Amrit Bedi, Furong Huang
Reinforcement Learning from Human Feedback (RLHF) is a key method for aligning large language models (LLMs) with human preferences.
1 code implementation • 17 Jun 2024 • Pankayaraj Pathmanathan, Souradip Chakraborty, Xiangyu Liu, Yongyuan Liang, Furong Huang
Recent advancements in Reinforcement Learning with Human Feedback (RLHF) have significantly impacted the alignment of Large Language Models (LLMs).
no code implementations • 16 Jun 2024 • Utsav Singh, Souradip Chakraborty, Wesley A. Suttle, Brian M. Sadler, Vinay P Namboodiri, Amrit Singh Bedi
To validate our approach, we perform extensive experimental analysis on a variety of challenging robotics tasks, demonstrating that DIPPER outperforms hierarchical and non-hierarchical baselines, while ameliorating the non-stationarity and infeasible subgoal generation issues of hierarchical reinforcement learning.
Computational Efficiency Hierarchical Reinforcement Learning +1
no code implementations • 30 May 2024 • Souradip Chakraborty, Soumya Suvra Ghosal, Ming Yin, Dinesh Manocha, Mengdi Wang, Amrit Singh Bedi, Furong Huang
Hence, prior SoTA methods either approximate this $Q^*$ using $Q^{\pi_{\texttt{sft}}}$ (derived from the reference $\texttt{SFT}$ model) or rely on short-term rewards, resulting in sub-optimal decoding performance.
1 code implementation • 16 Feb 2024 • Nirjhar Das, Souradip Chakraborty, Aldo Pacchiano, Sayak Ray Chowdhury
Reinforcement Learning from Human Feedback (RLHF) is pivotal in aligning Large Language Models (LLMs) with human preferences.
1 code implementation • 15 Feb 2024 • Xiyang Wu, Souradip Chakraborty, Ruiqi Xian, Jing Liang, Tianrui Guan, Fuxiao Liu, Brian M. Sadler, Dinesh Manocha, Amrit Singh Bedi
In this paper, we highlight the critical issues of robustness and safety associated with integrating large language models (LLMs) and vision-language models (VLMs) into robotics applications.
no code implementations • 14 Feb 2024 • Souradip Chakraborty, Jiahao Qiu, Hui Yuan, Alec Koppel, Furong Huang, Dinesh Manocha, Amrit Singh Bedi, Mengdi Wang
Reinforcement Learning from Human Feedback (RLHF) aligns language models to human preferences by employing a singular reward model derived from preference data.
no code implementations • 5 Feb 2024 • Xingpeng Sun, Haoming Meng, Souradip Chakraborty, Amrit Singh Bedi, Aniket Bera
While LLMs excel in processing text in these human conversations, they struggle with the nuances of verbal instructions in scenarios like social navigation, where ambiguity and uncertainty can erode trust in robotic and other AI systems.
no code implementations • 22 Dec 2023 • Souradip Chakraborty, Anukriti Singh, Amisha Bhaskar, Pratap Tokekar, Dinesh Manocha, Amrit Singh Bedi
Current methods to mitigate this misalignment work by learning reward functions from human preferences; however, they inadvertently introduce a risk of reward overoptimization.
no code implementations • 23 Oct 2023 • Soumya Suvra Ghosal, Souradip Chakraborty, Jonas Geiping, Furong Huang, Dinesh Manocha, Amrit Singh Bedi
But in parallel to the development of detection frameworks, researchers have also concentrated on designing strategies to elude detection, i. e., focusing on the impossibilities of AI-generated text detection.
no code implementations • 3 Aug 2023 • Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Dinesh Manocha, Huazheng Wang, Mengdi Wang, Furong Huang
We present a novel unified bilevel optimization-based framework, \textsf{PARL}, formulated to address the recently highlighted critical issue of policy alignment in reinforcement learning using utility or preference-based feedback.
no code implementations • 27 May 2023 • Xiangyu Liu, Souradip Chakraborty, Yanchao Sun, Furong Huang
To address these limitations, we introduce a generalized attack framework that has the flexibility to model to what extent the adversary is able to control the agent, and allows the attacker to regulate the state distribution shift and produce stealthier adversarial policies.
no code implementations • 10 Apr 2023 • Souradip Chakraborty, Amrit Singh Bedi, Sicheng Zhu, Bang An, Dinesh Manocha, Furong Huang
Our work addresses the critical issue of distinguishing text generated by Large Language Models (LLMs) from human-produced text, a task essential for numerous applications.
no code implementations • 14 Mar 2023 • Souradip Chakraborty, Kasun Weerakoon, Prithvi Poddar, Mohamed Elnoor, Priya Narayanan, Carl Busart, Pratap Tokekar, Amrit Singh Bedi, Dinesh Manocha
Reinforcement learning-based policies for continuous control robotic navigation tasks often fail to adapt to changes in the environment during real-time deployment, which may result in catastrophic failures.
no code implementations • 28 Jan 2023 • Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Mengdi Wang, Furong Huang, Dinesh Manocha
Directed Exploration is a crucial challenge in reinforcement learning (RL), especially when rewards are sparse.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • 12 Jun 2022 • Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Pratap Tokekar, Dinesh Manocha
In this paper, we present a novel Heavy-Tailed Stochastic Policy Gradient (HT-PSG) algorithm to deal with the challenges of sparse rewards in continuous control problems.
no code implementations • 2 Jun 2022 • Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Brian M. Sadler, Furong Huang, Pratap Tokekar, Dinesh Manocha
Model-based approaches to reinforcement learning (MBRL) exhibit favorable performance in practice, but their theoretical guarantees in large spaces are mostly restricted to the setting when transition model is Gaussian or Lipschitz, and demands a posterior estimate whose representational complexity grows unbounded with time.
no code implementations • 28 Jan 2022 • Amrit Singh Bedi, Souradip Chakraborty, Anjaly Parayil, Brian Sadler, Pratap Tokekar, Alec Koppel
Doing so incurs a persistent bias that appears in the attenuation rate of the expected policy gradient norm, which is inversely proportional to the radius of the action space.
no code implementations • SEMEVAL 2020 • Ekansh Verma, Vinodh Motupalli, Souradip Chakraborty
In this paper, we present our approach for the {'}Detection of Propaganda Techniques in News Articles{'} task as a part of the 2020 edition of International Workshop on Semantic Evaluation.
no code implementations • 7 Oct 2020 • Souradip Chakraborty, Ekansh Verma, Saswata Sahoo, Jyotishka Datta
Representation Learning in a heterogeneous space with mixed variables of numerical and categorical types has interesting challenges due to its complex feature manifold.
1 code implementation • 28 Sep 2020 • Souradip Chakraborty, Aritra Roy Gosthipaty, Sayak Paul
In this work, we propose that, with the normalized temperature-scaled cross-entropy (NT-Xent) loss function (as used in SimCLR), it is beneficial to not have images of the same category in the same batch.
1 code implementation • 25 Sep 2020 • Souradip Chakraborty, Aritra Roy Gosthipaty, Sayak Paul
In this work, we propose that, with the normalized temperature-scaled cross-entropy (NT-Xent) loss function (as used in SimCLR), it is beneficial to not have images of the same category in the same batch.
no code implementations • 21 Sep 2020 • Saswata Sahoo, Souradip Chakraborty
Representation of data on mixed variables, numerical and categorical types to get suitable feature map is a challenging task as important information lies in a complex non-linear manifold.
no code implementations • 6 May 2020 • Saswata Sahoo, Souradip Chakraborty
In this work, we propose a novel strategy to explicitly model the probabilistic dependence structure among the mixed type of variables by an undirected graph.