no code implementations • 12 Jun 2022 • Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Pratap Tokekar, Dinesh Manocha
In this paper, we present a novel Heavy-Tailed Stochastic Policy Gradient (HT-PSG) algorithm to deal with the challenges of sparse rewards in continuous control problems.
no code implementations • 2 Jun 2022 • Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Brian M. Sadler, Furong Huang, Pratap Tokekar, Dinesh Manocha
In this work, we propose a novel ${\bf K}$ernelized ${\bf S}$tein Discrepancy-based Posterior Sampling for ${\bf RL}$ algorithm (named $\texttt{KSRL}$) which extends model-based RL based upon posterior sampling (PSRL) in several ways: we (i) relax the need for any smoothness or Gaussian assumptions, allowing for complex mixture models; (ii) ensure it is applicable to large-scale training by incorporating a compression step such that the posterior consists of a \emph{Bayesian coreset} of only statistically significant past state-action pairs; and (iii) develop a novel regret analysis of PSRL based upon integral probability metrics, which, under a smoothness condition on the constructed posterior, can be evaluated in closed form as the kernelized Stein discrepancy (KSD).
no code implementations • 28 Jan 2022 • Amrit Singh Bedi, Souradip Chakraborty, Anjaly Parayil, Brian Sadler, Pratap Tokekar, Alec Koppel
Doing so incurs a persistent bias that appears in the attenuation rate of the expected policy gradient norm, which is inversely proportional to the radius of the action space.
no code implementations • SEMEVAL 2020 • Ekansh Verma, Vinodh Motupalli, Souradip Chakraborty
In this paper, we present our approach for the {'}Detection of Propaganda Techniques in News Articles{'} task as a part of the 2020 edition of International Workshop on Semantic Evaluation.
no code implementations • 7 Oct 2020 • Souradip Chakraborty, Ekansh Verma, Saswata Sahoo, Jyotishka Datta
Representation Learning in a heterogeneous space with mixed variables of numerical and categorical types has interesting challenges due to its complex feature manifold.
1 code implementation • 28 Sep 2020 • Souradip Chakraborty, Aritra Roy Gosthipaty, Sayak Paul
In this work, we propose that, with the normalized temperature-scaled cross-entropy (NT-Xent) loss function (as used in SimCLR), it is beneficial to not have images of the same category in the same batch.
1 code implementation • 25 Sep 2020 • Souradip Chakraborty, Aritra Roy Gosthipaty, Sayak Paul
In this work, we propose that, with the normalized temperature-scaled cross-entropy (NT-Xent) loss function (as used in SimCLR), it is beneficial to not have images of the same category in the same batch.
no code implementations • 21 Sep 2020 • Saswata Sahoo, Souradip Chakraborty
Representation of data on mixed variables, numerical and categorical types to get suitable feature map is a challenging task as important information lies in a complex non-linear manifold.
no code implementations • 6 May 2020 • Saswata Sahoo, Souradip Chakraborty
In this work, we propose a novel strategy to explicitly model the probabilistic dependence structure among the mixed type of variables by an undirected graph.