1 code implementation • 24 Jul 2024 • Changyu Chen, Shashank Reddy Chirra, Maria José Ferreira, Cleotilde Gonzalez, Arunesh Sinha, Pradeep Varakantham
However, IBL relies on simple fixed form functions to capture the mapping from past situations to current decisions.
1 code implementation • 14 Jun 2024 • Changyu Chen, Zichen Liu, Chao Du, Tianyu Pang, Qian Liu, Arunesh Sinha, Pradeep Varakantham, Min Lin
In this work, we make a novel observation that this implicit reward model can by itself be used in a bootstrapping fashion to further align the LLM.
no code implementations • 7 Jun 2024 • Roman Belaire, Arunesh Sinha, Pradeep Varakantham
To address this challenge, we introduce a novel objective called Adversarial Counterfactual Error (ACoE), which naturally balances optimizing value and robustness against adversarial attacks.
1 code implementation • NeurIPS 2023 • Changyu Chen, Ramesha Karunasena, Thanh Hong Nguyen, Arunesh Sinha, Pradeep Varakantham
Many problems in Reinforcement Learning (RL) seek an optimal policy with large discrete multidimensional yet unordered action spaces; these include problems in randomized allocation of resources such as placements of multiple security resources and emergency response units, etc.
no code implementations • 21 Feb 2023 • Yuxiao Lu, Arunesh Sinha, Pradeep Varakantham
Safety in goal directed Reinforcement Learning (RL) settings has typically been handled through constraints over trajectories and have demonstrated good performance in primarily short horizon tasks.
1 code implementation • 7 Oct 2022 • Chen Gong, Zhou Yang, Yunpeng Bai, Junda He, Jieke Shi, Kecen Li, Arunesh Sinha, Bowen Xu, Xinwen Hou, David Lo, Tianhao Wang
Our experiments conducted on four tasks and four offline RL algorithms expose a disquieting fact: none of the existing offline RL algorithms is immune to such a backdoor attack.
no code implementations • 31 May 2022 • Avinandan Bose, Arunesh Sinha, Tien Mai
Distributionally robust optimization (DRO) has shown lot of promise in providing robustness in learning as well as sample based optimization problems.
no code implementations • 28 Feb 2022 • James Holt, Edward Raff, Ahmad Ridley, Dennis Ross, Arunesh Sinha, Diane Staheli, William Streilen, Milind Tambe, Yevgeniy Vorobeychik, Allan Wollaber
These challenges are widely studied in enterprise networks, but there are many gaps in research and practice as well as novel problems in other domains.
no code implementations • 27 Feb 2022 • Thanh H. Nguyen, Arunesh Sinha
This paper studies the problem of multi-step manipulative attacks in Stackelberg security games, in which a clever attacker attempts to orchestrate its attacks over multiple time steps to mislead the defender's learning of the attacker's behavior.
1 code implementation • 13 Feb 2022 • Wai Tuck Wong, Sarah Kinsey, Ramesha Karunasena, Thanh Nguyen, Arunesh Sinha
We show that an adversary can cause such failures by forcing rank deficiency on the matrix fed to the optimization layer which results in the optimization failing to produce a solution.
1 code implementation • 24 Jan 2022 • Changyu Chen, Avinandan Bose, Shih-Fen Cheng, Arunesh Sinha
Recent work has used generative models (GANs in particular) for providing high-fidelity simulation of real-world systems.
no code implementations • 5 Nov 2020 • Ramesha Karunasena, Mohammad Sarparajul Ambiya, Arunesh Sinha, Ruchit Nagar, Saachi Dalal, Divy Thakkar, Dhyanesh Narayanan, Milind Tambe
In this work, we define and test a data collection diligence score.
no code implementations • ICLR 2019 • Junyi Li, Xitong Wang, Yaoyang Lin, Arunesh Sinha, Micheal P. Wellman
We propose an approach to generate realistic and high-fidelity stock market data based on generative adversarial networks (GANs).
no code implementations • 7 Feb 2020 • Dennis Ross, Arunesh Sinha, Diane Staheli, Bill Streilein
Further, cyber security application areas with a particular emphasis on the characterization and deployment of human-machine teaming will be the focus.
no code implementations • 16 Dec 2019 • Andrew Perrault, Fei Fang, Arunesh Sinha, Milind Tambe
With the maturing of AI and multiagent systems research, we have a tremendous opportunity to direct these advances towards addressing complex societal problems.
no code implementations • 20 Nov 2019 • Sanket Shah, Arunesh Sinha, Pradeep Varakantham, Andrew Perrault, Milind Tambe
To solve the online problem with a hard bound on risk, we formulate it as a Reinforcement Learning (RL) problem with constraints on the action space (hard bound on risk).
no code implementations • 13 Oct 2018 • Ankit Shah, Arunesh Sinha, Rajesh Ganesan, Sushil Jajodia, Hasan Cam
In order to explain this observation, we extend the earlier RL model to a game model and show that there exists defender policies that can be robust against any adversarial policy.
no code implementations • 13 Sep 2017 • Linh Nguyen, Sky Wang, Arunesh Sinha
Finally, we show that a classifier masking method achieved by adding noise to the a neural network's logit output protects against low distortion attacks such as the CW attack.
no code implementations • 11 Nov 2016 • Nicolas Papernot, Patrick McDaniel, Arunesh Sinha, Michael Wellman
Advances in machine learning (ML) in recent years have enabled a dizzying array of applications such as data analytics, autonomous systems, and security diagnostics.
no code implementations • 30 Oct 2015 • Arunesh Sinha, Debarun Kar, Milind Tambe
We provide four main contributions: (1) a PAC model of learning adversary response functions in SSGs; (2) PAC-model analysis of the learning of key, existing bounded rationality models in SSGs; (3) an entirely new approach to adversary modeling based on a non-parametric class of response functions with PAC-model analysis and (4) identification of conditions under which computing the best defender strategy against the learned adversary behavior is indeed the optimal strategy.
no code implementations • 23 Apr 2015 • Haifeng Xu, Albert X. Jiang, Arunesh Sinha, Zinovi Rabinovich, Shaddin Dughmi, Milind Tambe
Our experiments confirm the necessity of handling information leakage and the advantage of our algorithms.