no code implementations • 2 Sep 2024 • Zoey Chen, Zhao Mandi, Homanga Bharadhwaj, Mohit Sharma, Shuran Song, Abhishek Gupta, Vikash Kumar
By demonstrating the effectiveness of image-text generative models in diverse real-world robotic applications, our generative augmentation framework provides a scalable and efficient path for boosting generalization in robot learning at no extra human cost.
no code implementations • 19 Aug 2024 • Sriyash Poddar, Yanming Wan, Hamish Ivison, Abhishek Gupta, Natasha Jaques
Reinforcement Learning from Human Feedback (RLHF) is a powerful paradigm for aligning foundation models to human values and preferences.
1 code implementation • 4 Jul 2024 • Darshan Prabhu, Abhishek Gupta, Omkar Nitsure, Preethi Jyothi, Sriram Ganapathy
Speech accents present a serious challenge to the performance of state-of-the-art end-to-end Automatic Speech Recognition (ASR) systems.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 19 May 2024 • Zoey Chen, Aaron Walsman, Marius Memmel, Kaichun Mo, Alex Fang, Karthikeya Vemuri, Alan Wu, Dieter Fox, Abhishek Gupta
We present an integrated end-to-end pipeline that generates simulation scenes complete with articulated kinematic and dynamic structures from real-world images and use these for training robotic control policies.
no code implementations • 17 May 2024 • Jianzong Pi, Samuel Filgueira da Silva, Mehmet Fatih Ozkan, Abhishek Gupta, Marcello Canova
Efficient parameter identification of electrochemical models is crucial for accurate monitoring and control of lithium-ion cells.
no code implementations • 20 Apr 2024 • Yaqing Hou, Wenqiang Ma, Abhishek Gupta, Kavitesh Kumar Bali, Hongwei Ge, Qiang Zhang, Carlos A. Coello Coello, Yew-Soon Ong
This paper pioneers a practical TrEO benchmark suite, integrating problems from the literature categorized based on the three essential aspects of Big Source Task-Instances: volume, variety, and velocity.
no code implementations • 18 Apr 2024 • Marius Memmel, Andrew Wagenmaker, Chuning Zhu, Patrick Yin, Dieter Fox, Abhishek Gupta
In this work, we propose a learning system that can leverage a small amount of real-world data to autonomously refine a simulation model and then plan an accurate control strategy that can be deployed in the real world.
no code implementations • 24 Mar 2024 • Lu Bai, Abhishek Gupta, Yew-Soon Ong
Multi-task learning solves multiple correlated tasks.
no code implementations • 10 Mar 2024 • Chuning Zhu, Xinqi Wang, Tyler Han, Simon S. Du, Abhishek Gupta
This work proposes a novel class of models, i. e., generalized occupancy models (GOMs), that learn a distribution of successor features from a stationary dataset, along with a policy that acts to realize different successor features.
no code implementations • 6 Mar 2024 • Marcel Torne, Anthony Simeonov, Zechu Li, April Chan, Tao Chen, Abhishek Gupta, Pulkit Agrawal
To learn performant, robust policies without the burden of unsafe real-world data collection or extensive human supervision, we propose RialTo, a system for robustifying real-world imitation learning policies via reinforcement learning in "digital twin" simulation environments constructed on the fly from small amounts of real-world data.
no code implementations • 29 Jan 2024 • Jianlan Luo, Zheyuan Hu, Charles Xu, You Liang Tan, Jacob Berg, Archit Sharma, Stefan Schaal, Chelsea Finn, Abhishek Gupta, Sergey Levine
We posit that a significant challenge to widespread adoption of robotic RL, as well as further development of robotic RL methods, is the comparative inaccessibility of such methods.
no code implementations • 30 Dec 2023 • Yaqing Hou, Mingyang Sun, Abhishek Gupta, Yaochu Jin, Haiyin Piao, Hongwei Ge, Qiang Zhang
In this paper, we scale evolutionary algorithms to high-dimensional optimization problems that deceptively possess a low effective dimensionality (certain dimensions do not significantly affect the objective function).
1 code implementation • 22 Dec 2023 • Jiao Liu, Abhishek Gupta, Yew-Soon Ong
In this paper, we introduce a novel concept of \textit{inverse transfer} in multiobjective optimization.
no code implementations • 14 Dec 2023 • Hao Chen, Abhishek Gupta, Yin Sun, Ness Shroff
In particular, we provide performance guarantees for the MMD-CUSUM test under exponentially $\alpha$, $\beta$, and fast $\phi$-mixing processes, which significantly expands its utility beyond the i. i. d.
no code implementations • 7 Dec 2023 • Athul Paul Jacob, Abhishek Gupta, Jacob Andreas
We study the problem of modeling a population of agents pursuing unknown goals subject to unknown computational constraints.
1 code implementation • 6 Dec 2023 • Jian Cheng Wong, Chin Chun Ooi, Abhishek Gupta, Pao-Hsiung Chiu, Joshua Shao Zheng Low, My Ha Dao, Yew-Soon Ong
Physics-informed neural networks (PINNs) are at the forefront of scientific machine learning, making possible the creation of machine intelligence that is cognizant of physical laws and able to accurately simulate them.
no code implementations • 24 Nov 2023 • Vasantha Kumar Venugopal, Abhishek Gupta, Rohit Takhar, Vidur Mahajan
With the increasingly widespread adoption of AI in healthcare, maintaining the accuracy and reliability of AI models in clinical practice has become crucial.
no code implementations • 31 Oct 2023 • Max Balsells, Marcel Torne, Zihan Wang, Samedh Desai, Pulkit Agrawal, Abhishek Gupta
We evaluate this system on a suite of robotic tasks in simulation and demonstrate its effectiveness at learning behaviors both in simulation and the real world.
1 code implementation • 30 Oct 2023 • Zhaoyi Zhou, Chuning Zhu, Runlong Zhou, Qiwen Cui, Abhishek Gupta, Simon Shaolei Du
Off-policy dynamic programming (DP) techniques such as $Q$-learning have proven to be important in sequential decision-making problems.
no code implementations • 12 Oct 2023 • Zichen Zhang, Yunshuang Li, Osbert Bastani, Abhishek Gupta, Dinesh Jayaraman, Yecheng Jason Ma, Luca Weihs
Learning long-horizon manipulation tasks, however, is a long-standing challenge, and demands decomposing the overarching task into several manageable subtasks to facilitate policy learning and generalization to unseen tasks.
1 code implementation • NeurIPS 2023 • Zhang-Wei Hong, Aviral Kumar, Sathwik Karnik, Abhishek Bhandwaldar, Akash Srivastava, Joni Pajarinen, Romain Laroche, Abhishek Gupta, Pulkit Agrawal
We argue this is due to an assumption made by current offline RL algorithms of staying close to the trajectories in the dataset.
no code implementations • 4 Oct 2023 • Hao Chen, Abhishek Gupta, Yin Sun, Ness Shroff
This paper studies Hoeffding's inequality for Markov chains under the generalized concentrability condition defined via integral probability metric (IPM).
no code implementations • 29 Sep 2023 • Elizabeth Seger, Noemi Dreksler, Richard Moulange, Emily Dardaman, Jonas Schuett, K. Wei, Christoph Winter, Mackenzie Arnold, Seán Ó hÉigeartaigh, Anton Korinek, Markus Anderljung, Ben Bucknall, Alan Chan, Eoghan Stafford, Leonie Koessler, Aviv Ovadya, Ben Garfinkel, Emma Bluemke, Michael Aird, Patrick Levermore, Julian Hazell, Abhishek Gupta
Recent decisions by leading AI labs to either open-source their models or to restrict access to their models has sparked debate about whether, and how, increasingly capable AI models should be shared.
no code implementations • 25 Sep 2023 • Meenal Parakh, Alisha Fong, Anthony Simeonov, Tao Chen, Abhishek Gupta, Pulkit Agrawal
Large Language Models (LLMs) have been shown to act like planners that can decompose high-level instructions into a sequence of executable instructions.
no code implementations • 6 Sep 2023 • Zheyuan Hu, Aaron Rovinsky, Jianlan Luo, Vikash Kumar, Abhishek Gupta, Sergey Levine
We demonstrate the benefits of reusing past data as replay buffer initialization for new tasks, for instance, the fast acquisition of intricate manipulation skills in the real world on a four-fingered robotic hand.
no code implementations • 2 Aug 2023 • Aayush Chaudhary, Abhinav Rai, Abhishek Gupta
This paper discusses the system architecture design and deployment of non-stationary multi-armed bandit approaches to determine a near-optimal payment routing policy based on the recent history of transactions.
1 code implementation • 20 Jul 2023 • Marcel Torne, Max Balsells, Zihan Wang, Samedh Desai, Tao Chen, Pulkit Agrawal, Abhishek Gupta
This procedure can leverage noisy, asynchronous human feedback to learn policies with no hand-crafted reward design or exploration bonuses.
no code implementations • 12 Jul 2023 • Max Simchowitz, Abhishek Gupta, Kaiqing Zhang
Focusing on the special case where the labels are given by bilinear embeddings into a Hilbert space $H$: $\mathbb{E}[z \mid x, y ]=\langle f_{\star}(x), g_{\star}(y)\rangle_{{H}}$, we aim to extrapolate to a test distribution domain that is $not$ covered in training, i. e., achieving bilinear combinatorial extrapolation.
no code implementations • 10 Jul 2023 • Arti Vedula, Abhishek Gupta, Shaileshh Bojja Venkatakrishnan
To maximize rewards a miner must choose its network connections carefully, ensuring existence of paths to other miners that are on average of a lower latency compared to paths between other miners.
no code implementations • 24 Jun 2023 • H. J. Terry Suh, Glen Chou, Hongkai Dai, Lujie Yang, Abhishek Gupta, Russ Tedrake
However, in order to apply them effectively in offline optimization paradigms such as offline Reinforcement Learning (RL) or Imitation Learning (IL), we require a more careful consideration of how uncertainty estimation interplays with first-order methods that attempt to minimize them.
no code implementations • 2 Jun 2023 • Vasantha Kumar Venugopal, Abhishek Gupta, Rohit Takhar, Charlene Liew Jin Yee, Catherine Jones, Gilberto Szarf
This review discusses the concept of fairness in AI, focusing on bias auditing using the Aequitas toolkit, and its real-world implications in radiology, particularly in disease screening scenarios.
no code implementations • 24 May 2023 • Melvin Wong, Yew-Soon Ong, Abhishek Gupta, Kavitesh K. Bali, Caishun Chen
Synthesis of digital artifacts conditioned on user prompts has become an important paradigm facilitating an explosion of use cases with generative AI.
1 code implementation • 27 Apr 2023 • Aviv Netanyahu, Abhishek Gupta, Max Simchowitz, Kaiqing Zhang, Pulkit Agrawal
Machine learning systems, especially with overparameterized deep neural networks, can generalize to novel test instances drawn from the same distribution as the training data.
no code implementations • 31 Mar 2023 • Emily Dardaman, Abhishek Gupta
Without the ability to estimate and benchmark AI capability advancements, organizations are left to respond to each change reactively, impeding their ability to build viable mid and long-term strategies.
no code implementations • 31 Mar 2023 • Emily Dardaman, Abhishek Gupta
AI systems may be better thought of as peers than as tools.
no code implementations • 25 Mar 2023 • Raunak Joshi, Abhishek Gupta, Himanshu Soni, Ronald Laban
The polycystic ovary syndrome diagnosis is a problem that can be leveraged using prognostication based learning procedures.
no code implementations • 23 Mar 2023 • Sameer Pai, Tao Chen, Megha Tippur, Edward Adelson, Abhishek Gupta, Pulkit Agrawal
We study the problem of object retrieval in scenarios where visual sensing is absent, object shapes are unknown beforehand and objects can move freely, like grabbing objects out of a drawer.
1 code implementation • 13 Feb 2023 • Yuqing Du, Olivia Watkins, Zihan Wang, Cédric Colas, Trevor Darrell, Pieter Abbeel, Abhishek Gupta, Jacob Andreas
Reinforcement learning algorithms typically struggle in the absence of a dense, well-shaped reward function.
no code implementations • 19 Dec 2022 • Kelvin Xu, Zheyuan Hu, Ria Doshi, Aaron Rovinsky, Vikash Kumar, Abhishek Gupta, Sergey Levine
In this paper, we describe a system for vision-based dexterous manipulation that provides a "programming-free" approach for users to define new tasks and enable robots with complex multi-fingered hands to learn to perform them through interaction.
no code implementations • 15 Dec 2022 • Nicholas Sung Wei Yong, Jian Cheng Wong, Pao-Hsiung Chiu, Abhishek Gupta, Chinchun Ooi, Yew-Soon Ong
Hence, neuroevolution algorithms, with their superior global search capacity, may be a better choice for PINNs relative to gradient descent methods.
no code implementations • 28 Nov 2022 • Raunak Joshi, Abhishek Gupta, Nandan Kanvinde, Pandharinath Ghonge
Forgery of images is sub-category of image forensics and can be detected using Error Level Analysis.
no code implementations • 18 Oct 2022 • Abhishek Gupta, Aldo Pacchiano, Yuexiang Zhai, Sham M. Kakade, Sergey Levine
Reinforcement learning provides an automated framework for learning behaviors from high-level reward specifications, but in practice the choice of reward function can be crucial for good results -- while in principle the reward only needs to specify what the task is, in reality practitioners often need to design more detailed rewards that provide the agent with some hints about how the task should be completed.
no code implementations • 6 Oct 2022 • Anurag Ajay, Abhishek Gupta, Dibya Ghosh, Sergey Levine, Pulkit Agrawal
In this work, we develop a framework for meta-RL algorithms that are able to behave appropriately under test-time distribution shifts in the space of tasks.
no code implementations • 28 Sep 2022 • Zoey Qiuyu Chen, Karl Van Wyk, Yu-Wei Chao, Wei Yang, Arsalan Mousavian, Abhishek Gupta, Dieter Fox
The policy learned from our dataset can generalize well on unseen object poses in both simulation and the real world
1 code implementation • KDD 2022 • Xinghua Qu, Yew-Soon Ong, Abhishek Gupta, Pengfei Wei, Zhu Sun, Zejun Ma
Given such an issue, we denote the \emph{frame importance} as its contribution to the expected reward on a particular frame, and hypothesize that adapting such frame importance could benefit the performance of the distilled student policy.
no code implementations • 24 Aug 2022 • Abhishek Gupta, Raunak Joshi, Nandan Kanvinde, Pinky Gerela, Ronald Melwin Laban
Regression branch of Machine Learning purely focuses on prediction of continuous values.
no code implementations • 21 Aug 2022 • Shiping Shao, Farshad Harirchi, Devang Dave, Abhishek Gupta
We develop a new algorithm for scheduling the charging process of a large number of electric vehicles (EVs) over a finite horizon.
no code implementations • 5 Jun 2022 • Raunak Joshi, Abhishek Gupta
In this paper we present a performance based comparison between simple transformer based network and Res-CNN-BiLSTM based network for cyberbullying text classification problem.
no code implementations • 2 Jun 2022 • Abhishek Gupta, Sarvesh Thustu, Riti Thakor, Saniya Patil, Raunak Joshi, Ronald Melvin Laban
Aerial Vehicles follow a guided approach based on Latitude, Longitude and Altitude.
no code implementations • 25 May 2022 • Abhishek Gupta, Sruthi Nair, Raunak Joshi, Vidya Chitre
Many complex Deep Learning models are used with different variations for various prognostication tasks.
no code implementations • 2 May 2022 • Han Xiang Choong, Yew-Soon Ong, Abhishek Gupta, Caishun Chen, Ray Lim
For deep learning, size is power.
no code implementations • 20 Apr 2022 • Raunak Joshi, Abhishek Gupta, Nandan Kanvinde
Mental Health Disturbance has many reasons and cyberbullying is one of the major causes that does exploitation using social media as an instrument.
no code implementations • 19 Apr 2022 • Abhishek Gupta, Raunak Joshi, Ronald Laban
Image Forgery is a problem of image forensics and its detection can be leveraged using Deep Learning.
no code implementations • 21 Mar 2022 • Nick Zhang, Abhishek Gupta, Zefeng Chen, Yew-Soon Ong
This paper is the first to address the shortcoming of today's methods via a novel neuroevolutionary multitasking (NuEMT) algorithm, designed to transfer information from a set of auxiliary tasks (of short episode length) to the target (full length) RL task at hand.
1 code implementation • NeurIPS 2021 • Olivia Watkins, Trevor Darrell, Pieter Abbeel, Jacob Andreas, Abhishek Gupta
Training automated agents to complete complex tasks in interactive environments is challenging: reinforcement learning requires careful hand-engineering of reward functions, imitation learning requires specialized infrastructure and access to a human expert, and learning from intermediate forms of supervision (like binary preferences) is time-consuming and extracts little information from each human intervention.
no code implementations • 17 Feb 2022 • Sruthi Nair, Abhishek Gupta, Raunak Joshi, Vidya Chitre
The Machine Learning has various learning algorithms that are better in some or the other aspect when compared with each other but a common error that all algorithms will suffer from is training data with very high dimensional feature set.
no code implementations • 15 Feb 2022 • Nandan Kanvinde, Abhishek Gupta, Raunak Joshi, Pinky Gerela
High dimensional data for classification does create many difficulties for machine learning algorithms.
no code implementations • 15 Feb 2022 • Mu-Ruei Tseng, Abhishek Gupta, Chi-Keung Tang, Yu-Wing Tai
All training and testing 3D skeletons in HAA4D are globally aligned, using a deep alignment model to the same global space, making each skeleton face the negative z-direction.
no code implementations • 12 Feb 2022 • Abhishek Gupta, Connor Wright, Marianna Bergamaschi Ganapini, Masa Sweidan, Renjie Butalid
This report from the Montreal AI Ethics Institute (MAIEI) covers the most salient progress in research and reporting over the second half of 2021 in the field of AI ethics.
1 code implementation • 7 Feb 2022 • Sayali Tambe, Raunak Joshi, Abhishek Gupta, Nandan Kanvinde, Vidya Chitre
The semantics are derived from textual data that provide representations for Machine Learning algorithms.
no code implementations • 27 Jan 2022 • Hao Chen, Jiacheng Tang, Abhishek Gupta
In this paper, we develop a new change detection algorithm for detecting a change in the Markov kernel over a metric space in which the post-change kernel is unknown.
no code implementations • 9 Jan 2022 • Abhishek Gupta, Himanshu Soni, Raunak Joshi, Ronald Melwin Laban
A lot of prognostication methodologies have been formulated for early detection of Polycystic Ovary Syndrome also known as PCOS using Machine Learning.
no code implementations • 2 Jan 2022 • Abhishek Gupta, Sannidhi Shetty, Raunak Joshi, Ronald Melwin Laban
Prognostication of medical problems using the clinical data by leveraging the Machine Learning techniques with stellar precision is one of the most important real world challenges at the present time.
2 code implementations • ICLR 2022 • Archit Sharma, Kelvin Xu, Nikhil Sardana, Abhishek Gupta, Karol Hausman, Sergey Levine, Chelsea Finn
In this paper, we aim to address this discrepancy by laying out a framework for Autonomous Reinforcement Learning (ARL): reinforcement learning where the agent not only learns through its own experience, but also contends with lack of human supervision to reset between trials.
no code implementations • 9 Nov 2021 • Jiacheng Tang, Jiguo Song, Abhishek Gupta
Dynamic watermarking, as an active intrusion detection technique, can potentially detect replay attacks, spoofing attacks, and deception attacks in the feedback channel for control systems.
no code implementations • 7 Oct 2021 • Jayanth Reddy Regatti, Aniket Anand Deshmukh, Frank Cheng, Young Hun Jung, Abhishek Gupta, Urun Dogan
We address this performance gap with a policy transfer algorithm which first trains a teacher agent using the offline dataset where features are fully available, and then transfers this knowledge to a student agent that only uses the resource-constrained features.
no code implementations • 29 Sep 2021 • Jayanth Reddy Regatti, Aniket Anand Deshmukh, Young Hun Jung, Frank Cheng, Abhishek Gupta, Urun Dogan
We address this performance gap with a policy transfer algorithm which first trains a teacher agent using the offline dataset where features are fully available, and then transfers this knowledge to a student agent that only uses the resource-constrained features.
no code implementations • 27 Sep 2021 • Abhishek Gupta, Lei Zhou, Yew-Soon Ong, Zefeng Chen, Yaqing Hou
Until recently, the potential to transfer evolved skills across distinct optimization problem instances (or tasks) was seldom explored in evolutionary computation.
no code implementations • 20 Sep 2021 • Jian Cheng Wong, Chinchun Ooi, Abhishek Gupta, Yew-Soon Ong
In this paper, we present a novel perspective of the merits of learning in sinusoidal spaces with PINNs.
no code implementations • 9 Aug 2021 • Abhishek Gupta, Connor Wright, Marianna Bergamaschi Ganapini, Masa Sweidan, Renjie Butalid
This report from the Montreal AI Ethics Institute covers the most salient progress in research and reporting over the second quarter of 2021 in the field of AI ethics with a special emphasis on "Environment and AI", "Creativity and AI", and "Geopolitics and AI."
no code implementations • 5 Aug 2021 • Shreshta Rajakumar Deshpande, Shobhit Gupta, Abhishek Gupta, Marcello Canova
This paper presents a hierarchical multi-layer Model Predictive Control (MPC) approach for improving the fuel economy of a 48V mild-hybrid powertrain in a connected vehicle environment.
no code implementations • 28 Jul 2021 • Charles Sun, Jędrzej Orbik, Coline Devin, Brian Yang, Abhishek Gupta, Glen Berseth, Sergey Levine
Our aim is to devise a robotic reinforcement learning system for learning navigation and manipulation together, in an autonomous way without human intervention, enabling continual learning under realistic assumptions.
no code implementations • NeurIPS 2021 • Archit Sharma, Abhishek Gupta, Sergey Levine, Karol Hausman, Chelsea Finn
Reinforcement learning (RL) promises to enable autonomous acquisition of complex behaviors for diverse agents.
no code implementations • 15 Jul 2021 • Kevin Li, Abhishek Gupta, Ashwin Reddy, Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine
In this work, we show that an uncertainty aware classifier can solve challenging reinforcement learning problems by both encouraging exploration and provided directed guidance towards positive outcomes.
no code implementations • 6 Jul 2021 • Yuntian Deng, Xingyu Zhou, Baekjin Kim, Ambuj Tewari, Abhishek Gupta, Ness Shroff
To this end, we develop WGP-UCB, a novel UCB-type algorithm based on weighted Gaussian process regression.
no code implementations • NeurIPS 2021 • Kate Rakelly, Abhishek Gupta, Carlos Florensa, Sergey Levine
Mutual information maximization provides an appealing formalism for learning representations of data.
no code implementations • 25 May 2021 • Zhaoxuan Zhu, Nicola Pivaro, Shobhit Gupta, Abhishek Gupta, Marcello Canova
Connected and Automated Hybrid Electric Vehicles have the potential to reduce fuel consumption and travel time in real-world driving conditions.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • 19 May 2021 • Abhishek Gupta, Alexandrine Royer, Connor Wright, Victoria Heath, Muriam Fancy, Marianna Bergamaschi Ganapini, Shannon Egan, Masa Sweidan, Mo Akif, Renjie Butalid
The 4th edition of the Montreal AI Ethics Institute's The State of AI Ethics captures the most relevant developments in the field of AI Ethics since January 2021.
no code implementations • 19 May 2021 • Abhishek Gupta, Alexandrine Royer, Connor Wright, Falaah Arif Khan, Victoria Heath, Erick Galinkin, Ryan Khurana, Marianna Bergamaschi Ganapini, Muriam Fancy, Masa Sweidan, Mo Akif, Renjie Butalid
The 3rd edition of the Montreal AI Ethics Institute's The State of AI Ethics captures the most relevant developments in AI Ethics since October 2020.
no code implementations • 22 Apr 2021 • Abhishek Gupta, Justin Yu, Tony Z. Zhao, Vikash Kumar, Aaron Rovinsky, Kelvin Xu, Thomas Devlin, Sergey Levine
This work shows the ability to learn dexterous manipulation behaviors in the real world with RL without any human intervention.
no code implementations • 28 Jan 2021 • Abhishek Gupta
This report prepared by the Montreal AI Ethics Institute provides recommendations in response to the National Security Commission on Artificial Intelligence (NSCAI) Key Considerations for Responsible Development and Fielding of Artificial Intelligence document.
no code implementations • 27 Jan 2021 • Kanchan Chaurasia, Reena Sahu, Abhishek Gupta
With the help of numerical results and analysis, we show the impact of various parameters including content granularity, connectivity radius, and rate threshold and present important design insights.
Information Theory Information Theory
no code implementations • 13 Jan 2021 • Zhaoxuan Zhu, Shobhit Gupta, Abhishek Gupta, Marcello Canova
Connected and Automated Vehicles (CAVs), in particular those with multiple power sources, have the potential to significantly reduce fuel consumption and travel time in real-world driving conditions.
no code implementations • 9 Jan 2021 • Yuntian Deng, Shiping Shao, Archak Mittal, Richard Twumasi-Boakye, James Fishelson, Abhishek Gupta, Ness B. Shroff
Accordingly, in this paper, we use cooperative game theory coupled with the hyperpath-based stochastic user equilibrium framework to study such a market.
no code implementations • 6 Jan 2021 • Jian Cheng Wong, Abhishek Gupta, Yew-Soon Ong
In the context of solving differential equations, we are faced with the problem of finding globally optimum parameters of the network, instead of being concerned with out-of-sample generalization.
no code implementations • 1 Jan 2021 • Kevin Li, Abhishek Gupta, Vitchyr H. Pong, Ashwin Reddy, Aurick Zhou, Justin Yu, Sergey Levine
In this work, we study a more tractable class of reinforcement learning problems defined by data that provides examples of successful outcome states.
1 code implementation • 3 Dec 2020 • Mojtaba Shakeri, Erfan Miahi, Abhishek Gupta, Yew-Soon Ong
Under such settings, existing transfer evolutionary optimization frameworks grapple with simultaneously satisfying two important quality attributes, namely (1) scalability against a growing number of source tasks and (2) online learning agility against sparsity of relevant sources to the target task of interest.
no code implementations • 5 Nov 2020 • Abhishek Gupta, Alexandrine Royer, Victoria Heath, Connor Wright, Camylle Lanteigne, Allison Cohen, Marianna Bergamaschi Ganapini, Muriam Fancy, Erick Galinkin, Ryan Khurana, Mo Akif, Renjie Butalid, Falaah Arif Khan, Masa Sweidan, Audrey Balogh
The 2nd edition of the Montreal AI Ethics Institute's The State of AI Ethics captures the most relevant developments in the field of AI Ethics since July 2020.
no code implementations • 6 Oct 2020 • Daniel Nemirovsky, Nicolas Thiebaut, Ye Xu, Abhishek Gupta
Machine learning predictors have been increasingly applied in production settings, including in one of the world's largest hiring platforms, Hired, to provide a better candidate and recruiter experience.
no code implementations • 28 Sep 2020 • Marvin Mengxin Zhang, Henrik Marklund, Nikita Dhawan, Abhishek Gupta, Sergey Levine, Chelsea Finn
A fundamental assumption of most machine learning algorithms is that the training and test data are drawn from the same underlying distribution.
no code implementations • 15 Sep 2020 • Abhishek Gupta, Camylle Lanteigne, Victoria Heath
In order to ensure that the science and technology of AI is developed in a humane manner, we must develop research publication norms that are informed by our growing understanding of AI's potential threats and use cases.
no code implementations • 11 Sep 2020 • Daniel Nemirovsky, Nicolas Thiebaut, Ye Xu, Abhishek Gupta
The prevalence of machine learning models in various industries has led to growing demands for model interpretability and for the ability to provide meaningful recourse to users.
no code implementations • 14 Aug 2020 • Xinghua Qu, Yew-Soon Ong, Abhishek Gupta, Zhu Sun
Motivated by this finding, we propose a new policy distillation loss with two terms: 1) a prescription gap maximization loss aiming at simultaneously maximizing the likelihood of the action selected by the teacher policy and the entropy over the remaining actions; 2) a corresponding Jacobian regularization loss that minimizes the magnitude of the gradient with respect to the input state.
no code implementations • 11 Aug 2020 • Allison Cohen, Abhishek Gupta
Contact tracing has grown in popularity as a promising solution to the COVID-19 pandemic.
no code implementations • 9 Jul 2020 • Abhishek Gupta, Erick Galinkin
In this hand-off, the engineers responsible for model deployment are often not privy to the details of the model and thus, the potential vulnerabilities associated with its usage, exposure, or compromise.
3 code implementations • NeurIPS 2021 • Marvin Zhang, Henrik Marklund, Nikita Dhawan, Abhishek Gupta, Sergey Levine, Chelsea Finn
A fundamental assumption of most machine learning algorithms is that the training and test data are drawn from the same underlying distribution.
no code implementations • 25 Jun 2020 • Abhishek Gupta, Camylle Lanteigne, Victoria Heath, Marianna Bergamaschi Ganapini, Erick Galinkin, Allison Cohen, Tania De Gasperis, Mo Akif, Renjie Butalid
These past few months have been especially challenging, and the deployment of technology in ways hitherto untested at an unrivalled pace has left the internet and technology watchers aghast.
no code implementations • 24 Jun 2020 • Jayanth Regatti, Hao Chen, Abhishek Gupta
We show that using these reputation scores for gradient aggregation is robust to any number of multiplicative noise Byzantine adversaries and use two-timescale stochastic approximation theory to prove convergence for strongly convex loss functions.
no code implementations • 22 Jun 2020 • John D. Co-Reyes, Suvansh Sanjeev, Glen Berseth, Abhishek Gupta, Sergey Levine
Much of the current work on reinforcement learning studies episodic settings, where the agent is reset between trials to an initial state distribution, often with well-shaped reward functions.
no code implementations • 16 Jun 2020 • Abhishek Gupta, Camylle Lanteigne
This paper outlines the EC's policy options for the promotion and adoption of artificial intelligence (AI) in the European Union.
6 code implementations • 16 Jun 2020 • Ashvin Nair, Abhishek Gupta, Murtaza Dalal, Sergey Levine
If we can instead allow RL algorithms to effectively use previously collected data to aid the online learning process, such applications could be made substantially more practical: the prior data would provide a starting point that mitigates challenges due to exploration and sample complexity, while the online training enables the agent to perfect the desired skill.
no code implementations • 15 Jun 2020 • Mirka Snyder Caron, Abhishek Gupta
It is important to keep in mind that for AI, meeting the expectations of this social contract is critical, because recklessly driving the adoption and implementation of unsafe, irresponsible, or unethical AI systems may trigger serious backlash against industry and academia involved which could take decades to resolve, if not actually seriously harm society.
no code implementations • 11 Jun 2020 • Abhishek Gupta, Camylle Lanteigne, Sara Kingsley
In a world increasingly dominated by AI applications, an understudied aspect is the carbon and social footprint of these power-hungry algorithms that require copious computation and a trove of data for training and prediction.
no code implementations • 5 Jun 2020 • Abhishek Gupta, Shaohan Hu, Weida Zhong, Adel Sadek, Lu Su, Chunming Qiao
Estimates of road grade/slope can add another dimension of information to existing 2D digital road maps.
no code implementations • ICLR 2020 • Henry Zhu, Justin Yu, Abhishek Gupta, Dhruv Shah, Kristian Hartikainen, Avi Singh, Vikash Kumar, Sergey Levine
The success of reinforcement learning in the real world has been limited to instrumented laboratory scenarios, often requiring arduous human supervision to enable continuous learning.
no code implementations • 27 Apr 2020 • Henry Zhu, Justin Yu, Abhishek Gupta, Dhruv Shah, Kristian Hartikainen, Avi Singh, Vikash Kumar, Sergey Levine
In this work, we discuss the elements that are needed for a robotic learning system that can continually and autonomously improve with data collected in the real world.
no code implementations • 4 Apr 2020 • James Holehouse, Abhishek Gupta, Ramon Grima
A common model of stochastic auto-regulatory gene expression describes promoter switching via cooperative protein binding, effective protein production in the active state and dilution of proteins.
no code implementations • 25 Mar 2020 • Abhishek Gupta, William B. Haskell
We show that if the distribution of the iterates in the Markov chain satisfy a contraction property with respect to the Wasserstein divergence, then the Markov chain admits an invariant distribution.
4 code implementations • NeurIPS 2020 • Aviral Kumar, Abhishek Gupta, Sergey Levine
We show that bootstrapping-based Q-learning algorithms do not necessarily benefit from this corrective feedback, and training on the experience collected by the algorithm is not sufficient to correct errors in the Q-function.
Ranked #3 on Meta-Learning on MT50
no code implementations • 27 Feb 2020 • Rahul Singh, Abhishek Gupta, Ness B. Shroff
In order to measure the performance of a reinforcement learning algorithm that satisfies the average cost constraints, we define an $M+1$ dimensional regret vector that is composed of its reward regret, and $M$ cost regrets.
10 code implementations • NeurIPS 2020 • Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, Chelsea Finn
While deep learning and deep reinforcement learning (RL) systems have demonstrated impressive results in domains such as image classification, game playing, and robotic control, data efficiency remains a major challenge.
2 code implementations • ICLR 2021 • Dibya Ghosh, Abhishek Gupta, Ashwin Reddy, Justin Fu, Coline Devin, Benjamin Eysenbach, Sergey Levine
Current reinforcement learning (RL) algorithms can be brittle and difficult to use, especially when learning goal-reaching behaviors from sparse rewards.
Multi-Goal Reinforcement Learning Reinforcement Learning (RL)
no code implementations • NeurIPS 2019 • Allan Jabri, Kyle Hsu, Ben Eysenbach, Abhishek Gupta, Sergey Levine, Chelsea Finn
In experiments on vision-based navigation and manipulation domains, we show that the algorithm allows for unsupervised meta-learning that transfers to downstream tasks specified by hand-crafted reward functions and serves as pre-training for more efficient supervised meta-learning of test task distributions.
no code implementations • 18 Nov 2019 • Lu Bai, Yew-Soon Ong, Tiantian He, Abhishek Gupta
Multi-label learning studies the problem where an instance is associated with a set of labels.
no code implementations • 10 Nov 2019 • Xinghua Qu, Zhu Sun, Yew-Soon Ong, Abhishek Gupta, Pengfei Wei
Recent studies have revealed that neural network-based policies can be easily fooled by adversarial examples.
1 code implementation • 25 Oct 2019 • Abhishek Gupta, Vikash Kumar, Corey Lynch, Sergey Levine, Karol Hausman
We present relay policy learning, a method for imitation and reinforcement learning that can solve multi-stage, long-horizon robotic tasks.
no code implementations • 29 Sep 2019 • Jayanth Regatti, Gaurav Tendolkar, Yi Zhou, Abhishek Gupta, Yingbin Liang
The performance of fully synchronized distributed systems has faced a bottleneck due to the big data trend, under which asynchronous distributed systems are becoming a major popularity due to their powerful scalability.
1 code implementation • 25 Sep 2019 • Michael Ahn, Henry Zhu, Kristian Hartikainen, Hugo Ponte, Abhishek Gupta, Sergey Levine, Vikash Kumar
ROBEL introduces two robots, each aimed to accelerate reinforcement learning research in different task domains: D'Claw is a three-fingered hand robot that facilitates learning dexterous manipulation tasks, and D'Kitty is a four-legged robot that facilitates learning agile legged locomotion tasks.
no code implementations • 25 Sep 2019 • Tianhe Yu, Saurabh Kumar, Eric Mitchell, Abhishek Gupta, Karol Hausman, Sergey Levine, Chelsea Finn
Deep learning enables training of large and flexible function approximators from scratch at the cost of large amounts of data.
no code implementations • 25 Sep 2019 • Dibya Ghosh, Abhishek Gupta, Justin Fu, Ashwin Reddy, Coline Devin, Benjamin Eysenbach, Sergey Levine
By maximizing the likelihood of good actions provided by an expert demonstrator, supervised imitation learning can produce effective policies without the algorithmic complexities and optimization challenges of reinforcement learning, at the cost of requiring an expert demonstrator -- typically a person -- to provide the demonstrations.
no code implementations • 17 Jul 2019 • Carl-Maria Mörch, Abhishek Gupta, Brian L. Mishara
Objectives: The Canada Protocol - MHSP is a tool to guide and support professionals, users, and researchers using AI in mental health and suicide prevention.
no code implementations • 27 May 2019 • Giulia Vezzani, Abhishek Gupta, Lorenzo Natale, Pieter Abbeel
In this work, we take a representation learning viewpoint on exploration, utilizing prior experience to learn effective latent representations, which can subsequently indicate which regions to explore.
no code implementations • ICLR 2019 • Rosen Kralev, Russell Mendonca, Alvin Zhang, Tianhe Yu, Abhishek Gupta, Pieter Abbeel, Sergey Levine, Chelsea Finn
Meta-reinforcement learning aims to learn fast reinforcement learning (RL) procedures that can be applied to new tasks or environments.
no code implementations • ICLR 2019 • Dibya Ghosh, Abhishek Gupta, Sergey Levine
Most prior work on representation learning has focused on generative approaches, learning representations that capture all the underlying factors of variation in the observation space in a more disentangled or well-ordered manner.
no code implementations • 24 Apr 2019 • Abhishek Gupta, Hao Chen, Jianzong Pi, Gaurav Tendolkar
We show that starting from the same initial condition, the distribution of the random sequence generated by the iterated random operators converges weakly to the trajectory generated by the contraction operator.
no code implementations • NeurIPS 2019 • Russell Mendonca, Abhishek Gupta, Rosen Kralev, Pieter Abbeel, Sergey Levine, Chelsea Finn
Reinforcement learning (RL) algorithms have demonstrated promising results on complex tasks, yet often require impractical numbers of samples since they learn from scratch.
no code implementations • 10 Mar 2019 • Xinyi Ren, Jianlan Luo, Eugen Solowjow, Juan Aparicio Ojea, Abhishek Gupta, Aviv Tamar, Pieter Abbeel
In this work, we investigate how to improve the accuracy of domain randomization based pose estimation.
no code implementations • 30 Dec 2018 • Yew-Soon Ong, Abhishek Gupta
In this article, we provide and overview of what we consider to be some of the most pressing research questions facing the fields of artificial intelligence (AI) and computational intelligence (CI); with the latter focusing on algorithms that are inspired by various natural phenomena.
54 code implementations • 13 Dec 2018 • Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, Sergey Levine
A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
1 code implementation • 19 Nov 2018 • Dibya Ghosh, Abhishek Gupta, Sergey Levine
Most prior work on representation learning has focused on generative approaches, learning representations that capture all underlying factors of variation in the observation space in a more disentangled or well-ordered manner.
1 code implementation • ICLR 2019 • John D. Co-Reyes, Abhishek Gupta, Suvansh Sanjeev, Nick Altieri, Jacob Andreas, John DeNero, Pieter Abbeel, Sergey Levine
However, a single instruction may be insufficient to fully communicate our intent or, even if it is, may be insufficient for an autonomous agent to actually understand how to perform the desired task.
no code implementations • 14 Oct 2018 • Henry Zhu, Abhishek Gupta, Aravind Rajeswaran, Sergey Levine, Vikash Kumar
Dexterous multi-fingered robotic hands can perform a wide range of manipulation skills, making them an appealing component for general-purpose robotic manipulators.
no code implementations • 15 Sep 2018 • Abhishek Gupta, Zhaoyuan Yang
Complex autonomous control systems are subjected to sensor failures, cyber-attacks, sensor noise, communication channel failures, etc.
1 code implementation • ICLR 2019 • Michael B. Chang, Abhishek Gupta, Sergey Levine, Thomas L. Griffiths
A generally intelligent learner should generalize to more complex tasks than it has previously encountered, but the two common paradigms in machine learning -- either training a separate learner per task or training a single learner for all tasks -- both have difficulty with such generalization because they do not leverage the compositional structure of the task distribution.
no code implementations • ICLR 2020 • Abhishek Gupta, Benjamin Eysenbach, Chelsea Finn, Sergey Levine
In the context of reinforcement learning, meta-learning algorithms acquire reinforcement learning procedures to solve new problems more efficiently by utilizing experience from prior tasks.
no code implementations • ICML 2018 • John D. Co-Reyes, Yuxuan Liu, Abhishek Gupta, Benjamin Eysenbach, Pieter Abbeel, Sergey Levine
We show that we can learn continuous latent representations of trajectories, which are effective in solving temporally extended and multi-stage problems.
Hierarchical Reinforcement Learning reinforcement-learning +2
no code implementations • 4 Apr 2018 • Abhishek Gupta, Rahul Jain, Peter Glynn
In many branches of engineering, Banach contraction mapping theorem is employed to establish the convergence of certain deterministic algorithms.
2 code implementations • NeurIPS 2018 • Abhishek Gupta, Russell Mendonca, Yuxuan Liu, Pieter Abbeel, Sergey Levine
Exploration is a fundamental challenge in reinforcement learning (RL).
3 code implementations • ICLR 2019 • Benjamin Eysenbach, Abhishek Gupta, Julian Ibarz, Sergey Levine
On a variety of simulated robotic tasks, we show that this simple objective results in the unsupervised emergence of diverse skills, such as walking and jumping.
no code implementations • 17 Nov 2017 • Adam Żychowski, Abhishek Gupta, Jacek Mańdziuk, Yew Soon Ong
This paper presents algorithmic and empirical contributions demonstrating that the convergence characteristics of a co-evolutionary approach to tackle Multi-Objective Games (MOGs) with postponed preference articulation can often be hampered due to the possible emergence of the so-called Red Queen effect.
1 code implementation • 28 Sep 2017 • Aravind Rajeswaran, Vikash Kumar, Abhishek Gupta, Giulia Vezzani, John Schulman, Emanuel Todorov, Sergey Levine
Furthermore, deployment of DRL on physical systems remains challenging due to sample inefficiency.
no code implementations • 3 Sep 2017 • Viet Ha-Thuc, Yan Yan, Xianren Wu, Vijay Dialani, Abhishek Gupta, Shakti Sinha
One key challenge in talent search is to translate complex criteria of a hiring position into a search query, while it is relatively easy for a searcher to list examples of suitable candidates for a given position.
1 code implementation • 11 Jul 2017 • YuXuan Liu, Abhishek Gupta, Pieter Abbeel, Sergey Levine
Imitation learning is an effective approach for autonomous systems to acquire control policies when an explicit reward function is unavailable, using supervision provided as demonstrations from an expert, typically a human operator.
no code implementations • 12 Jun 2017 • Bingshui Da, Yew-Soon Ong, Liang Feng, A. K. Qin, Abhishek Gupta, Zexuan Zhu, Chuan-Kang Ting, Ke Tang, Xin Yao
In this report, we suggest nine test problems for multi-task single-objective optimization (MTSOO), each of which consists of two single-objective optimization tasks that need to be solved simultaneously.
no code implementations • 8 Jun 2017 • Yuan Yuan, Yew-Soon Ong, Liang Feng, A. K. Qin, Abhishek Gupta, Bingshui Da, Qingfu Zhang, Kay Chen Tan, Yaochu Jin, Hisao Ishibuchi
In this report, we suggest nine test problems for multi-task multi-objective optimization (MTMOO), each of which consists of two multiobjective optimization tasks that need to be solved simultaneously.
no code implementations • 8 Mar 2017 • Abhishek Gupta, Coline Devin, Yuxuan Liu, Pieter Abbeel, Sergey Levine
People can learn a wide range of tasks from their own experience, but can also learn from observing other creatures.
no code implementations • 15 Nov 2016 • Vikash Kumar, Abhishek Gupta, Emanuel Todorov, Sergey Levine
We demonstrate that such controllers can perform the task robustly, both in simulation and on the physical platform, for a limited range of initial conditions around the trained starting state.
no code implementations • 22 Sep 2016 • Coline Devin, Abhishek Gupta, Trevor Darrell, Pieter Abbeel, Sergey Levine
Using deep reinforcement learning to train general purpose neural network policies alleviates some of the burden of manual representation engineering by using expressive policy classes, but exacerbates the challenge of data collection, since such methods tend to be less efficient than RL with low-dimensional, hand-designed representations.
no code implementations • 19 Jul 2016 • Abhishek Gupta, Yew-Soon Ong
Evolutionary multitasking has recently emerged as a novel paradigm that enables the similarities and/or latent complementarities (if present) between distinct optimization tasks to be exploited in an autonomous manner simply by solving them together with a unified solution representation scheme.
no code implementations • 21 Mar 2016 • Abhishek Gupta, Clemens Eppner, Sergey Levine, Pieter Abbeel
In this paper, we describe an approach to learning from demonstration that can be used to train soft robotic hands to perform dexterous manipulation tasks.
no code implementations • 26 Feb 2016 • Viet Ha-Thuc, Ye Xu, Satya Pradeep Kanduri, Xianren Wu, Vijay Dialani, Yan Yan, Abhishek Gupta, Shakti Sinha
This new system only needs the searcher to input one or several examples of suitable candidates for the position.