Search Results for author: Abhishek Gupta

Found 151 papers, 28 papers with code

Semantically Controllable Augmentations for Generalizable Robot Learning

no code implementations2 Sep 2024 Zoey Chen, Zhao Mandi, Homanga Bharadhwaj, Mohit Sharma, Shuran Song, Abhishek Gupta, Vikash Kumar

By demonstrating the effectiveness of image-text generative models in diverse real-world robotic applications, our generative augmentation framework provides a scalable and efficient path for boosting generalization in robot learning at no extra human cost.

Data Augmentation Robot Manipulation

Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning

no code implementations19 Aug 2024 Sriyash Poddar, Yanming Wan, Hamish Ivison, Abhishek Gupta, Natasha Jaques

Reinforcement Learning from Human Feedback (RLHF) is a powerful paradigm for aligning foundation models to human values and preferences.

reinforcement-learning

Improving Self-supervised Pre-training using Accent-Specific Codebooks

1 code implementation4 Jul 2024 Darshan Prabhu, Abhishek Gupta, Omkar Nitsure, Preethi Jyothi, Sriram Ganapathy

Speech accents present a serious challenge to the performance of state-of-the-art end-to-end Automatic Speech Recognition (ASR) systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

URDFormer: A Pipeline for Constructing Articulated Simulation Environments from Real-World Images

no code implementations19 May 2024 Zoey Chen, Aaron Walsman, Marius Memmel, Kaichun Mo, Alex Fang, Karthikeya Vemuri, Alan Wu, Dieter Fox, Abhishek Gupta

We present an integrated end-to-end pipeline that generates simulation scenes complete with articulated kinematic and dynamic structures from real-world images and use these for training robotic control policies.

Scene Generation

Bridging the Gap Between Theory and Practice: Benchmarking Transfer Evolutionary Optimization

no code implementations20 Apr 2024 Yaqing Hou, Wenqiang Ma, Abhishek Gupta, Kavitesh Kumar Bali, Hongwei Ge, Qiang Zhang, Carlos A. Coello Coello, Yew-Soon Ong

This paper pioneers a practical TrEO benchmark suite, integrating problems from the literature categorized based on the three essential aspects of Big Source Task-Instances: volume, variety, and velocity.

Benchmarking

ASID: Active Exploration for System Identification in Robotic Manipulation

no code implementations18 Apr 2024 Marius Memmel, Andrew Wagenmaker, Chuning Zhu, Patrick Yin, Dieter Fox, Abhishek Gupta

In this work, we propose a learning system that can leverage a small amount of real-world data to autonomously refine a simulation model and then plan an accurate control strategy that can be deployed in the real world.

Transferable Reinforcement Learning via Generalized Occupancy Models

no code implementations10 Mar 2024 Chuning Zhu, Xinqi Wang, Tyler Han, Simon S. Du, Abhishek Gupta

This work proposes a novel class of models, i. e., generalized occupancy models (GOMs), that learn a distribution of successor features from a stationary dataset, along with a policy that acts to realize different successor features.

reinforcement-learning Reinforcement Learning (RL)

Reconciling Reality through Simulation: A Real-to-Sim-to-Real Approach for Robust Manipulation

no code implementations6 Mar 2024 Marcel Torne, Anthony Simeonov, Zechu Li, April Chan, Tao Chen, Abhishek Gupta, Pulkit Agrawal

To learn performant, robust policies without the burden of unsafe real-world data collection or extensive human supervision, we propose RialTo, a system for robustifying real-world imitation learning policies via reinforcement learning in "digital twin" simulation environments constructed on the fly from small amounts of real-world data.

Imitation Learning reinforcement-learning

SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning

no code implementations29 Jan 2024 Jianlan Luo, Zheyuan Hu, Charles Xu, You Liang Tan, Jacob Berg, Archit Sharma, Stefan Schaal, Chelsea Finn, Abhishek Gupta, Sergey Levine

We posit that a significant challenge to widespread adoption of robotic RL, as well as further development of robotic RL methods, is the comparative inaccessibility of such methods.

reinforcement-learning Reinforcement Learning (RL)

Multiform Evolution for High-Dimensional Problems with Low Effective Dimensionality

no code implementations30 Dec 2023 Yaqing Hou, Mingyang Sun, Abhishek Gupta, Yaochu Jin, Haiyin Piao, Hongwei Ge, Qiang Zhang

In this paper, we scale evolutionary algorithms to high-dimensional optimization problems that deceptively possess a low effective dimensionality (certain dimensions do not significantly affect the objective function).

Evolutionary Algorithms

Model-Free Change Point Detection for Mixing Processes

no code implementations14 Dec 2023 Hao Chen, Abhishek Gupta, Yin Sun, Ness Shroff

In particular, we provide performance guarantees for the MMD-CUSUM test under exponentially $\alpha$, $\beta$, and fast $\phi$-mixing processes, which significantly expands its utility beyond the i. i. d.

Change Point Detection

Modeling Boundedly Rational Agents with Latent Inference Budgets

no code implementations7 Dec 2023 Athul Paul Jacob, Abhishek Gupta, Jacob Andreas

We study the problem of modeling a population of agents pursuing unknown goals subject to unknown computational constraints.

Decision Making Decision Making Under Uncertainty

Generalizable Neural Physics Solvers by Baldwinian Evolution

1 code implementation6 Dec 2023 Jian Cheng Wong, Chin Chun Ooi, Abhishek Gupta, Pao-Hsiung Chiu, Joshua Shao Zheng Low, My Ha Dao, Yew-Soon Ong

Physics-informed neural networks (PINNs) are at the forefront of scientific machine learning, making possible the creation of machine intelligence that is cognizant of physical laws and able to accurately simulate them.

Meta-Learning

New Epochs in AI Supervision: Design and Implementation of an Autonomous Radiology AI Monitoring System

no code implementations24 Nov 2023 Vasantha Kumar Venugopal, Abhishek Gupta, Rohit Takhar, Vidur Mahajan

With the increasingly widespread adoption of AI in healthcare, maintaining the accuracy and reliability of AI models in clinical practice has become crucial.

Decision Making

Autonomous Robotic Reinforcement Learning with Asynchronous Human Feedback

no code implementations31 Oct 2023 Max Balsells, Marcel Torne, Zihan Wang, Samedh Desai, Pulkit Agrawal, Abhishek Gupta

We evaluate this system on a suite of robotic tasks in simulation and demonstrate its effectiveness at learning behaviors both in simulation and the real world.

reinforcement-learning Self-Supervised Learning

Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning

1 code implementation30 Oct 2023 Zhaoyi Zhou, Chuning Zhu, Runlong Zhou, Qiwen Cui, Abhishek Gupta, Simon Shaolei Du

Off-policy dynamic programming (DP) techniques such as $Q$-learning have proven to be important in sequential decision-making problems.

Decision Making Offline RL +1

Universal Visual Decomposer: Long-Horizon Manipulation Made Easy

no code implementations12 Oct 2023 Zichen Zhang, Yunshuang Li, Osbert Bastani, Abhishek Gupta, Dinesh Jayaraman, Yecheng Jason Ma, Luca Weihs

Learning long-horizon manipulation tasks, however, is a long-standing challenge, and demands decomposing the overarching task into several manageable subtasks to facilitate policy learning and generalization to unseen tasks.

reinforcement-learning

Hoeffding's Inequality for Markov Chains under Generalized Concentrability Condition

no code implementations4 Oct 2023 Hao Chen, Abhishek Gupta, Yin Sun, Ness Shroff

This paper studies Hoeffding's inequality for Markov chains under the generalized concentrability condition defined via integral probability metric (IPM).

Lifelong Robot Learning with Human Assisted Language Planners

no code implementations25 Sep 2023 Meenal Parakh, Alisha Fong, Anthony Simeonov, Tao Chen, Abhishek Gupta, Pulkit Agrawal

Large Language Models (LLMs) have been shown to act like planners that can decompose high-level instructions into a sequence of executable instructions.

REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous Manipulation

no code implementations6 Sep 2023 Zheyuan Hu, Aaron Rovinsky, Jianlan Luo, Vikash Kumar, Abhishek Gupta, Sergey Levine

We demonstrate the benefits of reusing past data as replay buffer initialization for new tasks, for instance, the fast acquisition of intricate manipulation skills in the real world on a four-fingered robotic hand.

Imitation Learning Reinforcement Learning (RL)

Maximizing Success Rate of Payment Routing using Non-stationary Bandits

no code implementations2 Aug 2023 Aayush Chaudhary, Abhinav Rai, Abhishek Gupta

This paper discusses the system architecture design and deployment of non-stationary multi-armed bandit approaches to determine a near-optimal payment routing policy based on the recent history of transactions.

Breadcrumbs to the Goal: Goal-Conditioned Exploration from Human-in-the-Loop Feedback

1 code implementation20 Jul 2023 Marcel Torne, Max Balsells, Zihan Wang, Samedh Desai, Tao Chen, Pulkit Agrawal, Abhishek Gupta

This procedure can leverage noisy, asynchronous human feedback to learn policies with no hand-crafted reward design or exploration bonuses.

Decision Making reinforcement-learning +1

Tackling Combinatorial Distribution Shift: A Matrix Completion Perspective

no code implementations12 Jul 2023 Max Simchowitz, Abhishek Gupta, Kaiqing Zhang

Focusing on the special case where the labels are given by bilinear embeddings into a Hilbert space $H$: $\mathbb{E}[z \mid x, y ]=\langle f_{\star}(x), g_{\star}(y)\rangle_{{H}}$, we aim to extrapolate to a test distribution domain that is $not$ covered in training, i. e., achieving bilinear combinatorial extrapolation.

Matrix Completion

Cobalt: Optimizing Mining Rewards in Proof-of-Work Network Games

no code implementations10 Jul 2023 Arti Vedula, Abhishek Gupta, Shaileshh Bojja Venkatakrishnan

To maximize rewards a miner must choose its network connections carefully, ensuring existence of paths to other miners that are on average of a lower latency compared to paths between other miners.

Fighting Uncertainty with Gradients: Offline Reinforcement Learning via Diffusion Score Matching

no code implementations24 Jun 2023 H. J. Terry Suh, Glen Chou, Hongkai Dai, Lujie Yang, Abhishek Gupta, Russ Tedrake

However, in order to apply them effectively in offline optimization paradigms such as offline Reinforcement Learning (RL) or Imitation Learning (IL), we require a more careful consideration of how uncertainty estimation interplays with first-order methods that attempt to minimize them.

Imitation Learning Offline RL +2

Navigating Fairness in Radiology AI: Concepts, Consequences,and Crucial Considerations

no code implementations2 Jun 2023 Vasantha Kumar Venugopal, Abhishek Gupta, Rohit Takhar, Charlene Liew Jin Yee, Catherine Jones, Gilberto Szarf

This review discusses the concept of fairness in AI, focusing on bias auditing using the Aequitas toolkit, and its real-world implications in radiology, particularly in disease screening scenarios.

Fairness

Prompt Evolution for Generative AI: A Classifier-Guided Approach

no code implementations24 May 2023 Melvin Wong, Yew-Soon Ong, Abhishek Gupta, Kavitesh K. Bali, Caishun Chen

Synthesis of digital artifacts conditioned on user prompts has become an important paradigm facilitating an explosion of use cases with generative AI.

Learning to Extrapolate: A Transductive Approach

1 code implementation27 Apr 2023 Aviv Netanyahu, Abhishek Gupta, Max Simchowitz, Kaiqing Zhang, Pulkit Agrawal

Machine learning systems, especially with overparameterized deep neural networks, can generalize to novel test instances drawn from the same distribution as the training data.

Imitation Learning

Asking Better Questions -- The Art and Science of Forecasting: A mechanism for truer answers to high-stakes questions

no code implementations31 Mar 2023 Emily Dardaman, Abhishek Gupta

Without the ability to estimate and benchmark AI capability advancements, organizations are left to respond to each change reactively, impeding their ability to build viable mid and long-term strategies.

Deep Linear Discriminant Analysis with Variation for Polycystic Ovary Syndrome Classification

no code implementations25 Mar 2023 Raunak Joshi, Abhishek Gupta, Himanshu Soni, Ronald Laban

The polycystic ovary syndrome diagnosis is a problem that can be leveraged using prognostication based learning procedures.

Dimensionality Reduction

TactoFind: A Tactile Only System for Object Retrieval

no code implementations23 Mar 2023 Sameer Pai, Tao Chen, Megha Tippur, Edward Adelson, Abhishek Gupta, Pulkit Agrawal

We study the problem of object retrieval in scenarios where visual sensing is absent, object shapes are unknown beforehand and objects can move freely, like grabbing objects out of a drawer.

Object Retrieval

Dexterous Manipulation from Images: Autonomous Real-World RL via Substep Guidance

no code implementations19 Dec 2022 Kelvin Xu, Zheyuan Hu, Ria Doshi, Aaron Rovinsky, Vikash Kumar, Abhishek Gupta, Sergey Levine

In this paper, we describe a system for vision-based dexterous manipulation that provides a "programming-free" approach for users to define new tasks and enable robots with complex multi-fingered hands to learn to perform them through interaction.

reinforcement-learning Reinforcement Learning (RL)

Neuroevolution of Physics-Informed Neural Nets: Benchmark Problems and Comparative Results

no code implementations15 Dec 2022 Nicholas Sung Wei Yong, Jian Cheng Wong, Pao-Hsiung Chiu, Abhishek Gupta, Chinchun Ooi, Yew-Soon Ong

Hence, neuroevolution algorithms, with their superior global search capacity, may be a better choice for PINNs relative to gradient descent methods.

Evolutionary Algorithms

Unpacking Reward Shaping: Understanding the Benefits of Reward Engineering on Sample Complexity

no code implementations18 Oct 2022 Abhishek Gupta, Aldo Pacchiano, Yuexiang Zhai, Sham M. Kakade, Sergey Levine

Reinforcement learning provides an automated framework for learning behaviors from high-level reward specifications, but in practice the choice of reward function can be crucial for good results -- while in principle the reward only needs to specify what the task is, in reality practitioners often need to design more detailed rewards that provide the agent with some hints about how the task should be completed.

reinforcement-learning Reinforcement Learning (RL)

Distributionally Adaptive Meta Reinforcement Learning

no code implementations6 Oct 2022 Anurag Ajay, Abhishek Gupta, Dibya Ghosh, Sergey Levine, Pulkit Agrawal

In this work, we develop a framework for meta-RL algorithms that are able to behave appropriately under test-time distribution shifts in the space of tasks.

Meta Reinforcement Learning reinforcement-learning +1

DexTransfer: Real World Multi-fingered Dexterous Grasping with Minimal Human Demonstrations

no code implementations28 Sep 2022 Zoey Qiuyu Chen, Karl Van Wyk, Yu-Wei Chao, Wei Yang, Arsalan Mousavian, Abhishek Gupta, Dieter Fox

The policy learned from our dataset can generalize well on unseen object poses in both simulation and the real world

Object

Importance Prioritized Policy Distillation

1 code implementation KDD 2022 Xinghua Qu, Yew-Soon Ong, Abhishek Gupta, Pengfei Wei, Zhu Sun, Zejun Ma

Given such an issue, we denote the \emph{frame importance} as its contribution to the expected reward on a particular frame, and hypothesize that adapting such frame importance could benefit the performance of the distilled student policy.

Atari Games Decision Making +1

Preemptive Scheduling of EV Charging for Providing Demand Response Services

no code implementations21 Aug 2022 Shiping Shao, Farshad Harirchi, Devang Dave, Abhishek Gupta

We develop a new algorithm for scheduling the charging process of a large number of electric vehicles (EVs) over a finite horizon.

Scheduling

Performance Comparison of Simple Transformer and Res-CNN-BiLSTM for Cyberbullying Classification

no code implementations5 Jun 2022 Raunak Joshi, Abhishek Gupta

In this paper we present a performance based comparison between simple transformer based network and Res-CNN-BiLSTM based network for cyberbullying text classification problem.

text-classification Text Classification

Res-CNN-BiLSTM Network for overcoming Mental Health Disturbances caused due to Cyberbullying through Social Media

no code implementations20 Apr 2022 Raunak Joshi, Abhishek Gupta, Nandan Kanvinde

Mental Health Disturbance has many reasons and cyberbullying is one of the major causes that does exploitation using social media as an instrument.

Detection of Tool based Edited Images from Error Level Analysis and Convolutional Neural Network

no code implementations19 Apr 2022 Abhishek Gupta, Raunak Joshi, Ronald Laban

Image Forgery is a problem of image forensics and its detection can be leveraged using Deep Learning.

Image Forensics

Multitask Neuroevolution for Reinforcement Learning with Long and Short Episodes

no code implementations21 Mar 2022 Nick Zhang, Abhishek Gupta, Zefeng Chen, Yew-Soon Ong

This paper is the first to address the shortcoming of today's methods via a novel neuroevolutionary multitasking (NuEMT) algorithm, designed to transfer information from a set of auxiliary tasks (of short episode length) to the target (full length) RL task at hand.

Continuous Control OpenAI Gym +2

Teachable Reinforcement Learning via Advice Distillation

1 code implementation NeurIPS 2021 Olivia Watkins, Trevor Darrell, Pieter Abbeel, Jacob Andreas, Abhishek Gupta

Training automated agents to complete complex tasks in interactive environments is challenging: reinforcement learning requires careful hand-engineering of reward functions, imitation learning requires specialized infrastructure and access to a human expert, and learning from intermediate forms of supervision (like binary preferences) is time-consuming and extracts little information from each human intervention.

Imitation Learning reinforcement-learning +1

Combining Varied Learners for Binary Classification using Stacked Generalization

no code implementations17 Feb 2022 Sruthi Nair, Abhishek Gupta, Raunak Joshi, Vidya Chitre

The Machine Learning has various learning algorithms that are better in some or the other aspect when compared with each other but a common error that all algorithms will suffer from is training data with very high dimensional feature set.

Binary Classification Classification +1

HAA4D: Few-Shot Human Atomic Action Recognition via 3D Spatio-Temporal Skeletal Alignment

no code implementations15 Feb 2022 Mu-Ruei Tseng, Abhishek Gupta, Chi-Keung Tang, Yu-Wing Tai

All training and testing 3D skeletons in HAA4D are globally aligned, using a deep alignment model to the same global space, making each skeleton face the negative z-direction.

Atomic action recognition

State of AI Ethics Report (Volume 6, February 2022)

no code implementations12 Feb 2022 Abhishek Gupta, Connor Wright, Marianna Bergamaschi Ganapini, Masa Sweidan, Renjie Butalid

This report from the Montreal AI Ethics Institute (MAIEI) covers the most salient progress in research and reporting over the second half of 2021 in the field of AI ethics.

Ethics

Change Detection of Markov Kernels with Unknown Pre and Post Change Kernel

no code implementations27 Jan 2022 Hao Chen, Jiacheng Tang, Abhishek Gupta

In this paper, we develop a new change detection algorithm for detecting a change in the Markov kernel over a metric space in which the post-change kernel is unknown.

Change Detection

Discriminant Analysis in Contrasting Dimensions for Polycystic Ovary Syndrome Prognostication

no code implementations9 Jan 2022 Abhishek Gupta, Himanshu Soni, Raunak Joshi, Ronald Melwin Laban

A lot of prognostication methodologies have been formulated for early detection of Polycystic Ovary Syndrome also known as PCOS using Machine Learning.

BIG-bench Machine Learning Binary Classification +2

Succinct Differentiation of Disparate Boosting Ensemble Learning Methods for Prognostication of Polycystic Ovary Syndrome Diagnosis

no code implementations2 Jan 2022 Abhishek Gupta, Sannidhi Shetty, Raunak Joshi, Ronald Melwin Laban

Prognostication of medical problems using the clinical data by leveraging the Machine Learning techniques with stellar precision is one of the most important real world challenges at the present time.

Ensemble Learning

Autonomous Reinforcement Learning: Formalism and Benchmarking

2 code implementations ICLR 2022 Archit Sharma, Kelvin Xu, Nikhil Sardana, Abhishek Gupta, Karol Hausman, Sergey Levine, Chelsea Finn

In this paper, we aim to address this discrepancy by laying out a framework for Autonomous Reinforcement Learning (ARL): reinforcement learning where the agent not only learns through its own experience, but also contends with lack of human supervision to reset between trials.

Benchmarking reinforcement-learning +1

A Dynamic Watermarking Algorithm for Finite Markov Decision Problems

no code implementations9 Nov 2021 Jiacheng Tang, Jiguo Song, Abhishek Gupta

Dynamic watermarking, as an active intrusion detection technique, can potentially detect replay attacks, spoofing attacks, and deception attacks in the feedback channel for control systems.

Intrusion Detection

Offline RL With Resource Constrained Online Deployment

no code implementations7 Oct 2021 Jayanth Reddy Regatti, Aniket Anand Deshmukh, Frank Cheng, Young Hun Jung, Abhishek Gupta, Urun Dogan

We address this performance gap with a policy transfer algorithm which first trains a teacher agent using the offline dataset where features are fully available, and then transfers this knowledge to a student agent that only uses the resource-constrained features.

D4RL Offline RL

Offline Reinforcement Learning with Resource Constrained Online Deployment

no code implementations29 Sep 2021 Jayanth Reddy Regatti, Aniket Anand Deshmukh, Young Hun Jung, Frank Cheng, Abhishek Gupta, Urun Dogan

We address this performance gap with a policy transfer algorithm which first trains a teacher agent using the offline dataset where features are fully available, and then transfers this knowledge to a student agent that only uses the resource-constrained features.

D4RL Offline RL +2

Half a Dozen Real-World Applications of Evolutionary Multitasking, and More

no code implementations27 Sep 2021 Abhishek Gupta, Lei Zhou, Yew-Soon Ong, Zefeng Chen, Yaqing Hou

Until recently, the potential to transfer evolved skills across distinct optimization problem instances (or tasks) was seldom explored in evolutionary computation.

Learning in Sinusoidal Spaces with Physics-Informed Neural Networks

no code implementations20 Sep 2021 Jian Cheng Wong, Chinchun Ooi, Abhishek Gupta, Yew-Soon Ong

In this paper, we present a novel perspective of the merits of learning in sinusoidal spaces with PINNs.

The State of AI Ethics Report (Volume 5)

no code implementations9 Aug 2021 Abhishek Gupta, Connor Wright, Marianna Bergamaschi Ganapini, Masa Sweidan, Renjie Butalid

This report from the Montreal AI Ethics Institute covers the most salient progress in research and reporting over the second quarter of 2021 in the field of AI ethics with a special emphasis on "Environment and AI", "Creativity and AI", and "Geopolitics and AI."

Ethics Fairness +2

Real-time Eco-Driving Control in Electrified Connected and Autonomous Vehicles using Approximate Dynamic Programming

no code implementations5 Aug 2021 Shreshta Rajakumar Deshpande, Shobhit Gupta, Abhishek Gupta, Marcello Canova

This paper presents a hierarchical multi-layer Model Predictive Control (MPC) approach for improving the fuel economy of a 48V mild-hybrid powertrain in a connected vehicle environment.

Autonomous Vehicles Model Predictive Control

Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation

no code implementations28 Jul 2021 Charles Sun, Jędrzej Orbik, Coline Devin, Brian Yang, Abhishek Gupta, Glen Berseth, Sergey Levine

Our aim is to devise a robotic reinforcement learning system for learning navigation and manipulation together, in an autonomous way without human intervention, enabling continual learning under realistic assumptions.

Continual Learning Navigate +2

MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning

no code implementations15 Jul 2021 Kevin Li, Abhishek Gupta, Ashwin Reddy, Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine

In this work, we show that an uncertainty aware classifier can solve challenging reinforcement learning problems by both encouraging exploration and provided directed guidance towards positive outcomes.

Meta-Learning reinforcement-learning +1

Weighted Gaussian Process Bandits for Non-stationary Environments

no code implementations6 Jul 2021 Yuntian Deng, Xingyu Zhou, Baekjin Kim, Ambuj Tewari, Abhishek Gupta, Ness Shroff

To this end, we develop WGP-UCB, a novel UCB-type algorithm based on weighted Gaussian process regression.

regression

The State of AI Ethics Report (Volume 4)

no code implementations19 May 2021 Abhishek Gupta, Alexandrine Royer, Connor Wright, Victoria Heath, Muriam Fancy, Marianna Bergamaschi Ganapini, Shannon Egan, Masa Sweidan, Mo Akif, Renjie Butalid

The 4th edition of the Montreal AI Ethics Institute's The State of AI Ethics captures the most relevant developments in the field of AI Ethics since January 2021.

Ethics Fairness +1

The State of AI Ethics Report (January 2021)

no code implementations19 May 2021 Abhishek Gupta, Alexandrine Royer, Connor Wright, Falaah Arif Khan, Victoria Heath, Erick Galinkin, Ryan Khurana, Marianna Bergamaschi Ganapini, Muriam Fancy, Masa Sweidan, Mo Akif, Renjie Butalid

The 3rd edition of the Montreal AI Ethics Institute's The State of AI Ethics captures the most relevant developments in AI Ethics since October 2020.

Ethics Misinformation

Making Responsible AI the Norm rather than the Exception

no code implementations28 Jan 2021 Abhishek Gupta

This report prepared by the Montreal AI Ethics Institute provides recommendations in response to the National Security Commission on Artificial Intelligence (NSCAI) Key Considerations for Responsible Development and Fielding of Artificial Intelligence document.

Ethics Friction +1

Coverage Analysis of Broadcast Networks with Users Having Heterogeneous Content/Advertisement Preferences

no code implementations27 Jan 2021 Kanchan Chaurasia, Reena Sahu, Abhishek Gupta

With the help of numerical results and analysis, we show the impact of various parameters including content granularity, connectivity radius, and rate threshold and present important design insights.

Information Theory Information Theory

A Deep Reinforcement Learning Framework for Eco-driving in Connected and Automated Hybrid Electric Vehicles

no code implementations13 Jan 2021 Zhaoxuan Zhu, Shobhit Gupta, Abhishek Gupta, Marcello Canova

Connected and Automated Vehicles (CAVs), in particular those with multiple power sources, have the potential to significantly reduce fuel consumption and travel time in real-world driving conditions.

Incentive Design and Profit Sharing in Multi-modal Transportation Network

no code implementations9 Jan 2021 Yuntian Deng, Shiping Shao, Archak Mittal, Richard Twumasi-Boakye, James Fishelson, Abhishek Gupta, Ness B. Shroff

Accordingly, in this paper, we use cooperative game theory coupled with the hyperpath-based stochastic user equilibrium framework to study such a market.

Can Transfer Neuroevolution Tractably Solve Your Differential Equations?

no code implementations6 Jan 2021 Jian Cheng Wong, Abhishek Gupta, Yew-Soon Ong

In the context of solving differential equations, we are faced with the problem of finding globally optimum parameters of the network, instead of being concerned with out-of-sample generalization.

Reinforcement Learning with Bayesian Classifiers: Efficient Skill Learning from Outcome Examples

no code implementations1 Jan 2021 Kevin Li, Abhishek Gupta, Vitchyr H. Pong, Ashwin Reddy, Aurick Zhou, Justin Yu, Sergey Levine

In this work, we study a more tractable class of reinforcement learning problems defined by data that provides examples of successful outcome states.

reinforcement-learning Reinforcement Learning (RL)

Scalable Transfer Evolutionary Optimization: Coping with Big Task Instances

1 code implementation3 Dec 2020 Mojtaba Shakeri, Erfan Miahi, Abhishek Gupta, Yew-Soon Ong

Under such settings, existing transfer evolutionary optimization frameworks grapple with simultaneously satisfying two important quality attributes, namely (1) scalability against a growing number of source tasks and (2) online learning agility against sparsity of relevant sources to the target task of interest.

Providing Actionable Feedback in Hiring Marketplaces using Generative Adversarial Networks

no code implementations6 Oct 2020 Daniel Nemirovsky, Nicolas Thiebaut, Ye Xu, Abhishek Gupta

Machine learning predictors have been increasingly applied in production settings, including in one of the world's largest hiring platforms, Hired, to provide a better candidate and recruiter experience.

Adaptive Risk Minimization: A Meta-Learning Approach for Tackling Group Shift

no code implementations28 Sep 2020 Marvin Mengxin Zhang, Henrik Marklund, Nikita Dhawan, Abhishek Gupta, Sergey Levine, Chelsea Finn

A fundamental assumption of most machine learning algorithms is that the training and test data are drawn from the same underlying distribution.

Image Classification Meta-Learning

Report prepared by the Montreal AI Ethics Institute (MAIEI) on Publication Norms for Responsible AI

no code implementations15 Sep 2020 Abhishek Gupta, Camylle Lanteigne, Victoria Heath

In order to ensure that the science and technology of AI is developed in a humane manner, we must develop research publication norms that are informed by our growing understanding of AI's potential threats and use cases.

Ethics Navigate

CounteRGAN: Generating Realistic Counterfactuals with Residual Generative Adversarial Nets

no code implementations11 Sep 2020 Daniel Nemirovsky, Nicolas Thiebaut, Ye Xu, Abhishek Gupta

The prevalence of machine learning models in various industries has led to growing demands for model interpretability and for the ability to provide meaningful recourse to users.

counterfactual

Adversary Agnostic Robust Deep Reinforcement Learning

no code implementations14 Aug 2020 Xinghua Qu, Yew-Soon Ong, Abhishek Gupta, Zhu Sun

Motivated by this finding, we propose a new policy distillation loss with two terms: 1) a prescription gap maximization loss aiming at simultaneously maximizing the likelihood of the action selected by the teacher policy and the entropy over the remaining actions; 2) a corresponding Jacobian regularization loss that minimizes the magnitude of the gradient with respect to the input state.

Adversarial Robustness Atari Games +2

Green Lighting ML: Confidentiality, Integrity, and Availability of Machine Learning Systems in Deployment

no code implementations9 Jul 2020 Abhishek Gupta, Erick Galinkin

In this hand-off, the engineers responsible for model deployment are often not privy to the details of the model and thus, the potential vulnerabilities associated with its usage, exposure, or compromise.

BIG-bench Machine Learning Ethics

Adaptive Risk Minimization: Learning to Adapt to Domain Shift

3 code implementations NeurIPS 2021 Marvin Zhang, Henrik Marklund, Nikita Dhawan, Abhishek Gupta, Sergey Levine, Chelsea Finn

A fundamental assumption of most machine learning algorithms is that the training and test data are drawn from the same underlying distribution.

BIG-bench Machine Learning Domain Generalization +2

The State of AI Ethics Report (June 2020)

no code implementations25 Jun 2020 Abhishek Gupta, Camylle Lanteigne, Victoria Heath, Marianna Bergamaschi Ganapini, Erick Galinkin, Allison Cohen, Tania De Gasperis, Mo Akif, Renjie Butalid

These past few months have been especially challenging, and the deployment of technology in ways hitherto untested at an unrivalled pace has left the internet and technology watchers aghast.

Ethics Navigate

ByGARS: Byzantine SGD with Arbitrary Number of Attackers

no code implementations24 Jun 2020 Jayanth Regatti, Hao Chen, Abhishek Gupta

We show that using these reputation scores for gradient aggregation is robust to any number of multiplicative noise Byzantine adversaries and use two-timescale stochastic approximation theory to prove convergence for strongly convex loss functions.

Ecological Reinforcement Learning

no code implementations22 Jun 2020 John D. Co-Reyes, Suvansh Sanjeev, Glen Berseth, Abhishek Gupta, Sergey Levine

Much of the current work on reinforcement learning studies episodic settings, where the agent is reset between trials to an initial state distribution, often with well-shaped reward functions.

reinforcement-learning Reinforcement Learning (RL)

Response by the Montreal AI Ethics Institute to the European Commission's Whitepaper on AI

no code implementations16 Jun 2020 Abhishek Gupta, Camylle Lanteigne

This paper outlines the EC's policy options for the promotion and adoption of artificial intelligence (AI) in the European Union.

Ethics Transfer Learning

AWAC: Accelerating Online Reinforcement Learning with Offline Datasets

6 code implementations16 Jun 2020 Ashvin Nair, Abhishek Gupta, Murtaza Dalal, Sergey Levine

If we can instead allow RL algorithms to effectively use previously collected data to aid the online learning process, such applications could be made substantially more practical: the prior data would provide a starting point that mitigates challenges due to exploration and sample complexity, while the online training enables the agent to perfect the desired skill.

reinforcement-learning Reinforcement Learning (RL)

The Social Contract for AI

no code implementations15 Jun 2020 Mirka Snyder Caron, Abhishek Gupta

It is important to keep in mind that for AI, meeting the expectations of this social contract is critical, because recklessly driving the adoption and implementation of unsafe, irresponsible, or unethical AI systems may trigger serious backlash against industry and academia involved which could take decades to resolve, if not actually seriously harm society.

SECure: A Social and Environmental Certificate for AI Systems

no code implementations11 Jun 2020 Abhishek Gupta, Camylle Lanteigne, Sara Kingsley

In a world increasingly dominated by AI applications, an understudied aspect is the carbon and social footprint of these power-hungry algorithms that require copious computation and a trove of data for training and prediction.

BIG-bench Machine Learning Federated Learning

Road Grade Estimation Using Crowd-Sourced Smartphone Data

no code implementations5 Jun 2020 Abhishek Gupta, Shaohan Hu, Weida Zhong, Adel Sadek, Lu Su, Chunming Qiao

Estimates of road grade/slope can add another dimension of information to existing 2D digital road maps.

The Ingredients of Real World Robotic Reinforcement Learning

no code implementations ICLR 2020 Henry Zhu, Justin Yu, Abhishek Gupta, Dhruv Shah, Kristian Hartikainen, Avi Singh, Vikash Kumar, Sergey Levine

The success of reinforcement learning in the real world has been limited to instrumented laboratory scenarios, often requiring arduous human supervision to enable continuous learning.

reinforcement-learning Reinforcement Learning (RL)

The Ingredients of Real-World Robotic Reinforcement Learning

no code implementations27 Apr 2020 Henry Zhu, Justin Yu, Abhishek Gupta, Dhruv Shah, Kristian Hartikainen, Avi Singh, Vikash Kumar, Sergey Levine

In this work, we discuss the elements that are needed for a robotic learning system that can continually and autonomously improve with data collected in the real world.

reinforcement-learning Reinforcement Learning (RL)

Steady-state fluctuations of a genetic feedback loop with fluctuating rate parameters using the unified colored noise approximation

no code implementations4 Apr 2020 James Holehouse, Abhishek Gupta, Ramon Grima

A common model of stochastic auto-regulatory gene expression describes promoter switching via cooperative protein binding, effective protein production in the active state and dilution of proteins.

Translation

Convergence of Recursive Stochastic Algorithms using Wasserstein Divergence

no code implementations25 Mar 2020 Abhishek Gupta, William B. Haskell

We show that if the distribution of the iterates in the Markov chain satisfy a contraction property with respect to the Wasserstein divergence, then the Markov chain admits an invariant distribution.

Q-Learning

DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction

4 code implementations NeurIPS 2020 Aviral Kumar, Abhishek Gupta, Sergey Levine

We show that bootstrapping-based Q-learning algorithms do not necessarily benefit from this corrective feedback, and training on the experience collected by the algorithm is not sufficient to correct errors in the Q-function.

Meta-Learning Multi-Task Learning +3

Learning in Markov Decision Processes under Constraints

no code implementations27 Feb 2020 Rahul Singh, Abhishek Gupta, Ness B. Shroff

In order to measure the performance of a reinforcement learning algorithm that satisfies the average cost constraints, we define an $M+1$ dimensional regret vector that is composed of its reward regret, and $M$ cost regrets.

reinforcement-learning Reinforcement Learning (RL)

Gradient Surgery for Multi-Task Learning

10 code implementations NeurIPS 2020 Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, Chelsea Finn

While deep learning and deep reinforcement learning (RL) systems have demonstrated impressive results in domains such as image classification, game playing, and robotic control, data efficiency remains a major challenge.

Image Classification Multi-Task Learning +1

Learning to Reach Goals via Iterated Supervised Learning

2 code implementations ICLR 2021 Dibya Ghosh, Abhishek Gupta, Ashwin Reddy, Justin Fu, Coline Devin, Benjamin Eysenbach, Sergey Levine

Current reinforcement learning (RL) algorithms can be brittle and difficult to use, especially when learning goal-reaching behaviors from sparse rewards.

Multi-Goal Reinforcement Learning Reinforcement Learning (RL)

Unsupervised Curricula for Visual Meta-Reinforcement Learning

no code implementations NeurIPS 2019 Allan Jabri, Kyle Hsu, Ben Eysenbach, Abhishek Gupta, Sergey Levine, Chelsea Finn

In experiments on vision-based navigation and manipulation domains, we show that the algorithm allows for unsupervised meta-learning that transfers to downstream tasks specified by hand-crafted reward functions and serves as pre-training for more efficient supervised meta-learning of test task distributions.

Clustering Meta-Learning +3

A Multi-Task Gradient Descent Method for Multi-Label Learning

no code implementations18 Nov 2019 Lu Bai, Yew-Soon Ong, Tiantian He, Abhishek Gupta

Multi-label learning studies the problem where an instance is associated with a set of labels.

Multi-Label Learning

Relay Policy Learning: Solving Long-Horizon Tasks via Imitation and Reinforcement Learning

1 code implementation25 Oct 2019 Abhishek Gupta, Vikash Kumar, Corey Lynch, Sergey Levine, Karol Hausman

We present relay policy learning, a method for imitation and reinforcement learning that can solve multi-stage, long-horizon robotic tasks.

Imitation Learning reinforcement-learning +1

Distributed SGD Generalizes Well Under Asynchrony

no code implementations29 Sep 2019 Jayanth Regatti, Gaurav Tendolkar, Yi Zhou, Abhishek Gupta, Yingbin Liang

The performance of fully synchronized distributed systems has faced a bottleneck due to the big data trend, under which asynchronous distributed systems are becoming a major popularity due to their powerful scalability.

ROBEL: Robotics Benchmarks for Learning with Low-Cost Robots

1 code implementation25 Sep 2019 Michael Ahn, Henry Zhu, Kristian Hartikainen, Hugo Ponte, Abhishek Gupta, Sergey Levine, Vikash Kumar

ROBEL introduces two robots, each aimed to accelerate reinforcement learning research in different task domains: D'Claw is a three-fingered hand robot that facilitates learning dexterous manipulation tasks, and D'Kitty is a four-legged robot that facilitates learning agile legged locomotion tasks.

Continuous Control reinforcement-learning +1

Mint: Matrix-Interleaving for Multi-Task Learning

no code implementations25 Sep 2019 Tianhe Yu, Saurabh Kumar, Eric Mitchell, Abhishek Gupta, Karol Hausman, Sergey Levine, Chelsea Finn

Deep learning enables training of large and flexible function approximators from scratch at the cost of large amounts of data.

Multi-Task Learning reinforcement-learning +1

Learning to Reach Goals Without Reinforcement Learning

no code implementations25 Sep 2019 Dibya Ghosh, Abhishek Gupta, Justin Fu, Ashwin Reddy, Coline Devin, Benjamin Eysenbach, Sergey Levine

By maximizing the likelihood of good actions provided by an expert demonstrator, supervised imitation learning can produce effective policies without the algorithmic complexities and optimization challenges of reinforcement learning, at the cost of requiring an expert demonstrator -- typically a person -- to provide the demonstrations.

Imitation Learning reinforcement-learning +1

Canada Protocol: an ethical checklist for the use of Artificial Intelligence in Suicide Prevention and Mental Health

no code implementations17 Jul 2019 Carl-Maria Mörch, Abhishek Gupta, Brian L. Mishara

Objectives: The Canada Protocol - MHSP is a tool to guide and support professionals, users, and researchers using AI in mental health and suicide prevention.

Ethics

Learning latent state representation for speeding up exploration

no code implementations27 May 2019 Giulia Vezzani, Abhishek Gupta, Lorenzo Natale, Pieter Abbeel

In this work, we take a representation learning viewpoint on exploration, utilizing prior experience to learn effective latent representations, which can subsequently indicate which regions to explore.

Representation Learning

Learning Actionable Representations with Goal Conditioned Policies

no code implementations ICLR 2019 Dibya Ghosh, Abhishek Gupta, Sergey Levine

Most prior work on representation learning has focused on generative approaches, learning representations that capture all the underlying factors of variation in the observation space in a more disentangled or well-ordered manner.

Decision Making Hierarchical Reinforcement Learning +3

Some Limit Properties of Markov Chains Induced by Stochastic Recursive Algorithms

no code implementations24 Apr 2019 Abhishek Gupta, Hao Chen, Jianzong Pi, Gaurav Tendolkar

We show that starting from the same initial condition, the distribution of the random sequence generated by the iterated random operators converges weakly to the trajectory generated by the contraction operator.

Guided Meta-Policy Search

no code implementations NeurIPS 2019 Russell Mendonca, Abhishek Gupta, Rosen Kralev, Pieter Abbeel, Sergey Levine, Chelsea Finn

Reinforcement learning (RL) algorithms have demonstrated promising results on complex tasks, yet often require impractical numbers of samples since they learn from scratch.

Continuous Control Imitation Learning +4

Domain Randomization for Active Pose Estimation

no code implementations10 Mar 2019 Xinyi Ren, Jianlan Luo, Eugen Solowjow, Juan Aparicio Ojea, Abhishek Gupta, Aviv Tamar, Pieter Abbeel

In this work, we investigate how to improve the accuracy of domain randomization based pose estimation.

Pose Estimation

AIR5: Five Pillars of Artificial Intelligence Research

no code implementations30 Dec 2018 Yew-Soon Ong, Abhishek Gupta

In this article, we provide and overview of what we consider to be some of the most pressing research questions facing the fields of artificial intelligence (AI) and computational intelligence (CI); with the latter focusing on algorithms that are inspired by various natural phenomena.

Artificial Life

Learning Actionable Representations with Goal-Conditioned Policies

1 code implementation19 Nov 2018 Dibya Ghosh, Abhishek Gupta, Sergey Levine

Most prior work on representation learning has focused on generative approaches, learning representations that capture all underlying factors of variation in the observation space in a more disentangled or well-ordered manner.

Decision Making Hierarchical Reinforcement Learning +3

Guiding Policies with Language via Meta-Learning

1 code implementation ICLR 2019 John D. Co-Reyes, Abhishek Gupta, Suvansh Sanjeev, Nick Altieri, Jacob Andreas, John DeNero, Pieter Abbeel, Sergey Levine

However, a single instruction may be insufficient to fully communicate our intent or, even if it is, may be insufficient for an autonomous agent to actually understand how to perform the desired task.

Imitation Learning Instruction Following +1

Dexterous Manipulation with Deep Reinforcement Learning: Efficient, General, and Low-Cost

no code implementations14 Oct 2018 Henry Zhu, Abhishek Gupta, Aravind Rajeswaran, Sergey Levine, Vikash Kumar

Dexterous multi-fingered robotic hands can perform a wide range of manipulation skills, making them an appealing component for general-purpose robotic manipulators.

reinforcement-learning Reinforcement Learning (RL)

Adversarial Reinforcement Learning for Observer Design in Autonomous Systems under Cyber Attacks

no code implementations15 Sep 2018 Abhishek Gupta, Zhaoyuan Yang

Complex autonomous control systems are subjected to sensor failures, cyber-attacks, sensor noise, communication channel failures, etc.

reinforcement-learning Reinforcement Learning (RL)

Automatically Composing Representation Transformations as a Means for Generalization

1 code implementation ICLR 2019 Michael B. Chang, Abhishek Gupta, Sergey Levine, Thomas L. Griffiths

A generally intelligent learner should generalize to more complex tasks than it has previously encountered, but the two common paradigms in machine learning -- either training a separate learner per task or training a single learner for all tasks -- both have difficulty with such generalization because they do not leverage the compositional structure of the task distribution.

Decision Making

Unsupervised Meta-Learning for Reinforcement Learning

no code implementations ICLR 2020 Abhishek Gupta, Benjamin Eysenbach, Chelsea Finn, Sergey Levine

In the context of reinforcement learning, meta-learning algorithms acquire reinforcement learning procedures to solve new problems more efficiently by utilizing experience from prior tasks.

Meta-Learning Meta Reinforcement Learning +3

Probabilistic Contraction Analysis of Iterated Random Operators

no code implementations4 Apr 2018 Abhishek Gupta, Rahul Jain, Peter Glynn

In many branches of engineering, Banach contraction mapping theorem is employed to establish the convergence of certain deterministic algorithms.

Addressing Expensive Multi-objective Games with Postponed Preference Articulation via Memetic Co-evolution

no code implementations17 Nov 2017 Adam Żychowski, Abhishek Gupta, Jacek Mańdziuk, Yew Soon Ong

This paper presents algorithmic and empirical contributions demonstrating that the convergence characteristics of a co-evolutionary approach to tackle Multi-Objective Games (MOGs) with postponed preference articulation can often be hampered due to the possible emergence of the so-called Red Queen effect.

From Query-By-Keyword to Query-By-Example: LinkedIn Talent Search Approach

no code implementations3 Sep 2017 Viet Ha-Thuc, Yan Yan, Xianren Wu, Vijay Dialani, Abhishek Gupta, Shakti Sinha

One key challenge in talent search is to translate complex criteria of a hiring position into a search query, while it is relatively easy for a searcher to list examples of suitable candidates for a given position.

Position

Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation

1 code implementation11 Jul 2017 YuXuan Liu, Abhishek Gupta, Pieter Abbeel, Sergey Levine

Imitation learning is an effective approach for autonomous systems to acquire control policies when an explicit reward function is unavailable, using supervision provided as demonstrations from an expert, typically a human operator.

Imitation Learning Translation +1

Evolutionary Multitasking for Single-objective Continuous Optimization: Benchmark Problems, Performance Metric, and Baseline Results

no code implementations12 Jun 2017 Bingshui Da, Yew-Soon Ong, Liang Feng, A. K. Qin, Abhishek Gupta, Zexuan Zhu, Chuan-Kang Ting, Ke Tang, Xin Yao

In this report, we suggest nine test problems for multi-task single-objective optimization (MTSOO), each of which consists of two single-objective optimization tasks that need to be solved simultaneously.

Evolutionary Multitasking for Multiobjective Continuous Optimization: Benchmark Problems, Performance Metrics and Baseline Results

no code implementations8 Jun 2017 Yuan Yuan, Yew-Soon Ong, Liang Feng, A. K. Qin, Abhishek Gupta, Bingshui Da, Qingfu Zhang, Kay Chen Tan, Yaochu Jin, Hisao Ishibuchi

In this report, we suggest nine test problems for multi-task multi-objective optimization (MTMOO), each of which consists of two multiobjective optimization tasks that need to be solved simultaneously.

Multiobjective Optimization

Learning Dexterous Manipulation Policies from Experience and Imitation

no code implementations15 Nov 2016 Vikash Kumar, Abhishek Gupta, Emanuel Todorov, Sergey Levine

We demonstrate that such controllers can perform the task robustly, both in simulation and on the physical platform, for a limited range of initial conditions around the trained starting state.

Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer

no code implementations22 Sep 2016 Coline Devin, Abhishek Gupta, Trevor Darrell, Pieter Abbeel, Sergey Levine

Using deep reinforcement learning to train general purpose neural network policies alleviates some of the burden of manual representation engineering by using expressive policy classes, but exacerbates the challenge of data collection, since such methods tend to be less efficient than RL with low-dimensional, hand-designed representations.

reinforcement-learning Reinforcement Learning (RL) +2

Genetic Transfer or Population Diversification? Deciphering the Secret Ingredients of Evolutionary Multitask Optimization

no code implementations19 Jul 2016 Abhishek Gupta, Yew-Soon Ong

Evolutionary multitasking has recently emerged as a novel paradigm that enables the similarities and/or latent complementarities (if present) between distinct optimization tasks to be exploited in an autonomous manner simply by solving them together with a unified solution representation scheme.

Learning Dexterous Manipulation for a Soft Robotic Hand from Human Demonstration

no code implementations21 Mar 2016 Abhishek Gupta, Clemens Eppner, Sergey Levine, Pieter Abbeel

In this paper, we describe an approach to learning from demonstration that can be used to train soft robotic hands to perform dexterous manipulation tasks.

Search by Ideal Candidates: Next Generation of Talent Search at LinkedIn

no code implementations26 Feb 2016 Viet Ha-Thuc, Ye Xu, Satya Pradeep Kanduri, Xianren Wu, Vijay Dialani, Yan Yan, Abhishek Gupta, Shakti Sinha

This new system only needs the searcher to input one or several examples of suitable candidates for the position.

Position

Cannot find the paper you are looking for? You can Submit a new open access paper.