Search Results for author: Abhishek Gupta

Found 87 papers, 16 papers with code

Teachable Reinforcement Learning via Advice Distillation

no code implementations NeurIPS 2021 Olivia Watkins, Abhishek Gupta, Trevor Darrell, Pieter Abbeel, Jacob Andreas

Training automated agents to perform complex behaviors in interactive environments is challenging: reinforcement learning requires careful hand-engineering of reward functions, imitation learning requires specialized infrastructure and access to a human expert, and learning from intermediate forms of supervision (like binary preferences) is time-consuming and provides minimal information per human intervention.

Decision Making Imitation Learning

A Dynamic Watermarking Algorithm for Finite Markov Decision Problems

no code implementations9 Nov 2021 Jiacheng Tang, Jiguo Song, Abhishek Gupta

Dynamic watermarking, as an active intrusion detection technique, can potentially detect replay attacks, spoofing attacks, and deception attacks in the feedback channel for control systems.

Intrusion Detection

Offline RL With Resource Constrained Online Deployment

no code implementations7 Oct 2021 Jayanth Reddy Regatti, Aniket Anand Deshmukh, Frank Cheng, Young Hun Jung, Abhishek Gupta, Urun Dogan

We address this performance gap with a policy transfer algorithm which first trains a teacher agent using the offline dataset where features are fully available, and then transfers this knowledge to a student agent that only uses the resource-constrained features.

Offline RL

Half a Dozen Real-World Applications of Evolutionary Multitasking and More

no code implementations27 Sep 2021 Abhishek Gupta, Lei Zhou, Yew-Soon Ong, Zefeng Chen, Yaqing Hou

Until recently, the potential to transfer evolved skills across distinct optimization problem instances (or tasks) was seldom explored in evolutionary computation.

Learning in Sinusoidal Spaces with Physics-Informed Neural Networks

no code implementations20 Sep 2021 Jian Cheng Wong, Chinchun Ooi, Abhishek Gupta, Yew-Soon Ong

A physics-informed neural network (PINN) uses physics-augmented loss functions, e. g., incorporating the residual term from governing differential equations, to ensure its output is consistent with fundamental physics laws.

The State of AI Ethics Report (Volume 5)

no code implementations9 Aug 2021 Abhishek Gupta, Connor Wright, Marianna Bergamaschi Ganapini, Masa Sweidan, Renjie Butalid

This report from the Montreal AI Ethics Institute covers the most salient progress in research and reporting over the second quarter of 2021 in the field of AI ethics with a special emphasis on "Environment and AI", "Creativity and AI", and "Geopolitics and AI."

Fairness

Real-time Eco-Driving Control in Electrified Connected and Autonomous Vehicles using Approximate Dynamic Programming

no code implementations5 Aug 2021 Shreshta Rajakumar Deshpande, Shobhit Gupta, Abhishek Gupta, Marcello Canova

This paper presents a hierarchical multi-layer Model Predictive Control (MPC) approach for improving the fuel economy of a 48V mild-hybrid powertrain in a connected vehicle environment.

Autonomous Vehicles

Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation

no code implementations28 Jul 2021 Charles Sun, Jędrzej Orbik, Coline Devin, Brian Yang, Abhishek Gupta, Glen Berseth, Sergey Levine

Our aim is to devise a robotic reinforcement learning system for learning navigation and manipulation together, in an autonomous way without human intervention, enabling continual learning under realistic assumptions.

Continual Learning

Autonomous Reinforcement Learning via Subgoal Curricula

no code implementations NeurIPS 2021 Archit Sharma, Abhishek Gupta, Sergey Levine, Karol Hausman, Chelsea Finn

Reinforcement learning (RL) promises to enable autonomous acquisition of complex behaviors for diverse agents.

MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning

no code implementations15 Jul 2021 Kevin Li, Abhishek Gupta, Ashwin Reddy, Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine

In this work, we show that an uncertainty aware classifier can solve challenging reinforcement learning problems by both encouraging exploration and provided directed guidance towards positive outcomes.

Meta-Learning

Weighted Gaussian Process Bandits for Non-stationary Environments

no code implementations6 Jul 2021 Yuntian Deng, Xingyu Zhou, Baekjin Kim, Ambuj Tewari, Abhishek Gupta, Ness Shroff

To this end, we develop WGP-UCB, a novel UCB-type algorithm based on weighted Gaussian process regression.

Safe Model-based Off-policy Reinforcement Learning for Eco-Driving in Connected and Automated Hybrid Electric Vehicles

no code implementations25 May 2021 Zhaoxuan Zhu, Nicola Pivaro, Shobhit Gupta, Abhishek Gupta, Marcello Canova

Connected and Automated Hybrid Electric Vehicles have the potential to reduce fuel consumption and travel time in real-world driving conditions.

Model-based Reinforcement Learning

The State of AI Ethics Report (January 2021)

no code implementations19 May 2021 Abhishek Gupta, Alexandrine Royer, Connor Wright, Falaah Arif Khan, Victoria Heath, Erick Galinkin, Ryan Khurana, Marianna Bergamaschi Ganapini, Muriam Fancy, Masa Sweidan, Mo Akif, Renjie Butalid

The 3rd edition of the Montreal AI Ethics Institute's The State of AI Ethics captures the most relevant developments in AI Ethics since October 2020.

Misinformation

The State of AI Ethics Report (Volume 4)

no code implementations19 May 2021 Abhishek Gupta, Alexandrine Royer, Connor Wright, Victoria Heath, Muriam Fancy, Marianna Bergamaschi Ganapini, Shannon Egan, Masa Sweidan, Mo Akif, Renjie Butalid

The 4th edition of the Montreal AI Ethics Institute's The State of AI Ethics captures the most relevant developments in the field of AI Ethics since January 2021.

Fairness

Making Responsible AI the Norm rather than the Exception

no code implementations28 Jan 2021 Abhishek Gupta

This report prepared by the Montreal AI Ethics Institute provides recommendations in response to the National Security Commission on Artificial Intelligence (NSCAI) Key Considerations for Responsible Development and Fielding of Artificial Intelligence document.

Translation

Coverage Analysis of Broadcast Networks with Users Having Heterogeneous Content/Advertisement Preferences

no code implementations27 Jan 2021 Kanchan Chaurasia, Reena Sahu, Abhishek Gupta

With the help of numerical results and analysis, we show the impact of various parameters including content granularity, connectivity radius, and rate threshold and present important design insights.

Information Theory Information Theory

A Deep Reinforcement Learning Framework for Eco-driving in Connected and Automated Hybrid Electric Vehicles

no code implementations13 Jan 2021 Zhaoxuan Zhu, Shobhit Gupta, Abhishek Gupta, Marcello Canova

Connected and Automated Vehicles (CAVs), in particular those with multiple power sources, have the potential to significantly reduce fuel consumption and travel time in real-world driving conditions.

Incentive Design and Profit Sharing in Multi-modal Transportation Network

no code implementations9 Jan 2021 Yuntian Deng, Shiping Shao, Archak Mittal, Richard Twumasi-Boakye, James Fishelson, Abhishek Gupta, Ness B. Shroff

This market structure allows the multi-modal platform to coordinate profits across modes and also provide incentives to the passengers.

Can Transfer Neuroevolution Tractably Solve Your Differential Equations?

no code implementations6 Jan 2021 Jian Cheng Wong, Abhishek Gupta, Yew-Soon Ong

In the context of solving differential equations, we are faced with the problem of finding globally optimum parameters of the network, instead of being concerned with out-of-sample generalization.

Reinforcement Learning with Bayesian Classifiers: Efficient Skill Learning from Outcome Examples

no code implementations1 Jan 2021 Kevin Li, Abhishek Gupta, Vitchyr H. Pong, Ashwin Reddy, Aurick Zhou, Justin Yu, Sergey Levine

In this work, we study a more tractable class of reinforcement learning problems defined by data that provides examples of successful outcome states.

Scalable Transfer Evolutionary Optimization: Coping with Big Task Instances

1 code implementation3 Dec 2020 Mojtaba Shakeri, Erfan Miahi, Abhishek Gupta, Yew-Soon Ong

In today's digital world, we are confronted with an explosion of data and models produced and manipulated by numerous large-scale IoT/cloud-based applications.

The State of AI Ethics Report (October 2020)

no code implementations5 Nov 2020 Abhishek Gupta, Alexandrine Royer, Victoria Heath, Connor Wright, Camylle Lanteigne, Allison Cohen, Marianna Bergamaschi Ganapini, Muriam Fancy, Erick Galinkin, Ryan Khurana, Mo Akif, Renjie Butalid, Falaah Arif Khan, Masa Sweidan, Audrey Balogh

The 2nd edition of the Montreal AI Ethics Institute's The State of AI Ethics captures the most relevant developments in the field of AI Ethics since July 2020.

Providing Actionable Feedback in Hiring Marketplaces using Generative Adversarial Networks

no code implementations6 Oct 2020 Daniel Nemirovsky, Nicolas Thiebaut, Ye Xu, Abhishek Gupta

Machine learning predictors have been increasingly applied in production settings, including in one of the world's largest hiring platforms, Hired, to provide a better candidate and recruiter experience.

Adaptive Risk Minimization: A Meta-Learning Approach for Tackling Group Shift

no code implementations28 Sep 2020 Marvin Mengxin Zhang, Henrik Marklund, Nikita Dhawan, Abhishek Gupta, Sergey Levine, Chelsea Finn

A fundamental assumption of most machine learning algorithms is that the training and test data are drawn from the same underlying distribution.

Image Classification Meta-Learning

Report prepared by the Montreal AI Ethics Institute (MAIEI) on Publication Norms for Responsible AI

no code implementations15 Sep 2020 Abhishek Gupta, Camylle Lanteigne, Victoria Heath

In order to ensure that the science and technology of AI is developed in a humane manner, we must develop research publication norms that are informed by our growing understanding of AI's potential threats and use cases.

CounteRGAN: Generating Realistic Counterfactuals with Residual Generative Adversarial Nets

no code implementations11 Sep 2020 Daniel Nemirovsky, Nicolas Thiebaut, Ye Xu, Abhishek Gupta

The prevalence of machine learning models in various industries has led to growing demands for model interpretability and for the ability to provide meaningful recourse to users.

Adversary Agnostic Robust Deep Reinforcement Learning

no code implementations14 Aug 2020 Xinghua Qu, Yew-Soon Ong, Abhishek Gupta, Zhu Sun

Motivated by this finding, we propose a new policy distillation loss with two terms: 1) a prescription gap maximization loss aiming at simultaneously maximizing the likelihood of the action selected by the teacher policy and the entropy over the remaining actions; 2) a corresponding Jacobian regularization loss that minimizes the magnitude of the gradient with respect to the input state.

Adversarial Robustness Atari Games

Green Lighting ML: Confidentiality, Integrity, and Availability of Machine Learning Systems in Deployment

no code implementations9 Jul 2020 Abhishek Gupta, Erick Galinkin

In this hand-off, the engineers responsible for model deployment are often not privy to the details of the model and thus, the potential vulnerabilities associated with its usage, exposure, or compromise.

The State of AI Ethics Report (June 2020)

no code implementations25 Jun 2020 Abhishek Gupta, Camylle Lanteigne, Victoria Heath, Marianna Bergamaschi Ganapini, Erick Galinkin, Allison Cohen, Tania De Gasperis, Mo Akif, Renjie Butalid

These past few months have been especially challenging, and the deployment of technology in ways hitherto untested at an unrivalled pace has left the internet and technology watchers aghast.

ByGARS: Byzantine SGD with Arbitrary Number of Attackers

no code implementations24 Jun 2020 Jayanth Regatti, Hao Chen, Abhishek Gupta

We show that using these reputation scores for gradient aggregation is robust to any number of multiplicative noise Byzantine adversaries and use two-timescale stochastic approximation theory to prove convergence for strongly convex loss functions.

Ecological Reinforcement Learning

no code implementations22 Jun 2020 John D. Co-Reyes, Suvansh Sanjeev, Glen Berseth, Abhishek Gupta, Sergey Levine

Much of the current work on reinforcement learning studies episodic settings, where the agent is reset between trials to an initial state distribution, often with well-shaped reward functions.

AWAC: Accelerating Online Reinforcement Learning with Offline Datasets

1 code implementation16 Jun 2020 Ashvin Nair, Abhishek Gupta, Murtaza Dalal, Sergey Levine

If we can instead allow RL algorithms to effectively use previously collected data to aid the online learning process, such applications could be made substantially more practical: the prior data would provide a starting point that mitigates challenges due to exploration and sample complexity, while the online training enables the agent to perfect the desired skill.

Fine-tuning

Response by the Montreal AI Ethics Institute to the European Commission's Whitepaper on AI

no code implementations16 Jun 2020 Abhishek Gupta, Camylle Lanteigne

This paper outlines the EC's policy options for the promotion and adoption of artificial intelligence (AI) in the European Union.

Transfer Learning

The Social Contract for AI

no code implementations15 Jun 2020 Mirka Snyder Caron, Abhishek Gupta

It is important to keep in mind that for AI, meeting the expectations of this social contract is critical, because recklessly driving the adoption and implementation of unsafe, irresponsible, or unethical AI systems may trigger serious backlash against industry and academia involved which could take decades to resolve, if not actually seriously harm society.

SECure: A Social and Environmental Certificate for AI Systems

no code implementations11 Jun 2020 Abhishek Gupta, Camylle Lanteigne, Sara Kingsley

In a world increasingly dominated by AI applications, an understudied aspect is the carbon and social footprint of these power-hungry algorithms that require copious computation and a trove of data for training and prediction.

Federated Learning

Road Grade Estimation Using Crowd-Sourced Smartphone Data

no code implementations5 Jun 2020 Abhishek Gupta, Shaohan Hu, Weida Zhong, Adel Sadek, Lu Su, Chunming Qiao

Estimates of road grade/slope can add another dimension of information to existing 2D digital road maps.

The Ingredients of Real World Robotic Reinforcement Learning

no code implementations ICLR 2020 Henry Zhu, Justin Yu, Abhishek Gupta, Dhruv Shah, Kristian Hartikainen, Avi Singh, Vikash Kumar, Sergey Levine

The success of reinforcement learning in the real world has been limited to instrumented laboratory scenarios, often requiring arduous human supervision to enable continuous learning.

The Ingredients of Real-World Robotic Reinforcement Learning

no code implementations27 Apr 2020 Henry Zhu, Justin Yu, Abhishek Gupta, Dhruv Shah, Kristian Hartikainen, Avi Singh, Vikash Kumar, Sergey Levine

In this work, we discuss the elements that are needed for a robotic learning system that can continually and autonomously improve with data collected in the real world.

Steady-state fluctuations of a genetic feedback loop with fluctuating rate parameters using the unified colored noise approximation

no code implementations4 Apr 2020 James Holehouse, Abhishek Gupta, Ramon Grima

A common model of stochastic auto-regulatory gene expression describes promoter switching via cooperative protein binding, effective protein production in the active state and dilution of proteins.

Translation

Convergence of Recursive Stochastic Algorithms using Wasserstein Divergence

no code implementations25 Mar 2020 Abhishek Gupta, William B. Haskell

We show that if the distribution of the iterates in the Markov chain satisfy a contraction property with respect to the Wasserstein divergence, then the Markov chain admits an invariant distribution.

Q-Learning

DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction

3 code implementations NeurIPS 2020 Aviral Kumar, Abhishek Gupta, Sergey Levine

We show that bootstrapping-based Q-learning algorithms do not necessarily benefit from this corrective feedback, and training on the experience collected by the algorithm is not sufficient to correct errors in the Q-function.

Meta-Learning Multi-Task Learning +1

Learning in Markov Decision Processes under Constraints

no code implementations27 Feb 2020 Rahul Singh, Abhishek Gupta, Ness B. Shroff

We design learning algorithms that maximize the cumulative reward earned over a time horizon of $T$ time-steps, while simultaneously ensuring that the average values of the $M$ cost expenditures are bounded by agent-specified thresholds $c^{ub}_i, i=1, 2,\ldots, M$.

Gradient Surgery for Multi-Task Learning

7 code implementations NeurIPS 2020 Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, Chelsea Finn

While deep learning and deep reinforcement learning (RL) systems have demonstrated impressive results in domains such as image classification, game playing, and robotic control, data efficiency remains a major challenge.

Image Classification Multi-Task Learning

Learning to Reach Goals via Iterated Supervised Learning

2 code implementations ICLR 2021 Dibya Ghosh, Abhishek Gupta, Ashwin Reddy, Justin Fu, Coline Devin, Benjamin Eysenbach, Sergey Levine

Current reinforcement learning (RL) algorithms can be brittle and difficult to use, especially when learning goal-reaching behaviors from sparse rewards.

Multi-Goal Reinforcement Learning

Unsupervised Curricula for Visual Meta-Reinforcement Learning

no code implementations NeurIPS 2019 Allan Jabri, Kyle Hsu, Ben Eysenbach, Abhishek Gupta, Sergey Levine, Chelsea Finn

In experiments on vision-based navigation and manipulation domains, we show that the algorithm allows for unsupervised meta-learning that transfers to downstream tasks specified by hand-crafted reward functions and serves as pre-training for more efficient supervised meta-learning of test task distributions.

Meta-Learning Meta Reinforcement Learning

A Multi-Task Gradient Descent Method for Multi-Label Learning

no code implementations18 Nov 2019 Lu Bai, Yew-Soon Ong, Tiantian He, Abhishek Gupta

Multi-label learning studies the problem where an instance is associated with a set of labels.

Multi-Label Learning

Minimalistic Attacks: How Little it Takes to Fool a Deep Reinforcement Learning Policy

1 code implementation10 Nov 2019 Xinghua Qu, Zhu Sun, Yew-Soon Ong, Abhishek Gupta, Pengfei Wei

Recent studies have revealed that neural network-based policies can be easily fooled by adversarial examples.

Adversarial Attack Atari Games

Relay Policy Learning: Solving Long-Horizon Tasks via Imitation and Reinforcement Learning

1 code implementation25 Oct 2019 Abhishek Gupta, Vikash Kumar, Corey Lynch, Sergey Levine, Karol Hausman

We present relay policy learning, a method for imitation and reinforcement learning that can solve multi-stage, long-horizon robotic tasks.

Imitation Learning

Distributed SGD Generalizes Well Under Asynchrony

no code implementations29 Sep 2019 Jayanth Regatti, Gaurav Tendolkar, Yi Zhou, Abhishek Gupta, Yingbin Liang

The performance of fully synchronized distributed systems has faced a bottleneck due to the big data trend, under which asynchronous distributed systems are becoming a major popularity due to their powerful scalability.

Mint: Matrix-Interleaving for Multi-Task Learning

no code implementations25 Sep 2019 Tianhe Yu, Saurabh Kumar, Eric Mitchell, Abhishek Gupta, Karol Hausman, Sergey Levine, Chelsea Finn

Deep learning enables training of large and flexible function approximators from scratch at the cost of large amounts of data.

Multi-Task Learning

Learning to Reach Goals Without Reinforcement Learning

no code implementations25 Sep 2019 Dibya Ghosh, Abhishek Gupta, Justin Fu, Ashwin Reddy, Coline Devin, Benjamin Eysenbach, Sergey Levine

By maximizing the likelihood of good actions provided by an expert demonstrator, supervised imitation learning can produce effective policies without the algorithmic complexities and optimization challenges of reinforcement learning, at the cost of requiring an expert demonstrator -- typically a person -- to provide the demonstrations.

Imitation Learning

ROBEL: Robotics Benchmarks for Learning with Low-Cost Robots

1 code implementation25 Sep 2019 Michael Ahn, Henry Zhu, Kristian Hartikainen, Hugo Ponte, Abhishek Gupta, Sergey Levine, Vikash Kumar

ROBEL introduces two robots, each aimed to accelerate reinforcement learning research in different task domains: D'Claw is a three-fingered hand robot that facilitates learning dexterous manipulation tasks, and D'Kitty is a four-legged robot that facilitates learning agile legged locomotion tasks.

Continuous Control

Canada Protocol: an ethical checklist for the use of Artificial Intelligence in Suicide Prevention and Mental Health

no code implementations17 Jul 2019 Carl-Maria Mörch, Abhishek Gupta, Brian L. Mishara

Objectives: The Canada Protocol - MHSP is a tool to guide and support professionals, users, and researchers using AI in mental health and suicide prevention.

Learning latent state representation for speeding up exploration

no code implementations27 May 2019 Giulia Vezzani, Abhishek Gupta, Lorenzo Natale, Pieter Abbeel

In this work, we take a representation learning viewpoint on exploration, utilizing prior experience to learn effective latent representations, which can subsequently indicate which regions to explore.

Representation Learning

Learning Actionable Representations with Goal Conditioned Policies

no code implementations ICLR 2019 Dibya Ghosh, Abhishek Gupta, Sergey Levine

Most prior work on representation learning has focused on generative approaches, learning representations that capture all the underlying factors of variation in the observation space in a more disentangled or well-ordered manner.

Decision Making Hierarchical Reinforcement Learning +1

Some Limit Properties of Markov Chains Induced by Stochastic Recursive Algorithms

no code implementations24 Apr 2019 Abhishek Gupta, Hao Chen, Jianzong Pi, Gaurav Tendolkar

We show that starting from the same initial condition, the distribution of the random sequence generated by the iterated random operators converges weakly to the trajectory generated by the contraction operator.

Guided Meta-Policy Search

no code implementations NeurIPS 2019 Russell Mendonca, Abhishek Gupta, Rosen Kralev, Pieter Abbeel, Sergey Levine, Chelsea Finn

Reinforcement learning (RL) algorithms have demonstrated promising results on complex tasks, yet often require impractical numbers of samples since they learn from scratch.

Continuous Control Imitation Learning +2

Domain Randomization for Active Pose Estimation

no code implementations10 Mar 2019 Xinyi Ren, Jianlan Luo, Eugen Solowjow, Juan Aparicio Ojea, Abhishek Gupta, Aviv Tamar, Pieter Abbeel

In this work, we investigate how to improve the accuracy of domain randomization based pose estimation.

Pose Estimation

AIR5: Five Pillars of Artificial Intelligence Research

no code implementations30 Dec 2018 Yew-Soon Ong, Abhishek Gupta

In this article, we provide and overview of what we consider to be some of the most pressing research questions facing the fields of artificial intelligence (AI) and computational intelligence (CI); with the latter focusing on algorithms that are inspired by various natural phenomena.

Artificial Life

Guiding Policies with Language via Meta-Learning

2 code implementations ICLR 2019 John D. Co-Reyes, Abhishek Gupta, Suvansh Sanjeev, Nick Altieri, Jacob Andreas, John DeNero, Pieter Abbeel, Sergey Levine

However, a single instruction may be insufficient to fully communicate our intent or, even if it is, may be insufficient for an autonomous agent to actually understand how to perform the desired task.

Imitation Learning Meta-Learning

Learning Actionable Representations with Goal-Conditioned Policies

1 code implementation19 Nov 2018 Dibya Ghosh, Abhishek Gupta, Sergey Levine

Most prior work on representation learning has focused on generative approaches, learning representations that capture all underlying factors of variation in the observation space in a more disentangled or well-ordered manner.

Decision Making Hierarchical Reinforcement Learning +1

Dexterous Manipulation with Deep Reinforcement Learning: Efficient, General, and Low-Cost

no code implementations14 Oct 2018 Henry Zhu, Abhishek Gupta, Aravind Rajeswaran, Sergey Levine, Vikash Kumar

Dexterous multi-fingered robotic hands can perform a wide range of manipulation skills, making them an appealing component for general-purpose robotic manipulators.

Adversarial Reinforcement Learning for Observer Design in Autonomous Systems under Cyber Attacks

no code implementations15 Sep 2018 Abhishek Gupta, Zhaoyuan Yang

Complex autonomous control systems are subjected to sensor failures, cyber-attacks, sensor noise, communication channel failures, etc.

Automatically Composing Representation Transformations as a Means for Generalization

1 code implementation ICLR 2019 Michael B. Chang, Abhishek Gupta, Sergey Levine, Thomas L. Griffiths

A generally intelligent learner should generalize to more complex tasks than it has previously encountered, but the two common paradigms in machine learning -- either training a separate learner per task or training a single learner for all tasks -- both have difficulty with such generalization because they do not leverage the compositional structure of the task distribution.

Decision Making

Unsupervised Meta-Learning for Reinforcement Learning

no code implementations ICLR 2020 Abhishek Gupta, Benjamin Eysenbach, Chelsea Finn, Sergey Levine

In the context of reinforcement learning, meta-learning algorithms acquire reinforcement learning procedures to solve new problems more efficiently by utilizing experience from prior tasks.

Meta-Learning Meta Reinforcement Learning +1

Probabilistic Contraction Analysis of Iterated Random Operators

no code implementations4 Apr 2018 Abhishek Gupta, Rahul Jain, Peter Glynn

Consider a contraction operator $T$ over a complete metric space $\mathcal X$ with the fixed point $x^\star$.

Addressing Expensive Multi-objective Games with Postponed Preference Articulation via Memetic Co-evolution

no code implementations17 Nov 2017 Adam Żychowski, Abhishek Gupta, Jacek Mańdziuk, Yew Soon Ong

This paper presents algorithmic and empirical contributions demonstrating that the convergence characteristics of a co-evolutionary approach to tackle Multi-Objective Games (MOGs) with postponed preference articulation can often be hampered due to the possible emergence of the so-called Red Queen effect.

From Query-By-Keyword to Query-By-Example: LinkedIn Talent Search Approach

no code implementations3 Sep 2017 Viet Ha-Thuc, Yan Yan, Xianren Wu, Vijay Dialani, Abhishek Gupta, Shakti Sinha

One key challenge in talent search is to translate complex criteria of a hiring position into a search query, while it is relatively easy for a searcher to list examples of suitable candidates for a given position.

Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation

1 code implementation11 Jul 2017 YuXuan Liu, Abhishek Gupta, Pieter Abbeel, Sergey Levine

Imitation learning is an effective approach for autonomous systems to acquire control policies when an explicit reward function is unavailable, using supervision provided as demonstrations from an expert, typically a human operator.

Imitation Learning Translation +1

Evolutionary Multitasking for Single-objective Continuous Optimization: Benchmark Problems, Performance Metric, and Baseline Results

no code implementations12 Jun 2017 Bingshui Da, Yew-Soon Ong, Liang Feng, A. K. Qin, Abhishek Gupta, Zexuan Zhu, Chuan-Kang Ting, Ke Tang, Xin Yao

In this report, we suggest nine test problems for multi-task single-objective optimization (MTSOO), each of which consists of two single-objective optimization tasks that need to be solved simultaneously.

Evolutionary Multitasking for Multiobjective Continuous Optimization: Benchmark Problems, Performance Metrics and Baseline Results

no code implementations8 Jun 2017 Yuan Yuan, Yew-Soon Ong, Liang Feng, A. K. Qin, Abhishek Gupta, Bingshui Da, Qingfu Zhang, Kay Chen Tan, Yaochu Jin, Hisao Ishibuchi

In this report, we suggest nine test problems for multi-task multi-objective optimization (MTMOO), each of which consists of two multiobjective optimization tasks that need to be solved simultaneously.

Multiobjective Optimization

Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning

no code implementations8 Mar 2017 Abhishek Gupta, Coline Devin, Yuxuan Liu, Pieter Abbeel, Sergey Levine

People can learn a wide range of tasks from their own experience, but can also learn from observing other creatures.

Transfer Learning

Learning Dexterous Manipulation Policies from Experience and Imitation

no code implementations15 Nov 2016 Vikash Kumar, Abhishek Gupta, Emanuel Todorov, Sergey Levine

We demonstrate that such controllers can perform the task robustly, both in simulation and on the physical platform, for a limited range of initial conditions around the trained starting state.

Motion Capture

Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer

no code implementations22 Sep 2016 Coline Devin, Abhishek Gupta, Trevor Darrell, Pieter Abbeel, Sergey Levine

Using deep reinforcement learning to train general purpose neural network policies alleviates some of the burden of manual representation engineering by using expressive policy classes, but exacerbates the challenge of data collection, since such methods tend to be less efficient than RL with low-dimensional, hand-designed representations.

Transfer Learning

Genetic Transfer or Population Diversification? Deciphering the Secret Ingredients of Evolutionary Multitask Optimization

no code implementations19 Jul 2016 Abhishek Gupta, Yew-Soon Ong

Evolutionary multitasking has recently emerged as a novel paradigm that enables the similarities and/or latent complementarities (if present) between distinct optimization tasks to be exploited in an autonomous manner simply by solving them together with a unified solution representation scheme.

Learning Dexterous Manipulation for a Soft Robotic Hand from Human Demonstration

no code implementations21 Mar 2016 Abhishek Gupta, Clemens Eppner, Sergey Levine, Pieter Abbeel

In this paper, we describe an approach to learning from demonstration that can be used to train soft robotic hands to perform dexterous manipulation tasks.

Search by Ideal Candidates: Next Generation of Talent Search at LinkedIn

no code implementations26 Feb 2016 Viet Ha-Thuc, Ye Xu, Satya Pradeep Kanduri, Xianren Wu, Vijay Dialani, Yan Yan, Abhishek Gupta, Shakti Sinha

This new system only needs the searcher to input one or several examples of suitable candidates for the position.

Cannot find the paper you are looking for? You can Submit a new open access paper.