Search Results for author: Aviv Tamar

To combine the benefits of these different forms of learning, it is common to train a policy to maximize a combination of reinforcement and teacher-student learning objectives.

counterfactual Decision Making +1

Paper
Add Code

DDLP: Unsupervised Object-Centric Video Prediction with Deep Dynamic Latent Particles

1 code implementation • 9 Jun 2023 • Tal Daniel, Aviv Tamar

We propose a new object-centric video prediction algorithm based on the deep latent particle (DLP) representation.

Object Position +2

Paper
Code

Explore to Generalize in Zero-Shot RL

1 code implementation • NeurIPS 2023 • Ev Zisselman, Itai Lavie, Daniel Soudry, Aviv Tamar

Our insight is that learning a policy that effectively $\textit{explores}$ the domain is harder to memorize than a policy that maximizes reward for a specific task, and therefore we expect such learned behavior to generalize well; we indeed demonstrate this empirically on several domains that are difficult for invariance-based approaches.

Zero-shot Generalization

Paper
Code

ContraBAR: Contrastive Bayes-Adaptive Deep RL

1 code implementation • 4 Jun 2023 • Era Choshen, Aviv Tamar

In meta reinforcement learning (meta RL), an agent seeks a Bayes-optimal policy -- the optimal policy when facing an unknown task that is sampled from some known task distribution.

Contrastive Learning Meta Reinforcement Learning +1

Paper
Code

Goal-Conditioned Supervised Learning with Sub-Goal Prediction

no code implementations • 17 May 2023 • Tom Jurgenson, Aviv Tamar

Based on this idea, we propose Trajectory Iterative Learner (TraIL), an extension of GCSL that further exploits the information in a trajectory, and uses it for learning to predict both actions and sub-goals.

Paper
Add Code

A Deep Learning Perspective on Network Routing

no code implementations • 1 Mar 2023 • Yarin Perry, Felipe Vieira Frujeri, Chaim Hoch, Srikanth Kandula, Ishai Menache, Michael Schapira, Aviv Tamar

Routing is, arguably, the most fundamental task in computer networking, and the most extensively studied one.

Stochastic Optimization

Paper
Add Code

Online Tool Selection with Learned Grasp Prediction Models

no code implementations • 15 Feb 2023 • Khashayar Rohanimanesh, Jake Metzger, William Richards, Aviv Tamar

However, we find that an approximate solution based on sparse tree search yields near optimal performance at a fraction of the time.

Model Predictive Control

Paper
Add Code

Towards Deployable RL - What's Broken with RL Research and a Potential Fix

no code implementations • 3 Jan 2023 • Shie Mannor, Aviv Tamar

Reinforcement learning (RL) has demonstrated great potential, but is currently full of overhyping and pipe dreams.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Learning Control by Iterative Inversion

no code implementations • 3 Nov 2022 • Gal Leibovich, Guy Jacob, Or Avner, Gal Novik, Aviv Tamar

The key challenge is a $\textit{distribution shift}$ between the desired outputs and the outputs of an initial random guess, and we prove that iterative inversion can steer the learning correctly, under rather strict conditions on the function.

Continuous Control

Paper
Add Code

Meta Reinforcement Learning with Finite Training Tasks -- a Density Estimation Approach

1 code implementation • 21 Jun 2022 • Zohar Rimon, Aviv Tamar, Gilad Adler

We show that our approach leads to bounds that depend on the dimension of the task distribution.

Density Estimation Dimensionality Reduction +3

Paper
Code

Unsupervised Image Representation Learning with Deep Latent Particles

1 code implementation • 31 May 2022 • Tal Daniel, Aviv Tamar

We propose a new representation of visual data that disentangles object position from appearance.

Ranked #1 on Unsupervised Facial Landmark Detection on MAFL

Image Manipulation Model Selection +3

Paper
Code

Validate on Sim, Detect on Real -- Model Selection for Domain Randomization

no code implementations • 1 Nov 2021 • Gal Leibovich, Guy Jacob, Shadi Endrawis, Gal Novik, Aviv Tamar

We show that our score - VSDR - can significantly improve the accuracy of policy ranking without requiring additional real world data.

Model Selection Out-of-Distribution Detection +1

Paper
Add Code

Regularization Guarantees Generalization in Bayesian Reinforcement Learning through Algorithmic Stability

no code implementations • 24 Sep 2021 • Aviv Tamar, Daniel Soudry, Ev Zisselman

In the Bayesian reinforcement learning (RL) setting, a prior distribution over the unknown problem parameters -- the rewards and transitions -- is assumed, and a policy that optimizes the (posterior) expected return is sought.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Offline Meta Reinforcement Learning -- Identifiability Challenges and Effective Data Collection Strategies

1 code implementation • NeurIPS 2021 • Ron Dorfman, Idan Shenfeld, Aviv Tamar

Consider the following instance of the Offline Meta Reinforcement Learning (OMRL) problem: given the complete training logs of $N$ conventional RL agents, trained on $N$ different tasks, design a meta-agent that can quickly maximize reward in a new, unseen task from the same task distribution.

Meta Reinforcement Learning reinforcement-learning +1

Paper
Code

Efficient Self-Supervised Data Collection for Offline Robot Learning

no code implementations • 10 May 2021 • Shadi Endrawis, Gal Leibovich, Guy Jacob, Gal Novik, Aviv Tamar

In this work, we propose that data collection policies should actively explore the environment to collect diverse data.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Unsupervised Feature Learning for Manipulation with Contrastive Domain Randomization

1 code implementation • ICLR Workshop SSL-RL 2021 • Carmel Rabinovitz, Niko Grupen, Aviv Tamar

In this work, however, we show that a naive application of DR to unsupervised learning based on contrastive estimation does not promote invariance, as the loss function maximizes mutual information between the features and both the relevant and irrelevant visual properties.

Paper
Code

Soft-IntroVAE: Analyzing and Improving the Introspective Variational Autoencoder

2 code implementations • CVPR 2021 • Tal Daniel, Aviv Tamar

However, the original IntroVAE loss function relied on a particular hinge-loss formulation that is very hard to stabilize in practice, and its theoretical convergence analysis ignored important terms in the loss.

Image Generation Out-of-Distribution Detection

185

Paper
Code

Online Safety Assurance for Deep Reinforcement Learning

no code implementations • 7 Oct 2020 • Noga H. Rotman, Michael Schapira, Aviv Tamar

We illustrate the usefulness of online safety assurance in the context of the proposed deep reinforcement learning (RL) approach to video streaming.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Offline Meta Learning of Exploration

1 code implementation • NeurIPS 2021 • Ron Dorfman, Idan Shenfeld, Aviv Tamar

Meta-Learning Meta Reinforcement Learning

Paper
Code

Efficient MDP Analysis for Selfish-Mining in Blockchains

1 code implementation • 10 Jul 2020 • Roi Bar Zur, Ittay Eyal, Aviv Tamar

We call this Probabilistic Termination Optimization (PTO), and the technique applies to any MDP whose utility is a ratio function.

Cryptography and Security

Paper
Code

Hallucinative Topological Memory for Zero-Shot Visual Planning

1 code implementation • ICML 2020 • Kara Liu, Thanard Kurutach, Christine Tung, Pieter Abbeel, Aviv Tamar

In visual planning (VP), an agent learns to plan goal-directed behavior from observations of a dynamical system obtained offline, e. g., images obtained from self-supervised robot interaction.

Paper
Code

Sub-Goal Trees -- a Framework for Goal-Based Reinforcement Learning

no code implementations • ICML 2020 • Tom Jurgenson, Or Avner, Edward Groshev, Aviv Tamar

Reinforcement learning (RL), building on Bellman's optimality equation, naturally optimizes for a single goal, yet can be made multi-goal by augmenting the state with the goal.

Motion Planning reinforcement-learning +1

Paper
Add Code

Deep Residual Flow for Out of Distribution Detection

1 code implementation • CVPR 2020 • Ev Zisselman, Aviv Tamar

Specifically, we demonstrate the effectiveness of our method in ResNet and DenseNet architectures trained on various image datasets.

Out-of-Distribution Detection

Paper
Code

Deep Variational Semi-Supervised Novelty Detection

no code implementations • 12 Nov 2019 • Tal Daniel, Thanard Kurutach, Aviv Tamar

In this work, we propose two variational methods for training VAEs for SSAD.

Anomaly Detection Astronomy +2

Paper
Add Code

Bayesian Relational Memory for Semantic Visual Navigation

1 code implementation • ICCV 2019 • Yi Wu, Yuxin Wu, Aviv Tamar, Stuart Russell, Georgia Gkioxari, Yuandong Tian

We introduce a new memory architecture, Bayesian Relational Memory (BRM), to improve the generalization ability for semantic visual navigation agents in unseen environments, where an agent is given a semantic target to navigate towards.

Navigate Visual Navigation

Paper
Code

Sub-Goal Trees -- a Framework for Goal-Directed Trajectory Prediction and Optimization

no code implementations • 12 Jun 2019 • Tom Jurgenson, Edward Groshev, Aviv Tamar

In such problems, the way we choose to represent a trajectory underlies algorithms for trajectory prediction and optimization.

Motion Planning reinforcement-learning +2

Paper
Add Code

Harnessing Reinforcement Learning for Neural Motion Planning

1 code implementation • 1 Jun 2019 • Tom Jurgenson, Aviv Tamar

We then propose a modification of the popular DDPG RL algorithm that is tailored to motion planning domains, by exploiting the known model in the problem and the set of solved plans in the data.

Motion Planning reinforcement-learning +1

Paper
Code

Learning Robotic Manipulation through Visual Planning and Acting

no code implementations • 11 May 2019 • Angelina Wang, Thanard Kurutach, Kara Liu, Pieter Abbeel, Aviv Tamar

We further demonstrate our approach on learning to imagine and execute in 3 environments, the final of which is deformable rope manipulation on a PR2 robot.

Visual Tracking

Paper
Add Code

Domain Randomization for Active Pose Estimation

no code implementations • 10 Mar 2019 • Xinyi Ren, Jianlan Luo, Eugen Solowjow, Juan Aparicio Ojea, Abhishek Gupta, Aviv Tamar, Pieter Abbeel

In this work, we investigate how to improve the accuracy of domain randomization based pose estimation.

Pose Estimation

Paper
Add Code

Multi-Agent Reinforcement Learning with Multi-Step Generative Models

no code implementations • 29 Jan 2019 • Orr Krupnik, Igor Mordatch, Aviv Tamar

We consider model-based reinforcement learning (MBRL) in 2-agent, high-fidelity continuous control problems -- an important domain for robots interacting with other agents in the same workspace.

Continuous Control Decision Making +5

Paper
Add Code

Learning and Planning with a Semantic Model

no code implementations • ICLR 2019 • Yi Wu, Yuxin Wu, Aviv Tamar, Stuart Russell, Georgia Gkioxari, Yuandong Tian

Building deep reinforcement learning agents that can generalize and adapt to unseen environments remains a fundamental challenge for AI.

Visual Navigation

Paper
Add Code

Safe Policy Learning from Observations

no code implementations • 27 Sep 2018 • Elad Sarafian, Aviv Tamar, Sarit Kraus

The primary advantages of our approach, termed Rerouted Behavior Improvement (RBI), over other safe learning methods are its stability in the presence of value estimation errors and the elimination of a policy search process.

Paper
Add Code

Distributional Multivariate Policy Evaluation and Exploration with the Bellman GAN

no code implementations • 6 Aug 2018 • Dror Freirich, Ron Meir, Aviv Tamar

In this formulation, DiRL can be seen as learning a deep generative model of the value distribution, driven by the discrepancy between the distribution of the current value, and the distribution of the sum of current reward and next value.

Generative Adversarial Network

Paper
Add Code

Learning Plannable Representations with Causal InfoGAN

1 code implementation • NeurIPS 2018 • Thanard Kurutach, Aviv Tamar, Ge Yang, Stuart Russell, Pieter Abbeel

Finally, to generate a visual plan, we project the current and goal observations onto their respective states in the planning model, plan a trajectory, and then use the generative model to transform the trajectory to a sequence of observations.

Representation Learning

Paper
Code

Constrained Policy Improvement for Safe and Efficient Reinforcement Learning

1 code implementation • 20 May 2018 • Elad Sarafian, Aviv Tamar, Sarit Kraus

To minimize the improvement penalty, the RBI idea is to attenuate rapid policy changes of low probability actions which were less frequently sampled.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Learning Robotic Assembly from CAD

no code implementations • 20 Mar 2018 • Garrett Thomas, Melissa Chien, Aviv Tamar, Juan Aparicio Ojea, Pieter Abbeel

We propose to leverage this prior knowledge by guiding RL along a geometric motion plan, calculated using the CAD data.

Motion Planning Reinforcement Learning (RL)

Paper
Add Code

Model-Ensemble Trust-Region Policy Optimization

2 code implementations • ICLR 2018 • Thanard Kurutach, Ignasi Clavera, Yan Duan, Aviv Tamar, Pieter Abbeel

In this paper, we analyze the behavior of vanilla model-based reinforcement learning methods when deep neural networks are used to learn both the model and the policy, and show that the learned policy tends to exploit regions where insufficient data is available for the model to be learned, causing instability in training.

Continuous Control Model-based Reinforcement Learning +2

Paper
Code

Imitation Learning from Visual Data with Multiple Intentions

no code implementations • ICLR 2018 • Aviv Tamar, Khashayar Rohanimanesh, Yin-Lam Chow, Chris Vigorito, Ben Goodrich, Michael Kahane, Derik Pridmore

In this paper we present an LfD approach for learning multiple modes of behavior from visual data.

Imitation Learning

Paper
Add Code

Safer Classification by Synthesis

no code implementations • 22 Nov 2017 • William Wang, Angelina Wang, Aviv Tamar, Xi Chen, Pieter Abbeel

We posit that a generative approach is the natural remedy for this problem, and propose a method for classification using generative models.

Classification General Classification

Paper
Add Code

Situationally Aware Options

no code implementations • 20 Nov 2017 • Daniel J. Mankowitz, Aviv Tamar, Shie Mannor

We learn reusable options in different scenarios in a RoboCup soccer domain (i. e., winning/losing).

Paper
Add Code

Learning Generalized Reactive Policies using Deep Neural Networks

no code implementations • 24 Aug 2017 • Edward Groshev, Maxwell Goldstein, Aviv Tamar, Siddharth Srivastava, Pieter Abbeel

We show that a deep neural network can be used to learn and represent a \emph{generalized reactive policy} (GRP) that maps a problem instance and a state to an action, and that the learned GRPs efficiently solve large classes of challenging problem instances.

Decision Making feature selection

Paper
Add Code

A Machine Learning Approach to Routing

no code implementations • 10 Aug 2017 • Asaf Valadarsky, Michael Schapira, Dafna Shahaf, Aviv Tamar

Can ideas and techniques from machine learning be leveraged to automatically generate "good" routing configurations?

BIG-bench Machine Learning reinforcement-learning +1

Paper
Add Code

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

84 code implementations • NeurIPS 2017 • Ryan Lowe, Yi Wu, Aviv Tamar, Jean Harb, Pieter Abbeel, Igor Mordatch

We explore deep reinforcement learning methods for multi-agent domains.

Ranked #1 on SMAC+ on Def_Infantry_sequential

Multi-agent Reinforcement Learning Q-Learning +3

31,072

Paper
Code

Constrained Policy Optimization

9 code implementations • ICML 2017 • Joshua Achiam, David Held, Aviv Tamar, Pieter Abbeel

For many applications of reinforcement learning it can be more convenient to specify both a reward function and constraints, rather than trying to design behavior through the reward function.

Reinforcement Learning (RL) Safe Reinforcement Learning

287

Paper
Code

Shallow Updates for Deep Reinforcement Learning

no code implementations • NeurIPS 2017 • Nir Levine, Tom Zahavy, Daniel J. Mankowitz, Aviv Tamar, Shie Mannor

In this work we propose a hybrid approach -- the Least Squares Deep Q-Network (LS-DQN), which combines rich feature representations learned by a DRL algorithm with the stability of a linear least squares method.

Atari Games Feature Engineering +2

Paper
Add Code

Situational Awareness by Risk-Conscious Skills

no code implementations • 10 Oct 2016 • Daniel J. Mankowitz, Aviv Tamar, Shie Mannor

In addition, the learned risk aware skills are able to mitigate reward-based model misspecification.

Hierarchical Reinforcement Learning

Paper
Add Code

Learning from the Hindsight Plan -- Episodic MPC Improvement

1 code implementation • 28 Sep 2016 • Aviv Tamar, Garrett Thomas, Tianhao Zhang, Sergey Levine, Pieter Abbeel

To bring the next real-world execution closer to the hindsight plan, our approach learns to re-shape the original cost function with the goal of satisfying the following property: short horizon planning (as realistic during real executions) with respect to the shaped cost should result in mimicking the hindsight plan.

Model Predictive Control

221

Paper
Code

Bayesian Reinforcement Learning: A Survey

no code implementations • 14 Sep 2016 • Mohammad Ghavamzadeh, Shie Mannor, Joelle Pineau, Aviv Tamar

The objective of the paper is to provide a comprehensive survey on Bayesian RL algorithms and their theoretical and empirical properties.

Bayesian Inference reinforcement-learning +1

Paper
Add Code

Value Iteration Networks

8 code implementations • NeurIPS 2016 • Aviv Tamar, Yi Wu, Garrett Thomas, Sergey Levine, Pieter Abbeel

We introduce the value iteration network (VIN): a fully differentiable neural network with a `planning module' embedded within.

reinforcement-learning Reinforcement Learning (RL)

554

Paper
Code

Generalized Emphatic Temporal Difference Learning: Bias-Variance Analysis

no code implementations • 17 Sep 2015 • Assaf Hallak, Aviv Tamar, Remi Munos, Shie Mannor

We consider the off-policy evaluation problem in Markov decision processes with function approximation.

Off-policy evaluation

Paper
Add Code

Emphatic TD Bellman Operator is a Contraction

no code implementations • 14 Aug 2015 • Assaf Hallak, Aviv Tamar, Shie Mannor

Recently, \citet{SuttonMW15} introduced the emphatic temporal differences (ETD) algorithm for off-policy evaluation in Markov decision processes.

Off-policy evaluation

Paper
Add Code

Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach

no code implementations • NeurIPS 2015 • Yin-Lam Chow, Aviv Tamar, Shie Mannor, Marco Pavone

Our first contribution is to show that a CVaR objective, besides capturing risk sensitivity, has an alternative interpretation as expected cost under worst-case modeling errors, for a given error budget.

Decision Making

Paper
Add Code

Policy Gradient for Coherent Risk Measures

no code implementations • NeurIPS 2015 • Aviv Tamar, Yin-Lam Chow, Mohammad Ghavamzadeh, Shie Mannor

For static risk measures, our approach is in the spirit of policy gradient algorithms and combines a standard sampling approach with convex programming.

Policy Gradient Methods

Paper
Add Code

Implicit Temporal Differences

no code implementations • 21 Dec 2014 • Aviv Tamar, Panos Toulis, Shie Mannor, Edoardo M. Airoldi

In reinforcement learning, the TD($\lambda$) algorithm is a fundamental policy evaluation method with an efficient online implementation that is suitable for large-scale problems.

Paper
Add Code

Optimizing the CVaR via Sampling

1 code implementation • 15 Apr 2014 • Aviv Tamar, Yonatan Glassner, Shie Mannor

Conditional Value at Risk (CVaR) is a prominent risk measure that is being used extensively in various domains.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Variance Adjusted Actor Critic Algorithms

no code implementations • 14 Oct 2013 • Aviv Tamar, Shie Mannor

We present an actor-critic framework for MDPs where the objective is the variance-adjusted expected return.

Paper
Add Code

Scaling Up Robust MDPs by Reinforcement Learning

no code implementations • 26 Jun 2013 • Aviv Tamar, Huan Xu, Shie Mannor

We consider large-scale Markov decision processes (MDPs) with parameter uncertainty, under the robust MDP paradigm.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Policy Gradients with Variance Related Risk Criteria

no code implementations • 27 Jun 2012 • Dotan Di Castro, Aviv Tamar, Shie Mannor

In this paper we devise a framework for local policy gradient style algorithms for reinforcement learning for variance related criteria.

Reinforcement Learning (RL)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.