no code implementations • NeurIPS 2021 • Tijana Zrnic, Eric Mazumdar, S. Shankar Sastry, Michael I. Jordan
In particular, by generalizing the standard model to allow both players to learn over time, we show that a decision-maker that makes updates faster than the agents can reverse the order of play, meaning that the agents lead and the decision-maker follows.
no code implementations • 16 Jun 2021 • Chinmay Maheshwari, Chih-Yuan Chiu, Eric Mazumdar, S. Shankar Sastry, Lillian J. Ratliff
Min-max optimization is emerging as a key framework for analyzing problems of robustness to strategically and adversarially generated data.
no code implementations • 27 Mar 2021 • Tyler Westenbroek, Max Simchowitz, Michael I. Jordan, S. Shankar Sastry
Crucially, this guarantee requires that state costs applied to the planning problems are in a certain sense `compatible' with the global geometry of the system, and a simple counter-example demonstrates the necessity of this condition.
no code implementations • 24 Feb 2021 • David L. McPherson, Kaylene C. Stocking, S. Shankar Sastry
Stochastic models, however, can capture the uncertainty and risk tolerance that are often present in real systems of interest.
no code implementations • 26 Oct 2020 • Vicenc Rubies-Royo, Eric Mazumdar, Roy Dong, Claire Tomlin, S. Shankar Sastry
In this work we present a multi-armed bandit framework for online expert selection in Markov decision processes and demonstrate its use in high-dimensional settings.
no code implementations • 20 Apr 2020 • Oladapo Afolabi, Allen Y. Yang, S. Shankar Sastry
Recent advances in computer graphics and computer vision have found successful application of deep neural network models for 3D shapes based on signed distance functions (SDFs) that are useful for shape representation, retrieval, and completion.
no code implementations • L4DC 2020 • Fernando Castañeda, Mathias Wulfman, Ayush Agrawal, Tyler Westenbroek, Claire J. Tomlin, S. Shankar Sastry, Koushil Sreenath
The main drawbacks of input-output linearizing controllers are the need for precise dynamics models and not being able to account for input constraints.
no code implementations • 6 Apr 2020 • Tyler Westenbroek, Eric Mazumdar, David Fridovich-Keil, Valmik Prabhu, Claire J. Tomlin, S. Shankar Sastry
This paper proposes a framework for adaptively learning a feedback linearization-based tracking controller for an unknown system using discrete-time model-free policy-gradient parameter update rules.
no code implementations • 13 Jan 2020 • Andreea Bobu, Dexter R. R. Scobee, Jaime F. Fisac, S. Shankar Sastry, Anca D. Dragan
A common model is the Boltzmann noisily-rational decision model, which assumes people approximately optimize a reward function and choose trajectories in proportion to their exponentiated reward.
1 code implementation • 4 Nov 2019 • Kamil Nar, S. Shankar Sastry
While training a neural network, the iterative optimization algorithm involved also creates an online learning problem, and consequently, correct estimation of the optimal parameters requires persistent excitation of the network weights.
no code implementations • 29 Oct 2019 • Tyler Westenbroek, David Fridovich-Keil, Eric Mazumdar, Shreyas Arora, Valmik Prabhu, S. Shankar Sastry, Claire J. Tomlin
We present a novel approach to control design for nonlinear systems which leverages model-free policy optimization techniques to learn a linearizing controller for a physical plant with unknown dynamics.
no code implementations • ICLR 2020 • Dexter R. R. Scobee, S. Shankar Sastry
While most approaches to the problem of Inverse Reinforcement Learning (IRL) focus on estimating a reward function that best explains an expert agent's policy or demonstrated behavior on a control task, it is often the case that such behavior is more succinctly represented by a simple reward combined with a set of hard constraints.
no code implementations • 8 Jul 2019 • Eric Mazumdar, Lillian J. Ratliff, Michael. I. Jordan, S. Shankar Sastry
In such games the state and action spaces are continuous and global Nash equilibria can be found be solving coupled Ricatti equations.
no code implementations • ICLR 2019 • Kamil Nar, Orhan Ocal, S. Shankar Sastry, Kannan Ramchandran
In this work, we study the binary classification of linearly separable datasets and show that linear classifiers could also have decision boundaries that lie close to their training dataset if cross-entropy loss is used for training.
no code implementations • 29 Apr 2019 • Tyler Westenbroek, Roy Dong, Lillian J. Ratliff, S. Shankar Sastry
Recent work has explored mechanisms to ensure that the data sources share high quality data with a single data aggregator, addressing the issue of moral hazard.
no code implementations • 24 Jan 2019 • Kamil Nar, Orhan Ocal, S. Shankar Sastry, Kannan Ramchandran
We show that differential training can ensure a large margin between the decision boundary of the neural network and the points in the training dataset.
no code implementations • 3 Jan 2019 • Eric V. Mazumdar, Michael. I. Jordan, S. Shankar Sastry
We propose local symplectic surgery, a two-timescale procedure for finding local Nash equilibria in two-player zero-sum games.
no code implementations • 13 Oct 2018 • Jaime F. Fisac, Eli Bronstein, Elis Stefansson, Dorsa Sadigh, S. Shankar Sastry, Anca D. Dragan
This mutual dependence, best captured by dynamic game theory, creates a strong coupling between the vehicle's planning and its predictions of other drivers' behavior, and constitutes an open problem with direct implications on the safety and viability of autonomous driving technology.
2 code implementations • NeurIPS 2018 • Kamil Nar, S. Shankar Sastry
To elucidate the effects of the step size on training of neural networks, we study the gradient descent algorithm as a discrete-time dynamical system, and by analyzing the Lyapunov stability of different solutions, we show the relationship between the step size of the algorithm and the solutions that can be obtained with this algorithm.
no code implementations • 16 Apr 2018 • Eric Mazumdar, Lillian J. Ratliff, S. Shankar Sastry
We formulate a general framework for competitive gradient-based learning that encompasses a wide breadth of multi-agent learning algorithms, and analyze the limiting behavior of competitive gradient-based learning algorithms using dynamical systems theory.
no code implementations • 14 Feb 2018 • Jaime F. Fisac, Chang Liu, Jessica B. Hamrick, S. Shankar Sastry, J. Karl Hedrick, Thomas L. Griffiths, Anca D. Dragan
We introduce $t$-\ACty{}: a measure that quantifies the accuracy and confidence with which human observers can predict the remaining robot plan from the overall task goal and the observed initial $t$ actions in the plan.
no code implementations • 6 Feb 2018 • Chang Liu, Jessica B. Hamrick, Jaime F. Fisac, Anca D. Dragan, J. Karl Hedrick, S. Shankar Sastry, Thomas L. Griffiths
The study of human-robot interaction is fundamental to the design and use of robotics in real-world applications.
no code implementations • 20 Jul 2017 • Jaime F. Fisac, Monica A. Gates, Jessica B. Hamrick, Chang Liu, Dylan Hadfield-Menell, Malayandi Palaniappan, Dhruv Malik, S. Shankar Sastry, Thomas L. Griffiths, Anca D. Dragan
In robotics, value alignment is key to the design of collaborative robots that can integrate into human workflows, successfully inferring and adapting to their users' objectives as they go.
no code implementations • 18 Jul 2017 • Eric Mazumdar, Roy Dong, Vicenç Rúbies Royo, Claire Tomlin, S. Shankar Sastry
We formulate a multi-armed bandit (MAB) approach to choosing expert policies online in Markov decision processes (MDPs).
Systems and Control
no code implementations • 27 Jun 2016 • Sanjit A. Seshia, Dorsa Sadigh, S. Shankar Sastry
Verified artificial intelligence (AI) is the goal of designing AI-based systems that that have strong, ideally provable, assurances of correctness with respect to mathematically-specified requirements.
no code implementations • 25 Jul 2014 • Ehsan Elhamifar, Guillermo Sapiro, S. Shankar Sastry
The solution of our optimization finds representatives and the assignment of each element of the target set to each representative, hence, obtaining a clustering.
no code implementations • 8 Feb 2014 • Liansheng Zhuang, Tsung-Han Chan, Allen Y. Yang, S. Shankar Sastry, Yi Ma
In particular, the single-sample face alignment accuracy is comparable to that of the well-known Deformable SRC algorithm using multiple gallery images per class.
no code implementations • 7 Dec 2013 • Dorsa Sadigh, Henrik Ohlsson, S. Shankar Sastry, Sanjit A. Seshia
As in robust PCA, it can be problematic to find a suitable regularization parameter.
no code implementations • 20 Sep 2013 • Henrik Ohlsson, Tianshi Chen, Sina Khoshfetrat Pakazad, Lennart Ljung, S. Shankar Sastry
The number of hypothesis grows rapidly with the number of systems and approximate solutions become a necessity for any problems of practical interests.
no code implementations • CVPR 2013 • Liansheng Zhuang, Allen Y. Yang, Zihan Zhou, S. Shankar Sastry, Yi Ma
To compensate the missing illumination information typically provided by multiple training images, a sparse illumination transfer (SIT) technique is introduced.
no code implementations • 20 Mar 2013 • Henrik Ohlsson, Yonina C. Eldar, Allen Y. Yang, S. Shankar Sastry
The problem is of great importance in many applications and is typically solved by maximizing the cross-correlation between the two signals.
no code implementations • 21 Jul 2010 • Allen Y. Yang, Zihan Zhou, Arvind Ganesh, S. Shankar Sastry, Yi Ma
L1-minimization refers to finding the minimum L1-norm solution to an underdetermined linear system b=Ax.
1 code implementation • 21 Jul 2010 • Allen Y. Yang, Zihan Zhou, Arvind Ganesh, S. Shankar Sastry, Yi Ma
L1-minimization refers to finding the minimum L1-norm solution to an underdetermined linear system b=Ax.