To improve the stability of the learning-based policy and efficiency of exploration, we utilize an imitation loss based on the state-of-the-art classical control policy.
We also compare our method with recursive least squares and the particle filter, and show that our technique has significantly more accurate point estimates as well as a decrease in tracking error of the value of interest.
The proposed architecture divides human motion prediction into two parts: 1) the human trajectory, which is the hip joint 3D position over time and 2) the human pose which is the all other joints 3D positions over time with respect to a fixed hip joint.
We show that our system outperforms the state-of-the-art in human motion prediction while it can predict diverse multi-motion future trajectories with hip movements
Robots and artificial agents that interact with humans should be able to do so without bias and inequity, but facial perception systems have notoriously been found to work more poorly for certain groups of people than others.
This paper introduces OptimizedDP, a high-performance software library that solves time-dependent Hamilton-Jacobi partial differential equation (PDE), computes backward reachable sets with application in robotics, and contains value iterations algorithm implementation for continuous action-state space Markov Decision Process (MDP) while leveraging user-friendliness of Python for different problem specifications without sacrificing efficiency of the core computation.
Multi-agent policy gradient methods have demonstrated success in games and robotics but are often limited to problems with low-level action space.
Our method shows at-least 70\% improvement in parameter point estimation accuracy and approximately 55\% reduction in tracking error of the value of interest compared to recursive least squares and conventional MCMC.
We propose a vector-quantized variational autoencoder structure as well as training techniques to learn a rigorous representation of gesture sequences.
We demonstrate our approach on a challenging benchmark: estimation of parameters in the Hunt-Crossley dynamic model, which models both on/off contact forces applied to soft materials.
Our deep RL module implicitly estimates human trajectory and produces short-term navigational goals to guide the robot.
On the other end, "classical methods" such as optimal control generate solutions without collecting data, but assume that an accurate model of the system and environment is known and are mostly limited to problems with low-dimensional (lo-dim) state spaces.
This article describes a dataset collected in a set of experiments that involves human participants and a robot.
In Bansal et al. (2019), a novel visual navigation framework that combines learning-based and model-based approaches has been proposed.
Sequence labeling of biomedical entities, e. g., side effects or phenotypes, was a long-term task in BioNLP and MedNLP communities.
Our Backward Reachability Curriculum (BaRC) begins policy training from states that require a small number of actions to accomplish the task, and expands the initial state distribution backwards in a dynamically-consistent manner once the policy optimization algorithm demonstrates sufficient performance.
Hamilton-Jacobi (HJ) reachability analysis is an important formal verification method for guaranteeing performance and safety properties of dynamical systems; it has been applied to many small-scale systems in the past decade.
Systems and Control Dynamical Systems Optimization and Control
We propose a new algorithm FaSTrack: Fast and Safe Tracking for High Dimensional systems.
To sidestep the curse of dimensionality when computing solutions to Hamilton-Jacobi-Bellman partial differential equations (HJB PDE), we propose an algorithm that leverages a neural network to approximate the value function.