However, these methods often demand large scale and high quality counseling data, which are difficult to collect.
Reactive motion generation problems are usually solved by computing actions as a sum of policies.
Many IL methods, such as Dataset Aggregation (DAgger), combat challenges like distributional shift by interacting with oracular experts.
Each player allocates his limited attention capacity between biased sources and the other players, and the resulting stochastic attention network facilitates the transmission of information from primary sources to him either directly or indirectly through the other players.
Using RMPflow as a structured policy class in learning has several benefits, such as sufficient expressiveness, the flexibility to inject different levels of prior knowledge as well as the ability to transfer policies between robots.
The policy structure provides the user an interface to 1) specifying the spaces that are directly relevant to the completion of the tasks, and 2) designing policies for certain tasks that do not need to be learned.
We study a model of electoral accountability and selection (EAS) in which heterogeneous voters can aggregate the incumbent's performance data into personalized signals through paying limited attention.
The complex motions are encoded as rollouts of a stable dynamical system, which, under a change of coordinates defined by a diffeomorphism, is equivalent to a simple, hand-specified dynamical system.
In addition to regular high- and low-competence types, the incumbent may be an aspiring autocrat who controls the mainstream media and will subvert democracy if retained in office.
We propose a collection of RMPs for simple multi-robot tasks that can be used for building controllers for more complicated tasks.