How can we train an assistive human-machine interface (e. g., an electromyography-based limb prosthesis) to translate a user's raw command signals into the actions of a robot or computer when there is no prior mapping, we cannot ask the user for supervision in the form of action labels or reward feedback, and we do not have prior knowledge of the tasks the user is trying to accomplish?
In the typing domain, we leverage backspaces as feedback that the interface did not perform the desired action.
Building assistive interfaces for controlling robots through arbitrary, high-dimensional, noisy inputs (e. g., webcam images of eye gaze) can be challenging, especially when it involves inferring the user's desired action in the absence of a natural 'default' interface.
Standard lossy image compression algorithms aim to preserve an image's appearance, while minimizing the number of bits needed to transmit it.
In the typing domain, we leverage backspaces as implicit feedback that the interface did not perform the desired action.
We evaluate ASE in a user study with 12 participants who each perform four tasks: two tasks with known user biases -- bandwidth-limited image classification and a driving video game with observation delay -- and two with unknown biases that our method has to learn -- guided 2D navigation and a lunar lander teleoperation video game.
To address this challenge, we propose an algorithm that safely and interactively learns a model of the user's reward function.
We learn a model of the user's preferences from observations of the user's choices in easy settings with a few robots, and use it in challenging settings with more robots to automatically identify which robot the user would most likely choose to control, if they were able to evaluate the states of all robots at all times.
Theoretically, we show that SQIL can be interpreted as a regularized variant of BC that uses a sparsity prior to encourage long-horizon imitation.
Learning to imitate expert actions given demonstrations containing image observations is a difficult problem in robotic control.
Inferring intent from observed behavior has been studied extensively within the frameworks of Bayesian inverse planning and inverse reinforcement learning.
In this paper, we present the Latent Skill Embedding (LSE), a probabilistic model of students and educational content that can be used to recommend personalized sequences of lessons with the goal of helping students prepare for specific assessments.