However, whether an SAE is actually able to track objects in the scene and thus yields a spatial state representation well suited for RL tasks has rarely been examined due to a lack of established metrics.
We find that in the case of trajectory control, the standard model-based RL formulation used in approaches like PETS-MPPI and MBPO is not suitable.
In this paper, we propose a new approach to hybrid modeling, where we inform the latent states of a learned model via a black-box simulator.
Convolutional Neural Networks (CNN) have become a common choice for industrial quality control, as well as other critical applications in the Industry 4. 0.
Accurate simulation of deformable linear object (DLO) dynamics is challenging if the task at hand requires a human-interpretable model that also yields fast predictions.
Subsequently, we demonstrate the potential of CE methods, by applying them in three industrial use cases.
Model-free learning of communication and control policies provides an alternative.
Our method requires a single evaluation of the NN and forward integration of the input sequence online, which is fast to compute on resource-constrained systems.
Machine learning and deep learning have been used extensively to classify physical surfaces through images and time-series contact data.
Kernel methods, being supported by a well-developed theory and coming with efficient algorithms, are among the most popular and successful machine learning techniques.
This filtering technique combines two signals by applying a high-pass filter to one signal, and low-pass filtering the other.
Establishing these properties is challenging, especially when no analytical model is available and they are to be inferred directly from measurement data.
The results demonstrate that ET-GP-UCB is readily applicable without prior knowledge on the rate of change.
However, in practice, many systems also exhibit uncertainty in the form of changes over time, e. g., due to weight shifts or wear and tear, leading to decreased performance or instability of the learning-based controller.
Current TVBO methods do not explicitly account for these properties, resulting in poor tuning performance and many unstable controllers through over-exploration of the parameter space.
Distributed model predictive control (DMPC) is often used to tackle path planning for unmanned aerial vehicle (UAV) swarms.
Further, we introduce a process for the validation of concept-extraction techniques based on synthetic datasets with pixel-wise annotations of their main components, reducing the need for human intervention.
Questions in causality, control, and reinforcement learning go beyond the classical machine learning task of prediction under i. i. d.
Learning optimal control policies directly on physical systems is challenging since even a single failure can lead to costly hardware damage.
When learning policies for robotic systems from data, safety is a major concern, as violation of safety constraints may cause hardware damage.
Although it is often not possible to compute the minimum required penalty, we reveal clear structure of how the penalty, rewards, discount factor, and dynamics interact.
In particular, we discuss the family of constraints that enforce safety in the context of a nominal control policy, and expose that these constraints do not need to be accurate everywhere.
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design.
The combination of machine learning with control offers many opportunities, in particular for robust control.
However, these estimates are of a Bayesian nature, whereas for some important applications, like learning-based control with safety guarantees, frequentist uncertainty bounds are required.
An important class of cyber-physical systems relies on multiple agents that jointly perform a task by coordinating their actions over a wireless network.
On the other hand, classical numerical integrators are specifically designed to preserve these crucial properties through time.
Accurate models of mechanical system dynamics are often critical for model-based control and reinforcement learning.
We consider failing behaviors as those that violate a constraint and address the problem of learning with crash constraints, where no data is obtained upon constraint violation.
We present a framework for model-free learning of event-triggered control strategies.
In this paper, we propose a method that identifies the causal structure of control systems.
When learning to ride a bike, a child falls down a number of times before achieving the first success.
Evaluating whether data streams are drawn from the same distribution is at the heart of various machine learning problems.
Despite the availability of ever more data enabled through modern sensor and computer technology, it still remains an open problem to learn dynamical systems in a sample-efficient way.
In reinforcement learning (RL), an autonomous agent learns to perform complex tasks by maximizing an exogenous reward signal while interacting with its environment.
While safety can only be guaranteed after learning the safety measure, we show that failures can already be greatly reduced by using the estimated measure during learning.
Learning robot controllers by minimizing a black-box objective cost using Bayesian optimization (BO) can be time-consuming and challenging.
Policy gradient methods are powerful reinforcement learning algorithms and have been demonstrated to solve many complex tasks.
Bayesian optimization is proposed for automatic learning of optimal controller parameters from experimental data.
Event-triggered control (ETC) methods can achieve high-performance control with a significantly lower number of samples compared to usual, time-triggered methods.
Soft microrobots based on photoresponsive materials and controlled by light fields can generate a variety of different gaits.
A supervised learning framework is proposed to approximate a model predictive controller (MPC) with reduced computational complexity and guarantees on stability and constraint satisfaction.
Apart from its application for encoding a sequence of observations, we propose to use the compression achieved by this encoding as a criterion for model selection.
Common event-triggered state estimation (ETSE) algorithms save communication in networked control systems by predicting agents' behavior, and transmitting updates only when the predictions deviate significantly.
State-space models (SSMs) are a highly expressive model class for learning patterns in time series data and for system identification.
In practice, the parameters of control policies are often tuned manually.
With this framework, an initial set of controller gains is automatically improved according to a pre-defined performance objective evaluated from experimental data.
To address this issue, we show how a recently published robustification method for Gaussian filters can be applied to the problem at hand.
The contribution of this paper is to show that any Gaussian filter can be made compatible with fat-tailed sensor models by applying one simple change: Instead of filtering with the physical measurement, we propose to filter with a pseudo measurement obtained by applying a feature function to the physical measurement.