Motion planning can be cast as a trajectory optimisation problem where a cost is minimised as a function of the trajectory being generated.
Bayesian optimisation (BO) algorithms have shown remarkable success in applications involving expensive black-box functions.
Stochastic model predictive control has been a successful and robust control framework for many robotics tasks where the system dynamics model is slightly inaccurate or in the presence of environment disturbances.
We propose an adaptive optimisation approach for tuning stochastic model predictive control (MPC) hyper-parameters while jointly estimating probability distributions of the transition model parameters based on performance rewards.
For the matrix normal model, all our bounds are minimax optimal up to logarithmic factors, and for the tensor normal model our bound for the largest factor and overall covariance matrix are minimax optimal up to constant factors provided there are enough samples for any estimator to obtain constant Frobenius error.
We consider the regret minimization problem in reinforcement learning (RL) in the episodic setting.
The resulting method is based on sparse spectrum Gaussian processes, enabling closed-form solutions, and is extensible to a stacked construction to capture more complex patterns.
Model predictive control (MPC) has been successful in applications involving the control of complex physical systems.
Accurate simulation of complex physical systems enables the development, testing, and certification of control strategies before they are deployed into the real systems.
Inverse problems are ubiquitous in natural sciences and refer to the challenging task of inferring complex and potentially multi-modal posterior distributions over hidden parameters given a set of observations.
In this context, we propose an upper confidence bound (UCB) algorithm for BO problems where both the outcome of a query and the true query location are uncertain.
On the other hand, the cost to evaluate the policy's performance might also be high, being desirable that a solution can be found with as few interactions as possible with the real system.
In outdoor environments, mobile robots are required to navigate through terrain with varying characteristics, some of which might significantly affect the integrity of the platform.