Reinforcement learning (RL) is a powerful tool for finding optimal policies in sequential decision processes.
Robust Markov decision processes (MDPs) address the challenge of model uncertainty by optimizing the worst-case performance over an uncertainty set of MDPs.
To measure the robustness of the predicted structures, we utilize (i) the root-mean-square deviation (RMSD) and (ii) the Global Distance Test (GDT) similarity measure between the predicted structure of the original sequence and the structure of its adversarially perturbed version.
We derive the robust Bellman equation for robust average-reward MDPs, prove that the optimal policy can be derived from its solution, and further design a robust relative value iteration algorithm that provably finds its solution, or equivalently, the optimal robust policy.
The success of reinforcement learning in typical settings is predicated on Markovian assumptions on the reward signal by which an agent learns optimal policies.
Given a Markov decision process (MDP) and a linear-time ($\omega$-regular or LTL) specification, the controller synthesis problem aims to compute the optimal policy that satisfies the specification.
The problem of representative selection amounts to sampling few informative exemplars from large datasets.
We propose a new way of thinking about deep neural networks, in which the linear and non-linear components of the network are naturally derived and justified in terms of principles in probability theory.
Uniform random node sampling is shown to improve the computational complexity over clustering of the full graph when the cluster sizes are balanced.
An important problem in training deep networks with high capacity is to ensure that the trained network works well when presented with new inputs outside the training dataset.
To the best of our knowledge, this is the first provable robust PCA algorithm that is simultaneously non-iterative, can tolerate a large number of outliers and is robust to linearly dependent outliers.
Random column sampling is not guaranteed to yield data sketches that preserve the underlying structures of the data and may not sample sufficiently from less-populated data clusters.
Our approach hinges on the sparse approximation of a sparsely corrupted column so that the sparse expansion of a column with respect to the other data points is used to distinguish a sparsely corrupted inlier column from an outlying data point.
Conventional sampling techniques fall short of drawing descriptive sketches of the data when the data is grossly corrupted as such corruptions break the low rank structure required for them to perform satisfactorily.
As inliers lie in a low dimensional subspace and are mostly correlated, an inlier is likely to have strong mutual coherence with a large number of data points.
This paper presents a new approach dubbed Innovation Pursuit (iPursuit) to the problem of subspace clustering using a new geometrical idea whereby subspaces are identified based on their relative novelties.
A new class of formal latent-variable stochastic processes called hidden quantum models (HQM's) is defined in order to clarify the theoretical foundations of ion channel signal processing.
This paper explores and analyzes two randomized designs for robust Principal Component Analysis (PCA) employing low-dimensional data sketching.
In this paper, a scalable subspace-pursuit approach that transforms the decomposition problem to a subspace learning problem is proposed.
These mutual information expressions unify conditions for both linear and nonlinear observations.