Weight norm $\|w\|$ and margin $\gamma$ participate in learning theory via the normalized margin $\gamma/\|w\|$.
Existing design paradigms do not address the gap between theory (controller design with continuous time models) and practice (the discrete time sampled implementation of the resulting controllers); this can lead to poor performance and violations of safety for hardware instantiations.
We present MLNav, a learning-enhanced path planning framework for safety-critical and resource-limited systems operating in complex environments, such as rovers navigating on Mars.
Our approach, called LyaNet, is based on a novel Lyapunov loss formulation that encourages the inference dynamics to converge quickly to the correct prediction.
We propose a method for learning the posture and structure of agents from unlabelled behavioral videos.
In this paper, we study the problem of blind inversion: solving an inverse problem with unknown or imperfect knowledge of the forward model parameters.
We evaluate AutoSWAP in three behavior analysis domains and demonstrate that AutoSWAP outperforms existing approaches using only a fraction of the data.
For each layer, we also achieve higher accuracy when the overall accuracy is kept fixed across different methods.
Do neural networks generalise because of bias in the functions returned by gradient descent, or bias already present in the network architecture?
We present a framework for the unsupervised learning of neurosymbolic encoders, i. e., encoders obtained by composing neural networks with symbolic programs from a domain-specific language.
We provide instantiations of our approach under varying conditions, leading to the first non-asymptotic end-to-end convergence guarantee for multi-task nonlinear control.
Hand-annotated data can vary due to factors such as subjective differences, intra-rater variability, and differing annotator expertise.
We study the problem of sparse nonlinear model recovery of high dimensional compositional functions.
In addition, we compare our learned approach against Gurobi, a state-of-the-art MIP solver, demonstrating that our method can be used to improve solver performance.
In this paper, we leverage the sequential nature of MRI measurements, and propose a fully differentiable framework that jointly learns a sequential sampling policy simultaneously with a reconstruction strategy.
1 code implementation • 6 Apr 2021 • Jennifer J. Sun, Tomomi Karigo, Dipam Chakraborty, Sharada P. Mohanty, Benjamin Wild, Quan Sun, Chen Chen, David J. Anderson, Pietro Perona, Yisong Yue, Ann Kennedy
Multi-agent behavior modeling aims to understand the interactions that occur between agents.
We present a straightforward and efficient way to control unstable robotic systems using an estimated dynamics model.
The combination of high diversity and limited data calls for new learning methods that are robust and invariant to operating conditions and surgical techniques.
To address this problem, this paper conducts a combined study of neural architecture and optimisation, leading to a new optimiser called Nero: the neuronal rotator.
Thus, we provide a principled approach to tackling the joint problem of causal discovery and latent variable inference.
In particular, we focus on a class of combinatorial problems that can be solved via submodular maximization (either directly on the objective function or via submodular surrogates).
We present Neural-Swarm2, a learning-based method for motion planning and control that allows heterogeneous multirotors in a swarm to safely fly in close proximity.
The tasks in our method can be efficiently engineered by domain experts through a process we call "task programming", which uses programs to explicitly encode structured knowledge from domain experts.
Modern nonlinear control theory seeks to endow systems with properties such as stability and safety, and has been deployed successfully across various domains.
To facilitate the study of early multimodal fusion, we create a convolutional LSTM network architecture that simultaneously processes both audio and visual inputs, and allows us to select the layer at which audio and visual information combines.
Enhanced AutoNav (ENav), the baseline surface navigation software for NASA's Perseverance rover, sorts a list of candidate paths for the rover to traverse, then uses the Approximate Clearance Evaluation (ACE) algorithm to evaluate whether the most highly ranked paths are safe.
ROIAL learns Bayesian posteriors that predict each exoskeleton user's utility landscape across four exoskeleton gait parameters.
Within this sparse, binary paradigm we sample many binary architectures to create families of architecture agnostic neural networks not trained via backpropagation.
Policy networks are a central feature of deep reinforcement learning (RL) algorithms for continuous control, enabling the estimation and sampling of high-value actions.
We detect such domain shifts through the use of a binary domain classifier and integrate it with the task network and train them jointly end-to-end.
This formulation motivates the use of two neural networks that are jointly trained --- a discriminative network between the source and target domains for density-ratio estimation, in addition to the standard classification network.
This relaxed program is differentiable and can be trained end-to-end, and the resulting training loss is an approximately admissible heuristic that can guide the combinatorial search.
We present a systematic investigation using graph neural networks (GNNs) to model organic chemical reactions.
On the other hand, more sample efficient alternatives like Bayesian quadrature methods have received little attention due to their high computational complexity.
We investigate the average teaching complexity of the task, i. e., the minimal number of samples (halfspace queries) required by a teacher to help a version-space learner in locating a randomly selected target.
This paper proves that multiplicative weight updates satisfy a descent lemma tailored to compositional functions.
A core challenge in policy optimization in competitive Markov decision processes is the design of efficient optimization methods with desirable convergence and stability properties.
The Info-SNOC algorithm is used to compute a sub-optimal pool of safe motion plans that aid in exploration for learning unknown residual dynamics under safety constraints.
This paper studies a strategy for data-driven algorithm design for large-scale combinatorial optimization problems that can leverage existing state-of-the-art solvers in general purpose ways.
Optimizing lower-body exoskeleton walking gaits for user comfort requires understanding users' preferences over a high-dimensional gait parameter space.
We present GLAS: Global-to-Local Autonomy Synthesis, a provably-safe, automated distributed policy generation for multi-robot motion planning.
Efficient and interpretable spatial analysis is crucial in many fields such as geology, sports, and climate science.
Modern nonlinear control theory seeks to endow systems with properties of stability and safety, and have been deployed successfully in multiple domains.
We offer an experimental benchmark and empirical study for off-policy policy evaluation (OPE) in reinforcement learning, which is a key problem in many safety critical applications.
We study the problem of controllable generation of long-term sequential behaviors, where the goal is to calibrate to multiple behavior styles simultaneously.
In preference-based reinforcement learning (RL), an agent interacts with the environment while receiving preferences instead of absolute feedback.
We show that the encoder-decoder model is able to identify the injected anomalies in a modern manufacturing process in an unsupervised fashion.
First, we view our learning task as optimization in policy space, modulo the constraint that the desired policy has a programmatic representation, and solve this optimization problem using a form of mirror descent that takes a gradient step into the unconstrained policy space and then projects back onto the constrained space.
We study the problem of learning sequential decision-making policies in settings with multiple state-action representations.
To address this challenge, we present a deep robust regression model that is trained to directly predict the uncertainty bounds for safe exploration.
We show that functional regularization yields a bias-variance trade-off, and propose an adaptive tuning strategy to optimize this trade-off.
Importantly, we show that our objective function can be efficiently decomposed as a difference of submodular functions (DS), which allows us to employ DS optimization tools to greedily identify sets of constraints that increase the likelihood of finding items with high utility.
When learning policies for real-world domains, two important questions arise: (i) how to efficiently use pre-collected off-policy, non-optimal behavior data; and (ii) how to mediate among different competing objectives and constraints.
The goal of this paper is to understand the impact of learning on control synthesis from a Lyapunov function perspective.
Many modern nonlinear control methods aim to endow systems with guaranteed properties, such as stability or safety, and have been successfully applied to the domain of robotics.
Missing value imputation is a fundamental problem in spatiotemporal modeling, from motion tracking to the dynamics of physical systems.
Ranked #1 on Multivariate Time Series Imputation on PEMS-SF
To the best of our knowledge, this is the first DNN-based nonlinear feedback controller with stability guarantees that can utilize arbitrarily large neural nets.
We apply numerical methods in combination with finite-difference-time-domain (FDTD) simulations to optimize transmission properties of plasmonic mirror color filters using a multi-objective figure of merit over a five-dimensional parameter space by utilizing novel multi-fidelity Gaussian processes approach.
We introduce the variational filtering EM algorithm, a simple, general-purpose method for performing variational inference in dynamical latent variable models using information from only past and present variables, i. e. filtering.
How can we efficiently gather information to optimize an unknown function, when presented with multiple, mutually dependent information sources with different costs?
Deploying these tools, we generalize a variety of existing theoretical guarantees, such as policy gradient and convergence theorems, to partially observable domains, those which also could be carried to more settings of interest.
For the examined datasets, PhaseLink can precisely associate P- and S-picks to events that are separated by ~12 seconds in origin time.
We provide theoretical guarantees for both the satisfaction of safety constraints as well as convergence to the optimal utility value.
Our framework is both generic, allowing the design of teaching schedules for different memory models, and also interactive, allowing the teacher to adapt the schedule to the underlying forgetting mechanisms of the learner.
We study the problem of training sequential generative models for capturing coordinated multi-agent trajectory behavior, such as offensive basketball gameplay.
Deep neural networks are vulnerable to adversarial examples, which dramatically alter model output using small input changes.
We study how to effectively leverage expert feedback to learn sequential decision-making policies.
We highlight that adaptivity does not speed up the teaching process when considering existing models of version space learners, such as "worst-case" (the learner picks the next hypothesis randomly from the version space) and "preference-based" (the learner picks hypothesis according to some global preference).
Inference models, which replace an optimization-based inference procedure with a learned model, have been fundamental in advancing Bayesian deep learning, the most notable example being variational auto-encoders (VAEs).
We present Higher-Order Tensor RNN (HOT-RNN), a novel family of neural sequence architectures for multivariate forecasting in environments with nonlinear dynamics.
This problem can be formulated as a $K$-armed Dueling Bandits problem where $K$ is the total number of decisions.
Matrix and tensor factorization methods are often used for finding underlying low-dimensional patterns from noisy data.
The dueling bandits problem is an online learning framework for learning from pairwise preference feedback, and is particularly well-suited for modeling settings that elicit subjective or implicit human feedback.
We propose a framework for detecting action patterns from motion sequences and modeling the sensory-motor relationship of animals, using a generative recurrent neural network.
We tackle the problem of learning a rotation invariant latent factor model when the training data is comprised of lower-dimensional projections of the original feature space.
We study the problem of smooth imitation learning for online sequence prediction, where the goal is to train a policy that can smoothly imitate demonstrated behavior in a dynamic and continuous environment in response to online, sequential context input.
We study the problem of online prediction for realtime camera planning, where the goal is to predict smooth trajectories that correctly track and frame objects of interest (e. g., players in a basketball game).
Interactive submodular set cover is an interactive variant of submodular set cover over a hypothesis class of submodular functions, where the goal is to satisfy all sufficiently plausible submodular functions to a target threshold using as few (cost-weighted) actions as possible.
We study the problem of predicting a set or list of options under knapsack constraint.
Many prediction domains, such as ad placement, recommendation, trajectory prediction, and document summarization, require predicting a set or list of options.
Diversified retrieval and online learning are two core research areas in the design of modern information retrieval systems. In this paper, we propose the linear submodular bandits problem, which is an online learning setting for optimizing a general class of feature-rich submodular utility models for diversified retrieval.