This signal is used to guide policy learning, and the abstract interpretation used to construct it directly leads to the robustness certificate returned at convergence.
A growing body of work studies how to answer a question or verify a claim by generating a natural language "proof": a chain of deductive inferences yielding the answer based on a set of premises.
Neurosymbolic Programming (NP) techniques have the potential to accelerate scientific discovery.
In reinforcement learning for safety-critical settings, it is often desirable for the agent to obey safety constraints at all points in time, including during training.
In settings from fact-checking to question answering, we frequently want to know whether a collection of evidence (premises) entails a hypothesis.
State-of-the-art neural models of source code tend to be evaluated on the generation of individual expressions and lines of code, and commonly fail on long-horizon tasks such as the generation of entire method bodies.
We present a framework for the unsupervised learning of neurosymbolic encoders, which are encoders obtained by composing neural networks with symbolic programs from a domain-specific language.
Hand-annotated data can vary due to factors such as subjective differences, intra-rater variability, and differing annotator expertise.
Recent papers have suggested that transfer learning can outperform sophisticated meta-learning methods for few-shot image classification.
We present Revel, a partially neural reinforcement learning (RL) framework for provably safe exploration in continuous state and action spaces.
This relaxed program is differentiable and can be trained end-to-end, and the resulting training loss is an approximately admissible heuristic that can guide the combinatorial search.
We present a new approach, called meta-meta classification, to learning in small-data settings.
First, we view our learning task as optimization in policy space, modulo the constraint that the desired policy has a programmatic representation, and solve this optimization problem using a form of mirror descent that takes a gradient step into the unconstrained policy space and then projects back onto the constrained space.
We show that functional regularization yields a bias-variance trade-off, and propose an adaptive tuning strategy to optimize this trade-off.
We study the internal representations that a recurrent neural network (RNN) uses while learning to recognize a regular formal language.
In recent years, the notion of local robustness (or robustness for short) has emerged as a desirable property of deep neural networks.
We investigate the internal representations that a recurrent neural network (RNN) uses while learning to recognize a regular formal language.
Unlike the popular Deep Reinforcement Learning (DRL) paradigm, which represents policies by neural networks, PIRL represents policies using a high-level, domain-specific programming language.
We present a neurosymbolic framework for the lifelong learning of algorithmic tasks that mix perception and procedural reasoning.
In this work, we study POMDPs with safe-reachability objectives, which require that with a probability above some threshold, a goal state is eventually reached while keeping the probability of visiting unsafe states below some threshold.
During generation, NAMs make significantly fewer violations of the constraints of the underlying grammar than RNNs trained only on samples from the language of the grammar.
We study the problem of generating source code in a strongly typed, Java-like programming language, given a label (for example a set of API calls or types) carrying a small amount of information about the code that is desired.