In this work we propose VICause, a novel approach to simultaneously tackle missing value imputation and causal discovery efficiently with deep learning.
Graph representations of a target domain often project it to a set of entities (nodes) and their relations (edges).
Towards addressing this, we introduce Grammformers, transformer-based grammar-guided models that learn (without explicit supervision) to generate sketches -- sequences of tokens with holes.
Machine learning-based program analyses have recently shown the promise of integrating formal and probabilistic reasoning towards aiding software development.
To address this, we present GLUECode, Global and Local Understanding Evaluation of Code, a benchmark of diverse tasks to evaluate machine learning models of source code.
Neural sequence-to-sequence models are finding increasing use in editing of documents, for example in correcting a text document or repairing source code.
Code completion is one of the most widely used features of modern integrated development environments (IDEs).
The network uses deep similarity learning to learn a TypeSpace -- a continuous relaxation of the discrete space of types -- and how to embed the type properties of a symbol (i. e. identifier) into it.
We use the labelled traces to train a neural network (NN) model to learn to distinguish runtime patterns for passing versus failing executions for a given program.
To enable evaluation of progress on code search, we are releasing the CodeSearchNet Corpus and are presenting the CodeSearchNet Challenge, which consists of 99 natural language queries with about 4k expert relevance annotations of likely results from CodeSearchNet Corpus.
Program synthesis of general-purpose source code from natural language specifications is challenging due to the need to reason about high-level patterns in the target program and low-level implementation details at the same time.
The field of big code relies on mining large corpora of code to perform some learning task.
Summarization of long sequences into a concise statement is a core problem in natural language processing, requiring non-trivial understanding of the input.
Our evaluation shows the effectiveness of CODIT in learning and suggesting abstract change templates.
As a result, in the past several years there has been an increasing research interest in methods that focus on the intersection of programming and natural language, allowing users to use natural language to interact with computers in the complex ways that programs allow us to do.
Generative models for source code are an interesting structured prediction problem, requiring to reason about both hard syntactic and semantic constraints as well as about natural, likely programs.
Learning tasks on source code (i. e., formal languages) have been considered recently, but most work has tried to transfer natural language methods and does not capitalize on the unique opportunities offered by code's known syntax.
We contrast programming languages against natural languages and discuss how these similarities and differences drive the design of probabilistic models.
As first solutions, we design a set of deep neural models that learn to represent the context of each variable location and variable usage in a data flow-sensitive way.
The results demonstrate that the location selection heuristics produce mutants more closely coupled to real faults for a given budget of mutation operator applications.
Combining abstract, symbolic reasoning with continuous neural reasoning is a grand challenge of representation learning.
Attention mechanisms in neural networks have proved useful for problems in which the input and output do not have fixed dimension.
In recent years, multi-label classification has attracted a significant body of research, motivated by real-life applications, such as text classification and medical diagnoses.