Search Results for author: Mitchell Stern

Found 18 papers, 7 papers with code

Towards End-to-End In-Image Neural Machine Translation

no code implementations EMNLP (nlpbt) 2020 Elman Mansimov, Mitchell Stern, Mia Chen, Orhan Firat, Jakob Uszkoreit, Puneet Jain

In this paper, we offer a preliminary investigation into the task of in-image machine translation: transforming an image containing text in one language into an image containing the same text in another language.

Machine Translation Translation

Semantic Scaffolds for Pseudocode-to-Code Generation

1 code implementation ACL 2020 Ruiqi Zhong, Mitchell Stern, Dan Klein

We propose a method for program generation based on semantic scaffolds, lightweight structures representing the high-level semantic and syntactic composition of a program.

Code Generation

Imitation Attacks and Defenses for Black-box Machine Translation Systems

1 code implementation EMNLP 2020 Eric Wallace, Mitchell Stern, Dawn Song

To mitigate these vulnerabilities, we propose a defense that modifies translation outputs in order to misdirect the optimization of imitation models.

Machine Translation Translation

Insertion-Deletion Transformer

no code implementations15 Jan 2020 Laura Ruis, Mitchell Stern, Julia Proskurnia, William Chan

We propose the Insertion-Deletion Transformer, a novel transformer-based neural architecture and training method for sequence generation.

Translation

KERMIT: Generative Insertion-Based Modeling for Sequences

no code implementations4 Jun 2019 William Chan, Nikita Kitaev, Kelvin Guu, Mitchell Stern, Jakob Uszkoreit

During training, one can feed KERMIT paired data $(x, y)$ to learn the joint distribution $p(x, y)$, and optionally mix in unpaired data $x$ or $y$ to refine the marginals $p(x)$ or $p(y)$.

Machine Translation Question Answering +2

Insertion Transformer: Flexible Sequence Generation via Insertion Operations

no code implementations8 Feb 2019 Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit

We present the Insertion Transformer, an iterative, partially autoregressive model for sequence generation based on insertion operations.

Machine Translation Translation +1

Blockwise Parallel Decoding for Deep Autoregressive Models

no code implementations NeurIPS 2018 Mitchell Stern, Noam Shazeer, Jakob Uszkoreit

Deep autoregressive sequence-to-sequence models have demonstrated impressive performance across a wide variety of tasks in recent years.

Image Super-Resolution Machine Translation +1

What's Going On in Neural Constituency Parsers? An Analysis

1 code implementation NAACL 2018 David Gaddy, Mitchell Stern, Dan Klein

A number of differences have emerged between modern and classic approaches to constituency parsing in recent years, with structural components like grammars and feature-rich lexicons becoming less central while recurrent neural network representations rise in popularity.

Constituency Parsing

Adafactor: Adaptive Learning Rates with Sublinear Memory Cost

4 code implementations ICML 2018 Noam Shazeer, Mitchell Stern

In several recently proposed stochastic optimization methods (e. g. RMSProp, Adam, Adadelta), parameter updates are scaled by the inverse square roots of exponential moving averages of squared past gradients.

Machine Translation Stochastic Optimization +1

Stochastic Cubic Regularization for Fast Nonconvex Optimization

no code implementations NeurIPS 2018 Nilesh Tripuraneni, Mitchell Stern, Chi Jin, Jeffrey Regier, Michael. I. Jordan

This paper proposes a stochastic variant of a classic algorithm---the cubic-regularized Newton method [Nesterov and Polyak 2006].

Effective Inference for Generative Neural Parsing

no code implementations EMNLP 2017 Mitchell Stern, Daniel Fried, Dan Klein

Generative neural models have recently achieved state-of-the-art results for constituency parsing.

Constituency Parsing

Kernel Feature Selection via Conditional Covariance Minimization

1 code implementation NeurIPS 2017 Jianbo Chen, Mitchell Stern, Martin J. Wainwright, Michael. I. Jordan

We propose a method for feature selection that employs kernel-based measures of independence to find a subset of covariates that is maximally predictive of the response.

Dimensionality Reduction feature selection

The Marginal Value of Adaptive Gradient Methods in Machine Learning

3 code implementations NeurIPS 2017 Ashia C. Wilson, Rebecca Roelofs, Mitchell Stern, Nathan Srebro, Benjamin Recht

Adaptive optimization methods, which perform local optimization with a metric constructed from the history of iterates, are becoming increasingly popular for training deep neural networks.

BIG-bench Machine Learning Binary Classification

A Minimal Span-Based Neural Constituency Parser

no code implementations ACL 2017 Mitchell Stern, Jacob Andreas, Dan Klein

In this work, we present a minimal neural model for constituency parsing based on independent scoring of labels and spans.

Constituency Parsing

Abstract Syntax Networks for Code Generation and Semantic Parsing

1 code implementation ACL 2017 Maxim Rabinovich, Mitchell Stern, Dan Klein

Tasks like code generation and semantic parsing require mapping unstructured (or partially structured) inputs to well-formed, executable outputs.

Code Generation Semantic Parsing

Cannot find the paper you are looking for? You can Submit a new open access paper.