Search Results for author: Mitchell Stern

Found 18 papers, 7 papers with code

Interactive Assignments for Teaching Structured Neural NLP

no code implementations • NAACL (TeachingNLP) 2021 • David Gaddy, Daniel Fried, Nikita Kitaev, Mitchell Stern, Rodolfo Corona, John DeNero, Dan Klein

We present a set of assignments for a graduate-level NLP course.

Paper
Add Code

Towards End-to-End In-Image Neural Machine Translation

no code implementations • EMNLP (nlpbt) 2020 • Elman Mansimov, Mitchell Stern, Mia Chen, Orhan Firat, Jakob Uszkoreit, Puneet Jain

In this paper, we offer a preliminary investigation into the task of in-image machine translation: transforming an image containing text in one language into an image containing the same text in another language.

Machine Translation Translation

Paper
Add Code

Semantic Scaffolds for Pseudocode-to-Code Generation

1 code implementation • ACL 2020 • Ruiqi Zhong, Mitchell Stern, Dan Klein

We propose a method for program generation based on semantic scaffolds, lightweight structures representing the high-level semantic and syntactic composition of a program.

Code Generation

Paper
Code

Imitation Attacks and Defenses for Black-box Machine Translation Systems

1 code implementation • EMNLP 2020 • Eric Wallace, Mitchell Stern, Dawn Song

To mitigate these vulnerabilities, we propose a defense that modifies translation outputs in order to misdirect the optimization of imitation models.

Machine Translation Translation

Paper
Code

Insertion-Deletion Transformer

no code implementations • 15 Jan 2020 • Laura Ruis, Mitchell Stern, Julia Proskurnia, William Chan

We propose the Insertion-Deletion Transformer, a novel transformer-based neural architecture and training method for sequence generation.

Translation

Paper
Add Code

An Empirical Study of Generation Order for Machine Translation

no code implementations • EMNLP 2020 • William Chan, Mitchell Stern, Jamie Kiros, Jakob Uszkoreit

In this work, we present an empirical study of generation order for machine translation.

Machine Translation Translation

Paper
Add Code

KERMIT: Generative Insertion-Based Modeling for Sequences

no code implementations • 4 Jun 2019 • William Chan, Nikita Kitaev, Kelvin Guu, Mitchell Stern, Jakob Uszkoreit

During training, one can feed KERMIT paired data $(x, y)$ to learn the joint distribution $p(x, y)$, and optionally mix in unpaired data $x$ or $y$ to refine the marginals $p(x)$ or $p(y)$.

Ranked #39 on Machine Translation on WMT2014 English-German

Machine Translation Question Answering +2

Paper
Add Code

Insertion Transformer: Flexible Sequence Generation via Insertion Operations

no code implementations • 8 Feb 2019 • Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit

We present the Insertion Transformer, an iterative, partially autoregressive model for sequence generation based on insertion operations.

Machine Translation Translation +1

Paper
Add Code

Blockwise Parallel Decoding for Deep Autoregressive Models

no code implementations • NeurIPS 2018 • Mitchell Stern, Noam Shazeer, Jakob Uszkoreit

Deep autoregressive sequence-to-sequence models have demonstrated impressive performance across a wide variety of tasks in recent years.

Image Super-Resolution Machine Translation +1

Paper
Add Code

What's Going On in Neural Constituency Parsers? An Analysis

1 code implementation • NAACL 2018 • David Gaddy, Mitchell Stern, Dan Klein

A number of differences have emerged between modern and classic approaches to constituency parsing in recent years, with structural components like grammars and feature-rich lexicons becoming less central while recurrent neural network representations rise in popularity.

Constituency Parsing

Paper
Code

Adafactor: Adaptive Learning Rates with Sublinear Memory Cost

4 code implementations • ICML 2018 • Noam Shazeer, Mitchell Stern

In several recently proposed stochastic optimization methods (e. g. RMSProp, Adam, Adadelta), parameter updates are scaled by the inverse square roots of exponential moving averages of squared past gradients.

Machine Translation Stochastic Optimization +1

742

Paper
Code

Stochastic Cubic Regularization for Fast Nonconvex Optimization

no code implementations • NeurIPS 2018 • Nilesh Tripuraneni, Mitchell Stern, Chi Jin, Jeffrey Regier, Michael. I. Jordan

This paper proposes a stochastic variant of a classic algorithm---the cubic-regularized Newton method [Nesterov and Polyak 2006].

Paper
Add Code

Effective Inference for Generative Neural Parsing

no code implementations • EMNLP 2017 • Mitchell Stern, Daniel Fried, Dan Klein

Generative neural models have recently achieved state-of-the-art results for constituency parsing.

Constituency Parsing

Paper
Add Code

Improving Neural Parsing by Disentangling Model Combination and Reranking Effects

no code implementations • ACL 2017 • Daniel Fried, Mitchell Stern, Dan Klein

Recent work has proposed several generative neural models for constituency parsing that achieve state-of-the-art results.

Ranked #14 on Constituency Parsing on Penn Treebank

Constituency Parsing

Paper
Add Code

Kernel Feature Selection via Conditional Covariance Minimization

1 code implementation • NeurIPS 2017 • Jianbo Chen, Mitchell Stern, Martin J. Wainwright, Michael. I. Jordan

We propose a method for feature selection that employs kernel-based measures of independence to find a subset of covariates that is maximally predictive of the response.

Dimensionality Reduction feature selection

Paper
Code

The Marginal Value of Adaptive Gradient Methods in Machine Learning

3 code implementations • NeurIPS 2017 • Ashia C. Wilson, Rebecca Roelofs, Mitchell Stern, Nathan Srebro, Benjamin Recht

Adaptive optimization methods, which perform local optimization with a metric constructed from the history of iterates, are becoming increasingly popular for training deep neural networks.

BIG-bench Machine Learning Binary Classification

185

Paper
Code