no code implementations • AACL (NLP-TEA) 2020 • Bruce W. Lee, Jason Lee
We train our model with CoKEC-text and significantly improve the accuracy of readability assessment for texts in the Korean ELT curriculum.
1 code implementation • EMNLP 2020 • Jason Lee, Raphael Shu, Kyunghyun Cho
Given a continuous latent variable model for machine translation (Shu et al., 2020), we train an inference network to approximate the gradient of the marginal log probability of the target sentence, using only the latent variable as input.
1 code implementation • EMNLP (spnlp) 2020 • Jason Lee, Dustin Tran, Orhan Firat, Kyunghyun Cho
In this paper, by comparing several density estimators on five machine translation tasks, we find that the correlation between rankings of models based on log-likelihood and BLEU varies significantly depending on the range of the model families being compared.
no code implementations • IJCNLP 2019 • Jason Lee, Kyunghyun Cho, Douwe Kiela
Emergent multi-agent communication protocols are very different from natural language and not easily interpretable by humans.
1 code implementation • 20 Aug 2019 • Raphael Shu, Jason Lee, Hideki Nakayama, Kyunghyun Cho
By decoding multiple initial latent variables in parallel and rescore using a teacher model, the proposed model further brings the gap down to 1. 0 BLEU point on WMT'14 En-De task with 6. 8x speedup.
1 code implementation • 13 Jun 2019 • Blake Woodworth, Suriya Gunasekar, Pedro Savarese, Edward Moroshko, Itay Golan, Jason Lee, Daniel Soudry, Nathan Srebro
A recent line of work studies overparametrized neural networks in the "kernel regime," i. e. when the network behaves during training as a kernelized linear predictor, and thus training with gradient descent has the effect of finding the minimum RKHS norm solution.
1 code implementation • 1 Jun 2019 • Ilia Kulikov, Jason Lee, Kyunghyun Cho
We propose a novel approach for conversation-level inference by explicitly modeling the dialogue partner and running beam search across multiple conversation turns.
no code implementations • ICLR 2019 • Colin Wei, Jason Lee, Qiang Liu, Tengyu Ma
We establish: 1) for multi-layer feedforward relu networks, the global minimizer of a weakly-regularized cross-entropy loss has the maximum normalized margin among all networks, 2) as a result, increasing the over-parametrization improves the normalized margin and generalization error bounds for deep networks.
no code implementations • 29 Nov 2018 • Jason Lee, Inkyu Park, Sangnam Park
Recently machine learning algorithms based on deep layered artificial neural networks (DNNs) have been applied to a wide variety of high energy physics problems such as jet tagging or event classification.
no code implementations • 27 Sep 2018 • Jason Lee, Kyunghyun Cho, Douwe Kiela
While reinforcement learning (RL) shows a lot of promise for natural language processing—e. g.
no code implementations • ICML 2018 • Simon Du, Jason Lee
We provide new theoretical insights on why over-parametrization is effective in learning neural networks.
no code implementations • ICML 2018 • Mingyi Hong, Meisam Razaviyayn, Jason Lee
In this work, we study two first-order primal-dual based algorithms, the Gradient Primal-Dual Algorithm (GPDA) and the Gradient Alternating Direction Method of Multipliers (GADMM), for solving a class of linearly constrained non-convex optimization problems.
no code implementations • NeurIPS 2018 • Suriya Gunasekar, Jason Lee, Daniel Soudry, Nathan Srebro
We show that gradient descent on full-width linear convolutional networks of depth $L$ converges to a linear predictor related to the $\ell_{2/L}$ bridge penalty in the frequency domain.
no code implementations • ICML 2018 • Suriya Gunasekar, Jason Lee, Daniel Soudry, Nathan Srebro
We study the implicit bias of generic optimization methods, such as mirror descent, natural gradient descent, and steepest descent with respect to different potentials and norms, when optimizing underdetermined linear regression or separable linear classification problems.
2 code implementations • EMNLP 2018 • Jason Lee, Elman Mansimov, Kyunghyun Cho
We propose a conditional non-autoregressive neural sequence model based on iterative refinement.
Ranked #5 on
Machine Translation
on IWSLT2015 German-English
no code implementations • ICLR 2018 • Jason Lee, Kyunghyun Cho, Jason Weston, Douwe Kiela
While most machine translation systems to date are trained on large parallel corpora, humans learn language in a different way: by being grounded in an environment and interacting with other humans.
2 code implementations • TACL 2017 • Jason Lee, Kyunghyun Cho, Thomas Hofmann
We observe that on CS-EN, FI-EN and RU-EN, the quality of the multilingual character-level translation even surpasses the models specifically trained on that language pair alone, both in terms of BLEU score and human judgment.
5 code implementations • 9 Oct 2014 • Trevor Hastie, Rahul Mazumder, Jason Lee, Reza Zadeh
The matrix-completion problem has attracted a lot of attention, largely as a result of the celebrated Netflix competition.