Search Results for author: Jason Lee

Found 19 papers, 8 papers with code

Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares

5 code implementations9 Oct 2014 Trevor Hastie, Rahul Mazumder, Jason Lee, Reza Zadeh

The matrix-completion problem has attracted a lot of attention, largely as a result of the celebrated Netflix competition.

Matrix Completion

Fully Character-Level Neural Machine Translation without Explicit Segmentation

2 code implementations TACL 2017 Jason Lee, Kyunghyun Cho, Thomas Hofmann

We observe that on CS-EN, FI-EN and RU-EN, the quality of the multilingual character-level translation even surpasses the models specifically trained on that language pair alone, both in terms of BLEU score and human judgment.

Machine Translation NMT +1

Emergent Translation in Multi-Agent Communication

no code implementations ICLR 2018 Jason Lee, Kyunghyun Cho, Jason Weston, Douwe Kiela

While most machine translation systems to date are trained on large parallel corpora, humans learn language in a different way: by being grounded in an environment and interacting with other humans.

Machine Translation Sentence +1

Characterizing Implicit Bias in Terms of Optimization Geometry

no code implementations ICML 2018 Suriya Gunasekar, Jason Lee, Daniel Soudry, Nathan Srebro

We study the implicit bias of generic optimization methods, such as mirror descent, natural gradient descent, and steepest descent with respect to different potentials and norms, when optimizing underdetermined linear regression or separable linear classification problems.

General Classification regression

Implicit Bias of Gradient Descent on Linear Convolutional Networks

no code implementations NeurIPS 2018 Suriya Gunasekar, Jason Lee, Daniel Soudry, Nathan Srebro

We show that gradient descent on full-width linear convolutional networks of depth $L$ converges to a linear predictor related to the $\ell_{2/L}$ bridge penalty in the frequency domain.

On the Power of Over-parametrization in Neural Networks with Quadratic Activation

no code implementations ICML 2018 Simon Du, Jason Lee

We provide new theoretical insights on why over-parametrization is effective in learning neural networks.

Gradient Primal-Dual Algorithm Converges to Second-Order Stationary Solution for Nonconvex Distributed Optimization Over Networks

no code implementations ICML 2018 Mingyi Hong, Meisam Razaviyayn, Jason Lee

In this work, we study two first-order primal-dual based algorithms, the Gradient Primal-Dual Algorithm (GPDA) and the Gradient Alternating Direction Method of Multipliers (GADMM), for solving a class of linearly constrained non-convex optimization problems.

Distributed Optimization

Countering Language Drift via Grounding

no code implementations27 Sep 2018 Jason Lee, Kyunghyun Cho, Douwe Kiela

While reinforcement learning (RL) shows a lot of promise for natural language processing—e. g.

Language Modelling Policy Gradient Methods +3

Multi-Scale Distributed Representation for Deep Learning and its Application to b-Jet Tagging

no code implementations29 Nov 2018 Jason Lee, Inkyu Park, Sangnam Park

Recently machine learning algorithms based on deep layered artificial neural networks (DNNs) have been applied to a wide variety of high energy physics problems such as jet tagging or event classification.

Binarization Classification with Binary Neural Network +2

On the Margin Theory of Feedforward Neural Networks

no code implementations ICLR 2019 Colin Wei, Jason Lee, Qiang Liu, Tengyu Ma

We establish: 1) for multi-layer feedforward relu networks, the global minimizer of a weakly-regularized cross-entropy loss has the maximum normalized margin among all networks, 2) as a result, increasing the over-parametrization improves the normalized margin and generalization error bounds for deep networks.

Multi-Turn Beam Search for Neural Dialogue Modeling

1 code implementation1 Jun 2019 Ilia Kulikov, Jason Lee, Kyunghyun Cho

We propose a novel approach for conversation-level inference by explicitly modeling the dialogue partner and running beam search across multiple conversation turns.

Kernel and Rich Regimes in Overparametrized Models

1 code implementation13 Jun 2019 Blake Woodworth, Suriya Gunasekar, Pedro Savarese, Edward Moroshko, Itay Golan, Jason Lee, Daniel Soudry, Nathan Srebro

A recent line of work studies overparametrized neural networks in the "kernel regime," i. e. when the network behaves during training as a kernelized linear predictor, and thus training with gradient descent has the effect of finding the minimum RKHS norm solution.

Latent-Variable Non-Autoregressive Neural Machine Translation with Deterministic Inference Using a Delta Posterior

1 code implementation20 Aug 2019 Raphael Shu, Jason Lee, Hideki Nakayama, Kyunghyun Cho

By decoding multiple initial latent variables in parallel and rescore using a teacher model, the proposed model further brings the gap down to 1. 0 BLEU point on WMT'14 En-De task with 6. 8x speedup.

Machine Translation Translation

Countering Language Drift via Visual Grounding

no code implementations IJCNLP 2019 Jason Lee, Kyunghyun Cho, Douwe Kiela

Emergent multi-agent communication protocols are very different from natural language and not easily interpretable by humans.

Language Modelling Translation +1

On the Discrepancy between Density Estimation and Sequence Generation

1 code implementation EMNLP (spnlp) 2020 Jason Lee, Dustin Tran, Orhan Firat, Kyunghyun Cho

In this paper, by comparing several density estimators on five machine translation tasks, we find that the correlation between rankings of models based on log-likelihood and BLEU varies significantly depending on the range of the model families being compared.

Density Estimation Machine Translation +2

Iterative Refinement in the Continuous Space for Non-Autoregressive Neural Machine Translation

1 code implementation EMNLP 2020 Jason Lee, Raphael Shu, Kyunghyun Cho

Given a continuous latent variable model for machine translation (Shu et al., 2020), we train an inference network to approximate the gradient of the marginal log probability of the target sentence, using only the latent variable as input.

Machine Translation Sentence +1

LXPER Index 2.0: Improving Text Readability Assessment Model for L2 English Students in Korea

no code implementations AACL (NLP-TEA) 2020 Bruce W. Lee, Jason Lee

We train our model with CoKEC-text and significantly improve the accuracy of readability assessment for texts in the Korean ELT curriculum.

TeraHAC: Hierarchical Agglomerative Clustering of Trillion-Edge Graphs

no code implementations7 Aug 2023 Laxman Dhulipala, Jason Lee, Jakub Łącki, Vahab Mirrokni

Our algorithm is based on a new approach to computing $(1+\epsilon)$-approximate HAC, which is a novel combination of the nearest-neighbor chain algorithm and the notion of $(1+\epsilon)$-approximate HAC.

Clustering

Cannot find the paper you are looking for? You can Submit a new open access paper.