Search Results for author: Cheolhyoung Lee

Found 4 papers, 2 papers with code

Unsupervised Learning of Initialization in Deep Neural Networks via Maximum Mean Discrepancy

no code implementations • 8 Feb 2023 • Cheolhyoung Lee, Kyunghyun Cho

We first notice that each parameter configuration in the parameter space corresponds to one particular downstream task of d-way classification.

Paper
Add Code

A Non-monotonic Self-terminating Language Model

1 code implementation • 3 Oct 2022 • Eugene Choi, Kyunghyun Cho, Cheolhyoung Lee

We then propose a non-monotonic self-terminating language model, which significantly relaxes the constraint of monotonically increasing termination probability in the originally proposed self-terminating language model by Welleck et al. (2020), to address the issue of non-terminating sequences when using incomplete probable decoding algorithms.

Language Modelling Text Generation

Paper
Code

Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models

2 code implementations • ICLR 2020 • Cheolhyoung Lee, Kyunghyun Cho, Wanmo Kang

We empirically evaluate the proposed mixout and its variants on finetuning a pretrained language model on downstream tasks.

Language Modelling

Paper
Code

Directional Analysis of Stochastic Gradient Descent via von Mises-Fisher Distributions in Deep learning

no code implementations • ICLR 2019 • Cheolhyoung Lee, Kyunghyun Cho, Wanmo Kang

We empirically verify our result using deep convolutional networks and observe a higher correlation between the gradient stochasticity and the proposed directional uniformity than that against the gradient norm stochasticity, suggesting that the directional statistics of minibatch gradients is a major factor behind SGD.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.