Search Results for author: Cheolhyoung Lee

Found 4 papers, 2 papers with code

Unsupervised Learning of Initialization in Deep Neural Networks via Maximum Mean Discrepancy

no code implementations8 Feb 2023 Cheolhyoung Lee, Kyunghyun Cho

We first notice that each parameter configuration in the parameter space corresponds to one particular downstream task of d-way classification.

A Non-monotonic Self-terminating Language Model

1 code implementation3 Oct 2022 Eugene Choi, Kyunghyun Cho, Cheolhyoung Lee

We then propose a non-monotonic self-terminating language model, which significantly relaxes the constraint of monotonically increasing termination probability in the originally proposed self-terminating language model by Welleck et al. (2020), to address the issue of non-terminating sequences when using incomplete probable decoding algorithms.

Language Modelling Text Generation

Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models

2 code implementations ICLR 2020 Cheolhyoung Lee, Kyunghyun Cho, Wanmo Kang

We empirically evaluate the proposed mixout and its variants on finetuning a pretrained language model on downstream tasks.

Language Modelling

Directional Analysis of Stochastic Gradient Descent via von Mises-Fisher Distributions in Deep learning

no code implementations ICLR 2019 Cheolhyoung Lee, Kyunghyun Cho, Wanmo Kang

We empirically verify our result using deep convolutional networks and observe a higher correlation between the gradient stochasticity and the proposed directional uniformity than that against the gradient norm stochasticity, suggesting that the directional statistics of minibatch gradients is a major factor behind SGD.

Cannot find the paper you are looking for? You can Submit a new open access paper.