no code implementations • 2 Dec 2024 • Sheikh Shafayat, Dongkeun Yoon, Woori Jang, Jiwoo Choi, Alice Oh, Seohyon Jung
In this work, we propose and evaluate the feasibility of a two-stage pipeline to evaluate literary machine translation, in a fine-grained manner, from English to Korean.
1 code implementation • 23 Oct 2024 • Guijin Son, Dongkeun Yoon, Juyoung Suk, Javier Aula-Blasco, Mano Aslan, Vu Trong Kim, Shayekh Bin Islam, Jaume Prats-Cristià, Lucía Tormo-Bañuelos, Seungone Kim
To address this, we introduce MM-Eval, a multilingual meta-evaluation benchmark that covers 18 languages across six categories.
1 code implementation • 9 Jun 2024 • Seungone Kim, Juyoung Suk, Ji Yong Cho, Shayne Longpre, Chaeeun Kim, Dongkeun Yoon, Guijin Son, Yejin Cho, Sheikh Shafayat, Jinheon Baek, Sue Hyun Park, Hyeonbin Hwang, Jinkyung Jo, Hyowon Cho, Haebin Shin, Seongyun Lee, Hanseok Oh, Noah Lee, Namgyu Ho, Se June Joo, Miyoung Ko, Yoonjoo Lee, Hyungjoo Chae, Jamin Shin, Joel Jang, Seonghyeon Ye, Bill Yuchen Lin, Sean Welleck, Graham Neubig, Moontae Lee, Kyungjae Lee, Minjoon Seo
To overcome these limitations, we introduce the BiGGen Bench, a principled generation benchmark designed to thoroughly evaluate nine distinct capabilities of LMs across 77 diverse tasks.
1 code implementation • 19 Jan 2024 • Dongkeun Yoon, Joel Jang, Sungdong Kim, Seungone Kim, Sheikh Shafayat, Minjoon Seo
We introduce LangBridge, a zero-shot approach to adapt language models for multilingual reasoning tasks without multilingual supervision.
1 code implementation • 12 Jun 2023 • Dongkeun Yoon, Joel Jang, Sungdong Kim, Minjoon Seo
In this work, we empirically show that updating pretrained LMs (350M, 1. 3B, 2. 7B) with just a few steps of Gradient Ascent Post-training (GAP) on random, unlabeled text corpora enhances its zero-shot generalization capabilities across diverse NLP tasks.
2 code implementations • 4 Oct 2022 • Joel Jang, Dongkeun Yoon, Sohee Yang, Sungmin Cha, Moontae Lee, Lajanugen Logeswaran, Minjoon Seo
Pretrained Language Models (LMs) memorize a vast amount of knowledge during initial pretraining, including information that may violate the privacy of personal lives and identities.
Ranked #3 on
Language Modelling
on The Pile
(Test perplexity metric)