Search Results for author: Seungone Kim

Found 14 papers, 11 papers with code

Self-Explore to Avoid the Pit: Improving the Reasoning Capabilities of Language Models with Fine-grained Rewards

1 code implementation16 Apr 2024 Hyeonbin Hwang, Doyoung Kim, Seungone Kim, Seonghyeon Ye, Minjoon Seo

Training on large amounts of rationales (i. e., CoT Fine-tuning) is effective at improving the reasoning capabilities of large language models (LLMs).

GSM8K Math

Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models

no code implementations3 Apr 2024 Hyungjoo Chae, Yeonghyeon Kim, Seungone Kim, Kai Tzu-iunn Ong, Beong-woo Kwak, Moohyeon Kim, SeongHwan Kim, Taeyoon Kwon, Jiwan Chung, Youngjae Yu, Jinyoung Yeo

Also, we show that compared to natural language, pseudocode can better guide the reasoning of LMs, even though they are trained to follow natural language instructions.

KMMLU: Measuring Massive Multitask Language Understanding in Korean

no code implementations18 Feb 2024 Guijin Son, Hanwool Lee, Sungdong Kim, Seungone Kim, Niklas Muennighoff, Taekyoon Choi, Cheonbok Park, Kang Min Yoo, Stella Biderman

We propose KMMLU, a new Korean benchmark with 35, 030 expert-level multiple-choice questions across 45 subjects ranging from humanities to STEM.

Language Modelling Multiple-choice

Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?

1 code implementation18 Feb 2024 Guijin Son, Sangwon Baek, Sangdae Nam, Ilgyun Jeong, Seungone Kim

Large language models (LLMs) are typically prompted to follow a single instruction per inference call.

LangBridge: Multilingual Reasoning Without Multilingual Supervision

no code implementations19 Jan 2024 Dongkeun Yoon, Joel Jang, Sungdong Kim, Seungone Kim, Sheikh Shafayat, Minjoon Seo

We introduce LangBridge, a zero-shot approach to adapt language models for multilingual reasoning tasks without multilingual supervision.

Logical Reasoning Mathematical Reasoning

Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging

1 code implementation17 Oct 2023 Joel Jang, Seungone Kim, Bill Yuchen Lin, Yizhong Wang, Jack Hessel, Luke Zettlemoyer, Hannaneh Hajishirzi, Yejin Choi, Prithviraj Ammanabrolu

In this work, we study Reinforcement Learning from Personalized Human Feedback (RLPHF) problem, wherein LLMs are aligned to multiple (sometimes conflicting) preferences by modeling alignment as a Multi-Objective Reinforcement Learning (MORL) problem.

Language Modelling Large Language Model +2

Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

2 code implementations12 Oct 2023 Seungone Kim, Jamin Shin, Yejin Cho, Joel Jang, Shayne Longpre, Hwaran Lee, Sangdoo Yun, Seongjin Shin, Sungdong Kim, James Thorne, Minjoon Seo

We first construct the Feedback Collection, a new dataset that consists of 1K fine-grained score rubrics, 20K instructions, and 100K responses and language feedback generated by GPT-4.

Language Modelling Large Language Model

FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets

1 code implementation20 Jul 2023 Seonghyeon Ye, Doyoung Kim, Sungdong Kim, Hyeonbin Hwang, Seungone Kim, Yongrae Jo, James Thorne, Juho Kim, Minjoon Seo

Evaluation of Large Language Models (LLMs) is challenging because instruction-following necessitates alignment with human values and the required set of skills varies depending on the instruction.

Instruction Following Language Modelling

The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning

2 code implementations23 May 2023 Seungone Kim, Se June Joo, Doyoung Kim, Joel Jang, Seonghyeon Ye, Jamin Shin, Minjoon Seo

Furthermore, we show that instruction tuning with CoT Collection allows LMs to possess stronger few-shot learning capabilities on 4 domain-specific tasks, resulting in an improvement of +2. 24% (Flan-T5 3B) and +2. 37% (Flan-T5 11B), even outperforming ChatGPT utilizing demonstrations until the max length by a +13. 98% margin.

Common Sense Reasoning Common Sense Reasoning (Zero-Shot) +7

CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification

1 code implementation7 Mar 2023 Seungone Kim, Se June Joo, Yul Jang, Hyungjoo Chae, Jinyoung Yeo

To improve the correctness of the explanations, fine-tuning language models with explanation data is needed.

Exploring the Benefits of Training Expert Language Models over Instruction Tuning

2 code implementations7 Feb 2023 Joel Jang, Seungone Kim, Seonghyeon Ye, Doyoung Kim, Lajanugen Logeswaran, Moontae Lee, Kyungjae Lee, Minjoon Seo

Recently, Language Models (LMs) instruction-tuned on multiple tasks, also known as multitask-prompted fine-tuning (MT), have shown the capability to generalize to unseen tasks.

Common Sense Reasoning Coreference Resolution +4

Mind the Gap! Injecting Commonsense Knowledge for Abstractive Dialogue Summarization

1 code implementation COLING 2022 Seungone Kim, Se June Joo, Hyungjoo Chae, Chaehyeong Kim, Seung-won Hwang, Jinyoung Yeo

In this paper, we propose to leverage the unique characteristics of dialogues sharing commonsense knowledge across participants, to resolve the difficulties in summarizing them.

Abstractive Dialogue Summarization Multi-Task Learning +1

Can Language Models perform Abductive Commonsense Reasoning?

1 code implementation7 Jul 2022 Seungone Kim

Abductive Reasoning is a task of inferring the most plausible hypothesis given a set of observations.

Cannot find the paper you are looking for? You can Submit a new open access paper.