Search Results for author: Sungdong Kim

Found 26 papers, 15 papers with code

KMMLU: Measuring Massive Multitask Language Understanding in Korean

no code implementations18 Feb 2024 Guijin Son, Hanwool Lee, Sungdong Kim, Seungone Kim, Niklas Muennighoff, Taekyoon Choi, Cheonbok Park, Kang Min Yoo, Stella Biderman

We propose KMMLU, a new Korean benchmark with 35, 030 expert-level multiple-choice questions across 45 subjects ranging from humanities to STEM.

Language Modelling Multiple-choice

Aligning Large Language Models by On-Policy Self-Judgment

1 code implementation17 Feb 2024 Sangkyu Lee, Sungdong Kim, Ashkan Yousefpour, Minjoon Seo, Kang Min Yoo, Youngjae Yu

Existing approaches for aligning large language models with human preferences face a trade-off that requires a separate reward model (RM) for on-policy learning.

Instruction Following

LangBridge: Multilingual Reasoning Without Multilingual Supervision

no code implementations19 Jan 2024 Dongkeun Yoon, Joel Jang, Sungdong Kim, Seungone Kim, Sheikh Shafayat, Minjoon Seo

We introduce LangBridge, a zero-shot approach to adapt language models for multilingual reasoning tasks without multilingual supervision.

Logical Reasoning Mathematical Reasoning

Tree of Clarifications: Answering Ambiguous Questions with Retrieval-Augmented Large Language Models

1 code implementation23 Oct 2023 Gangwoo Kim, Sungdong Kim, Byeongguk Jeon, Joonsuk Park, Jaewoo Kang

To cope with the challenge, we propose a novel framework, Tree of Clarifications (ToC): It recursively constructs a tree of disambiguations for the AQ -- via few-shot prompting leveraging external knowledge -- and uses it to generate a long-form answer.

Open-Domain Question Answering Retrieval

Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

1 code implementation12 Oct 2023 Seungone Kim, Jamin Shin, Yejin Cho, Joel Jang, Shayne Longpre, Hwaran Lee, Sangdoo Yun, Seongjin Shin, Sungdong Kim, James Thorne, Minjoon Seo

We first construct the Feedback Collection, a new dataset that consists of 1K fine-grained score rubrics, 20K instructions, and 100K responses and language feedback generated by GPT-4.

Language Modelling Large Language Model

FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets

1 code implementation20 Jul 2023 Seonghyeon Ye, Doyoung Kim, Sungdong Kim, Hyeonbin Hwang, Seungone Kim, Yongrae Jo, James Thorne, Juho Kim, Minjoon Seo

Evaluation of Large Language Models (LLMs) is challenging because instruction-following necessitates alignment with human values and the required set of skills varies depending on the instruction.

Instruction Following Language Modelling

Gradient Ascent Post-training Enhances Language Model Generalization

1 code implementation12 Jun 2023 Dongkeun Yoon, Joel Jang, Sungdong Kim, Minjoon Seo

In this work, we empirically show that updating pretrained LMs (350M, 1. 3B, 2. 7B) with just a few steps of Gradient Ascent Post-training (GAP) on random, unlabeled text corpora enhances its zero-shot generalization capabilities across diverse NLP tasks.

Language Modelling Zero-shot Generalization

Revealing User Familiarity Bias in Task-Oriented Dialogue via Interactive Evaluation

no code implementations23 May 2023 Takyoung Kim, Jamin Shin, Young-Ho Kim, Sanghwan Bae, Sungdong Kim

Most task-oriented dialogue (TOD) benchmarks assume users that know exactly how to use the system by constraining the user behaviors within the system's capabilities via strict user goals, namely "user familiarity" bias.

Aligning Large Language Models through Synthetic Feedback

1 code implementation23 May 2023 Sungdong Kim, Sanghwan Bae, Jamin Shin, Soyoung Kang, Donghyun Kwak, Kang Min Yoo, Minjoon Seo

In human evaluation, our model is preferred to Alpaca and Dolly-v2, 55. 0% and 58. 5% of the time, respectively.

Language Modelling

Semi-Parametric Video-Grounded Text Generation

no code implementations27 Jan 2023 Sungdong Kim, Jin-Hwa Kim, Jiyoung Lee, Minjoon Seo

Efficient video-language modeling should consider the computational cost because of a large, sometimes intractable, number of video frames.

Language Modelling Text Generation +2

Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data

no code implementations14 Jan 2023 Jing Wei, Sungdong Kim, Hyunhoon Jung, Young-Ho Kim

Through an online study (N = 48) where participants conversed with chatbots driven by different designs of prompts, we assessed how prompt designs and conversation topics affected the conversation flows and users' perceptions of chatbots.

Can Current Task-oriented Dialogue Models Automate Real-world Scenarios in the Wild?

no code implementations20 Dec 2022 Sang-Woo Lee, Sungdong Kim, Donghyeon Ko, Donghoon Ham, Youngki Hong, Shin Ah Oh, Hyunhoon Jung, Wangkyo Jung, Kyunghyun Cho, Donghyun Kwak, Hyungsuk Noh, WooMyoung Park

Task-oriented dialogue (TOD) systems are mainly based on the slot-filling-based TOD (SF-TOD) framework, in which dialogues are broken down into smaller, controllable units (i. e., slots) to fulfill a specific task.

Language Modelling Position +2

Keep Me Updated! Memory Management in Long-term Conversations

no code implementations17 Oct 2022 Sanghwan Bae, Donghyun Kwak, Soyoung Kang, Min Young Lee, Sungdong Kim, Yuin Jeong, Hyeri Kim, Sang-Woo Lee, WooMyoung Park, Nako Sung

Remembering important information from the past and continuing to talk about it in the present are crucial in long-term conversations.

Management

Leveraging Pre-Trained Language Models to Streamline Natural Language Interaction for Self-Tracking

no code implementations31 May 2022 Young-Ho Kim, Sungdong Kim, Minsuk Chang, Sang-Woo Lee

Current natural language interaction for self-tracking tools largely depends on bespoke implementation optimized for a specific tracking theme and data format, which is neither generalizable nor scalable to a tremendous design space of self-tracking.

Generating Information-Seeking Conversations from Unlabeled Documents

1 code implementation25 May 2022 Gangwoo Kim, Sungdong Kim, Kang Min Yoo, Jaewoo Kang

In this paper, we introduce a novel framework, SIMSEEK, (Simulating information-Seeking conversation from unlabeled documents), and compare its two variants.

Conversational Search

Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models

1 code implementation NAACL 2022 Sanghwan Bae, Donghyun Kwak, Sungdong Kim, Donghoon Ham, Soyoung Kang, Sang-Woo Lee, WooMyoung Park

In this work, we study the challenge of imposing roles on open-domain dialogue systems, with the goal of making the systems maintain consistent roles while conversing naturally with humans.

Few-Shot Learning

Saving Dense Retriever from Shortcut Dependency in Conversational Search

1 code implementation15 Feb 2022 Sungdong Kim, Gangwoo Kim

In this paper, we demonstrate the existence of a retrieval shortcut in CS, which causes models to retrieve passages solely relying on partial history while disregarding the latest question.

Conversational Search Retrieval

Can Language Models be Biomedical Knowledge Bases?

1 code implementation EMNLP 2021 Mujeen Sung, Jinhyuk Lee, Sean Yi, Minji Jeon, Sungdong Kim, Jaewoo Kang

To this end, we create the BioLAMA benchmark, which is comprised of 49K biomedical factual knowledge triples for probing biomedical LMs.

Efficient Dialogue State Tracking by Selectively Overwriting Memory

3 code implementations ACL 2020 Sungdong Kim, Sohee Yang, Gyuwan Kim, Sang-Woo Lee

This mechanism consists of two steps: (1) predicting state operation on each of the memory slots, and (2) overwriting the memory with new values, of which only a few are generated according to the predicted state operations.

Dialogue State Tracking Multi-domain Dialogue State Tracking

QADiver: Interactive Framework for Diagnosing QA Models

no code implementations1 Dec 2018 Gyeongbok Lee, Sungdong Kim, Seung-won Hwang

Question answering (QA) extracting answers from text to the given question in natural language, has been actively studied and existing models have shown a promise of outperforming human performance when trained and evaluated with SQuAD dataset.

Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.