no code implementations • 19 Feb 2024 • Chanwoong Yoon, Gangwoo Kim, Byeongguk Jeon, Sungdong Kim, Yohan Jo, Jaewoo Kang
Furthermore, we fine-tune a smaller LM using this dataset to align it with the retrievers' preferences as feedback.
no code implementations • 18 Feb 2024 • Guijin Son, Hanwool Lee, Sungdong Kim, Seungone Kim, Niklas Muennighoff, Taekyoon Choi, Cheonbok Park, Kang Min Yoo, Stella Biderman
We propose KMMLU, a new Korean benchmark with 35, 030 expert-level multiple-choice questions across 45 subjects ranging from humanities to STEM.
1 code implementation • 17 Feb 2024 • Sangkyu Lee, Sungdong Kim, Ashkan Yousefpour, Minjoon Seo, Kang Min Yoo, Youngjae Yu
Existing approaches for aligning large language models with human preferences face a trade-off that requires a separate reward model (RM) for on-policy learning.
no code implementations • 19 Jan 2024 • Dongkeun Yoon, Joel Jang, Sungdong Kim, Seungone Kim, Sheikh Shafayat, Minjoon Seo
We introduce LangBridge, a zero-shot approach to adapt language models for multilingual reasoning tasks without multilingual supervision.
1 code implementation • 23 Oct 2023 • Gangwoo Kim, Sungdong Kim, Byeongguk Jeon, Joonsuk Park, Jaewoo Kang
To cope with the challenge, we propose a novel framework, Tree of Clarifications (ToC): It recursively constructs a tree of disambiguations for the AQ -- via few-shot prompting leveraging external knowledge -- and uses it to generate a long-form answer.
1 code implementation • 12 Oct 2023 • Seungone Kim, Jamin Shin, Yejin Cho, Joel Jang, Shayne Longpre, Hwaran Lee, Sangdoo Yun, Seongjin Shin, Sungdong Kim, James Thorne, Minjoon Seo
We first construct the Feedback Collection, a new dataset that consists of 1K fine-grained score rubrics, 20K instructions, and 100K responses and language feedback generated by GPT-4.
1 code implementation • 20 Jul 2023 • Seonghyeon Ye, Doyoung Kim, Sungdong Kim, Hyeonbin Hwang, Seungone Kim, Yongrae Jo, James Thorne, Juho Kim, Minjoon Seo
Evaluation of Large Language Models (LLMs) is challenging because instruction-following necessitates alignment with human values and the required set of skills varies depending on the instruction.
1 code implementation • 12 Jun 2023 • Dongkeun Yoon, Joel Jang, Sungdong Kim, Minjoon Seo
In this work, we empirically show that updating pretrained LMs (350M, 1. 3B, 2. 7B) with just a few steps of Gradient Ascent Post-training (GAP) on random, unlabeled text corpora enhances its zero-shot generalization capabilities across diverse NLP tasks.
no code implementations • 23 May 2023 • Takyoung Kim, Jamin Shin, Young-Ho Kim, Sanghwan Bae, Sungdong Kim
Most task-oriented dialogue (TOD) benchmarks assume users that know exactly how to use the system by constraining the user behaviors within the system's capabilities via strict user goals, namely "user familiarity" bias.
1 code implementation • 23 May 2023 • Sungdong Kim, Sanghwan Bae, Jamin Shin, Soyoung Kang, Donghyun Kwak, Kang Min Yoo, Minjoon Seo
In human evaluation, our model is preferred to Alpaca and Dolly-v2, 55. 0% and 58. 5% of the time, respectively.
no code implementations • 27 Jan 2023 • Sungdong Kim, Jin-Hwa Kim, Jiyoung Lee, Minjoon Seo
Efficient video-language modeling should consider the computational cost because of a large, sometimes intractable, number of video frames.
Ranked #12 on Video Question Answering on NExT-QA
no code implementations • 14 Jan 2023 • Jing Wei, Sungdong Kim, Hyunhoon Jung, Young-Ho Kim
Through an online study (N = 48) where participants conversed with chatbots driven by different designs of prompts, we assessed how prompt designs and conversation topics affected the conversation flows and users' perceptions of chatbots.
no code implementations • 20 Dec 2022 • Sang-Woo Lee, Sungdong Kim, Donghyeon Ko, Donghoon Ham, Youngki Hong, Shin Ah Oh, Hyunhoon Jung, Wangkyo Jung, Kyunghyun Cho, Donghyun Kwak, Hyungsuk Noh, WooMyoung Park
Task-oriented dialogue (TOD) systems are mainly based on the slot-filling-based TOD (SF-TOD) framework, in which dialogues are broken down into smaller, controllable units (i. e., slots) to fulfill a specific task.
no code implementations • 17 Oct 2022 • Sanghwan Bae, Donghyun Kwak, Soyoung Kang, Min Young Lee, Sungdong Kim, Yuin Jeong, Hyeri Kim, Sang-Woo Lee, WooMyoung Park, Nako Sung
Remembering important information from the past and continuing to talk about it in the present are crucial in long-term conversations.
no code implementations • 31 May 2022 • Young-Ho Kim, Sungdong Kim, Minsuk Chang, Sang-Woo Lee
Current natural language interaction for self-tracking tools largely depends on bespoke implementation optimized for a specific tracking theme and data format, which is neither generalizable nor scalable to a tremendous design space of self-tracking.
1 code implementation • 25 May 2022 • Gangwoo Kim, Sungdong Kim, Kang Min Yoo, Jaewoo Kang
In this paper, we introduce a novel framework, SIMSEEK, (Simulating information-Seeking conversation from unlabeled documents), and compare its two variants.
2 code implementations • CVPR 2023 • Gi-Cheon Kang, Sungdong Kim, Jin-Hwa Kim, Donghyun Kwak, Byoung-Tak Zhang
As a result, GST scales the amount of training data up to an order of magnitude that of VisDial (1. 2M to 12. 9M QA data).
Conditional Text Generation Out-of-Distribution Detection +1
1 code implementation • NAACL 2022 • Sanghwan Bae, Donghyun Kwak, Sungdong Kim, Donghoon Ham, Soyoung Kang, Sang-Woo Lee, WooMyoung Park
In this work, we study the challenge of imposing roles on open-domain dialogue systems, with the goal of making the systems maintain consistent roles while conversing naturally with humans.
no code implementations • NAACL 2022 • Seongjin Shin, Sang-Woo Lee, Hwijeen Ahn, Sungdong Kim, HyoungSeok Kim, Boseop Kim, Kyunghyun Cho, Gichang Lee, WooMyoung Park, Jung-Woo Ha, Nako Sung
Many recent studies on large-scale language models have reported successful in-context zero- and few-shot learning ability.
1 code implementation • 15 Feb 2022 • Sungdong Kim, Gangwoo Kim
In this paper, we demonstrate the existence of a retrieval shortcut in CS, which causes models to retrieve passages solely relying on partial history while disregarding the latest question.
1 code implementation • EMNLP 2021 • Mujeen Sung, Jinhyuk Lee, Sean Yi, Minji Jeon, Sungdong Kim, Jaewoo Kang
To this end, we create the BioLAMA benchmark, which is comprised of 49K biomedical factual knowledge triples for probing biomedical LMs.
1 code implementation • ACL 2021 • Sungdong Kim, Minsuk Chang, Sang-Woo Lee
We propose NeuralWOZ, a novel dialogue collection framework that uses model-based dialogue simulation.
3 code implementations • 20 May 2021 • Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, JunSeong Kim, Yongsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Lyu, Younghoon Jeong, InKwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Lucy Park, Alice Oh, Jung-Woo Ha, Kyunghyun Cho
We introduce Korean Language Understanding Evaluation (KLUE) benchmark.
3 code implementations • ACL 2020 • Sungdong Kim, Sohee Yang, Gyuwan Kim, Sang-Woo Lee
This mechanism consists of two steps: (1) predicting state operation on each of the memory slots, and (2) overwriting the memory with new values, of which only a few are generated according to the predicted state operations.
Ranked #10 on Multi-domain Dialogue State Tracking on MULTIWOZ 2.0
Dialogue State Tracking Multi-domain Dialogue State Tracking
19 code implementations • 25 Jan 2019 • Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, Jaewoo Kang
Biomedical text mining is becoming increasingly important as the number of biomedical documents rapidly grows.
no code implementations • 1 Dec 2018 • Gyeongbok Lee, Sungdong Kim, Seung-won Hwang
Question answering (QA) extracting answers from text to the given question in natural language, has been actively studied and existing models have shown a promise of outperforming human performance when trained and evaluated with SQuAD dataset.