Search Results for author: Taeuk Kim

Found 26 papers, 10 papers with code

Aligning Language Models to Explicitly Handle Ambiguity

no code implementations18 Apr 2024 Hyuhng Joon Kim, Youna Kim, Cheonbok Park, Junyeob Kim, Choonghyun Park, Kang Min Yoo, Sang-goo Lee, Taeuk Kim

However, conversational agents built upon even the most recent large language models (LLMs) face challenges in processing ambiguous inputs, primarily due to the following two hurdles: (1) LLMs are not directly trained to handle inputs that are too ambiguous to be properly managed; (2) the degree of ambiguity in an input can vary according to the intrinsic knowledge of the LLMs, which is difficult to investigate.

BlendX: Complex Multi-Intent Detection with Blended Patterns

1 code implementation27 Mar 2024 Yejin Yoon, Jungyeon Lee, Kangsan Kim, Chanhee Park, Taeuk Kim

Task-oriented dialogue (TOD) systems are commonly designed with the presumption that each utterance represents a single intent.

Intent Detection

Hyper-CL: Conditioning Sentence Representations with Hypernetworks

no code implementations14 Mar 2024 Young Hyun Yoo, Jii Cha, Changhyeon Kim, Taeuk Kim

While the introduction of contrastive learning frameworks in sentence representation learning has significantly contributed to advancements in the field, it still remains unclear whether state-of-the-art sentence embeddings can capture the fine-grained semantics of sentences, particularly when conditioned on specific perspectives.

Computational Efficiency Contrastive Learning +4

Analysis of Multi-Source Language Training in Cross-Lingual Transfer

no code implementations21 Feb 2024 Seong Hoon Lim, Taejun Yun, Jinhyeon Kim, Jihun Choi, Taeuk Kim

The successful adaptation of multilingual language models (LMs) to a specific language-task pair critically depends on the availability of data tailored for that condition.

Cross-Lingual Transfer

X-SNS: Cross-Lingual Transfer Prediction through Sub-Network Similarity

no code implementations26 Oct 2023 Taejun Yun, Jinhyeon Kim, Deokyeong Kang, Seong Hoon Lim, Jihoon Kim, Taeuk Kim

Cross-lingual transfer (XLT) is an emergent ability of multilingual language models that preserves their performance on a task to a significant extent when evaluated in languages that were not included in the fine-tuning process.

Cross-Lingual Transfer

Universal Domain Adaptation for Robust Handling of Distributional Shifts in NLP

1 code implementation23 Oct 2023 Hyuhng Joon Kim, Hyunsoo Cho, Sang-Woo Lee, Junyeob Kim, Choonghyun Park, Sang-goo Lee, Kang Min Yoo, Taeuk Kim

When deploying machine learning systems to the wild, it is highly desirable for them to effectively leverage prior knowledge to the unfamiliar domain while also firing alarms to anomalous inputs.

Universal Domain Adaptation

Enhancing Out-of-Distribution Detection in Natural Language Understanding via Implicit Layer Ensemble

1 code implementation20 Oct 2022 Hyunsoo Cho, Choonghyun Park, Jaewook Kang, Kang Min Yoo, Taeuk Kim, Sang-goo Lee

Out-of-distribution (OOD) detection aims to discern outliers from the intended data distribution, which is crucial to maintaining high reliability and a good user experience.

Contrastive Learning intent-classification +5

Revisiting the Practical Effectiveness of Constituency Parse Extraction from Pre-trained Language Models

no code implementations COLING 2022 Taeuk Kim

Constituency Parse Extraction from Pre-trained Language Models (CPE-PLM) is a recent paradigm that attempts to induce constituency parse trees relying only on the internal knowledge of pre-trained language models.

In-Context Learning

Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator

no code implementations16 Jun 2022 Hyuhng Joon Kim, Hyunsoo Cho, Junyeob Kim, Taeuk Kim, Kang Min Yoo, Sang-goo Lee

Large-scale pre-trained language models (PLMs) are well-known for being capable of solving a task simply by conditioning a few input-label pairs dubbed demonstrations on a prompt without being explicitly tuned for the desired downstream task.

In-Context Learning text-classification +2

HYU at SemEval-2022 Task 2: Effective Idiomaticity Detection with Consideration at Different Levels of Contextualization

no code implementations SemEval (NAACL) 2022 Youngju Joung, Taeuk Kim

We propose a unified framework that enables us to consider various aspects of contextualization at different levels to better identify the idiomaticity of multi-word expressions.

Sentence Task 2

Ground-Truth Labels Matter: A Deeper Look into Input-Label Demonstrations

no code implementations25 May 2022 Kang Min Yoo, Junyeob Kim, Hyuhng Joon Kim, Hyunsoo Cho, Hwiyeol Jo, Sang-Woo Lee, Sang-goo Lee, Taeuk Kim

Despite recent explosion of interests in in-context learning, the underlying mechanism and the precise impact of the quality of demonstrations remain elusive.

In-Context Learning Language Modelling

Self-Guided Contrastive Learning for BERT Sentence Representations

1 code implementation ACL 2021 Taeuk Kim, Kang Min Yoo, Sang-goo Lee

In this work, we propose a contrastive learning method that utilizes self-guidance for improving the quality of BERT sentence representations.

Contrastive Learning Data Augmentation +2

IDS at SemEval-2020 Task 10: Does Pre-trained Language Model Know What to Emphasize?

no code implementations SEMEVAL 2020 Jaeyoul Shin, Taeuk Kim, Sang-goo Lee

We propose a novel method that enables us to determine words that deserve to be emphasized from written text in visual media, relying only on the information from the self-attention distributions of pre-trained language models (PLMs).

Language Modelling

Multilingual Chart-based Constituency Parse Extraction from Pre-trained Language Models

1 code implementation Findings (EMNLP) 2021 Taeuk Kim, Bowen Li, Sang-goo Lee

As it has been unveiled that pre-trained language models (PLMs) are to some extent capable of recognizing syntactic concepts in natural language, much effort has been made to develop a method for extracting complete (binary) parses from PLMs without training separate parsers.

Constituency Parsing Cross-Lingual Transfer

Are Pre-trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction

1 code implementation ICLR 2020 Taeuk Kim, Jihun Choi, Daniel Edmiston, Sang-goo Lee

With the recent success and popularity of pre-trained language models (LMs) in natural language processing, there has been a rise in efforts to understand their inner workings.

Summary Level Training of Sentence Rewriting for Abstractive Summarization

no code implementations WS 2019 Sanghwan Bae, Taeuk Kim, Jihoon Kim, Sang-goo Lee

As an attempt to combine extractive and abstractive summarization, Sentence Rewriting models adopt the strategy of extracting salient sentences from a document first and then paraphrasing the selected ones to generate a summary.

Abstractive Text Summarization Extractive Text Summarization +3

Don't Just Scratch the Surface: Enhancing Word Representations for Korean with Hanja

3 code implementations IJCNLP 2019 Kang Min Yoo, Taeuk Kim, Sang-goo Lee

We propose a simple yet effective approach for improving Korean word representations using additional linguistic annotation (i. e. Hanja).

Cross-Lingual Transfer Headline Generation +1

Dynamic Compositionality in Recursive Neural Networks with Structure-aware Tag Representations

2 code implementations7 Sep 2018 Taeuk Kim, Jihun Choi, Daniel Edmiston, Sanghwan Bae, Sang-goo Lee

Most existing recursive neural network (RvNN) architectures utilize only the structure of parse trees, ignoring syntactic tags which are provided as by-products of parsing.

Natural Language Inference Sentence +2

Element-wise Bilinear Interaction for Sentence Matching

no code implementations SEMEVAL 2018 Jihun Choi, Taeuk Kim, Sang-goo Lee

When we build a neural network model predicting the relationship between two sentences, the most general and intuitive approach is to use a Siamese architecture, where the sentence vectors obtained from a shared encoder is given as input to a classifier.

Natural Language Inference Paraphrase Identification +1

A Syllable-based Technique for Word Embeddings of Korean Words

no code implementations WS 2017 Sanghyuk Choi, Taeuk Kim, Jinseok Seol, Sang-goo Lee

Word embedding has become a fundamental component to many NLP tasks such as named entity recognition and machine translation.

Machine Translation named-entity-recognition +4

Cannot find the paper you are looking for? You can Submit a new open access paper.