Search Results for author: Taeuk Kim

Found 26 papers, 10 papers with code

Aligning Language Models to Explicitly Handle Ambiguity

no code implementations • 18 Apr 2024 • Hyuhng Joon Kim, Youna Kim, Cheonbok Park, Junyeob Kim, Choonghyun Park, Kang Min Yoo, Sang-goo Lee, Taeuk Kim

However, conversational agents built upon even the most recent large language models (LLMs) face challenges in processing ambiguous inputs, primarily due to the following two hurdles: (1) LLMs are not directly trained to handle inputs that are too ambiguous to be properly managed; (2) the degree of ambiguity in an input can vary according to the intrinsic knowledge of the LLMs, which is difficult to investigate.

Paper
Add Code

BlendX: Complex Multi-Intent Detection with Blended Patterns

1 code implementation • 27 Mar 2024 • Yejin Yoon, Jungyeon Lee, Kangsan Kim, Chanhee Park, Taeuk Kim

Task-oriented dialogue (TOD) systems are commonly designed with the presumption that each utterance represents a single intent.

Intent Detection

Paper
Code

Hyper-CL: Conditioning Sentence Representations with Hypernetworks

no code implementations • 14 Mar 2024 • Young Hyun Yoo, Jii Cha, Changhyeon Kim, Taeuk Kim

While the introduction of contrastive learning frameworks in sentence representation learning has significantly contributed to advancements in the field, it still remains unclear whether state-of-the-art sentence embeddings can capture the fine-grained semantics of sentences, particularly when conditioned on specific perspectives.

Computational Efficiency Contrastive Learning +4

Paper
Add Code

Analysis of Multi-Source Language Training in Cross-Lingual Transfer

no code implementations • 21 Feb 2024 • Seong Hoon Lim, Taejun Yun, Jinhyeon Kim, Jihun Choi, Taeuk Kim

The successful adaptation of multilingual language models (LMs) to a specific language-task pair critically depends on the availability of data tailored for that condition.

Cross-Lingual Transfer

Paper
Add Code

X-SNS: Cross-Lingual Transfer Prediction through Sub-Network Similarity

no code implementations • 26 Oct 2023 • Taejun Yun, Jinhyeon Kim, Deokyeong Kang, Seong Hoon Lim, Jihoon Kim, Taeuk Kim

Cross-lingual transfer (XLT) is an emergent ability of multilingual language models that preserves their performance on a task to a significant extent when evaluated in languages that were not included in the fine-tuning process.

Cross-Lingual Transfer

Paper
Add Code

Universal Domain Adaptation for Robust Handling of Distributional Shifts in NLP

1 code implementation • 23 Oct 2023 • Hyuhng Joon Kim, Hyunsoo Cho, Sang-Woo Lee, Junyeob Kim, Choonghyun Park, Sang-goo Lee, Kang Min Yoo, Taeuk Kim

When deploying machine learning systems to the wild, it is highly desirable for them to effectively leverage prior knowledge to the unfamiliar domain while also firing alarms to anomalous inputs.

Universal Domain Adaptation

Paper
Code

Prompt-Augmented Linear Probing: Scaling beyond the Limit of Few-shot In-Context Learners

no code implementations • 21 Dec 2022 • Hyunsoo Cho, Hyuhng Joon Kim, Junyeob Kim, Sang-Woo Lee, Sang-goo Lee, Kang Min Yoo, Taeuk Kim

Through in-context learning (ICL), large-scale language models are effective few-shot learners without additional model fine-tuning.

In-Context Learning Language Modelling

Paper
Add Code

Enhancing Out-of-Distribution Detection in Natural Language Understanding via Implicit Layer Ensemble

1 code implementation • 20 Oct 2022 • Hyunsoo Cho, Choonghyun Park, Jaewook Kang, Kang Min Yoo, Taeuk Kim, Sang-goo Lee

Out-of-distribution (OOD) detection aims to discern outliers from the intended data distribution, which is crucial to maintaining high reliability and a good user experience.

Contrastive Learning intent-classification +5

Paper
Code

Revisiting the Practical Effectiveness of Constituency Parse Extraction from Pre-trained Language Models

no code implementations • COLING 2022 • Taeuk Kim

Constituency Parse Extraction from Pre-trained Language Models (CPE-PLM) is a recent paradigm that attempts to induce constituency parse trees relying only on the internal knowledge of pre-trained language models.

In-Context Learning

Paper
Add Code

Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator

no code implementations • 16 Jun 2022 • Hyuhng Joon Kim, Hyunsoo Cho, Junyeob Kim, Taeuk Kim, Kang Min Yoo, Sang-goo Lee

Large-scale pre-trained language models (PLMs) are well-known for being capable of solving a task simply by conditioning a few input-label pairs dubbed demonstrations on a prompt without being explicitly tuned for the desired downstream task.

In-Context Learning text-classification +2

Paper
Add Code

HYU at SemEval-2022 Task 2: Effective Idiomaticity Detection with Consideration at Different Levels of Contextualization

no code implementations • SemEval (NAACL) 2022 • Youngju Joung, Taeuk Kim

We propose a unified framework that enables us to consider various aspects of contextualization at different levels to better identify the idiomaticity of multi-word expressions.

Sentence Task 2

Paper
Add Code

Ground-Truth Labels Matter: A Deeper Look into Input-Label Demonstrations

no code implementations • 25 May 2022 • Kang Min Yoo, Junyeob Kim, Hyuhng Joon Kim, Hyunsoo Cho, Hwiyeol Jo, Sang-Woo Lee, Sang-goo Lee, Taeuk Kim

Despite recent explosion of interests in in-context learning, the underlying mechanism and the precise impact of the quality of demonstrations remain elusive.

In-Context Learning Language Modelling

Paper
Add Code

Self-Guided Contrastive Learning for BERT Sentence Representations

1 code implementation • ACL 2021 • Taeuk Kim, Kang Min Yoo, Sang-goo Lee

In this work, we propose a contrastive learning method that utilizes self-guidance for improving the quality of BERT sentence representations.

Contrastive Learning Data Augmentation +2

Paper
Code

Heads-up! Unsupervised Constituency Parsing via Self-Attention Heads

no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Bowen Li, Taeuk Kim, Reinald Kim Amplayo, Frank Keller

Here, we propose a novel fully unsupervised parsing approach that extracts constituency trees from PLM attention heads.

Constituency Parsing

Paper
Add Code

IDS at SemEval-2020 Task 10: Does Pre-trained Language Model Know What to Emphasize?

no code implementations • SEMEVAL 2020 • Jaeyoul Shin, Taeuk Kim, Sang-goo Lee

We propose a novel method that enables us to determine words that deserve to be emphasized from written text in visual media, relying only on the information from the self-attention distributions of pre-trained language models (PLMs).

Language Modelling

Paper
Add Code

Multilingual Chart-based Constituency Parse Extraction from Pre-trained Language Models

1 code implementation • Findings (EMNLP) 2021 • Taeuk Kim, Bowen Li, Sang-goo Lee

As it has been unveiled that pre-trained language models (PLMs) are to some extent capable of recognizing syntactic concepts in natural language, much effort has been made to develop a method for extracting complete (binary) parses from PLMs without training separate parsers.

Constituency Parsing Cross-Lingual Transfer

Paper
Code

Are Pre-trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction

1 code implementation • ICLR 2020 • Taeuk Kim, Jihun Choi, Daniel Edmiston, Sang-goo Lee

With the recent success and popularity of pre-trained language models (LMs) in natural language processing, there has been a rise in efforts to understand their inner workings.

Paper
Code

Summary Level Training of Sentence Rewriting for Abstractive Summarization

no code implementations • WS 2019 • Sanghwan Bae, Taeuk Kim, Jihoon Kim, Sang-goo Lee

As an attempt to combine extractive and abstractive summarization, Sentence Rewriting models adopt the strategy of extracting salient sentences from a document first and then paraphrasing the selected ones to generate a summary.

Ranked #5 on Extractive Text Summarization on CNN / Daily Mail

Abstractive Text Summarization Extractive Text Summarization +3

Paper
Add Code

Don't Just Scratch the Surface: Enhancing Word Representations for Korean with Hanja

3 code implementations • IJCNLP 2019 • Kang Min Yoo, Taeuk Kim, Sang-goo Lee

We propose a simple yet effective approach for improving Korean word representations using additional linguistic annotation (i. e. Hanja).

Cross-Lingual Transfer Headline Generation +1

272

Paper
Code

A Cross-Sentence Latent Variable Model for Semi-Supervised Text Sequence Matching

no code implementations • ACL 2019 • Jihun Choi, Taeuk Kim, Sang-goo Lee

We present a latent variable model for predicting the relationship between a pair of text sequences.

Natural Language Inference Paraphrase Identification +1

Paper
Add Code

Dynamic Compositionality in Recursive Neural Networks with Structure-aware Tag Representations

2 code implementations • 7 Sep 2018 • Taeuk Kim, Jihun Choi, Daniel Edmiston, Sanghwan Bae, Sang-goo Lee

Most existing recursive neural network (RvNN) architectures utilize only the structure of parse trees, ignoring syntactic tags which are provided as by-products of parsing.

Natural Language Inference Sentence +2

Paper
Code

Cell-aware Stacked LSTMs for Modeling Sentences

no code implementations • 7 Sep 2018 • Jihun Choi, Taeuk Kim, Sang-goo Lee

We propose a method of stacking multiple long short-term memory (LSTM) layers for modeling sentences.

Ranked #10 on Sentiment Analysis on SST-5 Fine-grained classification

Machine Translation Natural Language Inference +4

Paper
Add Code

Element-wise Bilinear Interaction for Sentence Matching

no code implementations • SEMEVAL 2018 • Jihun Choi, Taeuk Kim, Sang-goo Lee

When we build a neural network model predicting the relationship between two sentences, the most general and intuitive approach is to use a Siamese architecture, where the sentence vectors obtained from a shared encoder is given as input to a classifier.

Natural Language Inference Paraphrase Identification +1

Paper
Add Code

SNU\_IDS at SemEval-2018 Task 12: Sentence Encoder with Contextualized Vectors for Argument Reasoning Comprehension

1 code implementation • SEMEVAL 2018 • Taeuk Kim, Jihun Choi, Sang-goo Lee

We present a novel neural architecture for the Argument Reasoning Comprehension task of SemEval 2018.

Machine Translation Sentence +2

Paper
Code

SNU_IDS at SemEval-2018 Task 12: Sentence Encoder with Contextualized Vectors for Argument Reasoning Comprehension

1 code implementation • SEMEVAL 2018 • Taeuk Kim, Jihun Choi, Sang-goo Lee

We present a novel neural architecture for the Argument Reasoning Comprehension task of SemEval 2018.

Machine Translation Sentence +2

Paper
Code

A Syllable-based Technique for Word Embeddings of Korean Words

no code implementations • WS 2017 • Sanghyuk Choi, Taeuk Kim, Jinseok Seol, Sang-goo Lee

Word embedding has become a fundamental component to many NLP tasks such as named entity recognition and machine translation.

Machine Translation named-entity-recognition +4

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.