Search Results for author: Won Ik Cho

Found 28 papers, 14 papers with code

Modeling the Influence of Verb Aspect on the Activation of Typical Event Locations with BERT

1 code implementation • Findings (ACL) 2021 • Won Ik Cho, Emmanuele Chersoni, Yu-Yin Hsu, Chu-Ren Huang

Paper
Code

Evaluating How Users Game and Display Conversation with Human-Like Agents

no code implementations • COLING (CODI, CRAC) 2022 • Won Ik Cho, SooMin Kim, Eujeong Choi, Younghoon Jeong

Recently, with the advent of high-performance generative language models, artificial agents that communicate directly with the users have become more human-like.

Chatbot

Paper
Add Code

Pay Attention to Categories: Syntax-Based Sentence Modeling with Metadata Projection Matrix

no code implementations • PACLIC 2020 • Won Ik Cho, Nam Soo Kim

Sentence

Paper
Add Code

How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News Comments

no code implementations • NLP4DH (ICON) 2021 • Won Ik Cho, Jihyung Moon

Furthermore, we check how the final corpus corresponds with the definition and scope of hate speech, and confirm that the overall procedure and outcome is in concurrence with the sociolinguistic discussions.

Paper
Add Code

OpenKorPOS: Democratizing Korean Tokenization with Voting-Based Open Corpus Annotation

no code implementations • LREC 2022 • Sangwhan Moon, Won Ik Cho, Hye Joo Han, Naoaki Okazaki, Nam Soo Kim

As this problem originates from the conventional scheme used when creating a POS tagging corpus, we propose an improvement to the existing scheme, which makes it friendlier to generative tasks.

POS POS Tagging +1

Paper
Add Code

Google-trickers, Yaminjeongeum, and Leetspeak: An Empirical Taxonomy for Intentionally Noisy User-Generated Text

no code implementations • WNUT (ACL) 2021 • Won Ik Cho, SooMin Kim

WARNING: This article contains contents that may offend the readers.

Paper
Add Code

VUS at IWSLT 2021: A Finetuned Pipeline for Offline Speech Translation

no code implementations • ACL (IWSLT) 2021 • Yong Rae Jo, Youngki Moon, Minji Jung, Jungyoon Choi, Jihyung Moon, Won Ik Cho

In this technical report, we describe the fine-tuned ASR-MT pipeline used for the IWSLT shared task.

Boundary Detection Machine Translation +2

Paper
Add Code

Evaluating Span Extraction in Generative Paradigm: A Reflection on Aspect-Based Sentiment Analysis

no code implementations • 17 Apr 2024 • Soyoung Yang, Won Ik Cho

In the era of rapid evolution of generative language models within the realm of natural language processing, there is an imperative call to revisit and reformulate evaluation methodologies, especially in the domain of aspect-based sentiment analysis (ABSA).

Paper
Add Code

When Crowd Meets Persona: Creating a Large-Scale Open-Domain Persona Dialogue Corpus

no code implementations • 1 Apr 2023 • Won Ik Cho, Yoon Kyung Lee, Seoyeon Bae, JiHwan Kim, Sangah Park, Moosung Kim, Sowon Hahn, Nam Soo Kim

Building a natural language dataset requires caution since word semantics is vulnerable to subtle text change or the definition of the annotated concept.

Dialogue Generation Question Answering +2

Paper
Add Code

DAGAM: Data Augmentation with Generation And Modification

1 code implementation • 6 Apr 2022 • Byeong-Cheol Jo, Tak-Sung Heo, Yeongjoon Park, Yongmin Yoo, Won Ik Cho, Kyungsun Kim

Finally, we propose Data Augmentation with Generation And Modification (DAGAM), which combines DAG and DAM techniques for a boosted performance.

Data Augmentation text-classification +1

Paper
Code

APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets

1 code implementation • 25 Feb 2022 • Kichang Yang, Wonjun Jang, Won Ik Cho

In hate speech detection, developing training and evaluation datasets across various domains is the critical issue.

Hate Speech Detection

Paper
Code

Kosp2e: Korean Speech to English Translation Corpus

1 code implementation • 6 Jul 2021 • Won Ik Cho, Seok Min Kim, Hyunchang Cho, Nam Soo Kim

Most speech-to-text (S2T) translation studies use English speech as a source, which makes it difficult for non-English speakers to take advantage of the S2T technologies.

speech-recognition Speech Recognition +1

Paper
Code

KLUE: Korean Language Understanding Evaluation

3 code implementations • 20 May 2021 • Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, JunSeong Kim, Yongsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Lyu, Younghoon Jeong, InKwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Lucy Park, Alice Oh, Jung-Woo Ha, Kyunghyun Cho

We introduce Korean Language Understanding Evaluation (KLUE) benchmark.

Dependency Parsing Dialogue State Tracking +10

543

Paper
Code

StyleKQC: A Style-Variant Paraphrase Corpus for Korean Questions and Commands

1 code implementation • LREC 2022 • Won Ik Cho, Sangwhan Moon, Jong In Kim, Seok Min Kim, Nam Soo Kim

Paraphrasing is often performed with less concern for controlled style conversion.

Natural Language Queries

Paper
Code

Open Korean Corpora: A Practical Report

no code implementations • EMNLP (NLPOSS) 2020 • Won Ik Cho, Sangwhan Moon, YoungSook Song

Korean is often referred to as a low-resource language in the research community.

Paper
Add Code

TutorNet: Towards Flexible Knowledge Distillation for End-to-End Speech Recognition

no code implementations • 3 Aug 2020 • Ji Won Yoon, Hyeonseung Lee, Hyung Yong Kim, Won Ik Cho, Nam Soo Kim

To reduce this computational burden, knowledge distillation (KD), which is a popular model compression method, has been used to transfer knowledge from a deep and complex model (teacher) to a shallower and simpler model (student).

Knowledge Distillation Model Compression +3

Paper
Add Code

BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection

1 code implementation • WS 2020 • Jihyung Moon, Won Ik Cho, Junbum Lee

Additionally, when BERT is trained with bias label for hate speech detection, the prediction score increases, implying that bias and hate are intertwined.

Hate Speech Detection

363

Paper
Code

Speech to Text Adaptation: Towards an Efficient Cross-Modal Distillation

no code implementations • 17 May 2020 • Won Ik Cho, Dong-Hyun Kwak, Ji Won Yoon, Nam Soo Kim

We transfer the knowledge from a concrete Transformer-based text LM to an SLU module which can face a data shortage, based on recent cross-modal distillation methodologies.

Computational Efficiency speech-recognition +2

Paper
Add Code

Discourse Component to Sentence (DC2S): An Efficient Human-Aided Construction of Paraphrase and Sentence Similarity Dataset

no code implementations • LREC 2020 • Won Ik Cho, Jong In Kim, Young Ki Moon, Nam Soo Kim

Assessing the similarity of sentences and detecting paraphrases is an essential task both in theory and practice, but achieving a reliable dataset requires high resource.

Natural Language Inference Paraphrase Generation +2

Paper
Add Code

Towards an Efficient Code-Mixed Grapheme-to-Phoneme Conversion in an Agglutinative Language: A Case Study on To-Korean Transliteration

no code implementations • LREC 2020 • Won Ik Cho, Seok Min Kim, Nam Soo Kim

Code-mixed grapheme-to-phoneme (G2P) conversion is a crucial issue for modern speech recognition and synthesis task, but has been seldom investigated in sentence-level in literature.

Philosophy Sentence +3

Paper
Add Code

Machines Getting with the Program: Understanding Intent Arguments of Non-Canonical Directives

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Won Ik Cho, Young Ki Moon, Sangwhan Moon, Seok Min Kim, Nam Soo Kim

Modern dialog managers face the challenge of having to fulfill human-level conversational skills as part of common user expectations, including but not limited to discourse with no clear objective.

Paper
Code

Text Matters but Speech Influences: A Computational Analysis of Syntactic Ambiguity Resolution

1 code implementation • 21 Oct 2019 • Won Ik Cho, Jeonghwa Cho, Woo Hyun Kang, Nam Soo Kim

Analyzing how human beings resolve syntactic ambiguity has long been an issue of interest in the field of linguistics.

Sentence Spoken Language Understanding

Paper
Code

Investigating an Effective Character-level Embedding in Korean Sentence Classification

2 code implementations • 31 May 2019 • Won Ik Cho, Seok Min Kim, Nam Soo Kim

Different from the writing systems of many Romance and Germanic languages, some languages or language families show complex conjunct forms in character composition.

Classification General Classification +3

Paper
Code

On Measuring Gender Bias in Translation of Gender-neutral Pronouns

1 code implementation • WS 2019 • Won Ik Cho, Ji Won Kim, Seok Min Kim, Nam Soo Kim

However, detection and evaluation of gender bias in the machine translation systems are not yet thoroughly investigated, for the task being cross-lingual and challenging to define.

Ethics Image Captioning +3

Paper
Code

Speech Intention Understanding in a Head-final Language: A Disambiguation Utilizing Intonation-dependency

2 code implementations • 10 Nov 2018 • Won Ik Cho, Hyeon Seung Lee, Ji Won Yoon, Seok Min Kim, Nam Soo Kim

This paper suggests a system which identifies the inherent intention of a spoken utterance given its transcript, in some cases using auxiliary acoustic features.

Sentence

Paper
Code

Giving Space to Your Message: Assistive Word Segmentation for the Electronic Typing of Digital Minorities

1 code implementation • 31 Oct 2018 • Won Ik Cho, Sung Jun Cheon, Woo Hyun Kang, Ji Won Kim, Nam Soo Kim

For readability and disambiguation of the written text, appropriate word segmentation is recommended for documentation, and it also holds for the digitized texts.

Segmentation

Paper
Code

Extracting Arguments from Korean Question and Command: An Annotated Corpus for Structured Paraphrasing

1 code implementation • 10 Oct 2018 • Won Ik Cho, Young Ki Moon, Woo Hyun Kang, Nam Soo Kim

Intention identification is a core issue in dialog management.

Argument Mining Management +2

Paper
Code

HashCount at SemEval-2018 Task 3: Concatenative Featurization of Tweet and Hashtags for Irony Detection

no code implementations • SEMEVAL 2018 • Won Ik Cho, Woo Hyun Kang, Nam Soo Kim

This paper proposes a novel feature extraction process for SemEval task 3: Irony detection in English tweets.

Feature Engineering Hate Speech Detection +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.