Search Results for author: Won Ik Cho

Found 30 papers, 14 papers with code

OpenKorPOS: Democratizing Korean Tokenization with Voting-Based Open Corpus Annotation

no code implementations LREC 2022 Sangwhan Moon, Won Ik Cho, Hye Joo Han, Naoaki Okazaki, Nam Soo Kim

As this problem originates from the conventional scheme used when creating a POS tagging corpus, we propose an improvement to the existing scheme, which makes it friendlier to generative tasks.

POS POS Tagging +1

Evaluating How Users Game and Display Conversation with Human-Like Agents

no code implementations COLING (CODI, CRAC) 2022 Won Ik Cho, SooMin Kim, Eujeong Choi, Younghoon Jeong

Recently, with the advent of high-performance generative language models, artificial agents that communicate directly with the users have become more human-like.


How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News Comments

no code implementations NLP4DH (ICON) 2021 Won Ik Cho, Jihyung Moon

Furthermore, we check how the final corpus corresponds with the definition and scope of hate speech, and confirm that the overall procedure and outcome is in concurrence with the sociolinguistic discussions.

Hermit Kingdom Through the Lens of Multiple Perspectives: A Case Study of LLM Hallucination on North Korea

no code implementations10 Jan 2025 Eunjung Cho, Won Ik Cho, Soomin Seo

Hallucination in large language models (LLMs) remains a significant challenge for their safe deployment, particularly due to its potential to spread misinformation.

Hallucination Misinformation

Evaluating Span Extraction in Generative Paradigm: A Reflection on Aspect-Based Sentiment Analysis

no code implementations17 Apr 2024 Soyoung Yang, Won Ik Cho

In the era of rapid evolution of generative language models within the realm of natural language processing, there is an imperative call to revisit and reformulate evaluation methodologies, especially in the domain of aspect-based sentiment analysis (ABSA).

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1

When Crowd Meets Persona: Creating a Large-Scale Open-Domain Persona Dialogue Corpus

no code implementations1 Apr 2023 Won Ik Cho, Yoon Kyung Lee, Seoyeon Bae, JiHwan Kim, Sangah Park, Moosung Kim, Sowon Hahn, Nam Soo Kim

Building a natural language dataset requires caution since word semantics is vulnerable to subtle text change or the definition of the annotated concept.

Dialogue Generation Question Answering +2

DAGAM: Data Augmentation with Generation And Modification

1 code implementation6 Apr 2022 Byeong-Cheol Jo, Tak-Sung Heo, Yeongjoon Park, Yongmin Yoo, Won Ik Cho, Kyungsun Kim

Finally, we propose Data Augmentation with Generation And Modification (DAGAM), which combines DAG and DAM techniques for a boosted performance.

Data Augmentation text-classification +1

APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets

1 code implementation25 Feb 2022 Kichang Yang, Wonjun Jang, Won Ik Cho

In hate speech detection, developing training and evaluation datasets across various domains is the critical issue.

Hate Speech Detection

Kosp2e: Korean Speech to English Translation Corpus

1 code implementation6 Jul 2021 Won Ik Cho, Seok Min Kim, Hyunchang Cho, Nam Soo Kim

Most speech-to-text (S2T) translation studies use English speech as a source, which makes it difficult for non-English speakers to take advantage of the S2T technologies.

speech-recognition Speech Recognition +1

Open Korean Corpora: A Practical Report

no code implementations EMNLP (NLPOSS) 2020 Won Ik Cho, Sangwhan Moon, YoungSook Song

Korean is often referred to as a low-resource language in the research community.

TutorNet: Towards Flexible Knowledge Distillation for End-to-End Speech Recognition

no code implementations3 Aug 2020 Ji Won Yoon, Hyeonseung Lee, Hyung Yong Kim, Won Ik Cho, Nam Soo Kim

To reduce this computational burden, knowledge distillation (KD), which is a popular model compression method, has been used to transfer knowledge from a deep and complex model (teacher) to a shallower and simpler model (student).

Knowledge Distillation Model Compression +3

BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection

1 code implementation WS 2020 Jihyung Moon, Won Ik Cho, Junbum Lee

Additionally, when BERT is trained with bias label for hate speech detection, the prediction score increases, implying that bias and hate are intertwined.

Hate Speech Detection

Speech to Text Adaptation: Towards an Efficient Cross-Modal Distillation

no code implementations17 May 2020 Won Ik Cho, Dong-Hyun Kwak, Ji Won Yoon, Nam Soo Kim

We transfer the knowledge from a concrete Transformer-based text LM to an SLU module which can face a data shortage, based on recent cross-modal distillation methodologies.

Computational Efficiency speech-recognition +2

Discourse Component to Sentence (DC2S): An Efficient Human-Aided Construction of Paraphrase and Sentence Similarity Dataset

no code implementations LREC 2020 Won Ik Cho, Jong In Kim, Young Ki Moon, Nam Soo Kim

Assessing the similarity of sentences and detecting paraphrases is an essential task both in theory and practice, but achieving a reliable dataset requires high resource.

Natural Language Inference Paraphrase Generation +2

Towards an Efficient Code-Mixed Grapheme-to-Phoneme Conversion in an Agglutinative Language: A Case Study on To-Korean Transliteration

no code implementations LREC 2020 Won Ik Cho, Seok Min Kim, Nam Soo Kim

Code-mixed grapheme-to-phoneme (G2P) conversion is a crucial issue for modern speech recognition and synthesis task, but has been seldom investigated in sentence-level in literature.

Philosophy Sentence +3

Machines Getting with the Program: Understanding Intent Arguments of Non-Canonical Directives

1 code implementation Findings of the Association for Computational Linguistics 2020 Won Ik Cho, Young Ki Moon, Sangwhan Moon, Seok Min Kim, Nam Soo Kim

Modern dialog managers face the challenge of having to fulfill human-level conversational skills as part of common user expectations, including but not limited to discourse with no clear objective.

Text Matters but Speech Influences: A Computational Analysis of Syntactic Ambiguity Resolution

1 code implementation21 Oct 2019 Won Ik Cho, Jeonghwa Cho, Woo Hyun Kang, Nam Soo Kim

Analyzing how human beings resolve syntactic ambiguity has long been an issue of interest in the field of linguistics.

Sentence Spoken Language Understanding

Investigating an Effective Character-level Embedding in Korean Sentence Classification

2 code implementations31 May 2019 Won Ik Cho, Seok Min Kim, Nam Soo Kim

Different from the writing systems of many Romance and Germanic languages, some languages or language families show complex conjunct forms in character composition.

Classification General Classification +3

On Measuring Gender Bias in Translation of Gender-neutral Pronouns

1 code implementation WS 2019 Won Ik Cho, Ji Won Kim, Seok Min Kim, Nam Soo Kim

However, detection and evaluation of gender bias in the machine translation systems are not yet thoroughly investigated, for the task being cross-lingual and challenging to define.

Ethics Image Captioning +3

Speech Intention Understanding in a Head-final Language: A Disambiguation Utilizing Intonation-dependency

2 code implementations10 Nov 2018 Won Ik Cho, Hyeon Seung Lee, Ji Won Yoon, Seok Min Kim, Nam Soo Kim

This paper suggests a system which identifies the inherent intention of a spoken utterance given its transcript, in some cases using auxiliary acoustic features.


Giving Space to Your Message: Assistive Word Segmentation for the Electronic Typing of Digital Minorities

1 code implementation31 Oct 2018 Won Ik Cho, Sung Jun Cheon, Woo Hyun Kang, Ji Won Kim, Nam Soo Kim

For readability and disambiguation of the written text, appropriate word segmentation is recommended for documentation, and it also holds for the digitized texts.


Cannot find the paper you are looking for? You can Submit a new open access paper.