no code implementations • LREC 2022 • Sangwhan Moon, Won Ik Cho, Hye Joo Han, Naoaki Okazaki, Nam Soo Kim
As this problem originates from the conventional scheme used when creating a POS tagging corpus, we propose an improvement to the existing scheme, which makes it friendlier to generative tasks.
no code implementations • COLING (CODI, CRAC) 2022 • Won Ik Cho, SooMin Kim, Eujeong Choi, Younghoon Jeong
Recently, with the advent of high-performance generative language models, artificial agents that communicate directly with the users have become more human-like.
no code implementations • ACL (IWSLT) 2021 • Yong Rae Jo, Youngki Moon, Minji Jung, Jungyoon Choi, Jihyung Moon, Won Ik Cho
In this technical report, we describe the fine-tuned ASR-MT pipeline used for the IWSLT shared task.
no code implementations • NLP4DH (ICON) 2021 • Won Ik Cho, Jihyung Moon
Furthermore, we check how the final corpus corresponds with the definition and scope of hate speech, and confirm that the overall procedure and outcome is in concurrence with the sociolinguistic discussions.
no code implementations • WNUT (ACL) 2021 • Won Ik Cho, SooMin Kim
WARNING: This article contains contents that may offend the readers.
no code implementations • 10 Jan 2025 • Eunjung Cho, Won Ik Cho, Soomin Seo
Hallucination in large language models (LLMs) remains a significant challenge for their safe deployment, particularly due to its potential to spread misinformation.
no code implementations • 13 Oct 2024 • Soyoung Yang, Hojun Cho, Jiyoung Lee, Sohee Yoon, Edward Choi, Jaegul Choo, Won Ik Cho
Aspect-based sentiment analysis (ABSA) is the challenging task of extracting sentiment along with its corresponding aspects and opinions from human language.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2
no code implementations • 17 Apr 2024 • Soyoung Yang, Won Ik Cho
In the era of rapid evolution of generative language models within the realm of natural language processing, there is an imperative call to revisit and reformulate evaluation methodologies, especially in the domain of aspect-based sentiment analysis (ABSA).
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1
no code implementations • 1 Apr 2023 • Won Ik Cho, Yoon Kyung Lee, Seoyeon Bae, JiHwan Kim, Sangah Park, Moosung Kim, Sowon Hahn, Nam Soo Kim
Building a natural language dataset requires caution since word semantics is vulnerable to subtle text change or the definition of the annotated concept.
1 code implementation • 6 Apr 2022 • Byeong-Cheol Jo, Tak-Sung Heo, Yeongjoon Park, Yongmin Yoo, Won Ik Cho, Kyungsun Kim
Finally, we propose Data Augmentation with Generation And Modification (DAGAM), which combines DAG and DAM techniques for a boosted performance.
1 code implementation • 25 Feb 2022 • Kichang Yang, Wonjun Jang, Won Ik Cho
In hate speech detection, developing training and evaluation datasets across various domains is the critical issue.
1 code implementation • 6 Jul 2021 • Won Ik Cho, Seok Min Kim, Hyunchang Cho, Nam Soo Kim
Most speech-to-text (S2T) translation studies use English speech as a source, which makes it difficult for non-English speakers to take advantage of the S2T technologies.
4 code implementations • 20 May 2021 • Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, JunSeong Kim, Yongsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Lyu, Younghoon Jeong, InKwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Lucy Park, Alice Oh, Jung-Woo Ha, Kyunghyun Cho
We introduce Korean Language Understanding Evaluation (KLUE) benchmark.
1 code implementation • LREC 2022 • Won Ik Cho, Sangwhan Moon, Jong In Kim, Seok Min Kim, Nam Soo Kim
Paraphrasing is often performed with less concern for controlled style conversion.
no code implementations • EMNLP (NLPOSS) 2020 • Won Ik Cho, Sangwhan Moon, YoungSook Song
Korean is often referred to as a low-resource language in the research community.
no code implementations • 3 Aug 2020 • Ji Won Yoon, Hyeonseung Lee, Hyung Yong Kim, Won Ik Cho, Nam Soo Kim
To reduce this computational burden, knowledge distillation (KD), which is a popular model compression method, has been used to transfer knowledge from a deep and complex model (teacher) to a shallower and simpler model (student).
1 code implementation • WS 2020 • Jihyung Moon, Won Ik Cho, Junbum Lee
Additionally, when BERT is trained with bias label for hate speech detection, the prediction score increases, implying that bias and hate are intertwined.
no code implementations • 17 May 2020 • Won Ik Cho, Dong-Hyun Kwak, Ji Won Yoon, Nam Soo Kim
We transfer the knowledge from a concrete Transformer-based text LM to an SLU module which can face a data shortage, based on recent cross-modal distillation methodologies.
no code implementations • LREC 2020 • Won Ik Cho, Jong In Kim, Young Ki Moon, Nam Soo Kim
Assessing the similarity of sentences and detecting paraphrases is an essential task both in theory and practice, but achieving a reliable dataset requires high resource.
no code implementations • LREC 2020 • Won Ik Cho, Seok Min Kim, Nam Soo Kim
Code-mixed grapheme-to-phoneme (G2P) conversion is a crucial issue for modern speech recognition and synthesis task, but has been seldom investigated in sentence-level in literature.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Won Ik Cho, Young Ki Moon, Sangwhan Moon, Seok Min Kim, Nam Soo Kim
Modern dialog managers face the challenge of having to fulfill human-level conversational skills as part of common user expectations, including but not limited to discourse with no clear objective.
1 code implementation • 21 Oct 2019 • Won Ik Cho, Jeonghwa Cho, Woo Hyun Kang, Nam Soo Kim
Analyzing how human beings resolve syntactic ambiguity has long been an issue of interest in the field of linguistics.
2 code implementations • 31 May 2019 • Won Ik Cho, Seok Min Kim, Nam Soo Kim
Different from the writing systems of many Romance and Germanic languages, some languages or language families show complex conjunct forms in character composition.
1 code implementation • WS 2019 • Won Ik Cho, Ji Won Kim, Seok Min Kim, Nam Soo Kim
However, detection and evaluation of gender bias in the machine translation systems are not yet thoroughly investigated, for the task being cross-lingual and challenging to define.
2 code implementations • 10 Nov 2018 • Won Ik Cho, Hyeon Seung Lee, Ji Won Yoon, Seok Min Kim, Nam Soo Kim
This paper suggests a system which identifies the inherent intention of a spoken utterance given its transcript, in some cases using auxiliary acoustic features.
1 code implementation • 31 Oct 2018 • Won Ik Cho, Sung Jun Cheon, Woo Hyun Kang, Ji Won Kim, Nam Soo Kim
For readability and disambiguation of the written text, appropriate word segmentation is recommended for documentation, and it also holds for the digitized texts.
1 code implementation • 10 Oct 2018 • Won Ik Cho, Young Ki Moon, Woo Hyun Kang, Nam Soo Kim
Intention identification is a core issue in dialog management.
no code implementations • SEMEVAL 2018 • Won Ik Cho, Woo Hyun Kang, Nam Soo Kim
This paper proposes a novel feature extraction process for SemEval task 3: Irony detection in English tweets.