1 code implementation • COLING 2022 • Yeon Seonwoo, Seunghyun Yoon, Franck Dernoncourt, Trung Bui, Alice Oh
We conduct three experiments 1) domain-specific document retrieval, 2) comparison of our virtual knowledge graph construction method with previous approaches, and 3) ablation study on each component of our virtual knowledge graph.
no code implementations • EMNLP 2020 • Sungjoon Park, Kiwoong Park, Jaimeen Ahn, Alice Oh
We analyze social media for detecting the suicidal risk of military personnel, which is especially crucial for countries with compulsory military service such as the Republic of Korea.
no code implementations • NAACL (GeBNLP) 2022 • Jaimeen Ahn, Hwaran Lee, JinHwa Kim, Alice Oh
Knowledge distillation is widely used to transfer the language understanding of a large model to a smaller model. However, after knowledge distillation, it was found that the smaller model is more biased by gender compared to the source large model. This paper studies what causes gender bias to increase after the knowledge distillation process. Moreover, we suggest applying a variant of the mixup on knowledge distillation, which is used to increase generalizability during the distillation process, not for augmentation. By doing so, we can significantly reduce the gender bias amplification after knowledge distillation. We also conduct an experiment on the GLUE benchmark to demonstrate that even if the mixup is applied, it does not have a significant adverse effect on the model’s performance.
1 code implementation • NAACL 2022 • Changyoon Lee, Yeon Seonwoo, Alice Oh
We introduce CS1QA, a dataset for code-based question answering in the programming education domain.
1 code implementation • 25 Oct 2022 • Soyoung Yoon, Sungjoon Park, Gyuwan Kim, Junhee Cho, Kihyo Park, Gyutae Kim, Minjoon Seo, Alice Oh
We show that the model trained with our datasets significantly outperforms the currently used statistical Korean GEC system (Hanspell) on a wider range of error types, demonstrating the diversity and usefulness of the datasets.
1 code implementation • 25 Oct 2022 • Rifki Afina Putri, Alice Oh
Machine Reading Comprehension (MRC) has become one of the essential tasks in Natural Language Understanding (NLU) as it is often included in several NLU benchmarks (Liang et al., 2020; Wilie et al., 2020).
Machine Reading Comprehension
Natural Language Understanding
+1
no code implementations • 13 Oct 2022 • Haneul Yoo, Rifki Afina Putri, Changyoon Lee, Youngin Lee, So-Yeon Ahn, Dongyeop Kang, Alice Oh
The implication of our findings is that broadening the annotation task to include language learners can open up the opportunity to build benchmark datasets for languages for which it is difficult to recruit native speakers.
1 code implementation • Findings (NAACL) 2022 • Haneul Yoo, Jiho Jin, Juhee Son, JinYeong Bak, Kyunghyun Cho, Alice Oh
Historical records in Korea before the 20th century were primarily written in Hanja, an extinct language based on Chinese characters and not understood by modern Korean or Chinese speakers.
1 code implementation • 9 Sep 2022 • Yeon Seonwoo, Guoyin Wang, Changmin Seo, Sajal Choudhary, Jiwei Li, Xiang Li, Puyang Xu, Sunghyun Park, Alice Oh
In this work, we show that the semantic meaning of a sentence is also determined by nearest-neighbor sentences that are similar to the input sentence.
1 code implementation • 1 Sep 2022 • Dongkwan Kim, Jiho Jin, Jaimeen Ahn, Alice Oh
Subgraphs are rich substructures in graphs, and their nodes and edges can be partially observed in real-world tasks.
1 code implementation • 23 May 2022 • Younghoon Jeong, Juhyun Oh, Jaimeen Ahn, Jongwon Lee, Jihyung Moon, Sungjoon Park, Alice Oh
Recent directions for offensive language detection are hierarchical modeling, identifying the type and the target of offensive language, and interpretability with offensive span annotation and prediction.
no code implementations • 20 May 2022 • Juhee Son, Jiho Jin, Haneul Yoo, JinYeong Bak, Kyunghyun Cho, Alice Oh
Built on top of multilingual neural machine translation, H2KE learns to translate a historical document written in Hanja, from both a full dataset of outdated Korean translation and a small dataset of more recently translated contemporary Korean and English.
1 code implementation • Findings (ACL) 2022 • Yeon Seonwoo, Juhee Son, Jiho Jin, Sang-Woo Lee, Ji-Hoon Kim, Jung-Woo Ha, Alice Oh
These models have shown a significant increase in inference speed, but at the cost of lower QA performance compared to the retriever-reader models.
2 code implementations • ICLR 2021 • Dongkwan Kim, Alice Oh
However, what graph attention learns is not understood well, particularly when graphs are noisy.
no code implementations • 9 Apr 2022 • Dongkwan Kim, Alice Oh
A subgraph is a data structure that can represent various real-world problems.
no code implementations • NeurIPS 2021 • Jooyeon Kim, Alice Oh
Just as we humans have succeeded in creating a shared language that allows us to interact within a large group, can the emergent communication within an artificial group converge to a shared, agreed language?
no code implementations • 29 Sep 2021 • Dongkwan Kim, Jiho Jin, Jaimeen Ahn, Alice Oh
Subgraphs are important substructures of graphs, but learning their representations has not been studied well.
1 code implementation • Findings (EMNLP) 2021 • Yohan Jo, Haneul Yoo, JinYeong Bak, Alice Oh, Chris Reed, Eduard Hovy
Finding counterevidence to statements is key to many tasks, including counterargument generation.
1 code implementation • EMNLP 2021 • Jiseon Kim, Elden Griggs, In Song Kim, Alice Oh
Despite the significance of bill-to-bill linkages for understanding the legislative process, existing approaches fail to address semantic similarities across bills, let alone reordering or paraphrasing which are prevalent in legal document writing.
1 code implementation • EMNLP 2021 • Jaimeen Ahn, Alice Oh
Which of the two methods works better depends on the amount of NLP resources available for that language.
1 code implementation • EMNLP 2021 • Seonghyeon Ye, Jiseon Kim, Alice Oh
We introduce EfficientCL, a memory-efficient continual pretraining method that applies contrastive learning with novel data augmentation and curriculum learning.
1 code implementation • Findings (ACL) 2021 • Yeon Seonwoo, Sang-Woo Lee, Ji-Hoon Kim, Jung-Woo Ha, Alice Oh
In multi-hop QA, answering complex questions entails iterative document retrieval for finding the missing entity of the question.
3 code implementations • 20 May 2021 • Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, JunSeong Kim, Yongsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Lyu, Younghoon Jeong, InKwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Lucy Park, Alice Oh, Jung-Woo Ha, Kyunghyun Cho
We introduce Korean Language Understanding Evaluation (KLUE) benchmark.
1 code implementation • EMNLP 2020 • Yeon Seonwoo, Ji-Hoon Kim, Jung-Woo Ha, Alice Oh
With experiments on reading comprehension, we show that BLANC outperforms the state-of-the-art QA models, and the performance gap increases as the number of answer text occurrences increases.
1 code implementation • ACL 2020 • JinYeong Bak, Alice Oh
We provide our code and the learned parameters so that they can be used for automatic evaluation of dialogue response generation models.
1 code implementation • 8 May 2020 • Cheul Young Park, Narae Cha, Soowon Kang, Auk Kim, Ahsan Habib Khandoker, Leontios Hadjileontiadis, Alice Oh, Yong Jeong, Uichin Lee
Therefore, studying emotions in the context of social interactions requires a novel dataset, and K-EmoCon is such a multimodal dataset with comprehensive annotations of continuous emotions during naturalistic conversations.
1 code implementation • EMNLP 2021 • Sungjoon Park, Jiseon Kim, Seonghyeon Ye, Jaeyeol Jeon, Hee Young Park, Alice Oh
We present a model to predict fine-grained emotions along the continuous dimensions of valence, arousal, and dominance (VAD) with a corpus with categorical emotion annotations.
1 code implementation • IJCNLP 2019 • JinYeong Bak, Alice Oh
To overcome this limitation, we propose a new model with a stochastic variable designed to capture the speaker information and deliver it to the conversational context.
no code implementations • WS 2019 • Yeon Seonwoo, Sungjoon Park, Dongkwan Kim, Alice Oh
Additive compositionality of word embedding models has been studied from empirical and theoretical perspectives.
no code implementations • 25 Sep 2019 • Jooyeon Kim, Alice Oh
We consider a setting where biases are involved when agents internalise an environment.
no code implementations • NAACL 2019 • Sungjoon Park, Donghyun Kim, Alice Oh
A dataset of those interactions can be used to learn to automatically classify the client utterances into categories that help counselors in diagnosing client status and predicting counseling outcome.
1 code implementation • 16 Nov 2018 • Jooyeon Kim, Dongkwan Kim, Alice Oh
An overwhelming number of true and false news stories are posted and shared in social networks, and users diffuse the stories based on multiple factors.
no code implementations • EMNLP 2018 • Yeon Seonwoo, Alice Oh, Sungjoon Park
In news and discussions, many articles and posts are provided without their related previous articles or posts.
no code implementations • EMNLP 2018 • JinYeong Bak, Alice Oh
Styles of leaders when they make decisions in groups vary, and the different styles affect the performance of the group.
1 code implementation • ACL 2018 • Sungjoon Park, Jeongmin Byun, Sion Baek, Yongseok Cho, Alice Oh
The results show that our simple method outperforms word2vec and character-level Skip-Grams on semantic and syntactic similarity and analogy tasks and contributes positively toward downstream NLP tasks such as sentiment analysis.
1 code implementation • 27 Nov 2017 • Jooyeon Kim, Behzad Tabibian, Alice Oh, Bernhard Schoelkopf, Manuel Gomez-Rodriguez
Online social networking sites are experimenting with the following crowd-powered procedure to reduce the spread of fake news and misinformation: whenever a user is exposed to a story through her feed, she can flag the story as misinformation and, if the story receives enough flags, it is sent to a trusted third party for fact checking.
1 code implementation • EMNLP 2017 • Sungjoon Park, JinYeong Bak, Alice Oh
We apply several rotation algorithms to the vector representation of words to improve the interpretability.
no code implementations • TACL 2017 • Jooyeon Kim, Dongwoo Kim, Alice Oh
Second, it models each author's influence on citations of a paper based on the topics of the cited papers, as well as the citing papers.
no code implementations • 28 Aug 2015 • Suin Kim, Sungjoon Park, Scott A. Hale, Sooyoung Kim, Jeongmin Byun, Alice Oh
We study multilingualism by collecting and analyzing a large dataset of the content written by multilingual editors of the English, German, and Spanish editions of Wikipedia.
no code implementations • 22 Mar 2014 • Dongwoo Kim, Alice Oh
We present the \textit{hierarchical Dirichlet scaling process} (HDSP), a Bayesian nonparametric mixed membership model.