Search Results for author: Alice Oh

We study multilingualism by collecting and analyzing a large dataset of the content written by multilingual editors of the English, German, and Spanish editions of Wikipedia.

Paper
Add Code

Joint Modeling of Topics, Citations, and Topical Authority in Academic Corpora

no code implementations • TACL 2017 • Jooyeon Kim, Dongwoo Kim, Alice Oh

Second, it models each author's influence on citations of a paper based on the topics of the cited papers, as well as the citing papers.

Paper
Add Code

Rotated Word Vector Representations and their Interpretability

1 code implementation • EMNLP 2017 • Sungjoon Park, JinYeong Bak, Alice Oh

We apply several rotation algorithms to the vector representation of words to improve the interpretability.

Paper
Code

Leveraging the Crowd to Detect and Reduce the Spread of Fake News and Misinformation

1 code implementation • 27 Nov 2017 • Jooyeon Kim, Behzad Tabibian, Alice Oh, Bernhard Schoelkopf, Manuel Gomez-Rodriguez

Online social networking sites are experimenting with the following crowd-powered procedure to reduce the spread of fake news and misinformation: whenever a user is exposed to a story through her feed, she can flag the story as misinformation and, if the story receives enough flags, it is sent to a trusted third party for fact checking.

Fact Checking Misinformation +1

Paper
Code

Subword-level Word Vector Representations for Korean

1 code implementation • ACL 2018 • Sungjoon Park, Jeongmin Byun, Sion Baek, Yongseok Cho, Alice Oh

The results show that our simple method outperforms word2vec and character-level Skip-Grams on semantic and syntactic similarity and analogy tasks and contributes positively toward downstream NLP tasks such as sentiment analysis.

Document Classification Language Modelling +3

106

Paper
Code

Hierarchical Dirichlet Gaussian Marked Hawkes Process for Narrative Reconstruction in Continuous Time Domain

no code implementations • EMNLP 2018 • Yeon Seonwoo, Alice Oh, Sungjoon Park

In news and discussions, many articles and posts are provided without their related previous articles or posts.

Paper
Add Code

Conversational Decision-Making Model for Predicting the King's Decision in the Annals of the Joseon Dynasty

no code implementations • EMNLP 2018 • JinYeong Bak, Alice Oh

Styles of leaders when they make decisions in groups vary, and the different styles affect the performance of the group.

Decision Making

Paper
Add Code

Homogeneity-Based Transmissive Process to Model True and False News in Social Networks

1 code implementation • 16 Nov 2018 • Jooyeon Kim, Dongkwan Kim, Alice Oh

An overwhelming number of true and false news stories are posted and shared in social networks, and users diffuse the stories based on multiple factors.

Paper
Code

Conversation Model Fine-Tuning for Classifying Client Utterances in Counseling Dialogues

no code implementations • NAACL 2019 • Sungjoon Park, Donghyun Kim, Alice Oh

A dataset of those interactions can be used to learn to automatically classify the client utterances into categories that help counselors in diagnosing client status and predicting counseling outcome.

Language Modelling

Paper
Add Code

Emergence of Collective Policies Inside Simulations with Biased Representations

no code implementations • 25 Sep 2019 • Jooyeon Kim, Alice Oh

We consider a setting where biases are involved when agents internalise an environment.

Paper
Add Code

Variational Hierarchical User-based Conversation Model

1 code implementation • IJCNLP 2019 • JinYeong Bak, Alice Oh

To overcome this limitation, we propose a new model with a stochastic variable designed to capture the speaker information and deliver it to the conversational context.

Response Generation

Paper
Code

Additive Compositionality of Word Vectors

no code implementations • WS 2019 • Yeon Seonwoo, Sungjoon Park, Dongkwan Kim, Alice Oh

Additive compositionality of word embedding models has been studied from empirical and theoretical perspectives.

Sentence Sentence Similarity +1

Paper
Add Code

Dimensional Emotion Detection from Categorical Emotion

1 code implementation • EMNLP 2021 • Sungjoon Park, Jiseon Kim, Seonghyeon Ye, Jaeyeol Jeon, Hee Young Park, Alice Oh

We present a model to predict fine-grained emotions along the continuous dimensions of valence, arousal, and dominance (VAD) with a corpus with categorical emotion annotations.

Emotion Classification Sentence

Paper
Code

K-EmoCon, a multimodal sensor dataset for continuous emotion recognition in naturalistic conversations

1 code implementation • 8 May 2020 • Cheul Young Park, Narae Cha, Soowon Kang, Auk Kim, Ahsan Habib Khandoker, Leontios Hadjileontiadis, Alice Oh, Yong Jeong, Uichin Lee

Therefore, studying emotions in the context of social interactions requires a novel dataset, and K-EmoCon is such a multimodal dataset with comprehensive annotations of continuous emotions during naturalistic conversations.

EEG Emotion Recognition

Paper
Code

Speaker Sensitive Response Evaluation Model

1 code implementation • ACL 2020 • JinYeong Bak, Alice Oh

We provide our code and the learned parameters so that they can be used for automatic evaluation of dialogue response generation models.

Response Generation

Paper
Code

Context-Aware Answer Extraction in Question Answering

1 code implementation • EMNLP 2020 • Yeon Seonwoo, Ji-Hoon Kim, Jung-Woo Ha, Alice Oh

With experiments on reading comprehension, we show that BLANC outperforms the state-of-the-art QA models, and the performance gap increases as the number of answer text occurrences increases.

Multi-Task Learning Question Answering +1

Paper
Code

KLUE: Korean Language Understanding Evaluation

3 code implementations • 20 May 2021 • Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, JunSeong Kim, Yongsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Lyu, Younghoon Jeong, InKwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Lucy Park, Alice Oh, Jung-Woo Ha, Kyunghyun Cho

We introduce Korean Language Understanding Evaluation (KLUE) benchmark.

Dependency Parsing Dialogue State Tracking +10

542

Paper
Code

Weakly Supervised Pre-Training for Multi-Hop Retriever

1 code implementation • Findings (ACL) 2021 • Yeon Seonwoo, Sang-Woo Lee, Ji-Hoon Kim, Jung-Woo Ha, Alice Oh

In multi-hop QA, answering complex questions entails iterative document retrieval for finding the missing entity of the question.

Retrieval

Paper
Code

Efficient Contrastive Learning via Novel Data Augmentation and Curriculum Learning

1 code implementation • EMNLP 2021 • Seonghyeon Ye, Jiseon Kim, Alice Oh

We introduce EfficientCL, a memory-efficient continual pretraining method that applies contrastive learning with novel data augmentation and curriculum learning.

Continual Pretraining Contrastive Learning +2

Paper
Code

Mitigating Language-Dependent Ethnic Bias in BERT

1 code implementation • EMNLP 2021 • Jaimeen Ahn, Alice Oh

Which of the two methods works better depends on the amount of NLP resources available for that language.

Word Alignment

Paper
Code

Learning Bill Similarity with Annotated and Augmented Corpora of Bills

1 code implementation • EMNLP 2021 • Jiseon Kim, Elden Griggs, In Song Kim, Alice Oh

Despite the significance of bill-to-bill linkages for understanding the legislative process, existing approaches fail to address semantic similarities across bills, let alone reordering or paraphrasing which are prevalent in legal document writing.

Paper
Code

Knowledge-Enhanced Evidence Retrieval for Counterargument Generation

1 code implementation • Findings (EMNLP) 2021 • Yohan Jo, Haneul Yoo, JinYeong Bak, Alice Oh, Chris Reed, Eduard Hovy

Finding counterevidence to statements is key to many tasks, including counterargument generation.

Knowledge Graphs Natural Language Inference +3

Paper
Code

Learning Representations of Partial Subgraphs by Subgraph InfoMax

no code implementations • 29 Sep 2021 • Dongkwan Kim, Jiho Jin, Jaimeen Ahn, Alice Oh

Subgraphs are important substructures of graphs, but learning their representations has not been studied well.

Paper
Add Code

Emergent Communication under Varying Sizes and Connectivities

no code implementations • NeurIPS 2021 • Jooyeon Kim, Alice Oh

Just as we humans have succeeded in creating a shared language that allows us to interact within a large group, can the emergent communication within an artificial group converge to a shared, agreed language?

Paper
Add Code

Translating Subgraphs to Nodes Makes Simple GNNs Strong and Efficient for Subgraph Representation Learning

no code implementations • 9 Apr 2022 • Dongkwan Kim, Alice Oh

Subgraph representation learning has emerged as an important problem, but it is by default approached with specialized graph neural networks on a large global graph.

Representation Learning Translation

Paper
Add Code

How to Find Your Friendly Neighborhood: Graph Attention Design with Self-Supervision

2 code implementations • ICLR 2021 • Dongkwan Kim, Alice Oh

However, what graph attention learns is not understood well, particularly when graphs are noisy.

Ranked #6 on Node Classification on PubMed with Public Split: fixed 20 nodes per class

Graph Attention

20,099

Paper
Code

Two-Step Question Retrieval for Open-Domain QA

1 code implementation • Findings (ACL) 2022 • Yeon Seonwoo, Juhee Son, Jiho Jin, Sang-Woo Lee, Ji-Hoon Kim, Jung-Woo Ha, Alice Oh

These models have shown a significant increase in inference speed, but at the cost of lower QA performance compared to the retriever-reader models.

Computational Efficiency Retrieval +1

Paper
Code

Translating Hanja Historical Documents to Contemporary Korean and English

no code implementations • 20 May 2022 • Juhee Son, Jiho Jin, Haneul Yoo, JinYeong Bak, Kyunghyun Cho, Alice Oh

Built on top of multilingual neural machine translation, H2KE learns to translate a historical document written in Hanja, from both a full dataset of outdated Korean translation and a small dataset of more recently translated contemporary Korean and English.

Machine Translation Translation

Paper
Add Code

KOLD: Korean Offensive Language Dataset

1 code implementation • 23 May 2022 • Younghoon Jeong, Juhyun Oh, Jaimeen Ahn, Jongwon Lee, Jihyung Moon, Sungjoon Park, Alice Oh

Recent directions for offensive language detection are hierarchical modeling, identifying the type and the target of offensive language, and interpretability with offensive span annotation and prediction.

Classification

Paper
Code

Models and Benchmarks for Representation Learning of Partially Observed Subgraphs

1 code implementation • 1 Sep 2022 • Dongkwan Kim, Jiho Jin, Jaimeen Ahn, Alice Oh

Subgraphs are rich substructures in graphs, and their nodes and edges can be partially observed in real-world tasks.

Representation Learning

Paper
Code

Ranking-Enhanced Unsupervised Sentence Representation Learning

1 code implementation • 9 Sep 2022 • Yeon Seonwoo, Guoyin Wang, Changmin Seo, Sajal Choudhary, Jiwei Li, Xiang Li, Puyang Xu, Sunghyun Park, Alice Oh

In this work, we show that the semantic meaning of a sentence is also determined by nearest-neighbor sentences that are similar to the input sentence.

Contrastive Learning Data Augmentation +5

Paper
Code

HUE: Pretrained Model and Dataset for Understanding Hanja Documents of Ancient Korea

1 code implementation • Findings (NAACL) 2022 • Haneul Yoo, Jiho Jin, Juhee Son, JinYeong Bak, Kyunghyun Cho, Alice Oh

Historical records in Korea before the 20th century were primarily written in Hanja, an extinct language based on Chinese characters and not understood by modern Korean or Chinese speakers.

named-entity-recognition Named Entity Recognition +3

Paper
Code

Rethinking Annotation: Can Language Learners Contribute?

no code implementations • 13 Oct 2022 • Haneul Yoo, Rifki Afina Putri, Changyoon Lee, Youngin Lee, So-Yeon Ahn, Dongyeop Kang, Alice Oh

Researchers have traditionally recruited native speakers to provide annotations for widely used benchmark datasets.

Machine Reading Comprehension named-entity-recognition +4

Paper
Add Code

IDK-MRC: Unanswerable Questions for Indonesian Machine Reading Comprehension

1 code implementation • 25 Oct 2022 • Rifki Afina Putri, Alice Oh

Machine Reading Comprehension (MRC) has become one of the essential tasks in Natural Language Understanding (NLU) as it is often included in several NLU benchmarks (Liang et al., 2020; Wilie et al., 2020).

Machine Reading Comprehension Natural Language Understanding +1

Paper
Code

Towards standardizing Korean Grammatical Error Correction: Datasets and Annotation

1 code implementation • 25 Oct 2022 • Soyoung Yoon, Sungjoon Park, Gyuwan Kim, Junhee Cho, Kihyo Park, Gyutae Kim, Minjoon Seo, Alice Oh

We show that the model trained with our datasets significantly outperforms the currently used statistical Korean GEC system (Hanspell) on a wider range of error types, demonstrating the diversity and usefulness of the datasets.

Attribute Grammatical Error Correction

Paper
Code

CS1QA: A Dataset for Assisting Code-based Question Answering in an Introductory Programming Course

1 code implementation • NAACL 2022 • Changyoon Lee, Yeon Seonwoo, Alice Oh

We introduce CS1QA, a dataset for code-based question answering in the programming education domain.

Question Answering

Paper
Code

SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created Through Human-Machine Collaboration

1 code implementation • 28 May 2023 • Hwaran Lee, Seokhee Hong, Joonsuk Park, Takyoung Kim, Meeyoung Cha, Yejin Choi, Byoung Pil Kim, Gunhee Kim, Eun-Ju Lee, Yong Lim, Alice Oh, Sangchul Park, Jung-Woo Ha

The potential social harms that large language models pose, such as generating offensive content and reinforcing biases, are steeply rising.

Response Generation

220

Paper
Code

KoBBQ: Korean Bias Benchmark for Question Answering

no code implementations • 31 Jul 2023 • Jiho Jin, Jiseon Kim, Nayeon Lee, Haneul Yoo, Alice Oh, Hwaran Lee

In this paper, we present KoBBQ, a Korean bias benchmark dataset, and we propose a general framework that addresses considerations for cultural adaptation of a dataset.

Question Answering

Paper
Add Code

Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis

1 code implementation • 31 Aug 2023 • Nayeon Lee, Chani Jung, Junho Myung, Jiho Jin, Jose Camacho-Collados, Juho Kim, Alice Oh

To address this, we introduce CREHate, a CRoss-cultural English Hate speech dataset.

Hate Speech Detection Transfer Learning

Paper
Code

Learning from Teaching Assistants to Program with Subgoals: Exploring the Potential for AI Teaching Assistants

no code implementations • 19 Sep 2023 • Changyoon Lee, Junho Myung, Jieun Han, Jiho Jin, Alice Oh

To compare the learners' interaction and perception of the AI and human TAs, we conducted a between-subject study with 20 novice programming learners.

Paper
Add Code

ChEDDAR: Student-ChatGPT Dialogue in EFL Writing Education

no code implementations • 23 Sep 2023 • Jieun Han, Haneul Yoo, Junho Myung, Minsun Kim, Tak Yeon Lee, So-Yeon Ahn, Alice Oh

We analyze students' usage patterns and perceptions regarding generative AI with respect to their intent and satisfaction.

Intent Detection Task-Oriented Dialogue Systems

Paper
Add Code

FABRIC: Automated Scoring and Feedback Generation for Essays

no code implementations • 8 Oct 2023 • Jieun Han, Haneul Yoo, Junho Myung, Minsun Kim, Hyunseung Lim, Yoonsu Kim, Tak Yeon Lee, Hwajung Hong, Juho Kim, So-Yeon Ahn, Alice Oh

The second component is CASE, a Corruption-based Augmentation Strategy for Essays, with which we can improve the accuracy of the baseline model by 45. 44%.

Automated Essay Scoring

Paper
Add Code

Time-Aware Representation Learning for Time-Sensitive Question Answering

1 code implementation • 19 Oct 2023 • Jungbin Son, Alice Oh

The model is trained to extract the answer span from the sentence that is both correct in time and context.

Question Answering Representation Learning +1

Paper
Code

The Generative AI Paradox on Evaluation: What It Can Solve, It May Not Evaluate

no code implementations • 9 Feb 2024 • Juhyun Oh, Eunsu Kim, Inha Cha, Alice Oh

This paper explores the assumption that Large Language Models (LLMs) skilled in generation tasks are equally adept as evaluators.

Question Answering TriviaQA

Paper
Add Code

Can LLM Generate Culturally Relevant Commonsense QA Data? Case Study in Indonesian and Sundanese

1 code implementation • 27 Feb 2024 • Rifki Afina Putri, Faiz Ghifari Haznitrama, Dea Adhista, Alice Oh

Large Language Models (LLMs) are increasingly being used to generate synthetic data for training and evaluating models.

General Knowledge Question Answering

Paper
Code

Multi-FAct: Assessing Multilingual LLMs' Multi-Regional Knowledge using FActScore

1 code implementation • 28 Feb 2024 • Sheikh Shafayat, Eunsu Kim, Juhyun Oh, Alice Oh

Large Language Models (LLMs) are prone to factuality hallucination, generating text that contradicts established knowledge.

Hallucination

Paper
Code

CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean

1 code implementation • 11 Mar 2024 • Eunsu Kim, Juyoung Suk, Philhoon Oh, Haneul Yoo, James Thorne, Alice Oh

Despite the rapid development of large language models (LLMs) for the Korean language, there remains an obvious lack of benchmark datasets that test the requisite Korean cultural and linguistic knowledge.

Hate Speech Detection

Paper
Code

RECIPE4U: Student-ChatGPT Interaction Dataset in EFL Writing Education

no code implementations • 13 Mar 2024 • Jieun Han, Haneul Yoo, Junho Myung, Minsun Kim, Tak Yeon Lee, So-Yeon Ahn, Alice Oh

RECIPE4U includes comprehensive records of these interactions, including conversation logs, students' intent, students' self-rated satisfaction, and students' essay edit histories.

Intent Detection Task-Oriented Dialogue Systems

Paper
Add Code

BEnQA: A Question Answering and Reasoning Benchmark for Bengali and English

1 code implementation • 16 Mar 2024 • Sheikh Shafayat, H M Quamran Hasan, Minhajur Rahman Chowdhury Mahim, Rifki Afina Putri, James Thorne, Alice Oh

In this study, we introduce BEnQA, a dataset comprising parallel Bengali and English exam questions for middle and high school levels in Bangladesh.

Question Answering

Paper
Code

Why Knowledge Distillation Amplifies Gender Bias and How to Mitigate from the Perspective of DistilBERT

no code implementations • NAACL (GeBNLP) 2022 • Jaimeen Ahn, Hwaran Lee, JinHwa Kim, Alice Oh

Knowledge distillation is widely used to transfer the language understanding of a large model to a smaller model. However, after knowledge distillation, it was found that the smaller model is more biased by gender compared to the source large model. This paper studies what causes gender bias to increase after the knowledge distillation process. Moreover, we suggest applying a variant of the mixup on knowledge distillation, which is used to increase generalizability during the distillation process, not for augmentation. By doing so, we can significantly reduce the gender bias amplification after knowledge distillation. We also conduct an experiment on the GLUE benchmark to demonstrate that even if the mixup is applied, it does not have a significant adverse effect on the model’s performance.

Knowledge Distillation

Paper
Add Code

Virtual Knowledge Graph Construction for Zero-Shot Domain-Specific Document Retrieval

1 code implementation • COLING 2022 • Yeon Seonwoo, Seunghyun Yoon, Franck Dernoncourt, Trung Bui, Alice Oh

We conduct three experiments 1) domain-specific document retrieval, 2) comparison of our virtual knowledge graph construction method with previous approaches, and 3) ablation study on each component of our virtual knowledge graph.

Domain Adaptation graph construction +2

Paper
Code

Suicidal Risk Detection for Military Personnel

no code implementations • EMNLP 2020 • Sungjoon Park, Kiwoong Park, Jaimeen Ahn, Alice Oh

We analyze social media for detecting the suicidal risk of military personnel, which is especially crucial for countries with compulsory military service such as the Republic of Korea.

Ethics

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.