1 code implementation • EMNLP (MRL) 2021 • Jongin Kim, Nayoung Choi, Seunghyun Lim, Jungwhan Kim, Soojin Chung, Hyunsoo Woo, Min Song, Jinho D. Choi
This paper presents a English-Korean parallel dataset that collects 381K news articles where 1, 400 of them, comprising 10K sentences, are manually labeled for crosslingual named entity recognition (NER).
no code implementations • EMNLP 2020 • Changmao Li, Elaine Fisher, Rebecca Thomas, Steve Pittard, Vicki Hertzberg, Jinho D. Choi
Given this dataset, novel transformer-based classification models are developed for two tasks: the first task takes a resume and classifies it to a CRC level (T1), and the second task takes both a resume and a job description to apply and predicts if the application is suited to the job (T2).
no code implementations • COLING 2022 • Daniil Huryn, William M. Hutsell, Jinho D. Choi
Our best method is applied to posts from those 10 subreddits for the creation of a corpus comprising 10, 098 dialogues (3. 3M tokens), 570 of which are compared against dialogues in three other datasets, Blended Skill Talk, Daily Dialogue, and Topical Chat.
1 code implementation • LREC (LAW) 2022 • Yuxin Ji, Gregor Williamson, Jinho D. Choi
All code for this paper, including our automatic annotation tool, is made publicly available.
1 code implementation • LREC (LAW) 2022 • Angela Cao, Gregor Williamson, Jinho D. Choi
We present a scheme for annotating causal language in various genres of text.
no code implementations • CRAC (ACL) 2021 • Sooyoun Han, Sumin Seo, Minji Kang, Jongin Kim, Nayoung Choi, Min Song, Jinho D. Choi
This paper presents a new corpus and annotation guideline for a novel coreference resolution task on fictional texts, and analyzes its unique characteristics.
no code implementations • EMNLP (ACL) 2021 • Jin Zhao, Nianwen Xue, Jens Van Gysel, Jinho D. Choi
We present UMR-Writer, a web-based application for annotating Uniform Meaning Representations (UMR), a graph-based, cross-linguistically applicable semantic representation developed recently to support the development of interpretable natural language applications that require deep semantic analysis of texts.
1 code implementation • 5 Feb 2023 • Han He, Jinho D. Choi
Sequence-to-Sequence (S2S) models have achieved remarkable success on various text generation tasks.
1 code implementation • 18 Dec 2022 • Sarah E. Finch, James D. Finch, Jinho D. Choi
There has been great recent advancement in human-computer chat.
no code implementations • *SEM (NAACL) 2022 • Liyan Xu, Jinho D. Choi
This paper suggests a direction of coreference resolution for online decoding on actively generated input such as dialogue, where the model accepts an utterance and its past context, then finds mentions in the current utterance as well as their referents, upon each dialogue turn.
no code implementations • NAACL 2022 • Liyan Xu, Jinho D. Choi
We target on the document-level relation extraction in an end-to-end setting, where the model needs to jointly perform mention extraction, coreference resolution (COREF) and relation extraction (RE) at once, and gets evaluated in an entity-centric way.
Ranked #3 on
Joint Entity and Relation Extraction
on DocRED
2 code implementations • 6 Dec 2021 • Kaustubh D. Dhole, Varun Gangal, Sebastian Gehrmann, Aadesh Gupta, Zhenhao Li, Saad Mahamood, Abinaya Mahendiran, Simon Mille, Ashish Shrivastava, Samson Tan, Tongshuang Wu, Jascha Sohl-Dickstein, Jinho D. Choi, Eduard Hovy, Ondrej Dusek, Sebastian Ruder, Sajant Anand, Nagender Aneja, Rabin Banjade, Lisa Barthe, Hanna Behnke, Ian Berlot-Attwell, Connor Boyle, Caroline Brun, Marco Antonio Sobrevilla Cabezudo, Samuel Cahyawijaya, Emile Chapuis, Wanxiang Che, Mukund Choudhary, Christian Clauss, Pierre Colombo, Filip Cornell, Gautier Dagan, Mayukh Das, Tanay Dixit, Thomas Dopierre, Paul-Alexis Dray, Suchitra Dubey, Tatiana Ekeinhor, Marco Di Giovanni, Tanya Goyal, Rishabh Gupta, Louanes Hamla, Sang Han, Fabrice Harel-Canada, Antoine Honore, Ishan Jindal, Przemyslaw K. Joniak, Denis Kleyko, Venelin Kovatchev, Kalpesh Krishna, Ashutosh Kumar, Stefan Langer, Seungjae Ryan Lee, Corey James Levinson, Hualou Liang, Kaizhao Liang, Zhexiong Liu, Andrey Lukyanenko, Vukosi Marivate, Gerard de Melo, Simon Meoni, Maxime Meyer, Afnan Mir, Nafise Sadat Moosavi, Niklas Muennighoff, Timothy Sum Hon Mun, Kenton Murray, Marcin Namysl, Maria Obedkova, Priti Oli, Nivranshu Pasricha, Jan Pfister, Richard Plant, Vinay Prabhu, Vasile Pais, Libo Qin, Shahab Raji, Pawan Kumar Rajpoot, Vikas Raunak, Roy Rinberg, Nicolas Roberts, Juan Diego Rodriguez, Claude Roux, Vasconcellos P. H. S., Ananya B. Sai, Robin M. Schmidt, Thomas Scialom, Tshephisho Sefara, Saqib N. Shamsi, Xudong Shen, Haoyue Shi, Yiwen Shi, Anna Shvets, Nick Siegel, Damien Sileo, Jamie Simon, Chandan Singh, Roman Sitelew, Priyank Soni, Taylor Sorensen, William Soto, Aman Srivastava, KV Aditya Srivatsa, Tony Sun, Mukund Varma T, A Tabassum, Fiona Anting Tan, Ryan Teehan, Mo Tiwari, Marie Tolkiehn, Athena Wang, Zijian Wang, Gloria Wang, Zijie J. Wang, Fuxuan Wei, Bryan Wilie, Genta Indra Winata, Xinyi Wu, Witold Wydmański, Tianbao Xie, Usama Yaseen, Michael A. Yee, Jing Zhang, Yue Zhang
Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on.
1 code implementation • 1 Dec 2021 • Liyan Xu, Xuchao Zhang, Bo Zong, Yanchi Liu, Wei Cheng, Jingchao Ni, Haifeng Chen, Liang Zhao, Jinho D. Choi
We target the task of cross-lingual Machine Reading Comprehension (MRC) in the direct zero-shot setting, by incorporating syntactic features from Universal Dependencies (UD), and the key features we use are the syntactic relations within each sentence.
no code implementations • 31 Oct 2021 • Sarah E. Finch, James D. Finch, Daniil Huryn, William Hutsell, Xiaoyuan Huang, Han He, Jinho D. Choi
In the third and final stage, our bot selects a small subset of predicates and translates them into an English response.
no code implementations • EMNLP (NLP4ConvAI) 2021 • James D. Finch, Sarah E. Finch, Jinho D. Choi
Improving user experience of a dialogue system often requires intensive developer effort to read conversation logs, run statistical analyses, and intuit the relative importance of system shortcomings.
1 code implementation • EMNLP (LAW, DMR) 2021 • Gregor Williamson, Patrick Elliott, Yuxin Ji, Jinho D. Choi
We adopt a scope node from the literature and provide an explicit multidimensional semantics utilizing Cooper storage which allows us to derive the de re and de dicto scope readings as well as intermediate scope readings which prove difficult for accounts without a scope node.
1 code implementation • 20 Sep 2021 • Jinho D. Choi, Gregor Williamson
This demonstration paper presents StreamSide, an open-source toolkit for annotating multiple kinds of meaning representations.
1 code implementation • EMNLP 2021 • Han He, Jinho D. Choi
Multi-task learning with transformer encoders (MTL) has emerged as a powerful technique to improve performance on closely-related tasks for both accuracy and efficiency while a question still remains whether or not it would perform as well on tasks that are distinct in nature.
1 code implementation • 8 Sep 2021 • Han He, Liyan Xu, Jinho D. Choi
We introduce ELIT, the Emory Language and Information Toolkit, which is a comprehensive NLP framework providing transformer-based end-to-end models for core tasks with a special focus on memory efficiency while maintaining state-of-the-art accuracy and speed.
no code implementations • ACL (CODI, CRAC) 2021 • Liyan Xu, Jinho D. Choi
We present an effective system adapted from the end-to-end neural coreference resolution model, targeting on the task of anaphora resolution in dialogues.
1 code implementation • EMNLP 2021 • Liyan Xu, Xuchao Zhang, Xujiang Zhao, Haifeng Chen, Feng Chen, Jinho D. Choi
Recent multilingual pre-trained language models have achieved remarkable zero-shot performance, where the model is only finetuned on one source language and directly evaluated on target languages.
1 code implementation • ACL (IWPT) 2021 • Han He, Jinho D. Choi
Coupled with biaffine decoders, transformers have been effectively adapted to text-to-graph transduction and achieved state-of-the-art performance on AMR parsing.
Ranked #14 on
AMR Parsing
on LDC2017T10
no code implementations • NAACL (SMM4H) 2021 • Payam Karisani, Jinho D. Choi, Li Xiong
Then a classifier is trained on each view to label a set of unlabeled documents to be used as an initializer for a new classifier in the other view.
1 code implementation • 14 Apr 2021 • Jiaying Lu, Jinho D. Choi
Salience Estimation aims to predict term importance in documents.
no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Renxuan Albert Li, Ihab Hajjar, Felicia Goldstein, Jinho D. Choi
This paper presents a new dataset, B-SHARP, that can be used to develop NLP models for the detection of Mild Cognitive Impairment (MCI) known as an early sign of Alzheimer{'}s disease.
no code implementations • 5 Nov 2020 • Changmao Li, Elaine Fisher, Rebecca Thomas, Steve Pittard, Vicki Hertzberg, Jinho D. Choi
This paper presents a comprehensive study on resume classification to reduce the time and labor needed to screen an overwhelming number of applications significantly, while improving the selection of suitable candidates.
1 code implementation • EMNLP 2020 • Liyan Xu, Jinho D. Choi
We find that given a high-performing encoder such as SpanBERT, the impact of HOI is negative to marginal, providing a new perspective of HOI to this task.
Ranked #6 on
Coreference Resolution
on CoNLL 2012
no code implementations • 10 Sep 2020 • Sarah E. Finch, James D. Finch, Ali Ahmadvand, Ingyu, Choi, Xiangjue Dong, Ruixiang Qi, Harshita Sahijwani, Sergey Volokhin, Zihan Wang, ZiHao Wang, Jinho D. Choi
Inspired by studies on the overwhelming presence of experience-sharing in human-human conversations, Emora, the social chatbot developed by Emory University, aims to bring such experience-focused interaction to the current field of conversational AI.
no code implementations • SEMEVAL 2020 • Xiangjue Dong, Jinho D. Choi
This paper presents six document classification models using the latest transformer encoders and a high-performing ensemble model for a task of offensive language identification in social media.
no code implementations • WS 2020 • Han He, Jinho D. Choi
Our results show that models using the multilingual encoder outperform ones using the language specific encoders for most languages.
1 code implementation • SIGDIAL (ACL) 2020 • James D. Finch, Jinho D. Choi
This demo paper presents Emora STDM (State Transition Dialogue Manager), a dialogue system development framework that provides novel workflows for rapid prototyping of chat-based dialogue managers as well as collaborative development of complex interactions.
no code implementations • SIGDIAL (ACL) 2020 • Sarah E. Finch, Jinho D. Choi
As conversational AI-based dialogue management has increasingly become a trending topic, the need for a standardized and reliable evaluation procedure grows even more pressing.
no code implementations • WS 2020 • Tae Hwan Oh, Ji Yoon Han, Hyonsu Choe, Seokwon Park, Han He, Jinho D. Choi, Na-Rae Han, Jena D. Hwang, Hansaem Kim
In this paper, we first open on important issues regarding the Penn Korean Universal Treebank (PKT-UD) and address these issues by revising the entire corpus manually with the aim of producing cleaner UD annotations that are more faithful to Korean grammar.
no code implementations • WS 2020 • Xiangjue Dong, Changmao Li, Jinho D. Choi
We present a transformer-based sarcasm detection model that accounts for the context from the entire conversation thread for more robust predictions.
no code implementations • WS 2020 • Liyan Xu, Julien Hogan, Rachel E. Patzer, Jinho D. Choi
This paper presents a reinforcement learning approach to extract noise in long clinical documents for the task of readmission prediction after kidney transplant.
1 code implementation • ACL 2020 • Changmao Li, Jinho D. Choi
We introduce a novel approach to transformers that learns hierarchical representations in multiparty dialogue.
Ranked #3 on
Question Answering
on FriendsQA
2 code implementations • 21 Nov 2019 • Hang Jiang, Xianzhe Zhang, Jinho D. Choi
Previous works related to automatic personality recognition focus on using traditional classification models with linguistic features.
no code implementations • 5 Nov 2019 • Xinyi Jiang, Zhengzhe Yang, Jinho D. Choi
We hypothesize that not all dimensions are equally important for downstream tasks so that our algorithm can detect unessential dimensions and discard them without hurting the performance.
no code implementations • 2 Nov 2019 • Changmao Li, Tianhao Liu, Jinho D. Choi
According to our analysis, replacing the random data split with a chronological data split reduces test accuracy on previous single-variable passage completion task from 72\% to 34\%, that leaves much more room to improve.
no code implementations • WS 2019 • Zhengzhe Yang, Jinho D. Choi
This paper presents FriendsQA, a challenging question answering dataset that contains 1, 222 dialogues and 10, 610 open-domain questions, to tackle machine comprehension on everyday conversations.
1 code implementation • 14 Aug 2019 • Han He, Jinho D. Choi
This paper presents new state-of-the-art models for three tasks, part-of-speech tagging, syntactic parsing, and semantic parsing, using the cutting-edge contextualized embedding framework known as BERT.
no code implementations • WS 2019 • Jinho D. Choi, Mengmei Li, Felicia Goldstein, Ihab Hajjar
This paper presents a new task-oriented meaning representation called meta-semantics, that is designed to detect patients with early symptoms of Alzheimer{'}s disease by analyzing their language beyond a syntactic or semantic level.
1 code implementation • 31 May 2019 • Bonggun Shin, Hao Yang, Jinho D. Choi
Recent advances in deep learning have facilitated the demand of neural models for real applications.
Ranked #2 on
Sentiment Analysis
on MPQA
no code implementations • 31 May 2019 • Bonggun Shin, Julien Hogan, Andrew B. Adams, Raymond J. Lynch, Rachel E. Patzer, Jinho D. Choi
One of the modalities in EHRs, clinical notes, has not been fully explored for these tasks due to its unstructured and inexplicable nature.
no code implementations • WS 2018 • Hiroshi Kanayama, Na-Rae Han, Masayuki Asahara, Jena D. Hwang, Yusuke Miyao, Jinho D. Choi, Yuji Matsumoto
This paper discusses the representation of coordinate structures in the Universal Dependencies framework for two head-final languages, Japanese and Korean.
2 code implementations • COLING 2018 • Ethan Zhou, Jinho D. Choi
To the best of our knowledge, this is the first time that plural mentions are thoroughly analyzed for these two resolution tasks.
no code implementations • NAACL 2018 • Kaixin Ma, Tomasz Jurczyk, Jinho D. Choi
This paper presents a new corpus and a robust deep learning architecture for a task in reading comprehension, passage completion, on multiparty dialog.
no code implementations • SEMEVAL 2018 • Jinho D. Choi, Henry Y. Chen
Character identification is a task of entity linking that finds the global entity of each personal mention in multiparty dialogue.
no code implementations • 6 Jan 2018 • Tomasz Jurczyk, Amit Deshmane, Jinho D. Choi
This paper gives comprehensive analyses of corpora based on Wikipedia for several tasks in question answering.
no code implementations • WS 2017 • Myungha Jang, Jinho D. Choi, James Allan
We view this problem as an information extraction task and build a multiclass classification model identifying unnatural language components into four categories.
Document Layout Analysis
Optical Character Recognition (OCR)
no code implementations • 22 Aug 2017 • Bonggun Shin, Falgun H. Chokshi, Timothy Lee, Jinho D. Choi
The electronic health record (EHR) contains a large amount of multi-dimensional and unstructured clinical data of significant operational and research value.
1 code implementation • 14 Aug 2017 • Sayyed M. Zahiri, Jinho D. Choi
While there have been significant advances in detecting emotions from speech and image recognition, emotion detection on text is still under-explored and remained as an active research field.
no code implementations • 7 Aug 2017 • Ali Ahmadvand, Jinho D. Choi
In addition, using ISS-MULT could finely improve the MULT method for question answering tasks, and these improvements prove more significant in the answer triggering task.
no code implementations • CONLL 2017 • Henry Y. Chen, Ethan Zhou, Jinho D. Choi
This paper presents a novel approach to character identification, that is an entity linking task that maps mentions to characters in dialogues from TV show transcripts.
no code implementations • WS 2017 • Tomasz Jurczyk, Jinho D. Choi
This paper challenges a cross-genre document retrieval task, where the queries are in formal writing and the target documents are in conversational writing.
no code implementations • 16 Mar 2017 • Myungha Jang, Jinho D. Choi, James Allan
We view this problem as an information extraction task and build a multiclass classification model identifying unnatural language components into four categories.
no code implementations • WS 2017 • Bonggun Shin, Timothy Lee, Jinho D. Choi
With the advent of word embeddings, lexicons are no longer fully utilized for sentiment analysis although they still provide important features in the traditional setting.
1 code implementation • 27 Jun 2016 • Tomasz Jurczyk, Michael Zhai, Jinho D. Choi
This paper presents a new selection-based question answering dataset, SelQA.
no code implementations • 4 Apr 2016 • Tomasz Jurczyk, Jinho D. Choi
This paper presents a precursory yet novel approach to the question answering task using structural decomposition.
no code implementations • 4 Aug 2014 • Sandeep Ashwini, Jinho D. Choi
We present a novel approach for recognizing what we call targetable named entities; that is, named entities in a targeted set (e. g, movies, books, TV shows).
no code implementations • WS 2013 • Djam{\'e} Seddah, Reut Tsarfaty, S K{\"u}bler, ra, C, Marie ito, Jinho D. Choi, Rich{\'a}rd Farkas, Jennifer Foster, Iakes Goenaga, Koldo Gojenola Galletebeitia, Yoav Goldberg, Spence Green, Nizar Habash, Marco Kuhlmann, Wolfgang Maier, Joakim Nivre, Adam Przepi{\'o}rkowski, Ryan Roth, Wolfgang Seeker, Yannick Versley, Veronika Vincze, Marcin Woli{\'n}ski, Alina Wr{\'o}blewska, Eric Villemonte de la Clergerie
no code implementations • 6 Sep 2013 • Jinho D. Choi
This document gives a brief description of Korean data prepared for the SPMRL 2013 shared task.
no code implementations • LREC 2012 • Ashwini Vaidya, Jinho D. Choi, Martha Palmer, Bhuvana Narasimhan
This paper examines both linguistic behavior and practical implication of empty argument insertion in the Hindi PropBank.