no code implementations • ComputEL (ACL) 2022 • Luke Gessler
For decades, researchers in natural language processing and computational linguistics have been developing models and algorithms that aim to serve the needs of language documentation projects.
no code implementations • LREC 2022 • Luke Gessler, Nathan Schneider, Joseph C. Ledford, Austin Blodgett
We present Xposition, an online platform for documenting adpositional semantics across languages in terms of supersenses (Schneider et al., 2018).
no code implementations • LREC (LAW) 2022 • Luke Gessler, Lauren Levine, Amir Zeldes
Large scale annotation of rich multilayer corpus data is expensive and time consuming, motivating approaches that integrate high quality automatic tools with active learning in order to prioritize human labeling of hard cases.
no code implementations • 3 Jun 2023 • Tatsuya Aoyama, Shabnam Behzad, Luke Gessler, Lauren Levine, Jessica Lin, Yang Janet Liu, Siyao Peng, YIlun Zhu, Amir Zeldes
We evaluate state-of-the-art NLP systems on GENTLE and find severe degradation for at least some genres in their performance on all tasks, which indicates GENTLE's utility as an evaluation dataset for NLP systems.
1 code implementation • 22 May 2023 • Luke Gessler
Evaluation datasets are critical resources for measuring the quality of pretrained language models.
1 code implementation • 23 Dec 2022 • Luke Gessler, Amir Zeldes
Transformer language models (TLMs) are critical for most NLP tasks, but they are difficult to create for low-resource languages because of how much pretraining data they require.
1 code implementation • EMNLP (BlackboxNLP) 2021 • Luke Gessler, Nathan Schneider
An important question concerning contextualized word embedding (CWE) models like BERT is how well they can represent different word senses, especially those in the long tail of uncommon senses.
1 code implementation • EMNLP (DISRPT) 2021 • Luke Gessler, Shabnam Behzad, Yang Janet Liu, Siyao Peng, YIlun Zhu, Amir Zeldes
This paper describes our submission to the DISRPT2021 Shared Task on Discourse Unit Segmentation, Connective Detection, and Relation Classification.
no code implementations • COLING (LAW) 2020 • Luke Gessler, Shira Wein, Nathan Schneider
Prepositional supersense annotation is time-consuming and requires expert training.
1 code implementation • LREC 2020 • Luke Gessler, Siyao Peng, Yang Liu, YIlun Zhu, Shabnam Behzad, Amir Zeldes
We present a freely available, genre-balanced English web corpus totaling 4M tokens and featuring a large number of high-quality automatic annotation layers, including dependency trees, non-named entity annotations, coreference resolution, and discourse trees in Rhetorical Structure Theory.
no code implementations • LREC 2020 • Graham Neubig, Shruti Rijhwani, Alexis Palmer, Jordan MacKenzie, Hilaria Cruz, Xinjian Li, Matthew Lee, Aditi Chaudhary, Luke Gessler, Steven Abney, Shirley Anugrah Hayati, Antonios Anastasopoulos, Olga Zamaraeva, Emily Prud'hommeaux, Jennette Child, Sara Child, Rebecca Knowles, Sarah Moeller, Jeffrey Micher, Yiyuan Li, Sydney Zink, Mengzhou Xia, Roshan S Sharma, Patrick Littell
Despite recent advances in natural language processing and other language technology, the application of such technology to language documentation and conservation has been limited.
1 code implementation • ACL 2020 • Aryaman Arora, Luke Gessler, Nathan Schneider
Hindi grapheme-to-phoneme (G2P) conversion is mostly trivial, with one exception: whether a schwa represented in the orthography is pronounced or unpronounced (deleted).
no code implementations • WS 2019 • Mitchell Abrams, Luke Gessler, Matthew Marge
We present B. Rex, a dialogue agent for book recommendations.
1 code implementation • WS 2019 • Luke Gessler, Yang Liu, Amir Zeldes
This paper presents a new system for open-ended discourse relation signal annotation in the framework of Rhetorical Structure Theory (RST), implemented on top of an online tool for RST annotation.