Search Results for author: Luke Gessler

Found 19 papers, 9 papers with code

Overview of AMALGUM – Large Silver Quality Annotations across English Genres

no code implementations • SCiL 2021 • Luke Gessler, Siyao Peng, Yang Liu, YIlun Zhu, Shabnam Behzad, Amir Zeldes

Paper
Add Code

Midas Loop: A Prioritized Human-in-the-Loop Annotation for Large Scale Multilayer Data

no code implementations • LREC (LAW) 2022 • Luke Gessler, Lauren Levine, Amir Zeldes

Large scale annotation of rich multilayer corpus data is expensive and time consuming, motivating approaches that integrate high quality automatic tools with active learning in order to prioritize human labeling of hard cases.

Active Learning Management +3

Paper
Add Code

Closing the NLP Gap Documentary Linguistics and NLP Need a Shared Software Infrastructure

no code implementations • ComputEL (ACL) 2022 • Luke Gessler

For decades, researchers in natural language processing and computational linguistics have been developing models and algorithms that aim to serve the needs of language documentation projects.

Paper
Add Code

Xposition: An Online Multilingual Database of Adpositional Semantics

no code implementations • LREC 2022 • Luke Gessler, Nathan Schneider, Joseph C. Ledford, Austin Blodgett

We present Xposition, an online platform for documenting adpositional semantics across languages in terms of supersenses (Schneider et al., 2018).

Paper
Add Code

TAMS: Translation-Assisted Morphological Segmentation

no code implementations • 21 Mar 2024 • Enora Rice, Ali Marashian, Luke Gessler, Alexis Palmer, Katharina von der Wense

Canonical morphological segmentation is the process of analyzing words into the standard (aka underlying) forms of their constituent morphemes.

Segmentation Translation

Paper
Add Code

eRST: A Signaled Graph Theory of Discourse Relations and Organization

no code implementations • 20 Mar 2024 • Amir Zeldes, Tatsuya Aoyama, Yang Janet Liu, Siyao Peng, Debopam Das, Luke Gessler

In this article we present Enhanced Rhetorical Structure Theory (eRST), a new theoretical framework for computational discourse analysis, based on an expansion of Rhetorical Structure Theory (RST).

Paper
Add Code

Syntactic Inductive Bias in Transformer Language Models: Especially Helpful for Low-Resource Languages?

1 code implementation • 1 Nov 2023 • Luke Gessler, Nathan Schneider

A line of work on Transformer-based language models such as BERT has attempted to use syntactic inductive bias to enhance the pretraining process, on the theory that building syntactic structure into the training process should reduce the amount of data needed for training.

Inductive Bias

Paper
Code

GENTLE: A Genre-Diverse Multilayer Challenge Set for English NLP and Linguistic Evaluation

1 code implementation • 3 Jun 2023 • Tatsuya Aoyama, Shabnam Behzad, Luke Gessler, Lauren Levine, Jessica Lin, Yang Janet Liu, Siyao Peng, YIlun Zhu, Amir Zeldes

We evaluate state-of-the-art NLP systems on GENTLE and find severe degradation for at least some genres in their performance on all tasks, which indicates GENTLE's utility as an evaluation dataset for NLP systems.

coreference-resolution Dependency Parsing +2

Paper
Code

PrOnto: Language Model Evaluations for 859 Languages

1 code implementation • 22 May 2023 • Luke Gessler

Evaluation datasets are critical resources for measuring the quality of pretrained language models.

Language Modelling

Paper
Code

MicroBERT: Effective Training of Low-resource Monolingual BERTs through Parameter Reduction and Multitask Learning

1 code implementation • 23 Dec 2022 • Luke Gessler, Amir Zeldes

Transformer language models (TLMs) are critical for most NLP tasks, but they are difficult to create for low-resource languages because of how much pretraining data they require.

Dependency Parsing Language Modelling +3

Paper
Code

BERT Has Uncommon Sense: Similarity Ranking for Word Sense BERTology

1 code implementation • EMNLP (BlackboxNLP) 2021 • Luke Gessler, Nathan Schneider

An important question concerning contextualized word embedding (CWE) models like BERT is how well they can represent different word senses, especially those in the long tail of uncommon senses.

Retrieval

Paper
Code

DisCoDisCo at the DISRPT2021 Shared Task: A System for Discourse Segmentation, Classification, and Connective Detection

1 code implementation • EMNLP (DISRPT) 2021 • Luke Gessler, Shabnam Behzad, Yang Janet Liu, Siyao Peng, YIlun Zhu, Amir Zeldes

This paper describes our submission to the DISRPT2021 Shared Task on Discourse Unit Segmentation, Connective Detection, and Relation Classification.

Classification Connective Detection +6

Paper
Code

Supersense and Sensibility: Proxy Tasks for Semantic Annotation of Prepositions

no code implementations • COLING (LAW) 2020 • Luke Gessler, Shira Wein, Nathan Schneider

Prepositional supersense annotation is time-consuming and requires expert training.

Paper
Add Code

AMALGUM -- A Free, Balanced, Multilayer English Web Corpus

1 code implementation • LREC 2020 • Luke Gessler, Siyao Peng, Yang Liu, YIlun Zhu, Shabnam Behzad, Amir Zeldes

We present a freely available, genre-balanced English web corpus totaling 4M tokens and featuring a large number of high-quality automatic annotation layers, including dependency trees, non-named entity annotations, coreference resolution, and discourse trees in Rhetorical Structure Theory.

coreference-resolution

Paper
Code

A Summary of the First Workshop on Language Technology for Language Documentation and Revitalization

no code implementations • LREC 2020 • Graham Neubig, Shruti Rijhwani, Alexis Palmer, Jordan MacKenzie, Hilaria Cruz, Xinjian Li, Matthew Lee, Aditi Chaudhary, Luke Gessler, Steven Abney, Shirley Anugrah Hayati, Antonios Anastasopoulos, Olga Zamaraeva, Emily Prud'hommeaux, Jennette Child, Sara Child, Rebecca Knowles, Sarah Moeller, Jeffrey Micher, Yiyuan Li, Sydney Zink, Mengzhou Xia, Roshan S Sharma, Patrick Littell

Despite recent advances in natural language processing and other language technology, the application of such technology to language documentation and conservation has been limited.

Paper
Add Code

Supervised Grapheme-to-Phoneme Conversion of Orthographic Schwas in Hindi and Punjabi

1 code implementation • ACL 2020 • Aryaman Arora, Luke Gessler, Nathan Schneider

Hindi grapheme-to-phoneme (G2P) conversion is mostly trivial, with one exception: whether a schwa represented in the orthography is pronounced or unpronounced (deleted).

Paper
Code

B. Rex: a dialogue agent for book recommendations

no code implementations • WS 2019 • Mitchell Abrams, Luke Gessler, Matthew Marge

We present B. Rex, a dialogue agent for book recommendations.

Paper
Add Code

A Discourse Signal Annotation System for RST Trees

1 code implementation • WS 2019 • Luke Gessler, Yang Liu, Amir Zeldes

This paper presents a new system for open-ended discourse relation signal annotation in the framework of Rhetorical Structure Theory (RST), implemented on top of an online tool for RST annotation.

Paper
Code

Developing without developers: choosing labor-saving tools for language documentation apps

no code implementations • WS 2019 • Luke Gessler

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.