no code implementations • LREC 2012 • Paul Felt, Eric Ringger, Kevin Seppi, Kristian Heal, Robbie Haertel, Deryle Lonsdale
Manual annotation of large textual corpora can be cost-prohibitive, especially for rare and under-resourced languages.
no code implementations • LREC 2014 • Paul Felt, Eric Ringger, Kevin Seppi, Kristian Heal
We describe an under-studied problem in language resource management: that of providing automatic assistance to annotators working in exploratory settings.
no code implementations • LREC 2014 • Paul Felt, Robbie Haertel, Eric Ringger, Kevin Seppi
We introduce MomResp, a model that incorporates information from both natural data clusters as well as annotations from multiple annotators to infer ground-truth labels and annotator reliability for the document classification task.
no code implementations • LREC 2014 • Kevin Black, Eric Ringger, Paul Felt, Kevin Seppi, Kristian Heal, Deryle Lonsdale
The task of corpus-dictionary linkage (CDL) is to annotate each word in a corpus with a link to an appropriate dictionary entry that documents the sense and usage of the word.
no code implementations • COLING 2016 • Jeffrey Lund, Paul Felt, Kevin Seppi, Eric Ringger
Probabilistic models are a useful means for analyzing large text corpora.
no code implementations • COLING 2016 • Paul Felt, Eric Ringger, Kevin Seppi
In modern text annotation projects, crowdsourced annotations are often aggregated using item response models or by majority vote.
no code implementations • ACL 2017 • Jeffrey Lund, Connor Cook, Kevin Seppi, Jordan Boyd-Graber
We propose combinations of words as anchors, going beyond existing single word anchor algorithms{---}an approach we call {``}Tandem Anchors{''}.
no code implementations • COLING 2018 • Paul Felt, Eric Ringger, Jordan Boyd-Graber, Kevin Seppi
Annotated corpora enable supervised machine learning and data analysis.
no code implementations • EMNLP 2018 • Jeffrey Lund, Stephen Cowley, Wilson Fearn, Emily Hales, Kevin Seppi
We propose Labeled Anchors, an interactive and supervised topic model based on the anchor words algorithm (Arora et al., 2013).
no code implementations • 23 Oct 2018 • Brandon Schoenfeld, Christophe Giraud-Carrier, Mason Poggemann, Jarom Christensen, Kevin Seppi
Much of the work in metalearning has focused on classifier selection, combined more recently with hyperparameter optimization, with little concern for data preprocessing.
no code implementations • ACL 2019 • Jeffrey Lund, Piper Armstrong, Wilson Fearn, Stephen Cowley, Courtni Byun, Jordan Boyd-Graber, Kevin Seppi
Topic models are typically evaluated with respect to the global topic distributions that they generate, using metrics such as coherence, but without regard to local (token-level) topic assignments.
no code implementations • NAACL 2019 • Jeffrey Lund, Piper Armstrong, Wilson Fearn, Stephen Cowley, Emily Hales, Kevin Seppi
Cross-referencing, which links passages of text to other related passages, can be a valuable study aid for facilitating comprehension of a text.
no code implementations • ACL 2019 • Varun Kumar, Alison Smith-Renner, Leah Findlater, Kevin Seppi, Jordan Boyd-Graber
To address the lack of comparative evaluation of Human-in-the-Loop Topic Modeling (HLTM) systems, we implement and evaluate three contrasting HLTM modeling approaches using simulation experiments.
2 code implementations • IJCNLP 2019 • Orion Weller, Kevin Seppi
These experiments show that this method outperforms all previous work done on these tasks, with an F-measure of 93. 1% for the Puns dataset and 98. 6% on the Short Jokes dataset.
no code implementations • LREC 2020 • Orion Weller, Kevin Seppi
We also introduce this dataset as a task for future work, where models learn to predict the level of humor in a joke.
no code implementations • ACL 2020 • Orion Weller, Hildebr, Jordan t, Ilya Reznik, Christopher Challis, E. Shannon Tass, Quinn Snell, Kevin Seppi
Predicting reading time has been a subject of much previous work, focusing on how different words affect human processing, measured by reading time.
no code implementations • WS 2020 • Orion Weller, Nancy Fulda, Kevin Seppi
Understanding and identifying humor has been increasingly popular, as seen by the number of datasets created to study humor.
1 code implementation • NAACL 2021 • Wilson Fearn, Orion Weller, Kevin Seppi
Text classification is a significant branch of natural language processing, and has many applications including document classification and sentiment analysis.
1 code implementation • ACL 2022 • Orion Weller, Kevin Seppi, Matt Gardner
We find that there is a simple heuristic for when to use one of these techniques over the other: pairwise MTL is better than STILTs when the target task has fewer instances than the supporting task and vice versa.