Search Results for author: Kevin Seppi

Found 24 papers, 3 papers with code

First Results in a Study Evaluating Pre-annotation and Correction Propagation for Machine-Assisted Syriac Morphological Analysis

no code implementations • LREC 2012 • Paul Felt, Eric Ringger, Kevin Seppi, Kristian Heal, Robbie Haertel, Deryle Lonsdale

Manual annotation of large textual corpora can be cost-prohibitive, especially for rare and under-resourced languages.

Morphological Analysis Sentence

Paper
Add Code

Using Transfer Learning to Assist Exploratory Corpus Annotation

no code implementations • LREC 2014 • Paul Felt, Eric Ringger, Kevin Seppi, Kristian Heal

We describe an under-studied problem in language resource management: that of providing automatic assistance to annotators working in exploratory settings.

Management Part-Of-Speech Tagging +1

Paper
Add Code

Momresp: A Bayesian Model for Multi-Annotator Document Labeling

no code implementations • LREC 2014 • Paul Felt, Robbie Haertel, Eric Ringger, Kevin Seppi

We introduce MomResp, a model that incorporates information from both natural data clusters as well as annotations from multiple annotators to infer ground-truth labels and annotator reliability for the document classification task.

Document Classification

Paper
Add Code

Evaluating Lemmatization Models for Machine-Assisted Corpus-Dictionary Linkage

no code implementations • LREC 2014 • Kevin Black, Eric Ringger, Paul Felt, Kevin Seppi, Kristian Heal, Deryle Lonsdale

The task of corpus-dictionary linkage (CDL) is to annotate each word in a corpus with a link to an appropriate dictionary entry that documents the sense and usage of the word.

Lemmatization Morphological Analysis +2

Paper
Add Code

Is Your Anchor Going Up or Down? Fast and Accurate Supervised Topic Models

no code implementations • HLT 2015 • Eric Ringger, Kevin Seppi, Jeffrey Lund, Jordan Boyd-Graber, Thang Nguyen

Sentiment Analysis Topic Models

Paper
Add Code

Early Gains Matter: A Case for Preferring Generative over Discriminative Crowdsourcing Models

no code implementations • HLT 2015 • Eric Ringger, Paul Felt, Kevin Seppi, Robbie Haertel, Kevin Black

text-classification Text Classification +1

Paper
Add Code

An Analytic and Empirical Evaluation of Return-on-Investment-Based Active Learning

no code implementations • WS 2015 • Robbie Haertel, Eric Ringger, Kevin Seppi, Paul Felt

Active Learning

Paper
Add Code

Making the Most of Crowdsourced Document Annotations: Confused Supervised LDA

no code implementations • CONLL 2015 • Paul Felt, Eric Ringger, Jordan Boyd-Graber, Kevin Seppi

Text Classification

Paper
Add Code

ALTO: Active Learning with Topic Overviews for Speeding Label Induction and Document Labeling

no code implementations • ACL 2016 • Forough Poursabzi-Sangdeh, Jordan Boyd-Graber, Leah Findlater, Kevin Seppi

Active Learning Topic Models

Paper
Add Code

Fast Inference for Interactive Models of Text

no code implementations • COLING 2016 • Jeffrey Lund, Paul Felt, Kevin Seppi, Eric Ringger

Probabilistic models are a useful means for analyzing large text corpora.

Clustering Topic Models

Paper
Add Code

Semantic Annotation Aggregation with Conditional Crowdsourcing Models and Word Embeddings

no code implementations • COLING 2016 • Paul Felt, Eric Ringger, Kevin Seppi

In modern text annotation projects, crowdsourced annotations are often aggregated using item response models or by majority vote.

text annotation Word Embeddings

Paper
Add Code

Tandem Anchoring: a Multiword Anchor Approach for Interactive Topic Modeling

no code implementations • ACL 2017 • Jeffrey Lund, Connor Cook, Kevin Seppi, Jordan Boyd-Graber

We propose combinations of words as anchors, going beyond existing single word anchor algorithms{---}an approach we call {``}Tandem Anchors{''}.

Document Classification Information Retrieval +2

Paper
Add Code

Learning from Measurements in Crowdsourcing Models: Inferring Ground Truth from Diverse Annotation Types

no code implementations • COLING 2018 • Paul Felt, Eric Ringger, Jordan Boyd-Graber, Kevin Seppi

Annotated corpora enable supervised machine learning and data analysis.

Paper
Add Code

Labeled Anchors and a Scalable, Transparent, and Interactive Classifier

no code implementations • EMNLP 2018 • Jeffrey Lund, Stephen Cowley, Wilson Fearn, Emily Hales, Kevin Seppi

We propose Labeled Anchors, an interactive and supervised topic model based on the anchor words algorithm (Arora et al., 2013).

Document Classification General Classification +1

Paper
Add Code

Preprocessor Selection for Machine Learning Pipelines

no code implementations • 23 Oct 2018 • Brandon Schoenfeld, Christophe Giraud-Carrier, Mason Poggemann, Jarom Christensen, Kevin Seppi

Much of the work in metalearning has focused on classifier selection, combined more recently with hyperparameter optimization, with little concern for data preprocessing.

BIG-bench Machine Learning Hyperparameter Optimization

Paper
Add Code

Automatic Evaluation of Local Topic Quality

no code implementations • ACL 2019 • Jeffrey Lund, Piper Armstrong, Wilson Fearn, Stephen Cowley, Courtni Byun, Jordan Boyd-Graber, Kevin Seppi

Topic models are typically evaluated with respect to the global topic distributions that they generate, using metrics such as coherence, but without regard to local (token-level) topic assignments.

Topic Models

Paper
Add Code

Cross-referencing using Fine-grained Topic Modeling

no code implementations • NAACL 2019 • Jeffrey Lund, Piper Armstrong, Wilson Fearn, Stephen Cowley, Emily Hales, Kevin Seppi

Cross-referencing, which links passages of text to other related passages, can be a valuable study aid for facilitating comprehension of a text.

Paper
Add Code

Why Didn't You Listen to Me? Comparing User Control of Human-in-the-Loop Topic Models

no code implementations • ACL 2019 • Varun Kumar, Alison Smith-Renner, Leah Findlater, Kevin Seppi, Jordan Boyd-Graber

To address the lack of comparative evaluation of Human-in-the-Loop Topic Modeling (HLTM) systems, we implement and evaluate three contrasting HLTM modeling approaches using simulation experiments.

Topic Models

Paper
Add Code

Humor Detection: A Transformer Gets the Last Laugh

2 code implementations • IJCNLP 2019 • Orion Weller, Kevin Seppi

These experiments show that this method outperforms all previous work done on these tasks, with an F-measure of 93. 1% for the Puns dataset and 98. 6% on the Short Jokes dataset.

Humor Detection Sentence

Paper
Code

The rJokes Dataset: a Large Scale Humor Collection

no code implementations • LREC 2020 • Orion Weller, Kevin Seppi

We also introduce this dataset as a task for future work, where models learn to predict the level of humor in a joke.

Cultural Vocal Bursts Intensity Prediction

Paper
Add Code

You Don't Have Time to Read This: An Exploration of Document Reading Time Prediction

no code implementations • ACL 2020 • Orion Weller, Hildebr, Jordan t, Ilya Reznik, Christopher Challis, E. Shannon Tass, Quinn Snell, Kevin Seppi

Predicting reading time has been a subject of much previous work, focusing on how different words affect human processing, measured by reading time.

Paper
Add Code

Can Humor Prediction Datasets be used for Humor Generation? Humorous Headline Generation via Style Transfer

no code implementations • WS 2020 • Orion Weller, Nancy Fulda, Kevin Seppi

Understanding and identifying humor has been increasingly popular, as seen by the number of datasets created to study humor.

Headline Generation Style Transfer

Paper
Add Code

Exploring the Relationship Between Algorithm Performance, Vocabulary, and Run-Time in Text Classification

1 code implementation • NAACL 2021 • Wilson Fearn, Orion Weller, Kevin Seppi

Text classification is a significant branch of natural language processing, and has many applications including document classification and sentiment analysis.

Document Classification General Classification +2

Paper
Code

When to Use Multi-Task Learning vs Intermediate Fine-Tuning for Pre-Trained Encoder Transfer Learning

1 code implementation • ACL 2022 • Orion Weller, Kevin Seppi, Matt Gardner

We find that there is a simple heuristic for when to use one of these techniques over the other: pairwise MTL is better than STILTs when the target task has fewer instances than the supporting task and vice versa.

Multi-Task Learning

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.