no code implementations • RANLP 2021 • James R. Hull, Valerie Novak, C. Anton Rytting, Paul Rodrigues, Victor M. Frank, Matthew Swahn
Feature engineering is an important step in classical NLP pipelines, but machine learning engineers may not be aware of the signals to look for when processing foreign language text.
no code implementations • LREC 2022 • C. Anton Rytting, Valerie Novak, James R. Hull, Victor M. Frank, Paul Rodrigues, Jarrett G. W. Lee, Laurel Miller-Sims
We believe this to be the first publicly-available dataset associating demographic and personality trait data with Russian-language social media content, the first paper to describe the collection of Dark Triad scores with texts across multiple Russian-language social media platforms, and to a limited extent, the first publicly-available dataset of personality traits to author content across several different social media sites.
no code implementations • 12 Jul 2021 • Evan Williams, Paul Rodrigues, Sieu Tran
Twitter training and test data were provided in English, Arabic, Spanish, Turkish, and Bulgarian.
no code implementations • 5 Sep 2020 • Evan Williams, Paul Rodrigues, Valerie Novak
We utilized BERT and RoBERTa models to identify claims in social media text a professional fact-checker should review, and rank these in priority order for the fact-checker.
no code implementations • WS 2012 • Michael Bloodgood, Peng Ye, Paul Rodrigues, David Zajic, David Doermann
We investigate combining methods and show that using random forests is a promising approach.
no code implementations • 29 Oct 2014 • Paul Rodrigues, David Zajic, David Doermann, Michael Bloodgood, Peng Ye
Dictionaries are often developed using tools that save to Extensible Markup Language (XML)-based standards.
no code implementations • 28 Oct 2014 • David Zajic, Michael Maxwell, David Doermann, Paul Rodrigues, Michael Bloodgood
We describe a paradigm for combining manual and automatic error correction of noisy structured lexicographic data.
no code implementations • LREC 2012 • Paul Rodrigues, C. Anton Rytting
Responses that differed from the stimuli were considered a typographical or spelling error, and added to an error corpus.