1 code implementation • EACL (AdaptNLP) 2021 • Xiaolei Huang, Michael J. Paul, Robin Burke, Franck Dernoncourt, Mark Dredze
In this study, we treat the user interest as domains and empirically examine how the user language can vary across the user factor in three English social media datasets.
no code implementations • ACL 2020 • Mozhi Zhang, Yoshinari Fujinuma, Michael J. Paul, Jordan Boyd-Graber
Cross-lingual word embeddings (CLWE) are often evaluated on bilingual lexicon induction (BLI).
Bilingual Lexicon Induction Cross-Lingual Word Embeddings +2
2 code implementations • LREC 2020 • Xiaolei Huang, Linzi Xing, Franck Dernoncourt, Michael J. Paul
Existing research on fairness evaluation of document classification models mainly uses synthetic monolingual data without ground truth for author demographic attributes.
1 code implementation • IJCNLP 2019 • Linzi Xing, Michael J. Paul, Giuseppe Carenini
Probabilistic topic models such as latent Dirichlet allocation (LDA) are popularly used with Bayesian inference methods such as Gibbs sampling to learn posterior distributions over topic model parameters.
no code implementations • WS 2019 • Davy Weissenbacher, Abeed Sarker, Arjun Magge, Ashlynn Daughton, Karen O{'}Connor, Michael J. Paul, Gonzalez-Hern, Graciela ez
We present the Social Media Mining for Health Shared Tasks collocated with the ACL at Florence in 2019, which address these challenges for health monitoring and surveillance, utilizing state of the art techniques for processing noisy, real-world, and substantially creative language expressions from social media users.
1 code implementation • ACL 2019 • Xiaolei Huang, Michael J. Paul
Language usage can change across periods of time, but document classifiers models are usually trained and tested on corpora spanning multiple years without considering temporal variations.
1 code implementation • ACL 2019 • Yoshinari Fujinuma, Jordan Boyd-Graber, Michael J. Paul
Cross-lingual word embeddings encode the meaning of words from different languages into a shared low-dimensional space.
1 code implementation • SEMEVAL 2019 • Xiaolei Huang, Michael J. Paul
Language use varies across different demographic factors, such as gender, age, and geographic location.
no code implementations • NAACL 2019 • Shudong Hao, Michael J. Paul
We introduce a theoretical analysis of crosslingual transfer in probabilistic topic models.
no code implementations • CL 2020 • Shudong Hao, Michael J. Paul
Probabilistic topic modeling is a popular choice as the first step of crosslingual tasks to enable knowledge transfer and extract multilingual features.
no code implementations • WS 2018 • Davy Weissenbacher, Abeed Sarker, Michael J. Paul, Gonzalez-Hern, Graciela ez
The goals of the SMM4H shared tasks are to release annotated social media based health related datasets to the research community, and to compare the performances of natural language processing and machine learning systems on tasks involving these datasets.
no code implementations • COLING 2018 • Shudong Hao, Michael J. Paul
Multilingual topic models enable crosslingual tasks by extracting consistent topics from multilingual corpora.
1 code implementation • ACL 2018 • Xiaolei Huang, Michael J. Paul
Many corpora span broad periods of time.
no code implementations • 11 Jun 2018 • Shudong Hao, Michael J. Paul
Multilingual topic models enable crosslingual tasks by extracting consistent topics from multilingual corpora.
no code implementations • NAACL 2018 • Shudong Hao, Jordan Boyd-Graber, Michael J. Paul
Multilingual topic models enable document analysis across languages through coherent multilingual summaries of the data.
no code implementations • WS 2017 • Linzi Xing, Michael J. Paul
Low-dimensional vector representations of social media users can benefit applications like recommendation systems and user attribute inference.
no code implementations • CONLL 2017 • Michael J. Paul
This paper proposes a matching technique for learning causal associations between word features and class labels in document classification.
no code implementations • TACL 2015 • Michael J. Paul, Mark Dredze
We introduce Sprite, a family of topic models that incorporates structure into model priors as a function of underlying components.