1 code implementation • 26 Jun 2023 • Doug Beeferman, Nabeel Gillani
Analyzing open-ended survey responses is a crucial yet challenging task for social scientists, non-profit organizations, and educational institutions, as they often face the trade-off between obtaining rich data and the burden of reading and coding textual responses.
no code implementations • 14 Mar 2023 • Nabeel Gillani, Doug Beeferman, Christine Vega-Pourheydarian, Cassandra Overney, Pascal Van Hentenryck, Deb Roy
Most US school districts draw "attendance boundaries" to define catchment areas that assign students to schools near their homes, often recapitulating neighborhood demographic segregation in schools.
1 code implementation • COLING 2022 • Hang Jiang, Doug Beeferman, Brandon Roy, Deb Roy
As political attitudes have diverged ideologically in the United States, political speech has diverged lingusitically.
1 code implementation • LREC 2022 • Hang Jiang, Yining Hua, Doug Beeferman, Deb Roy
We release the dataset and make both the Stanza pipeline and BERTweet-based models available "off-the-shelf" for use in future Tweet NLP research.
Ranked #3 on Dependency Parsing on Tweebank
no code implementations • 12 Dec 2021 • Hang Jiang, Doug Beeferman, Weiquan Mao, Deb Roy
TDT systems aim to cluster a corpus of news articles by event, and in that context, stories that describe the same event are likely to have been written at around the same time.
no code implementations • 12 Oct 2021 • Doug Beeferman, Hang Jiang
The essential task of Topic Detection and Tracking (TDT) is to organize a collection of news media into clusters of stories that pertain to the same real-world event.
1 code implementation • 16 Jul 2019 • Doug Beeferman, William Brannon, Deb Roy
We introduce RadioTalk, a corpus of speech recognition transcripts sampled from talk radio broadcasts in the United States between October of 2018 and March of 2019.