Search Results for author: David Jurgens

Found 65 papers, 24 papers with code

Condolence and Empathy in Online Communities

no code implementations • EMNLP 2020 • Naitian Zhou, David Jurgens

Here, we develop computational tools to create a massive dataset of 11. 4M expressions of distress and 2. 8M corresponding offerings of condolence in order to examine the dynamics of condolence online.

Paper
Add Code

SemEval-2022 Task 8: Multilingual news article similarity

no code implementations • SemEval (NAACL) 2022 • Xi Chen, Ali Zeynali, Chico Camargo, Fabian Flöck, Devin Gaffney, Przemyslaw Grabowicz, Scott Hale, David Jurgens, Mattia Samory

Thousands of new news articles appear daily in outlets in different languages.

Paper
Add Code

Learning PyTorch Through A Neural Dependency Parsing Exercise

no code implementations • NAACL (TeachingNLP) 2021 • David Jurgens

Dependency parsing is increasingly the popular parsing formalism in practice.

Dependency Parsing

Paper
Add Code

Classification without (Proper) Representation: Political Heterogeneity in Social Media and Its Implications for Classification and Behavioral Analysis

no code implementations • Findings (ACL) 2022 • Kenan Alkiek, Bohan Zhang, David Jurgens

Reddit is home to a broad spectrum of political activity, and users signal their political affiliations in multiple ways—from self-declarations to community participation.

Classification

Paper
Add Code

Learning about Word Vector Representations and Deep Learning through Implementing Word2vec

no code implementations • NAACL (TeachingNLP) 2021 • David Jurgens

Word vector representations are an essential part of an NLP curriculum.

Paper
Add Code

The subtle language of exclusion: Identifying the Toxic Speech of Trans-exclusionary Radical Feminists

no code implementations • NAACL (WOAH) 2022 • Christina Lu, David Jurgens

Toxic language can take many forms, from explicit hate speech to more subtle microaggressions.

Ethics

Paper
Add Code

When it Rains, it Pours: Modeling Media Storms and the News Ecosystem

1 code implementation • 4 Dec 2023 • Benjamin Litterer, David Jurgens, Dallas Card

Most events in the world receive at most brief coverage by the news media.

Paper
Code

Aligning with Whom? Large Language Models Have Gender and Racial Biases in Subjective NLP Tasks

1 code implementation • 16 Nov 2023 • Huaman Sun, Jiaxin Pei, MinJe Choi, David Jurgens

We find that for both tasks, model predictions are closer to the labels from White and female participants.

Paper
Code

You don't need a personality test to know these models are unreliable: Assessing the Reliability of Large Language Models on Psychometric Instruments

1 code implementation • 16 Nov 2023 • Bangzhao Shu, Lechen Zhang, MinJe Choi, Lavinia Dunagan, Lajanugen Logeswaran, Moontae Lee, Dallas Card, David Jurgens

The versatility of Large Language Models (LLMs) on natural language understanding tasks has made them popular for research in social sciences.

Natural Language Understanding Negation +1

Paper
Code

Is "A Helpful Assistant" the Best Role for Large Language Models? A Systematic Evaluation of Social Roles in System Prompts

1 code implementation • 16 Nov 2023 • Mingqian Zheng, Jiaxin Pei, David Jurgens

Commercial AI systems commonly define the role of the LLM in system prompts.

Paper
Code

Social Meme-ing: Measuring Linguistic Variation in Memes

1 code implementation • 15 Nov 2023 • Naitian Zhou, David Jurgens, David Bamman

Much work in the space of NLP has used computational methods to explore sociolinguistic variation in text.

Paper
Code

RCT Rejection Sampling for Causal Estimation Evaluation

1 code implementation • 27 Jul 2023 • Katherine A. Keith, Sergey Feldman, David Jurgens, Jonathan Bragg, Rohit Bhattacharya

We contribute a new sampling algorithm, which we call RCT rejection sampling, and provide theoretical guarantees that causal identification holds in the observational data to allow for valid comparisons to the ground-truth RCT.

Causal Identification

Paper
Code

Exploring Linguistic Style Matching in Online Communities: The Role of Social Context and Conversation Dynamics

no code implementations • 6 Jul 2023 • Aparna Ananthasubramaniam, Hong Chen, Jason Yan, Kenan Alkiek, Jiaxin Pei, Agrima Seth, Lavinia Dunagan, MinJe Choi, Benjamin Litterer, David Jurgens

Linguistic style matching (LSM) in conversations can be reflective of several aspects of social influence such as power or persuasion.

Paper
Add Code

Your spouse needs professional help: Determining the Contextual Appropriateness of Messages through Modeling Social Relationships

1 code implementation • 6 Jul 2023 • David Jurgens, Agrima Seth, Jackson Sargent, Athena Aghighi, Michael Geraci

We introduce a new dataset of contextually-situated judgments of appropriateness and show that large language models can readily incorporate relationship information to accurately identify appropriateness in a given context.

Paper
Code

When Do Annotator Demographics Matter? Measuring the Influence of Annotator Demographics with the POPQUORN Dataset

1 code implementation • 12 Jun 2023 • Jiaxin Pei, David Jurgens

Further, our work shows that backgrounds not previously considered in NLP (e. g., education), are meaningful and should be considered.

Question Answering

Paper
Code

Do LLMs Understand Social Knowledge? Evaluating the Sociability of Large Language Models with SocKET Benchmark

1 code implementation • 24 May 2023 • MinJe Choi, Jiaxin Pei, Sagar Kumar, Chang Shu, David Jurgens

Large language models (LLMs) have been shown to perform well at a variety of syntactic, discourse, and reasoning tasks.

Paper
Code

Bridging Nations: Quantifying the Role of Multilinguals in Communication on Social Media

1 code implementation • 7 Apr 2023 • Julia Mendelsohn, Sayan Ghosh, David Jurgens, Ceren Budak

Social media enables the rapid spread of many kinds of information, from memes to social movements.

Causal Inference

Paper
Code

POTATO: The Portable Text Annotation Tool

1 code implementation • 16 Dec 2022 • Jiaxin Pei, Aparna Ananthasubramaniam, Xingyao Wang, Naitian Zhou, Jackson Sargent, Apostolos Dedeloudis, David Jurgens

We present POTATO, the Portable text annotation tool, a free, fully open-sourced annotation system that 1) supports labeling many types of text and multimodal data; 2) offers easy-to-configure features to maximize the productivity of both deployers and annotators (convenient templates for common ML/NLP tasks, active learning, keypress shortcuts, keyword highlights, tooltips); and 3) supports a high degree of customization (editable UI, inserting pre-screening questions, attention and qualification tests).

Active Learning text annotation

264

Paper
Code

A Critical Reflection and Forward Perspective on Empathy and Natural Language Processing

no code implementations • 29 Oct 2022 • Allison Lahnala, Charles Welch, David Jurgens, Lucie Flek

We review the state of research on empathy in natural language processing and identify the following issues: (1) empathy definitions are absent or abstract, which (2) leads to low construct validity and reproducibility.

Paper
Add Code

Modeling Information Change in Science Communication with Semantically Matched Paraphrases

no code implementations • 24 Oct 2022 • Dustin Wright, Jiaxin Pei, David Jurgens, Isabelle Augenstein

Whether the media faithfully communicate scientific information has long been a core issue to the science community.

Fact Checking Retrieval

Paper
Add Code

SemEval 2023 Task 9: Multilingual Tweet Intimacy Analysis

no code implementations • 3 Oct 2022 • Jiaxin Pei, Vítor Silva, Maarten Bos, Yozon Liu, Leonardo Neves, David Jurgens, Francesco Barbieri

We propose MINT, a new Multilingual INTimacy analysis dataset covering 13, 372 tweets in 10 languages including English, French, Spanish, Italian, Portuguese, Korean, Dutch, Chinese, Hindi, and Arabic.

Paper
Add Code

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

3 code implementations • 9 Jun 2022 • Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ambrose Slone, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew Dai, Andrew La, Andrew Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakaş, B. Ryan Roberts, Bao Sheng Loe, Barret Zoph, Bartłomiej Bojanowski, Batuhan Özyurt, Behnam Hedayatnia, Behnam Neyshabur, Benjamin Inden, Benno Stein, Berk Ekmekci, Bill Yuchen Lin, Blake Howald, Bryan Orinion, Cameron Diao, Cameron Dour, Catherine Stinson, Cedrick Argueta, César Ferri Ramírez, Chandan Singh, Charles Rathkopf, Chenlin Meng, Chitta Baral, Chiyu Wu, Chris Callison-Burch, Chris Waites, Christian Voigt, Christopher D. Manning, Christopher Potts, Cindy Ramirez, Clara E. Rivera, Clemencia Siro, Colin Raffel, Courtney Ashcraft, Cristina Garbacea, Damien Sileo, Dan Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Daniel Freeman, Daniel Khashabi, Daniel Levy, Daniel Moseguí González, Danielle Perszyk, Danny Hernandez, Danqi Chen, Daphne Ippolito, Dar Gilboa, David Dohan, David Drakard, David Jurgens, Debajyoti Datta, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Derek Tam, Dieuwke Hupkes, Diganta Misra, Dilyar Buzan, Dimitri Coelho Mollo, Diyi Yang, Dong-Ho Lee, Dylan Schrader, Ekaterina Shutova, Ekin Dogus Cubuk, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Ellie Pavlick, Emanuele Rodola, Emma Lam, Eric Chu, Eric Tang, Erkut Erdem, Ernie Chang, Ethan A. Chi, Ethan Dyer, Ethan Jerzak, Ethan Kim, Eunice Engefu Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fatemeh Siar, Fernando Martínez-Plumed, Francesca Happé, Francois Chollet, Frieda Rong, Gaurav Mishra, Genta Indra Winata, Gerard de Melo, Germán Kruszewski, Giambattista Parascandolo, Giorgio Mariani, Gloria Wang, Gonzalo Jaimovitch-López, Gregor Betz, Guy Gur-Ari, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hannaneh Hajishirzi, Harsh Mehta, Hayden Bogar, Henry Shevlin, Hinrich Schütze, Hiromu Yakura, Hongming Zhang, Hugh Mee Wong, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, Jacob Hilton, Jaehoon Lee, Jaime Fernández Fisac, James B. Simon, James Koppel, James Zheng, James Zou, Jan Kocoń, Jana Thompson, Janelle Wingfield, Jared Kaplan, Jarema Radom, Jascha Sohl-Dickstein, Jason Phang, Jason Wei, Jason Yosinski, Jekaterina Novikova, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesse Engel, Jesujoba Alabi, Jiacheng Xu, Jiaming Song, Jillian Tang, Joan Waweru, John Burden, John Miller, John U. Balis, Jonathan Batchelder, Jonathan Berant, Jörg Frohberg, Jos Rozen, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Guerr, Joseph Jones, Joshua B. Tenenbaum, Joshua S. Rule, Joyce Chua, Kamil Kanclerz, Karen Livescu, Karl Krauth, Karthik Gopalakrishnan, Katerina Ignatyeva, Katja Markert, Kaustubh D. Dhole, Kevin Gimpel, Kevin Omondi, Kory Mathewson, Kristen Chiafullo, Ksenia Shkaruta, Kumar Shridhar, Kyle McDonell, Kyle Richardson, Laria Reynolds, Leo Gao, Li Zhang, Liam Dugan, Lianhui Qin, Lidia Contreras-Ochando, Louis-Philippe Morency, Luca Moschella, Lucas Lam, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros Colón, Luke Metz, Lütfi Kerem Şenel, Maarten Bosma, Maarten Sap, Maartje ter Hoeve, Maheen Farooqi, Manaal Faruqui, Mantas Mazeika, Marco Baturan, Marco Marelli, Marco Maru, Maria Jose Ramírez Quintana, Marie Tolkiehn, Mario Giulianelli, Martha Lewis, Martin Potthast, Matthew L. Leavitt, Matthias Hagen, Mátyás Schubert, Medina Orduna Baitemirova, Melody Arnaud, Melvin McElrath, Michael A. Yee, Michael Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michał Swędrowski, Michele Bevilacqua, Michihiro Yasunaga, Mihir Kale, Mike Cain, Mimee Xu, Mirac Suzgun, Mitch Walker, Mo Tiwari, Mohit Bansal, Moin Aminnaseri, Mor Geva, Mozhdeh Gheini, Mukund Varma T, Nanyun Peng, Nathan A. Chi, Nayeon Lee, Neta Gur-Ari Krakover, Nicholas Cameron, Nicholas Roberts, Nick Doiron, Nicole Martinez, Nikita Nangia, Niklas Deckers, Niklas Muennighoff, Nitish Shirish Keskar, Niveditha S. Iyer, Noah Constant, Noah Fiedel, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Omer Levy, Owain Evans, Pablo Antonio Moreno Casares, Parth Doshi, Pascale Fung, Paul Pu Liang, Paul Vicol, Pegah Alipoormolabashi, Peiyuan Liao, Percy Liang, Peter Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Pouya Pezeshkpour, Priti Oli, Qiaozhu Mei, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Etta Rudolph, Raefer Gabriel, Rahel Habacker, Ramon Risco, Raphaël Millière, Rhythm Garg, Richard Barnes, Rif A. Saurous, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Roman Sitelew, Ronan LeBras, Rosanne Liu, Rowan Jacobs, Rui Zhang, Ruslan Salakhutdinov, Ryan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Rylan Yang, Sahib Singh, Saif M. Mohammad, Sajant Anand, Sam Dillavou, Sam Shleifer, Sam Wiseman, Samuel Gruetter, Samuel R. Bowman, Samuel S. Schoenholz, Sanghyun Han, Sanjeev Kwatra, Sarah A. Rous, Sarik Ghazarian, Sayan Ghosh, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sepideh Sadeghi, Shadi Hamdan, Sharon Zhou, Shashank Srivastava, Sherry Shi, Shikhar Singh, Shima Asaadi, Shixiang Shane Gu, Shubh Pachchigar, Shubham Toshniwal, Shyam Upadhyay, Shyamolima, Debnath, Siamak Shakeri, Simon Thormeyer, Simone Melzi, Siva Reddy, Sneha Priscilla Makini, Soo-Hwan Lee, Spencer Torene, Sriharsha Hatwar, Stanislas Dehaene, Stefan Divic, Stefano Ermon, Stella Biderman, Stephanie Lin, Stephen Prasad, Steven T. Piantadosi, Stuart M. Shieber, Summer Misherghi, Svetlana Kiritchenko, Swaroop Mishra, Tal Linzen, Tal Schuster, Tao Li, Tao Yu, Tariq Ali, Tatsu Hashimoto, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Tiberius Nkinyili, Timo Schick, Timofei Kornev, Titus Tunduny, Tobias Gerstenberg, Trenton Chang, Trishala Neeraj, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Vera Demberg, Victoria Nyamai, Vikas Raunak, Vinay Ramasesh, Vinay Uday Prabhu, Vishakh Padmakumar, Vivek Srikumar, William Fedus, William Saunders, William Zhang, Wout Vossen, Xiang Ren, Xiaoyu Tong, Xinran Zhao, Xinyi Wu, Xudong Shen, Yadollah Yaghoobzadeh, Yair Lakretz, Yangqiu Song, Yasaman Bahri, Yejin Choi, Yichi Yang, Yiding Hao, Yifu Chen, Yonatan Belinkov, Yu Hou, Yufang Hou, Yuntao Bai, Zachary Seid, Zhuoye Zhao, Zijian Wang, Zijie J. Wang, ZiRui Wang, Ziyi Wu

BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models.

Common Sense Reasoning Math +1

2,650

Paper
Code

An Attention-Based Model for Predicting Contextual Informativeness and Curriculum Learning Applications

1 code implementation • 21 Apr 2022 • Sungjin Nam, David Jurgens, Gwen Frishkoff, Kevyn Collins-Thompson

Second, we show how our model identifies key contextual elements in a sentence that are likely to contribute most to a reader's understanding of the target word.

Informativeness Sentence

Paper
Code

ByT5 model for massively multilingual grapheme-to-phoneme conversion

1 code implementation • 6 Apr 2022 • Jian Zhu, Cong Zhang, David Jurgens

In this study, we tackle massively multilingual grapheme-to-phoneme conversion through implementing G2P models based on ByT5.

250

Paper
Code

Networks and Identity Drive Geographic Properties of the Diffusion of Linguistic Innovation

no code implementations • 10 Feb 2022 • Aparna Ananthasubramaniam, David Jurgens, Daniel M. Romero

In this study, we show that demographic identity and network topology are both required to model the diffusion of innovation, as they play complementary roles in producing its spatial properties.

Cultural Vocal Bursts Intensity Prediction

Paper
Add Code

Detecting Community Sensitive Norm Violations in Online Conversations

1 code implementation • Findings (EMNLP) 2021 • Chan Young Park, Julia Mendelsohn, Karthik Radhakrishnan, Kinjal Jain, Tushar Kanakagiri, David Jurgens, Yulia Tsvetkov

Online platforms and communities establish their own norms that govern what behavior is acceptable within the community.

Paper
Code

Phone-to-audio alignment without text: A Semi-supervised Approach

1 code implementation • 8 Oct 2021 • Jian Zhu, Cong Zhang, David Jurgens

The task of phone-to-audio alignment has many applications in speech research.

Contrastive Learning

251

Paper
Code

Measuring Sentence-Level and Aspect-Level (Un)certainty in Science Communications

no code implementations • EMNLP 2021 • Jiaxin Pei, David Jurgens

Here, we introduce a new study of certainty that models both the level and the aspects of certainty in scientific findings.

Sentence

Paper
Add Code

An animated picture says at least a thousand words: Selecting Gif-based Replies in Multimodal Dialog

1 code implementation • Findings (EMNLP) 2021 • Xingyao Wang, David Jurgens

Online conversations include more than just text.

Ranked #1 on Multimodal GIF Dialog on GIF Reply Dataset

Multimodal GIF Dialog

Paper
Code

Using Sociolinguistic Variables to Reveal Changing Attitudes Towards Sexuality and Gender

no code implementations • EMNLP 2021 • Sky CH-Wang, David Jurgens

The linguistic choices in each variable allow us to study increased rates of acceptances of gay marriage and gender equality, respectively.

Paper
Add Code

Idiosyncratic but not Arbitrary: Learning Idiolects in Online Registers Reveals Distinctive yet Consistent Individual Styles

1 code implementation • EMNLP 2021 • Jian Zhu, David Jurgens

Furthermore, we provide a description of idiolects through measuring inter- and intra-author variation, showing that variation in idiolects is often distinctive yet consistent.

Paper
Code

HamiltonDinggg at SemEval-2021 Task 5: Investigating Toxic Span Detection using RoBERTa Pre-training

no code implementations • SEMEVAL 2021 • Huiyang Ding, David Jurgens

In this paper, we demonstrate our system for detecting toxic spans, which includes expanding the toxic training set with Local Interpretable Model-Agnostic Explanations (LIME), fine-tuning RoBERTa model for detection, and error analysis.

Toxic Spans Detection

Paper
Add Code

MultiCite: Modeling realistic citations requires moving beyond the single-sentence single-label setting

1 code implementation • NAACL 2022 • Anne Lauscher, Brandon Ko, Bailey Kuehl, Sophie Johnson, David Jurgens, Arman Cohan, Kyle Lo

In our work, we address this research gap by proposing a novel framework for CCA as a document-level context extraction and labeling task.

Sentence text-classification +1

Paper
Code

Detecting Cross-Geographic Biases in Toxicity Modeling on Social Media

no code implementations • WNUT (ACL) 2021 • Sayan Ghosh, Dylan Baker, David Jurgens, Vinodkumar Prabhakaran

Online social media platforms increasingly rely on Natural Language Processing (NLP) techniques to detect abusive content at scale in order to mitigate the harms it causes to their users.

Bias Detection

Paper
Add Code

Modeling Framing in Immigration Discourse on Social Media

1 code implementation • NAACL 2021 • Julia Mendelsohn, Ceren Budak, David Jurgens

The framing of political issues can influence policy and public opinion.

Cultural Vocal Bursts Intensity Prediction

Paper
Code

The structure of online social networks modulates the rate of lexical change

1 code implementation • NAACL 2021 • Jian Zhu, David Jurgens

Using Poisson regression and survival analysis, our study demonstrates that the community's network structure plays a significant role in lexical change.

Survival Analysis

Paper
Code

Conversations Gone Alright: Quantifying and Predicting Prosocial Outcomes in Online Conversations

1 code implementation • 16 Feb 2021 • Jiajun Bao, Junjie Wu, Yiming Zhang, Eshwar Chandrasekharan, David Jurgens

Online conversations can go in many directions: some turn out poorly due to antisocial behavior, while others turn out positively to the benefit of all.

Paper
Code

Audrey: A Personalized Open-Domain Conversational Bot

no code implementations • 11 Nov 2020 • Chung Hoon Hong, Yuan Liang, Sagnik Sinha Roy, Arushi Jain, Vihang Agarwal, Ryan Draves, Zhizhuo Zhou, William Chen, Yujian Liu, Martha Miracky, Lily Ge, Nikola Banovic, David Jurgens

Conversational Intelligence requires that a person engage on informational, personal and relational levels.

Natural Language Understanding

Paper
Add Code

Quantifying Intimacy in Language

no code implementations • EMNLP 2020 • Jiaxin Pei, David Jurgens

Intimacy is a fundamental aspect of how we relate to others in social settings.

Paper
Add Code

Finding Microaggressions in the Wild: A Case for Locating Elusive Phenomena in Social Media Posts

no code implementations • IJCNLP 2019 • Luke Breitfeller, Emily Ahn, David Jurgens, Yulia Tsvetkov

Microaggressions are subtle, often veiled, manifestations of human biases.

Active Learning

Paper
Add Code

Wetin dey with these comments? Modeling Sociolinguistic Factors Affecting Code-switching Behavior in Nigerian Online Discussions

no code implementations • ACL 2019 • Innocent Ndubuisi-Obi, Sayan Ghosh, David Jurgens

Multilingual individuals code switch between languages as a part of a complex communication process.

Paper
Add Code

A Just and Comprehensive Strategy for Using NLP to Address Online Abuse

no code implementations • ACL 2019 • David Jurgens, Eshwar Chandrasekharan, Libby Hemphill

Online abusive behavior affects millions and the NLP community has attempted to mitigate this problem by developing technologies to detect abuse.

Position

Paper
Add Code

Demographic Inference and Representative Population Estimates from Multilingual Social Media Data

no code implementations • 15 May 2019 • Zijian Wang, Scott A. Hale, David Adelani, Przemyslaw A. Grabowicz, Timo Hartmann, Fabian Flöck, David Jurgens

In a large experiment over multilingual heterogeneous European regions, we show that our demographic inference and bias correction together allow for more accurate estimates of populations and make a significant step towards representative social sensing in downstream applications with multilingual social media.

Attribute

Paper
Add Code

Still out there: Modeling and Identifying Russian Troll Accounts on Twitter

2 code implementations • 31 Jan 2019 • Jane Im, Eshwar Chandrasekharan, Jackson Sargent, Paige Lighthammer, Taylor Denby, Ankit Bhargava, Libby Hemphill, David Jurgens, Eric Gilbert

In this work, we: 1) develop machine learning models that predict whether a Twitter account is a Russian troll within a set of 170K control accounts; and, 2) demonstrate that it is possible to use this model to find active accounts on Twitter still likely acting on behalf of the Russian state.

Social and Information Networks Computers and Society

Paper
Code

It's going to be okay: Measuring Access to Support in Online Communities

no code implementations • EMNLP 2018 • Zijian Wang, David Jurgens

People use online platforms to seek out support for their informational and emotional needs.

Paper
Add Code

RtGender: A Corpus for Studying Differential Responses to Gender

no code implementations • LREC 2018 • Rob Voigt, David Jurgens, Vinodkumar Prabhakaran, Dan Jurafsky, Yulia Tsvetkov

Text Generation

Paper
Add Code

Measuring the Evolution of a Scientific Field through Citation Frames

1 code implementation • TACL 2018 • David Jurgens, Srijan Kumar, Raine Hoover, Dan McFarland, Dan Jurafsky

Ranked #4 on Sentence Classification on SciCite

Citation Intent Classification Sentence Classification

Paper
Code

Incorporating Dialectal Variability for Socially Equitable Language Identification

no code implementations • ACL 2017 • David Jurgens, Yulia Tsvetkov, Dan Jurafsky

Language identification (LID) is a critical first step for processing multilingual text.

Language Identification

Paper
Add Code

Citation Classification for Behavioral Analysis of a Scientific Field

no code implementations • 2 Sep 2016 • David Jurgens, Srijan Kumar, Raine Hoover, Dan McFarland, Dan Jurafsky

Citations are an important indicator of the state of a scientific field, reflecting how authors frame their work, and influencing uptake by future scholars.

Classification General Classification

Paper
Add Code

SemEval-2016 Task 14: Semantic Taxonomy Enrichment

no code implementations • SEMEVAL 2016 • David Jurgens, Mohammad Taher Pilehvar

Information Retrieval Semantic Textual Similarity +2

Paper
Add Code

Annotating Characters in Literary Corpora: A Scheme, the CHARLES Tool, and an Annotated Novel

no code implementations • LREC 2016 • Hardik Vala, Stefan Dimitrov, David Jurgens, Andrew Piper, Derek Ruths

To address the latter problem, this work presents three contributions: (1) a comprehensive scheme for manually resolving mentions to characters in texts.