no code implementations • Findings (EMNLP) 2021 • Jesse Dodge, Suchin Gururangan, Dallas Card, Roy Schwartz, Noah A. Smith
We find that the two biased estimators lead to the fewest incorrect conclusions, which hints at the importance of minimizing variance and MSE.
1 code implementation • 12 Nov 2024 • Benjamin Litterer, David Jurgens, Dallas Card
Podcasts provide highly diverse content to a massive listener base through a unique on-demand modality.
1 code implementation • 19 Jun 2024 • Julia Mendelsohn, Maya Vijan, Dallas Card, Ceren Budak
Social media enables activists to directly communicate with the public and provides a space for movement leaders, participants, bystanders, and opponents to collectively construct and contest narratives.
1 code implementation • 4 Dec 2023 • Benjamin Litterer, David Jurgens, Dallas Card
Most events in the world receive at most brief coverage by the news media.
1 code implementation • 16 Nov 2023 • Bangzhao Shu, Lechen Zhang, MinJe Choi, Lavinia Dunagan, Lajanugen Logeswaran, Moontae Lee, Dallas Card, David Jurgens
The versatility of Large Language Models (LLMs) on natural language understanding tasks has made them popular for research in social sciences.
1 code implementation • 5 Sep 2023 • Dallas Card
Measuring semantic change has thus far remained a task where methods using contextual embeddings have struggled to improve upon simpler techniques relying only on static word vectors.
2 code implementations • ACL 2022 • Kaitlyn Zhou, Kawin Ethayarajh, Dallas Card, Dan Jurafsky
Cosine similarity of contextual embeddings is used in many NLP tasks (e. g., QA, IR, MT) and metrics (e. g., BERTScore).
1 code implementation • Findings (ACL) 2022 • Junshen K. Chen, Dallas Card, Dan Jurafsky
Off-the-shelf models are widely used by computational social science researchers to measure properties of text, such as sentiment.
no code implementations • 25 Jan 2022 • Suchin Gururangan, Dallas Card, Sarah K. Dreier, Emily K. Gade, Leroy Z. Wang, Zeyu Wang, Luke Zettlemoyer, Noah A. Smith
Language models increasingly rely on massive web dumps for diverse text data.
no code implementations • 1 Oct 2021 • Jesse Dodge, Suchin Gururangan, Dallas Card, Roy Schwartz, Noah A. Smith
We find that the two biased estimators lead to the fewest incorrect conclusions, which hints at the importance of minimizing variance and MSE.
2 code implementations • 16 Aug 2021 • Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang
AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.
1 code implementation • NeurIPS 2021 • Abeba Birhane, Pratyusha Kalluri, Dallas Card, William Agnew, Ravit Dotan, Michelle Bao
We present extensive textual evidence and identify key themes in the definitions and operationalization of these values.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Yiwei Luo, Dallas Card, Dan Jurafsky
We release our stance dataset, model, and lexicons of framing devices for future work on opinion-framing and the automatic detection of GW stance.
1 code implementation • NAACL 2021 • Reid Pryzant, Dallas Card, Dan Jurafsky, Victor Veitch, Dhanya Sridhar
Second, in practice, we only have access to noisy proxies for the linguistic properties of interest -- e. g., predictions from classifiers and lexicons.
2 code implementations • EMNLP 2020 • Dallas Card, Peter Henderson, Urvashi Khandelwal, Robin Jia, Kyle Mahowald, Dan Jurafsky
Despite its importance to experimental design, statistical power (the probability that, given a real effect, an experiment will reject the null hypothesis) has largely been ignored by the NLP community.
no code implementations • 2 Jan 2020 • Dallas Card, Noah A. Smith
In this paper we provide a consequentialist critique of common definitions of fairness within machine learning, as well as a machine learning perspective on consequentialism.
4 code implementations • IJCNLP 2019 • Jesse Dodge, Suchin Gururangan, Dallas Card, Roy Schwartz, Noah A. Smith
Research in natural language processing proceeds, in part, by demonstrating that new models achieve superior performance (e. g., accuracy) on held-out test data, compared to previous results.
no code implementations • ACL 2019 • Maarten Sap, Dallas Card, Saadia Gabriel, Yejin Choi, Noah A. Smith
We investigate how annotators{'} insensitivity to differences in dialect can lead to racial bias in automatic hate speech detection models, potentially amplifying harm against minority populations.
1 code implementation • ACL 2019 • Suchin Gururangan, Tam Dang, Dallas Card, Noah A. Smith
We accompany this paper with code to pretrain and use VAMPIRE embeddings in downstream tasks.
2 code implementations • 6 Nov 2018 • Dallas Card, Michael Zhang, Noah A. Smith
Recent advances in deep learning have achieved impressive gains in classification accuracy on a variety of types of data, including images and text.
no code implementations • NAACL 2018 • Dallas Card, Noah A. Smith
Estimating label proportions in a target corpus is a type of measurement that is useful for answering certain types of social-scientific questions.
3 code implementations • ACL 2018 • Dallas Card, Chenhao Tan, Noah A. Smith
Most real-world document collections involve various types of metadata, such as author, source, and date, and yet the most commonly-used approaches to modeling text corpora ignore this information.
1 code implementation • ACL 2017 • Chenhao Tan, Dallas Card, Noah A. Smith
Combining two statistics --- cooccurrence within documents and prevalence correlation over time --- our approach reveals a number of different ways in which ideas can cooperate and compete.