1 code implementation • NAACL (NUSE) 2021 • Li Lucy, David Bamman
Using topic modeling and lexicon-based word similarity, we find that stories generated by GPT-3 exhibit many known gender stereotypes.
no code implementations • 15 Oct 2024 • David Bamman, Kent K. Chang, Li Lucy, Naitian Zhou
In this work, we survey the way in which classification is used as a sensemaking practice in cultural analytics, and assess where large language models can fit into this landscape.
1 code implementation • 8 Aug 2024 • Li Lucy, Tal August, Rose E. Wang, Luca Soldaini, Courtney Allison, Kyle Lo
To ensure that math curriculum is grade-appropriate and aligns with critical skills or concepts in accordance with educational standards, pedagogical experts can spend months carefully reviewing published math problems.
1 code implementation • 31 Jan 2024 • Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Pete Walsh, Luke Zettlemoyer, Noah A. Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, Kyle Lo
As a result, it is challenging to conduct and advance scientific research on language modeling, such as understanding how training data impacts model capabilities and limitations.
1 code implementation • 12 Jan 2024 • Li Lucy, Suchin Gururangan, Luca Soldaini, Emma Strubell, David Bamman, Lauren F. Klein, Jesse Dodge
Large language models' (LLMs) abilities are drawn from their pretraining data, and model development begins with data curation.
no code implementations • 23 Oct 2023 • Li Lucy, Su Lin Blodgett, Milad Shokouhi, Hanna Wallach, Alexandra Olteanu
Fairness-related assumptions about what constitute appropriate NLG system behaviors range from invariance, where systems are expected to behave identically for social groups, to adaptation, where behaviors should instead vary across them.
1 code implementation • 19 Dec 2022 • Li Lucy, Jesse Dodge, David Bamman, Katherine A. Keith
Scholarly text is often laden with jargon, or specialized language that can facilitate efficient in-group communication within fields but hinder understanding for out-groups.
1 code implementation • 21 Oct 2022 • Li Lucy, Divya Tadimeti, David Bamman
A common paradigm for identifying semantic differences across social and temporal contexts is the use of static word embeddings and their distances.
1 code implementation • 12 Feb 2021 • Li Lucy, David Bamman
Much previous work characterizing language variation across Internet social groups has focused on the types of words used by these groups.
1 code implementation • WS 2019 • Li Lucy, Julia Mendelsohn
We analyze gendered communities defined in three different ways: text, users, and sentiment.
1 code implementation • WS 2017 • Li Lucy, Jon Gauthier
Distributional word representation methods exploit word co-occurrences to build compact vector encodings of words.