no code implementations • 29 Feb 2024 • Karina Halevy, Anna Sotnikova, Badr AlKhamissi, Syrielle Montariol, Antoine Bosselut
We introduce a novel benchmark dataset, Seesaw-CF, for measuring bias-related harms of model editing and conduct the first in-depth investigation of how different weight-editing methods impact model bias.
1 code implementation • 12 Dec 2023 • Yang Trista Cao, Anna Sotnikova, Jieyu Zhao, Linda X. Zou, Rachel Rudinger, Hal Daume III
We evaluate human stereotypes and stereotypical associations manifested in multilingual large language models such as mBERT, mT5, and ChatGPT.
no code implementations • 8 Aug 2022 • S. Travis Waller, Moeid Qurashi, Anna Sotnikova, Lavina Karva, Sai Chand
This paper examines the impact of the ongoing disruption on traffic behavior using analytics as well as zonal-based network models.
1 code implementation • NAACL 2022 • Yang Trista Cao, Anna Sotnikova, Hal Daumé III, Rachel Rudinger, Linda Zou
NLP models trained on text have been shown to reproduce human stereotypes, which can magnify harms to marginalized groups when systems are deployed at scale.