Bias Detection
79 papers with code • 5 benchmarks • 11 datasets
Bias detection is the task of detecting and measuring racism, sexism and otherwise discriminatory behavior in a model (Source: https://stereoset.mit.edu/)
Datasets
Most implemented papers
StereoSet: Measuring stereotypical bias in pretrained language models
Since pretrained language models are trained on large real world data, they are known to capture stereotypical biases.
Automated Dependence Plots
To address these drawbacks, we formalize a method for automating the selection of interesting PDPs and extend PDPs beyond showing single features to show the model response along arbitrary directions, for example in raw feature space or a latent space arising from some generative model.
Detecting Emergent Intersectional Biases: Contextualized Word Embeddings Contain a Distribution of Human-like Biases
Furthermore, we develop two methods, Intersectional Bias Detection (IBD) and Emergent Intersectional Bias Detection (EIBD), to automatically identify the intersectional biases and emergent intersectional biases from static word embeddings in addition to measuring them in contextualized word embeddings.
Bipol: Multi-axes Evaluation of Bias with Explainability in Benchmark Datasets
Hence, we also contribute a new, large Swedish bias-labelled dataset (of 2 million samples), translated from the English version and train the SotA mT5 model on it.
Investigating Subtler Biases in LLMs: Ageism, Beauty, Institutional, and Nationality Bias in Generative Models
LLMs are increasingly powerful and widely used to assist users in a variety of tasks.
Fair is Better than Sensational:Man is to Doctor as Woman is to Doctor
However, beside the intrinsic problems with the analogy task as a bias detection tool, in this paper we show that a series of issues related to how analogies have been implemented and used might have yielded a distorted picture of bias in word embeddings.
Measuring Gender Bias in Word Embeddings across Domains and Discovering New Gender Bias Word Categories
We find that some domains are definitely more prone to gender bias than others, and that the categories of gender bias present also vary for each set of word embeddings.
Multilingual sentence-level bias detection in Wikipedia
We propose a multilingual method for the extraction of biased sentences from Wikipedia, and use it to create corpora in Bulgarian, French and English.
Predicting the Leading Political Ideology of YouTube Channels Using Acoustic, Textual, and Metadata Information
Our analysis shows that the use of acoustic signal helped to improve bias detection by more than 6% absolute over using text and metadata only.
My Approach = Your Apparatus? Entropy-Based Topic Modeling on Multiple Domain-Specific Text Collections
Comparative text mining extends from genre analysis and political bias detection to the revelation of cultural and geographic differences, through to the search for prior art across patents and scientific papers.