46 papers with code • 4 benchmarks • 6 datasets
Bias detection is the task of detecting and measuring racism, sexism and otherwise discriminatory behavior in a model (Source: https://stereoset.mit.edu/)
To address these drawbacks, we formalize a method for automating the selection of interesting PDPs and extend PDPs beyond showing single features to show the model response along arbitrary directions, for example in raw feature space or a latent space arising from some generative model.
Since pretrained language models are trained on large real world data, they are known to capture stereotypical biases.
Hence, we also contribute a new, large Swedish bias-labelled dataset (of 2 million samples), translated from the English version and train the SotA mT5 model on it.
However, beside the intrinsic problems with the analogy task as a bias detection tool, in this paper we show that a series of issues related to how analogies have been implemented and used might have yielded a distorted picture of bias in word embeddings.
Measuring Gender Bias in Word Embeddings across Domains and Discovering New Gender Bias Word Categories
We find that some domains are definitely more prone to gender bias than others, and that the categories of gender bias present also vary for each set of word embeddings.
We propose a multilingual method for the extraction of biased sentences from Wikipedia, and use it to create corpora in Bulgarian, French and English.
Predicting the Leading Political Ideology of YouTube Channels Using Acoustic, Textual, and Metadata Information
Our analysis shows that the use of acoustic signal helped to improve bias detection by more than 6% absolute over using text and metadata only.
My Approach = Your Apparatus? Entropy-Based Topic Modeling on Multiple Domain-Specific Text Collections
Comparative text mining extends from genre analysis and political bias detection to the revelation of cultural and geographic differences, through to the search for prior art across patents and scientific papers.
Subjective bias detection is critical for applications like propaganda detection, content recommendation, sentiment analysis, and bias neutralization.
Towards explainable classifiers using the counterfactual approach -- global explanations for discovering bias in data
The paper proposes summarized attribution-based post-hoc explanations for the detection and identification of bias in data.