Search Results for author: Mohit Sudhakar

Found 1 papers, 0 papers with code

Simple Text Detoxification by Identifying a Linear Toxic Subspace in Language Model Embeddings

no code implementations15 Dec 2021 Andrew Wang, Mohit Sudhakar, Yangfeng Ji

We hypothesize the existence of a low-dimensional toxic subspace in the latent space of pre-trained language models, the existence of which suggests that toxic features follow some underlying pattern and are thus removable.

Abusive Language Language Modelling +1

Cannot find the paper you are looking for? You can Submit a new open access paper.