The PS-Eval Dataset is a suite of polysemous and monosemous contexts extracted and filtered from the WiC dataset. It aims to evaluate the ability of Sparse Autoencoders (SAEs) to disentangle polysemantic activations into monosemantic features within large language models (LLMs).
Paper | Code | Results | Date | Stars |
---|