no code implementations • 19 Apr 2024 • Qinyuan Wu, Mohammad Aflah Khan, Soumi Das, Vedant Nanda, Bishwamittra Ghosh, Camila Kolling, Till Speicher, Laurent Bindschaedler, Krishna P. Gummadi, Evimaria Terzi
We propose an approach for estimating the latent knowledge embedded inside large language models (LLMs).
no code implementations • 30 May 2023 • Camila Kolling, Till Speicher, Vedant Nanda, Mariya Toneva, Krishna P. Gummadi
Concretely, we show how PNKA can be leveraged to develop a deeper understanding of (a) the input examples that are likely to be misclassified, (b) the concepts encoded by (individual) neurons in a layer, and (c) the effects of fairness interventions on learned representations.
1 code implementation • 23 Jun 2022 • Vedant Nanda, Till Speicher, Camila Kolling, John P. Dickerson, Krishna P. Gummadi, Adrian Weller
Our work offers a new view on robustness by using another reference NN to define the set of perturbations a given NN should be invariant to, thus generalizing the reliance on a reference ``human NN'' to any NN.
no code implementations • 1 Jul 2020 • Vedant Nanda, Till Speicher, John P. Dickerson, Krishna P. Gummadi, Muhammad Bilal Zafar
Our framework defines a large number of concepts that the DNN explanations could be based on and performs the explanation-conformity check at test time to assess prediction robustness.
no code implementations • 2 Jul 2018 • Till Speicher, Hoda Heidari, Nina Grgic-Hlaca, Krishna P. Gummadi, Adish Singla, Adrian Weller, Muhammad Bilal Zafar
Further, our work reveals overlooked tradeoffs between different fairness notions: using our proposed measures, the overall individual-level unfairness of an algorithm can be decomposed into a between-group and a within-group component.
1 code implementation • 13 Apr 2014 • Rene Pickhardt, Thomas Gottron, Martin Körner, Paul Georg Wagner, Till Speicher, Steffen Staab
In an extensive empirical experiment over English text corpora we demonstrate that our generalized language models lead to a substantial reduction of perplexity between 3. 1% and 12. 7% in comparison to traditional language models using modified Kneser-Ney smoothing.