no code implementations • 11 Oct 2023 • Hannah Rose Kirk, Andrew M. Bean, Bertie Vidgen, Paul Röttger, Scott A. Hale
Human feedback is increasingly used to steer the behaviours of Large Language Models (LLMs).
no code implementations • 11 Oct 2023 • Andrew M. Bean, Karolina Korgul, Felix Krones, Robert McCraith, Adam Mahdi
For each question, we score each model on the top-1 accuracy and the distribution of probabilities assigned.
no code implementations • 15 Sep 2023 • Khyati Khandelwal, Manuel Tonneau, Andrew M. Bean, Hannah Rose Kirk, Scott A. Hale
In this paper, we quantify stereotypical bias in popular LLMs according to an Indian-centric frame and compare bias levels between the Indian and Western contexts.