Search Results for author: Jonathan Stray

Found 4 papers, 0 papers with code

Summon a Demon and Bind it: A Grounded Theory of LLM Red Teaming in the Wild

no code implementations10 Nov 2023 Nanna Inie, Jonathan Stray, Leon Derczynski

As a result, this paper presents a grounded theory of how and why people attack large language models: LLM red teaming in the wild.

What are you optimizing for? Aligning Recommender Systems with Human Values

no code implementations22 Jul 2021 Jonathan Stray, Ivan Vendrov, Jeremy Nixon, Steven Adler, Dylan Hadfield-Menell

We describe cases where real recommender systems were modified in the service of various human values such as diversity, fairness, well-being, time well spent, and factual accuracy.

Fairness Recommendation Systems

Designing Recommender Systems to Depolarize

no code implementations11 Jul 2021 Jonathan Stray

Polarization is implicated in the erosion of democracy and the progression to violence, which makes the polarization properties of large algorithmic content selection systems (recommender systems) a matter of concern for peace and security.

Recommendation Systems

Cannot find the paper you are looking for? You can Submit a new open access paper.