Search Results for author: Avital Balwit

Found 3 papers, 0 papers with code

Aligned with Whom? Direct and social goals for AI systems

no code implementations9 May 2022 Anton Korinek, Avital Balwit

As artificial intelligence (AI) becomes more powerful and widespread, the AI alignment problem - how to ensure that AI systems pursue the goals that we want them to pursue - has garnered growing attention.

Truthful AI: Developing and governing AI that does not lie

no code implementations13 Oct 2021 Owain Evans, Owen Cotton-Barratt, Lukas Finnveden, Adam Bales, Avital Balwit, Peter Wills, Luca Righetti, William Saunders

Establishing norms or laws of AI truthfulness will require significant work to: (1) identify clear truthfulness standards; (2) create institutions that can judge adherence to those standards; and (3) develop AI systems that are robustly truthful.

Cannot find the paper you are looking for? You can Submit a new open access paper.