Search Results for author: Evan Ryan Gunter

Found 1 papers, 0 papers with code

Quantifying stability of non-power-seeking in artificial agents

no code implementations • 7 Jan 2024 • Evan Ryan Gunter, Yevgeny Liokumovich, Victoria Krakovna

In our first case of interest--near-optimal policies--we use a bisimulation metric on MDPs to prove that small perturbations won't make the agent take longer to shut down.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.