no code implementations • 11 Sep 2023 • Anna Brandenberger, Cassandra Marcussen, Elchanan Mossel, Madhu Sudan
We study processes of societal knowledge accumulation, where the validity of a new unit of knowledge depends both on the correctness of its derivation and on the validity of the units it depends on.
no code implementations • 3 Nov 2020 • Gavin McCracken, Colin Daniels, Rosie Zhao, Anna Brandenberger, Prakash Panangaden, Doina Precup
Policy gradient methods are extensively used in reinforcement learning as a way to optimize expected return.