Search Results for author: Koen Holtman

Found 5 papers, 1 papers with code

Demanding and Designing Aligned Cognitive Architectures

no code implementations19 Dec 2021 Koen Holtman

To support this move, we define several AI cognitive architectures that combine reward maximization with other technical elements designed to improve alignment.

BIG-bench Machine Learning

ML Agent Safety Mechanisms based on Counterfactual Planning

no code implementations NeurIPS 2021 Koen Holtman

The key step in counterfactual planning is to use the agent's machine learning system to construct a counterfactual world model, designed to be different from the real world the agent is in.

BIG-bench Machine Learning counterfactual +1

Counterfactual Planning in AGI Systems

no code implementations29 Jan 2021 Koen Holtman

The key step in counterfactual planning is to use an AGI machine learning system to construct a counterfactual world model, designed to be different from the real world the system is in.

BIG-bench Machine Learning counterfactual +1

AGI Agent Safety by Iteratively Improving the Utility Function

no code implementations10 Jul 2020 Koen Holtman

We present an AGI safety layer that creates a special dedicated input terminal to support the iterative improvement of an AGI agent's utility function.

Mathematical Proofs

Corrigibility with Utility Preservation

1 code implementation5 Aug 2019 Koen Holtman

The results in this paper were obtained by concurrently developing an AGI agent simulator, an agent model, and proofs.

Cannot find the paper you are looking for? You can Submit a new open access paper.