Search Results for author: Koen Holtman

Found 5 papers, 1 papers with code

Demanding and Designing Aligned Cognitive Architectures

no code implementations • 19 Dec 2021 • Koen Holtman

To support this move, we define several AI cognitive architectures that combine reward maximization with other technical elements designed to improve alignment.

BIG-bench Machine Learning

Paper
Add Code

ML Agent Safety Mechanisms based on Counterfactual Planning

no code implementations • NeurIPS 2021 • Koen Holtman

The key step in counterfactual planning is to use the agent's machine learning system to construct a counterfactual world model, designed to be different from the real world the agent is in.

BIG-bench Machine Learning counterfactual +1

Paper
Add Code

Counterfactual Planning in AGI Systems

no code implementations • 29 Jan 2021 • Koen Holtman

The key step in counterfactual planning is to use an AGI machine learning system to construct a counterfactual world model, designed to be different from the real world the system is in.

BIG-bench Machine Learning counterfactual +1

Paper
Add Code

AGI Agent Safety by Iteratively Improving the Utility Function

no code implementations • 10 Jul 2020 • Koen Holtman

We present an AGI safety layer that creates a special dedicated input terminal to support the iterative improvement of an AGI agent's utility function.

Mathematical Proofs

Paper
Add Code

Corrigibility with Utility Preservation

1 code implementation • 5 Aug 2019 • Koen Holtman

The results in this paper were obtained by concurrently developing an AGI agent simulator, an agent model, and proofs.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.