Search Results for author: Luciano Cavalcante Siebert

Found 4 papers, 0 papers with code

Explaining Learned Reward Functions with Counterfactual Trajectories

no code implementations7 Feb 2024 Jan Wehner, Frans Oliehoek, Luciano Cavalcante Siebert

Finally, we measure how informative the generated explanations are to a proxy-human model by training it on CTEs.

counterfactual

Reflective Hybrid Intelligence for Meaningful Human Control in Decision-Support Systems

no code implementations12 Jul 2023 Catholijn M. Jonker, Luciano Cavalcante Siebert, Pradeep K. Murukannaiah

With the growing capabilities and pervasiveness of AI systems, societies must collectively choose between reduced human autonomy, endangered democracies and limited human rights, and AI that is aligned to human and social values, nurturing collaboration, resilience, knowledge and ethical behaviour.

Philosophy

Meaningful human control: actionable properties for AI system development

no code implementations25 Nov 2021 Luciano Cavalcante Siebert, Maria Luce Lupetti, Evgeni Aizenberg, Niek Beckers, Arkady Zgonnikov, Herman Veluwenkamp, David Abbink, Elisa Giaccardi, Geert-Jan Houben, Catholijn M. Jonker, Jeroen van den Hoven, Deborah Forster, Reginald L. Lagendijk

The concept of meaningful human control has been proposed to address responsibility gaps and mitigate them by establishing conditions that enable a proper attribution of responsibility for humans; however, clear requirements for researchers, designers, and engineers are yet inexistent, making the development of AI-based systems that remain under meaningful human control challenging.

Improving Confidence in the Estimation of Values and Norms

no code implementations2 Apr 2020 Luciano Cavalcante Siebert, Rijk Mercuur, Virginia Dignum, Jeroen van den Hoven, Catholijn Jonker

This paper analyses to what extent an AA is able to estimate the values and norms of a simulated human agent (SHA) based on its actions in the ultimatum game.

counterfactual

Cannot find the paper you are looking for? You can Submit a new open access paper.