Effects of dynamic goal prior specification on grid-world agent trajectories

Abstract

Aligning artificially intelligent agents with human intentions is a significant challenge. Traditional reinforcement learning methods often result in agents exploiting their reward functions, leading to undesired behaviors. We conduct an empirical analysis of goal prior specification in an expected free energy minimizing grid-world agent, investigating how specific qualities of the goal prior parameters affect the agent’s trajectories. We present a protocol for studying ”laziness”, ”vagueness”, ”indifference”, and ”dynamicity”. Our findings show that laziness and vagueness have a large effect on behaviour, whereas indifference and dynamicity only have a small effect. This offers insights to engineers on how to set goal priors.

Publication
International Workshop on Active Inference
Wouter M. Kouw
Wouter M. Kouw
Assistant Professor