In this way, we are able capture the common structure of the instances and their interactions with the solver, and produce good choices of penalty parameters with fewer number of calls to the QUBO solver.
Scaling decision theoretic planning to large multiagent systems is challenging due to uncertainty and partial observability in the environment.
Decentralized (PO)MDPs provide an expressive framework for sequential decision making in a multiagent system.
A key characteristic of the domains of interest is that the interactions between individuals are anonymous, i. e., the outcome of an interaction (competing for demand) is dependent only on the number and not on the identity of the agents.
In this work, we address their efficiency issues by proposing local GPs to learn from and make predictions for correlated subsets of data.
Thus, in this paper, our focus is on providing a scalable method for solving RCPSP/max problems with durational uncertainty.