The evolutionary learning is also employed to fine-tune the parameters of the models.
When these problems are extended to multiobjective ones, it becomes difficult for the existing DRL approaches to flexibly and efficiently deal with multiple subproblems determined by weight decomposition of objectives.
Each subproblem is modeled with a pointer network and the model is trained with reinforcement learning.
However, the fact is, the state of an instance is changed according to the decision that the model made at different construction steps, and the node features should be updated correspondingly.
Mobile sequential recommendation was originally designed to find a promising route for a single taxicab.
The talent scheduling problem is a simplified version of the real-world film shooting problem, which aims to determine a shooting sequence so as to minimize the total cost of the actors involved.