Semi-Supervised Policy Initialization for Playing Games with Language Hints

NAACL 2021  ·  Tsu-Jui Fu, William Yang Wang ·

Using natural language as a hint can supply an additional reward for playing sparse-reward games. Achieving a goal should involve several different hints, while the given hints are usually incomplete. Those unmentioned latent hints still rely on the sparse reward signal, and make the learning process difficult. In this paper, we propose semi-supervised initialization (SSI) that allows the agent to learn from various possible hints before training under different tasks. Experiments show that SSI not only helps to learn faster (\textbf{1.2x}) but also has a higher success rate (\textbf{11{\%}} relative improvement) of the final policy.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here