no code implementations • 18 Nov 2019 • Wen-Yen Chang, Wen-Huan Chiang, Shao-Hao Lu, Tingfan Wu, Min Sun
Last but not least, we investigate the generalization of the HAL policy learned on MNIST dataset by directly applying it on MNIST-M. We show that the agent can generalize and outperform directly-learned policy under constrained labeled sets.