no code implementations • 14 May 2021 • Ming Liang Ang, Eloise Y. Y. Lim, Joel Q. L. Chang
The multi-armed bandit (MAB) problem is a ubiquitous decision-making problem that exemplifies exploration-exploitation tradeoff.
1 code implementation • ICLR 2021 • Abdul Fatir Ansari, Ming Liang Ang, Harold Soh
We introduce Discriminator Gradient flow (DGflow), a new technique that improves generated samples via the gradient flow of entropy-regularized f-divergences between the real and the generated data distributions.
Ranked #1 on Text Generation on One Billion Word