no code implementations • 7 Dec 2016 • Tong Che, Yan-ran Li, Athul Paul Jacob, Yoshua Bengio, Wenjie Li
Although Generative Adversarial Networks achieve state-of-the-art results on a variety of generative tasks, they are regarded as highly unstable and prone to miss modes.
6 code implementations • 27 Feb 2017 • R. Devon Hjelm, Athul Paul Jacob, Tong Che, Adam Trischler, Kyunghyun Cho, Yoshua Bengio
We introduce a method for training GANs with discrete data that uses the estimated difference measure from the discriminator to compute importance weights for generated samples, thus providing a policy gradient for training the generator.
no code implementations • ICLR 2018 • R. Devon Hjelm, Athul Paul Jacob, Adam Trischler, Gerry Che, Kyunghyun Cho, Yoshua Bengio
We introduce a method for training GANs with discrete data that uses the estimated difference measure from the discriminator to compute importance weights for generated samples, thus providing a policy gradient for training the generator.
2 code implementations • ACL 2018 • Yikang Shen, Zhouhan Lin, Athul Paul Jacob, Alessandro Sordoni, Aaron Courville, Yoshua Bengio
In this work, we propose a novel constituency parsing scheme.
no code implementations • WS 2018 • Athul Paul Jacob, Zhouhan Lin, Aless Sordoni, ro, Yoshua Bengio
We propose a hierarchical model for sequential data that learns a tree on-the-fly, i. e. while reading the sequence.
no code implementations • NAACL 2021 • Athul Paul Jacob, Mike Lewis, Jacob Andreas
When intelligent agents communicate to accomplish shared goals, how do these goals shape the agents' language?
no code implementations • 14 Dec 2021 • Athul Paul Jacob, David J. Wu, Gabriele Farina, Adam Lerer, Hengyuan Hu, Anton Bakhtin, Jacob Andreas, Noam Brown
We consider the task of building strong but human-like policies in multi-agent decision-making problems, given examples of human behavior.
1 code implementation • 11 Oct 2022 • Anton Bakhtin, David J Wu, Adam Lerer, Jonathan Gray, Athul Paul Jacob, Gabriele Farina, Alexander H Miller, Noam Brown
We then show that DiL-piKL can be extended into a self-play reinforcement learning algorithm we call RL-DiL-piKL that provides a model of human play while simultaneously training an agent that responds well to this human model.
1 code implementation • Science 2022 • Anton Bakhtin, Noam Brown, Emily Dinan, Gabriele Farina, Colin Flaherty, Daniel Fried, Andrew Goff, Jonathan Gray, Hengyan Hu, Athul Paul Jacob, Mojtaba Komeili, Karthik Konath, Minae Kwon, Adam Lerer, Mike Lewis, Alexander H. Miller, Sash Mitts, Aditya Renduchintala, Stephen Roller, Dirk Rowe, Weiyan Shi, Joe Spisak, Alexander Wei, David Wu, Hugh Zhang, Markus Zijlstra
Despite much progress in training AI systems to imitate human language, building agents that use language to communicate intentionally with humans in interactive environments remains a major challenge.
no code implementations • 22 Nov 2022 • Weiyan Shi, Emily Dinan, Adi Renduchintala, Daniel Fried, Athul Paul Jacob, Zhou Yu, Mike Lewis
Existing approaches built separate classifiers to detect nonsense in dialogues.
no code implementations • 13 Oct 2023 • Athul Paul Jacob, Yikang Shen, Gabriele Farina, Jacob Andreas
When applied to question answering and other text generation tasks, language models (LMs) may be queried generatively (by sampling answers from their output distribution) or discriminatively (by using them to score or rank a set of candidate outputs).
no code implementations • 16 Nov 2023 • Athul Paul Jacob, Gabriele Farina, Jacob Andreas
We present a model of pragmatic language understanding, where utterances are produced and understood by searching for regularized equilibria of signaling games.
no code implementations • 7 Dec 2023 • Athul Paul Jacob, Abhishek Gupta, Jacob Andreas
We study the problem of modeling a population of agents pursuing unknown goals subject to unknown computational constraints.