no code implementations • 10 Nov 2022 • Leonard Adolphs, Tianyu Gao, Jing Xu, Kurt Shuster, Sainbayar Sukhbaatar, Jason Weston
Standard language model training employs gold human documents or human-human interaction data, and treats all training data as positive examples.
1 code implementation • 21 Oct 2022 • Leonard Adolphs, Michelle Chen Huebscher, Christian Buck, Sertan Girgin, Olivier Bachem, Massimiliano Ciaramita, Thomas Hofmann
Neural retrieval models have superseded classic bag-of-words methods such as BM25 as the retrieval framework of choice.
1 code implementation • 24 Mar 2022 • Kurt Shuster, Mojtaba Komeili, Leonard Adolphs, Stephen Roller, Arthur Szlam, Jason Weston
We show that, when using SeeKeR as a dialogue model, it outperforms the state-of-the-art model BlenderBot 2 (Chen et al., 2021) on open-domain knowledge-grounded conversations for the same number of parameters, in terms of consistency, knowledge and per-turn engagingness.
no code implementations • Findings (ACL) 2022 • Shehzaad Dhuliawala, Leonard Adolphs, Rajarshi Das, Mrinmaya Sachan
We show that calibrating such complex systems which contain discrete retrieval and deep reading components is challenging and current calibration techniques fail to scale to these settings.
no code implementations • 9 Nov 2021 • Leonard Adolphs, Kurt Shuster, Jack Urbanek, Arthur Szlam, Jason Weston
Large language models can produce fluent dialogue but often hallucinate factual inaccuracies.
no code implementations • 1 Sep 2021 • Leonard Adolphs, Benjamin Boerschinger, Christian Buck, Michelle Chen Huebscher, Massimiliano Ciaramita, Lasse Espeholt, Thomas Hofmann, Yannic Kilcher, Sascha Rothe, Pier Giuseppe Sessa, Lierni Sestorain Saralegui
This paper presents first successful steps in designing search agents that learn meta-strategies for iterative query refinement in information-seeking tasks.
1 code implementation • 4 Aug 2021 • Leonard Adolphs, Shehzaad Dhuliawala, Thomas Hofmann
We apply this approach of querying by example to the LAMA probe and obtain substantial improvements of up to 37. 8% for BERT-large on the T-REx data when providing only 10 demonstrations--even outperforming a baseline that queries the model with up to 40 paraphrases of the question.
no code implementations • 25 Sep 2019 • Leonard Adolphs, Jonas Kohler, Aurelien Lucchi
We investigate the use of ellipsoidal trust region constraints for second-order optimization of neural networks.
no code implementations • 4 Sep 2019 • Leonard Adolphs, Thomas Hofmann
We, however, consider the task of designing an agent that not just succeeds in a single game, but performs well across a whole family of games, sharing the same theme.
no code implementations • 22 May 2019 • Jonas Kohler, Leonard Adolphs, Aurelien Lucchi
We investigate the use of regularized Newton methods with adaptive norms for optimizing neural networks.
1 code implementation • 15 May 2018 • Leonard Adolphs, Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann
Gradient-based optimization methods are the most popular choice for finding local optima for classical minimization and saddle point problems.