no code implementations • WMT (EMNLP) 2021 • Farhad Akhbardeh, Arkady Arkhangorodsky, Magdalena Biesialska, Ondřej Bojar, Rajen Chatterjee, Vishrav Chaudhary, Marta R. Costa-Jussa, Cristina España-Bonet, Angela Fan, Christian Federmann, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Leonie Harter, Kenneth Heafield, Christopher Homan, Matthias Huck, Kwabena Amponsah-Kaakyire, Jungo Kasai, Daniel Khashabi, Kevin Knight, Tom Kocmi, Philipp Koehn, Nicholas Lourie, Christof Monz, Makoto Morishita, Masaaki Nagata, Ajay Nagesh, Toshiaki Nakazawa, Matteo Negri, Santanu Pal, Allahsera Auguste Tapo, Marco Turchi, Valentin Vydrin, Marcos Zampieri
This paper presents the results of the newstranslation task, the multilingual low-resourcetranslation for Indo-European languages, thetriangular translation task, and the automaticpost-editing task organised as part of the Con-ference on Machine Translation (WMT) 2021. In the news task, participants were asked tobuild machine translation systems for any of10 language pairs, to be evaluated on test setsconsisting mainly of news stories.
1 code implementation • 8 Nov 2024 • Mayee F. Chen, Michael Y. Hu, Nicholas Lourie, Kyunghyun Cho, Christopher Ré
Finally, we leverage the insights from our framework to derive a new online method named Aioli, which directly estimates the mixing law parameters throughout training and uses them to dynamically adjust proportions.
no code implementations • 30 May 2024 • Siavash Golkar, Alberto Bietti, Mariel Pettee, Michael Eickenberg, Miles Cranmer, Keiya Hirashima, Geraud Krawezik, Nicholas Lourie, Michael McCabe, Rudy Morel, Ruben Ohana, Liam Holden Parker, Bruno Régaldo-Saint Blancard, Kyunghyun Cho, Shirley Ho
Transformers have revolutionized machine learning across diverse domains, yet understanding their behavior remains crucial, particularly in high-stakes applications.
1 code implementation • 16 Nov 2023 • Nicholas Lourie, Kyunghyun Cho, He He
We present the first method to construct valid confidence bands for tuning curves.
1 code implementation • 24 Mar 2021 • Nicholas Lourie, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
First, we propose a new multitask benchmark, RAINBOW, to promote research on commonsense models that generalize well over multiple tasks and datasets.
Ranked #1 on
Question Answering
on SIQA
2 code implementations • 17 Jan 2021 • Daniel Khashabi, Gabriel Stanovsky, Jonathan Bragg, Nicholas Lourie, Jungo Kasai, Yejin Choi, Noah A. Smith, Daniel S. Weld
While often assumed a gold standard, effective human evaluation of text generation remains an important, open area for research.
1 code implementation • EMNLP 2020 • Orion Weller, Nicholas Lourie, Matt Gardner, Matthew E. Peters
Typically, machine learning systems solve new tasks by training on thousands of examples.
6 code implementations • EMNLP 2020 • Swabha Swayamdipta, Roy Schwartz, Nicholas Lourie, Yizhong Wang, Hannaneh Hajishirzi, Noah A. Smith, Yejin Choi
Experiments across four datasets show that these model-dependent measures reveal three distinct regions in the data map, each with pronounced characteristics.
1 code implementation • 20 Aug 2020 • Nicholas Lourie, Ronan Le Bras, Yejin Choi
As AI systems become an increasing part of people's everyday lives, it becomes ever more important that they understand people's ethical norms.
4 code implementations • NAACL 2019 • Alon Talmor, Jonathan Herzig, Nicholas Lourie, Jonathan Berant
To investigate question answering with prior knowledge, we present CommonsenseQA: a challenging new dataset for commonsense question answering.
Ranked #31 on
Common Sense Reasoning
on CommonsenseQA
(using extra training data)
2 code implementations • 31 Oct 2018 • Maarten Sap, Ronan LeBras, Emily Allaway, Chandra Bhagavatula, Nicholas Lourie, Hannah Rashkin, Brendan Roof, Noah A. Smith, Yejin Choi
We present ATOMIC, an atlas of everyday commonsense reasoning, organized through 877k textual descriptions of inferential knowledge.
1 code implementation • 6 Apr 2018 • Noah Siegel, Nicholas Lourie, Russell Power, Waleed Ammar
Non-textual components such as charts, diagrams and tables provide key information in many scientific documents, but the lack of large labeled datasets has impeded the development of data-driven methods for scientific figure extraction.