no code implementations • *SEM (NAACL) 2022 • Ronen Tamari, Kyle Richardson, Noam Kahlon, Aviad Sar-Shalom, Nelson F. Liu, Reut Tsarfaty, Dafna Shahaf
However, the main synthetic resource for story understanding, the bAbI benchmark, lacks such a systematic mechanism for controllable task generation.
4 code implementations • 6 Jul 2023 • Nelson F. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, Percy Liang
While recent language models have the ability to take long contexts as input, relatively little is known about how well they use longer context.
1 code implementation • 23 May 2023 • Nelson F. Liu, Kenton Lee, Kristina Toutanova
Internet links enable users to deepen their understanding of a topic by providing convenient access to related information.
1 code implementation • 19 Apr 2023 • Nelson F. Liu, Tianyi Zhang, Percy Liang
Generative search engines directly generate responses to user queries, along with in-line citations.
no code implementations • 12 Oct 2022 • Nelson F. Liu, Ananya Kumar, Percy Liang, Robin Jia
Recent results in image classification and extractive question answering have observed that pre-trained models trained on less in-distribution data have better out-of-distribution performance.
1 code implementation • Findings (EMNLP) 2021 • Michael Kranzlein, Nelson F. Liu, Nathan Schneider
For interpreting the behavior of a probabilistic model, it is useful to measure a model's calibration--the extent to which it produces reliable confidence scores.
1 code implementation • RepL4NLP (ACL) 2022 • Zhengxuan Wu, Nelson F. Liu, Christopher Potts
There is growing evidence that pretrained language models improve task-specific fine-tuning not just for the languages seen in pretraining, but also for new languages and even non-linguistic data.
no code implementations • 1 Feb 2021 • Nelson F. Liu, Tony Lee, Robin Jia, Percy Liang
Do question answering (QA) modeling improvements (e. g., choice of architecture and training procedure) hold consistently across the diverse landscape of QA benchmarks?
no code implementations • 1 Oct 2020 • Matt Gardner, Yoav Artzi, Victoria Basmova, Jonathan Berant, Ben Bogin, Sihao Chen, Pradeep Dasigi, Dheeru Dua, Yanai Elazar, Ananth Gottumukkala, Nitish Gupta, Hanna Hajishirzi, Gabriel Ilharco, Daniel Khashabi, Kevin Lin, Jiangming Liu, Nelson F. Liu, Phoebe Mulcaire, Qiang Ning, Sameer Singh, Noah A. Smith, Sanjay Subramanian, Reut Tsarfaty, Eric Wallace, A. Zhang, Ben Zhou
Unfortunately, when a dataset has systematic gaps (e. g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on the test set but do not capture a dataset's intended capabilities.
2 code implementations • ACL (MWE) 2021 • Nelson F. Liu, Daniel Hershcovich, Michael Kranzlein, Nathan Schneider
In lexical semantics, full-sentence segmentation and segment labeling of various phenomena are generally treated separately, despite their interdependence.
Ranked #1 on Natural Language Understanding on STREUSLE
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Matt Gardner, Yoav Artzi, Victoria Basmova, Jonathan Berant, Ben Bogin, Sihao Chen, Pradeep Dasigi, Dheeru Dua, Yanai Elazar, Ananth Gottumukkala, Nitish Gupta, Hanna Hajishirzi, Gabriel Ilharco, Daniel Khashabi, Kevin Lin, Jiangming Liu, Nelson F. Liu, Phoebe Mulcaire, Qiang Ning, Sameer Singh, Noah A. Smith, Sanjay Subramanian, Reut Tsarfaty, Eric Wallace, Ally Zhang, Ben Zhou
Unfortunately, when a dataset has systematic gaps (e. g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on the test set but do not capture a dataset's intended capabilities.
1 code implementation • IJCNLP 2019 • Pradeep Dasigi, Nelson F. Liu, Ana Marasović, Noah A. Smith, Matt Gardner
Machine comprehension of texts longer than a single sentence often requires coreference resolution.
1 code implementation • ACL 2019 • Robert Logan, Nelson F. Liu, Matthew E. Peters, Matt Gardner, Sameer Singh
Modeling human language requires the ability to not only generate fluent text but also encode factual knowledge.
1 code implementation • 17 Jun 2019 • Robert L. Logan IV, Nelson F. Liu, Matthew E. Peters, Matt Gardner, Sameer Singh
Modeling human language requires the ability to not only generate fluent text but also encode factual knowledge.
no code implementations • NAACL 2019 • Nelson F. Liu, Roy Schwartz, Noah A. Smith
Several datasets have recently been constructed to expose brittleness in models trained on existing benchmarks.
no code implementations • NAACL 2019 • Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew E. Peters, Noah A. Smith
Contextual word representations derived from large-scale neural language models are successful across a diverse set of NLP tasks, suggesting that they encode useful and transferable features of language.
no code implementations • 16 Aug 2018 • Nelson F. Liu, Jonathan May, Michael Pust, Kevin Knight
Most statistical machine translation systems cannot translate words that are unseen in the training data.
no code implementations • WS 2018 • Nelson F. Liu, Gina-Anne Levow, Noah A. Smith
We introduce a simple method for extracting non-arbitrary form-meaning representations from a collection of semantic vectors.
no code implementations • WS 2018 • Nelson F. Liu, Omer Levy, Roy Schwartz, Chenhao Tan, Noah A. Smith
While recurrent neural networks have found success in a variety of natural language processing applications, they are general models of sequential data.
no code implementations • WS 2017 • Johannes Welbl, Nelson F. Liu, Matt Gardner
With this method we have assembled SciQ, a dataset of 13. 7K multiple choice science exam questions (Dataset available at http://allenai. org/data. html).