Search Results for author: Hanna Hajishirzi

Found 5 papers, 4 papers with code

What's In My Big Data?

1 code implementation31 Oct 2023 Yanai Elazar, Akshita Bhagia, Ian Magnusson, Abhilasha Ravichander, Dustin Schwenk, Alane Suhr, Pete Walsh, Dirk Groeneveld, Luca Soldaini, Sameer Singh, Hanna Hajishirzi, Noah A. Smith, Jesse Dodge

We open-source WIMBD's code and artifacts to provide a standard set of evaluations for new text-based corpora and to encourage more analyses and transparency around them: github. com/allenai/wimbd.

Benchmarking

GenEval: An Object-Focused Framework for Evaluating Text-to-Image Alignment

1 code implementation NeurIPS 2023 Dhruba Ghosh, Hanna Hajishirzi, Ludwig Schmidt

Recent breakthroughs in diffusion models, multimodal pretraining, and efficient finetuning have led to an explosion of text-to-image generative models.

Attribute Object +2

Elaboration-Generating Commonsense Question Answering at Scale

1 code implementation2 Sep 2022 Wenya Wang, Vivek Srikumar, Hanna Hajishirzi, Noah A. Smith

In question answering requiring common sense, language models (e. g., GPT-3) have been used to generate text expressing background knowledge that helps improve performance.

Common Sense Reasoning Question Answering

Evaluating NLP Models via Contrast Sets

no code implementations1 Oct 2020 Matt Gardner, Yoav Artzi, Victoria Basmova, Jonathan Berant, Ben Bogin, Sihao Chen, Pradeep Dasigi, Dheeru Dua, Yanai Elazar, Ananth Gottumukkala, Nitish Gupta, Hanna Hajishirzi, Gabriel Ilharco, Daniel Khashabi, Kevin Lin, Jiangming Liu, Nelson F. Liu, Phoebe Mulcaire, Qiang Ning, Sameer Singh, Noah A. Smith, Sanjay Subramanian, Reut Tsarfaty, Eric Wallace, A. Zhang, Ben Zhou

Unfortunately, when a dataset has systematic gaps (e. g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on the test set but do not capture a dataset's intended capabilities.

Reading Comprehension Sentiment Analysis +1

Cannot find the paper you are looking for? You can Submit a new open access paper.