no code implementations • 18 Mar 2025 • Jacob Eisenstein, Reza Aghajani, Adam Fisch, Dheeru Dua, Fantine Huot, Mirella Lapata, Vicky Zayats, Jonathan Berant
To be helpful assistants, AI agents must be aware of their own capabilities and limitations.
1 code implementation • 19 Jun 2024 • Jinhyuk Lee, Anthony Chen, Zhuyun Dai, Dheeru Dua, Devendra Singh Sachan, Michael Boratko, Yi Luan, Sébastien M. R. Arnold, Vincent Perot, Siddharth Dalmia, Hexiang Hu, Xudong Lin, Panupong Pasupat, Aida Amini, Jeremy R. Cole, Sebastian Riedel, Iftekhar Naim, Ming-Wei Chang, Kelvin Guu
Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases.
no code implementations • 20 Dec 2022 • Dheeru Dua, Emma Strubell, Sameer Singh, Pat Verga
Recent advances in open-domain question answering (ODQA) have demonstrated impressive accuracy on standard Wikipedia style benchmarks.
no code implementations • 8 Dec 2022 • Dheeru Dua, Shivanshu Gupta, Sameer Singh, Matt Gardner
The intermediate supervision is typically manually written, which can be expensive to collect.
no code implementations • NAACL 2022 • Dheeru Dua, Shruti Bhosale, Vedanuj Goswami, James Cross, Mike Lewis, Angela Fan
Multi-task learning with an unbalanced data distribution skews model learning towards high resource tasks, especially when model capacity is fixed and fully shared across all tasks.
no code implementations • EMNLP 2021 • Dheeru Dua, Cicero Nogueira dos santos, Patrick Ng, Ben Athiwaratkun, Bing Xiang, Matt Gardner, Sameer Singh
Compositional reasoning tasks like multi-hop question answering, require making latent decisions to get the final answer, given a question.
no code implementations • EMNLP 2021 • Dheeru Dua, Pradeep Dasigi, Sameer Singh, Matt Gardner
When training most modern reading comprehension models, all the questions associated with a context are treated as being independent from each other.
no code implementations • EMNLP 2020 • Qiang Ning, Hao Wu, Pradeep Dasigi, Dheeru Dua, Matt Gardner, Robert L. Logan IV, Ana Marasovi{\'c}, Zhen Nie
High-quality and large-scale data are key to success for AI systems.
no code implementations • 1 Oct 2020 • Matt Gardner, Yoav Artzi, Victoria Basmova, Jonathan Berant, Ben Bogin, Sihao Chen, Pradeep Dasigi, Dheeru Dua, Yanai Elazar, Ananth Gottumukkala, Nitish Gupta, Hanna Hajishirzi, Gabriel Ilharco, Daniel Khashabi, Kevin Lin, Jiangming Liu, Nelson F. Liu, Phoebe Mulcaire, Qiang Ning, Sameer Singh, Noah A. Smith, Sanjay Subramanian, Reut Tsarfaty, Eric Wallace, A. Zhang, Ben Zhou
Unfortunately, when a dataset has systematic gaps (e. g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on the test set but do not capture a dataset's intended capabilities.
no code implementations • ACL 2020 • Ananth Gottumukkala, Dheeru Dua, Sameer Singh, Matt Gardner
Building general reading comprehension systems, capable of solving multiple datasets at the same time, is a recent aspirational goal in the research community.
no code implementations • ACL 2020 • Dheeru Dua, Sameer Singh, Matt Gardner
Complex compositional reading comprehension datasets require performing latent sequential decisions that are learned via supervision from the final answer.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Matt Gardner, Yoav Artzi, Victoria Basmova, Jonathan Berant, Ben Bogin, Sihao Chen, Pradeep Dasigi, Dheeru Dua, Yanai Elazar, Ananth Gottumukkala, Nitish Gupta, Hanna Hajishirzi, Gabriel Ilharco, Daniel Khashabi, Kevin Lin, Jiangming Liu, Nelson F. Liu, Phoebe Mulcaire, Qiang Ning, Sameer Singh, Noah A. Smith, Sanjay Subramanian, Reut Tsarfaty, Eric Wallace, Ally Zhang, Ben Zhou
Unfortunately, when a dataset has systematic gaps (e. g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on the test set but do not capture a dataset's intended capabilities.
no code implementations • 29 Dec 2019 • Dheeru Dua, Ananth Gottumukkala, Alon Talmor, Sameer Singh, Matt Gardner
A lot of diverse reading comprehension datasets have recently been introduced to study various phenomena in natural language, ranging from simple paraphrase matching and entity typing to entity tracking and understanding the implications of the context.
no code implementations • WS 2019 • Dheeru Dua, Ananth Gottumukkala, Alon Talmor, Sameer Singh, Matt Gardner
A lot of diverse reading comprehension datasets have recently been introduced to study various phenomena in natural language, ranging from simple paraphrase matching and entity typing to entity tracking and understanding the implications of the context.
no code implementations • NAACL 2019 • Jun Seok Kang, Robert L. Logan IV, Zewei Chu, Yang Chen, Dheeru Dua, Kevin Gimpel, Sameer Singh, Niranjan Balasubramanian
Given a sentence about a target entity, the task is to automatically generate a post-modifier phrase that provides contextually relevant information about the entity.
3 code implementations • NAACL 2019 • Dheeru Dua, Yizhong Wang, Pradeep Dasigi, Gabriel Stanovsky, Sameer Singh, Matt Gardner
We introduce a new English reading comprehension benchmark, DROP, which requires Discrete Reasoning Over the content of Paragraphs.
Ranked #14 on
Question Answering
on DROP Test
1 code implementation • ICLR 2018 • Zhengli Zhao, Dheeru Dua, Sameer Singh
Due to their complex nature, it is hard to characterize the ways in which machine learning models can misbehave or be exploited when deployed.