1 code implementation • 17 Oct 2024 • Shailaja Keyur Sampat, Mutsumi Nakamura, Shankar Kailas, Kartik Aggarwal, Mandy Zhou, Yezhou Yang, Chitta Baral
We show that this benchmark is quite challenging for existing large-scale vision-language models and encourage development of systems that possess robust visuo-linguistic reasoning capabilities.
1 code implementation • 17 Oct 2024 • Shailaja Keyur Sampat, Yezhou Yang, Chitta Baral
We present baseline results of ActionCOMET over the collected dataset and compare them with the performance of the best existing VQA approaches.
1 code implementation • 17 Oct 2024 • Shailaja Keyur Sampat, Maitreya Patel, Yezhou Yang, Chitta Baral
An ability to learn about new objects from a small amount of visual data and produce convincing linguistic justification about the presence/absence of certain concepts (that collectively compose the object) in novel scenarios is an important characteristic of human cognition.
1 code implementation • 7 Dec 2022 • Shailaja Keyur Sampat, Pratyay Banerjee, Yezhou Yang, Chitta Baral
'Actions' play a vital role in how humans interact with the world.
no code implementations • 7 Dec 2022 • Shailaja Keyur Sampat, Pratyay Banerjee, Yezhou Yang, Chitta Baral
'Actions' play a vital role in how humans interact with the world.
no code implementations • 15 Jul 2022 • Shailaja Keyur Sampat, Maitreya Patel, Subhasish Das, Yezhou Yang, Chitta Baral
'Actions' play a vital role in how humans interact with the world and enable them to achieve desired goals.
10 code implementations • 16 Apr 2022 • Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, Arut Selvan Dhanasekaran, Atharva Naik, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Gary Lai, Ishan Purohit, Ishani Mondal, Jacob Anderson, Kirby Kuznia, Krima Doshi, Maitreya Patel, Kuntal Kumar Pal, Mehrad Moradshahi, Mihir Parmar, Mirali Purohit, Neeraj Varshney, Phani Rohitha Kaza, Pulkit Verma, Ravsehaj Singh Puri, Rushang Karia, Shailaja Keyur Sampat, Savan Doshi, Siddhartha Mishra, Sujan Reddy, Sumanta Patro, Tanay Dixit, Xudong Shen, Chitta Baral, Yejin Choi, Noah A. Smith, Hannaneh Hajishirzi, Daniel Khashabi
This large and diverse collection of tasks enables rigorous benchmarking of cross-task generalization under instructions -- training models to follow instructions on a subset of tasks and evaluating them on the remaining unseen ones.
1 code implementation • NAACL 2021 • Shailaja Keyur Sampat, Akshay Kumar, Yezhou Yang, Chitta Baral
Most existing research on visual question answering (VQA) is limited to information explicitly present in an image or a video.
1 code implementation • 13 Apr 2021 • Shailaja Keyur Sampat, Akshay Kumar, Yezhou Yang, Chitta Baral
Most existing research on visual question answering (VQA) is limited to information explicitly present in an image or a video.
no code implementations • EACL 2021 • Man Luo, Shailaja Keyur Sampat, Riley Tallman, Yankai Zeng, Manuha Vancha, Akarshan Sajja, Chitta Baral
GQA (CITATION) is a dataset for real-world visual reasoning and compositional question answering.
1 code implementation • 28 Mar 2021 • Man Luo, Shailaja Keyur Sampat, Riley Tallman, Yankai Zeng, Manuha Vancha, Akarshan Sajja, Chitta Baral
GQA~\citep{hudson2019gqa} is a dataset for real-world visual reasoning and compositional question answering.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Shailaja Keyur Sampat, Yezhou Yang, Chitta Baral
Understanding images and text together is an important aspect of cognition and building advanced Artificial Intelligence (AI) systems.