Search Results for author: Chitta Baral

Found 128 papers, 46 papers with code

Deeply Embedded Knowledge Representation & Reasoning For Natural Language Question Answering: A Practitioner’s Perspective

no code implementations EMNLP (spnlp) 2020 Arindam Mitra, Sanjay Narayana, Chitta Baral

Successful application of Knowledge Representation and Reasoning (KR) in Natural Language Understanding (NLU) is largely limited by the availability of a robust and general purpose natural language parser.

Natural Language Understanding Question Answering

To Find Waldo You Need Contextual Cues: Debiasing Who’s Waldo

1 code implementation ACL 2022 Yiran Luo, Pratyay Banerjee, Tejas Gokhale, Yezhou Yang, Chitta Baral

We find that the original Who’s Waldo dataset compiled for this task contains a large number of biased samples that are solvable simply by heuristic methods; for instance, in many cases the first name in the sentence corresponds to the largest bounding box, or the sequence of names in the sentence corresponds to an exact left-to-right order in the image.

Benchmarking Person-centric Visual Grounding +1

Reframing Instructional Prompts to GPTk’s Language

no code implementations Findings (ACL) 2022 Daniel Khashabi, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi

Our experiments compare the zero-shot and few-shot performance of LMs prompted with reframed instructions on 12 NLP tasks across 6 categories.

Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models

1 code implementation23 Apr 2024 Mihir Parmar, Nisarg Patel, Neeraj Varshney, Mutsumi Nakamura, Man Luo, Santosh Mashetty, Arindam Mitra, Chitta Baral

Existing work investigating this reasoning ability of LLMs has focused only on a couple of inference rules (such as modus ponens and modus tollens) of propositional and first-order logic.

Logical Reasoning Question Answering

Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks

no code implementations23 Apr 2024 Amir Saeidi, Shivanshu Verma, Chitta Baral

Key observations reveal that alignment methods achieve optimal performance with smaller training data subsets, exhibit limited effectiveness in reasoning tasks yet significantly impact mathematical problem-solving, and employing an instruction-tuned model notably influences truthfulness.

Question Answering

On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation

1 code implementation12 Apr 2024 Agneet Chatterjee, Tejas Gokhale, Chitta Baral, Yezhou Yang

Recent advances in monocular depth estimation have been made by incorporating natural language as additional guidance.

Monocular Depth Estimation

Getting it Right: Improving Spatial Consistency in Text-to-Image Models

1 code implementation1 Apr 2024 Agneet Chatterjee, Gabriela Ben Melech Stan, Estelle Aflalo, Sayak Paul, Dhruba Ghosh, Tejas Gokhale, Ludwig Schmidt, Hannaneh Hajishirzi, Vasudev Lal, Chitta Baral, Yezhou Yang

One of the key shortcomings in current text-to-image (T2I) models is their inability to consistently generate images which faithfully follow the spatial relationships specified in the text prompt.

Lost in Translation? Translation Errors and Challenges for Fair Assessment of Text-to-Image Models on Multilingual Concepts

no code implementations17 Mar 2024 Michael Saxon, Yiran Luo, Sharon Levy, Chitta Baral, Yezhou Yang, William Yang Wang

Benchmarks of the multilingual capabilities of text-to-image (T2I) models compare generated images prompted in a test language to an expected image distribution over a concept set.

Translation

Jailbreaking Proprietary Large Language Models using Word Substitution Cipher

no code implementations16 Feb 2024 Divij Handa, Advait Chirmule, Bimal Gajera, Chitta Baral

We first present a pilot study on the state-of-the-art LLM, GPT-4, in decoding several safe sentences that have been encrypted using various cryptographic techniques and find that a straightforward word substitution cipher can be decoded most effectively.

$λ$-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space

no code implementations7 Feb 2024 Maitreya Patel, Sangmin Jung, Chitta Baral, Yezhou Yang

While LDMs offer distinct advantages, P-T2I methods' reliance on the latent space of these diffusion models significantly escalates resource demands, leading to inconsistent results and necessitating numerous iterations for a single desired image.

Concept Alignment Philosophy

The Art of Defending: A Systematic Evaluation and Analysis of LLM Defense Strategies on Safety and Over-Defensiveness

no code implementations30 Dec 2023 Neeraj Varshney, Pavel Dolin, Agastya Seth, Chitta Baral

As Large Language Models (LLMs) play an increasingly pivotal role in natural language processing applications, their safety concerns become critical areas of NLP research.

ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations

no code implementations7 Dec 2023 Maitreya Patel, Changhoon Kim, Sheng Cheng, Chitta Baral, Yezhou Yang

The T2I prior model alone adds a billion parameters compared to the Latent Diffusion Models, which increases the computational and high-quality data requirements.

Contrastive Learning

LongBoX: Evaluating Transformers on Long-Sequence Clinical Tasks

1 code implementation16 Nov 2023 Mihir Parmar, Aakanksha Naik, Himanshu Gupta, Disha Agrawal, Chitta Baral

Assessing these models on long sequences is crucial since prior work in the general domain has demonstrated performance degradation of LLMs on longer texts.

Accelerating LLaMA Inference by Enabling Intermediate Layer Decoding via Instruction Tuning with LITE

no code implementations28 Oct 2023 Neeraj Varshney, Agneet Chatterjee, Mihir Parmar, Chitta Baral

Large Language Models (LLMs) have achieved remarkable performance across a wide variety of natural language tasks; however, their large size makes their inference slow and computationally expensive.

Semantic Similarity Semantic Textual Similarity +1

TarGEN: Targeted Data Generation with Large Language Models

1 code implementation27 Oct 2023 Himanshu Gupta, Kevin Scaria, Ujjwala Anantheswaran, Shreyas Verma, Mihir Parmar, Saurabh Arjun Sawant, Chitta Baral, Swaroop Mishra

Finally, when pre-finetuned on our synthetic SuperGLUE dataset, T5-3B yields impressive results on the OpenLLM leaderboard, surpassing the model trained on the Self-Instruct dataset by 4. 14% points.

InstructExcel: A Benchmark for Natural Language Instruction in Excel

no code implementations23 Oct 2023 Justin Payan, Swaroop Mishra, Mukul Singh, Carina Negreanu, Christian Poelitz, Chitta Baral, Subhro Roy, Rasika Chakravarthy, Benjamin Van Durme, Elnaz Nouri

With the evolution of Large Language Models (LLMs) we can solve increasingly more complex NLP tasks across various domains, including spreadsheets.

Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models

no code implementations2 Oct 2023 Man Luo, Shrinidhi Kumbhar, Ming Shen, Mihir Parmar, Neeraj Varshney, Pratyay Banerjee, Somak Aditya, Chitta Baral

This work strives to understand the proficiency of LLMs in logical reasoning by offering a brief review of the latest progress in this area; with a focus on the logical reasoning datasets, tasks, and the methods adopted to utilize LLMs for reasoning.

Knowledge Distillation Language Modelling +1

Can NLP Models 'Identify', 'Distinguish', and 'Justify' Questions that Don't have a Definitive Answer?

no code implementations8 Sep 2023 Ayushi Agarwal, Nisarg Patel, Neeraj Varshney, Mihir Parmar, Pavan Mallina, Aryan Bhavin Shah, Srihari Raju Sangaraju, Tirth Patel, Nihar Thakkar, Chitta Baral

Though state-of-the-art (SOTA) NLP systems have achieved remarkable performance on a variety of language understanding tasks, they primarily focus on questions that have a correct and a definitive answer.

Language-Conditioned Change-point Detection to Identify Sub-Tasks in Robotics Domains

no code implementations1 Sep 2023 Divyanshu Raj, Chitta Baral, Nakul Gopalan

In this work, we present an approach to identify sub-tasks within a demonstrated robot trajectory using language instructions.

Change Point Detection Instruction Following +3

MDDial: A Multi-turn Differential Diagnosis Dialogue Dataset with Reliability Evaluation

1 code implementation16 Aug 2023 Srija Macherla, Man Luo, Mihir Parmar, Chitta Baral

We introduce a unified score for the ADD system that takes into account the interplay between symptoms and diagnosis.

Natural Language Understanding

ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion Models

1 code implementation7 Jun 2023 Maitreya Patel, Tejas Gokhale, Chitta Baral, Yezhou Yang

To quantify the ability of T2I models in learning and synthesizing novel visual concepts (a. k. a.

Concept Alignment

End-to-end Knowledge Retrieval with Multi-modal Queries

1 code implementation1 Jun 2023 Man Luo, Zhiyuan Fang, Tejas Gokhale, Yezhou Yang, Chitta Baral

We investigate knowledge retrieval with multi-modal queries, i. e. queries containing information split across image and text inputs, a challenging task that differs from previous work on cross-modal retrieval.

Benchmarking Cross-Modal Retrieval +2

EDM3: Event Detection as Multi-task Text Generation

1 code implementation25 May 2023 Ujjwala Anantheswaran, Himanshu Gupta, Mihir Parmar, Kuntal Kumar Pal, Chitta Baral

We show that EDM3 helps to learn transferable knowledge that can be leveraged to perform Event Detection and its subtasks concurrently, mitigating the error propagation inherent in pipelined approaches.

Event Detection Sentence +1

Dr.ICL: Demonstration-Retrieved In-context Learning

no code implementations23 May 2023 Man Luo, Xin Xu, Zhuyun Dai, Panupong Pasupat, Mehran Kazemi, Chitta Baral, Vaiva Imbrasaite, Vincent Y Zhao

In-context learning (ICL), teaching a large language model (LLM) to perform a task with few-shot demonstrations rather than adjusting the model parameters, has emerged as a strong paradigm for using LLMs.

In-Context Learning Language Modelling +2

Instruction Tuned Models are Quick Learners

1 code implementation17 May 2023 Himanshu Gupta, Saurabh Arjun Sawant, Swaroop Mishra, Mutsumi Nakamura, Arindam Mitra, Santosh Mashetty, Chitta Baral

In the MTL setting, an instruction tuned model trained on only 6% of downstream training data achieve SOTA, while using 100% of the training data results in a 3. 69% points improvement (ROUGE-L 74. 68) over the previous SOTA.

In-Context Learning Multi-Task Learning +1

A Unified Evaluation Framework for Novelty Detection and Accommodation in NLP with an Instantiation in Authorship Attribution

no code implementations8 May 2023 Neeraj Varshney, Himanshu Gupta, Eric Robertson, Bing Liu, Chitta Baral

To initiate a systematic research in this important area of 'dealing with novelties', we introduce 'NoveltyTask', a multi-stage task to evaluate a system's performance on pipelined novelty 'detection' and 'accommodation' tasks.

Authorship Attribution Novelty Detection

Post-Abstention: Towards Reliably Re-Attempting the Abstained Instances in QA

no code implementations2 May 2023 Neeraj Varshney, Chitta Baral

Despite remarkable progress made in natural language processing, even the state-of-the-art models often make incorrect predictions.

Prompt-Based Learning for Thread Structure Prediction in Cybersecurity Forums

no code implementations5 Mar 2023 Kazuaki Kashihara, Kuntal Kumar Pal, Chitta Baral, Robert P Trevino

We propose a method called Next Paragraph Prediction with Instructional Prompting (NPP-IP) to predict thread structures while grounded on the context around posts.

Methods and Mechanisms for Interactive Novelty Handling in Adversarial Environments

no code implementations28 Feb 2023 Tung Thai, Ming Shen, Mayank Garg, Ayush Kalani, Nakul Vaidya, Utkarsh Soni, Mudit Verma, Sriram Gopalakrishnan, Neeraj Varshney, Chitta Baral, Subbarao Kambhampati, Jivko Sinapov, Matthias Scheutz

Learning to detect, characterize and accommodate novelties is a challenge that agents operating in open-world domains need to address to be able to guarantee satisfactory task performance.

Novelty Detection

Real-Time Visual Feedback to Guide Benchmark Creation: A Human-and-Metric-in-the-Loop Workflow

1 code implementation9 Feb 2023 Anjana Arunkumar, Swaroop Mishra, Bhavdeep Sachdeva, Chitta Baral, Chris Bryan

In pursuit of creating better benchmarks, we propose VAIDA, a novel benchmark creation paradigm for NLP, that focuses on guiding crowdworkers, an under-explored facet of addressing benchmark idiosyncrasies.

Lexi: Self-Supervised Learning of the UI Language

1 code implementation23 Jan 2023 Pratyay Banerjee, Shweti Mahajan, Kushal Arora, Chitta Baral, Oriana Riva

Along with text, these resources include visual content such as UI screenshots and images of application icons referenced in the text.

Image Retrieval Language Modelling +2

Benchmarking Spatial Relationships in Text-to-Image Generation

1 code implementation20 Dec 2022 Tejas Gokhale, Hamid Palangi, Besmira Nushi, Vibhav Vineet, Eric Horvitz, Ece Kamar, Chitta Baral, Yezhou Yang

We investigate the ability of T2I models to generate correct spatial relationships among objects and present VISOR, an evaluation metric that captures how accurately the spatial relationship described in text is generated in the image.

Benchmarking Text-to-Image Generation

Can Open-Domain QA Reader Utilize External Knowledge Efficiently like Humans?

no code implementations23 Nov 2022 Neeraj Varshney, Man Luo, Chitta Baral

Comparing with the FiD reader, this approach matches its accuracy by utilizing just 18. 32% of its reader inference cost and also outperforms it by achieving up to 55. 10% accuracy on NQ Open.

Open-Domain Question Answering TriviaQA

A Survey of Parameters Associated with the Quality of Benchmarks in NLP

no code implementations14 Oct 2022 Swaroop Mishra, Anjana Arunkumar, Chris Bryan, Chitta Baral

Inspired by successful quality indices in several domains such as power, food, and water, we take the first step towards a metric by identifying certain language properties that can represent various possible interactions leading to biases in a benchmark.

Benchmarking

Hardness of Samples Need to be Quantified for a Reliable Evaluation System: Exploring Potential Opportunities with a New Task

no code implementations14 Oct 2022 Swaroop Mishra, Anjana Arunkumar, Chris Bryan, Chitta Baral

Evaluation of models on benchmarks is unreliable without knowing the degree of sample hardness; this subsequently overestimates the capability of AI systems and limits their adoption in real world applications.

Semantic Textual Similarity STS

Pretrained Transformers Do not Always Improve Robustness

no code implementations14 Oct 2022 Swaroop Mishra, Bhavdeep Singh Sachdeva, Chitta Baral

Pretrained Transformers (PT) have been shown to improve Out of Distribution (OOD) robustness than traditional models such as Bag of Words (BOW), LSTMs, Convolutional Neural Networks (CNN) powered by Word2Vec and Glove embeddings.

Model Cascading: Towards Jointly Improving Efficiency and Accuracy of NLP Systems

no code implementations11 Oct 2022 Neeraj Varshney, Chitta Baral

Through comprehensive experiments in multiple task settings that differ in the number of models available for cascading (K value), we show that cascading improves both the computational efficiency and the prediction accuracy.

Computational Efficiency

Investigating the Failure Modes of the AUC metric and Exploring Alternatives for Evaluating Systems in Safety Critical Applications

no code implementations10 Oct 2022 Swaroop Mishra, Anjana Arunkumar, Chitta Baral

We find limitations in AUC; e. g., a model having higher AUC is not always better in performing selective answering.

A Study on the Efficiency and Generalization of Light Hybrid Retrievers

no code implementations4 Oct 2022 Man Luo, Shashank Jain, Anchit Gupta, Arash Einolghozati, Barlas Oguz, Debojeet Chatterjee, Xilun Chen, Chitta Baral, Peyman Heidari

Driven by this question, we leverage an indexing-efficient dense retriever (i. e. DrBoost) and introduce a LITE retriever that further reduces the memory of DrBoost.

Adversarial Attack Contrastive Learning +1

Reasoning about Actions over Visual and Linguistic Modalities: A Survey

no code implementations15 Jul 2022 Shailaja Keyur Sampat, Maitreya Patel, Subhasish Das, Yezhou Yang, Chitta Baral

'Actions' play a vital role in how humans interact with the world and enable them to achieve desired goals.

Common Sense Reasoning

BioTABQA: Instruction Learning for Biomedical Table Question Answering

no code implementations6 Jul 2022 Man Luo, Sharad Saxena, Swaroop Mishra, Mihir Parmar, Chitta Baral

To the best of our knowledge, none of TQA datasets exist in the biomedical domain where tables are frequently used to present information.

Question Answering

Improving Diversity with Adversarially Learned Transformations for Domain Generalization

1 code implementation15 Jun 2022 Tejas Gokhale, Rushil Anirudh, Jayaraman J. Thiagarajan, Bhavya Kailkhura, Chitta Baral, Yezhou Yang

To be successful in single source domain generalization, maximizing diversity of synthesized domains has emerged as one of the most effective strategies.

Domain Generalization

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

3 code implementations9 Jun 2022 Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ambrose Slone, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew Dai, Andrew La, Andrew Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakaş, B. Ryan Roberts, Bao Sheng Loe, Barret Zoph, Bartłomiej Bojanowski, Batuhan Özyurt, Behnam Hedayatnia, Behnam Neyshabur, Benjamin Inden, Benno Stein, Berk Ekmekci, Bill Yuchen Lin, Blake Howald, Bryan Orinion, Cameron Diao, Cameron Dour, Catherine Stinson, Cedrick Argueta, César Ferri Ramírez, Chandan Singh, Charles Rathkopf, Chenlin Meng, Chitta Baral, Chiyu Wu, Chris Callison-Burch, Chris Waites, Christian Voigt, Christopher D. Manning, Christopher Potts, Cindy Ramirez, Clara E. Rivera, Clemencia Siro, Colin Raffel, Courtney Ashcraft, Cristina Garbacea, Damien Sileo, Dan Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Daniel Freeman, Daniel Khashabi, Daniel Levy, Daniel Moseguí González, Danielle Perszyk, Danny Hernandez, Danqi Chen, Daphne Ippolito, Dar Gilboa, David Dohan, David Drakard, David Jurgens, Debajyoti Datta, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Derek Tam, Dieuwke Hupkes, Diganta Misra, Dilyar Buzan, Dimitri Coelho Mollo, Diyi Yang, Dong-Ho Lee, Dylan Schrader, Ekaterina Shutova, Ekin Dogus Cubuk, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Ellie Pavlick, Emanuele Rodola, Emma Lam, Eric Chu, Eric Tang, Erkut Erdem, Ernie Chang, Ethan A. Chi, Ethan Dyer, Ethan Jerzak, Ethan Kim, Eunice Engefu Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fatemeh Siar, Fernando Martínez-Plumed, Francesca Happé, Francois Chollet, Frieda Rong, Gaurav Mishra, Genta Indra Winata, Gerard de Melo, Germán Kruszewski, Giambattista Parascandolo, Giorgio Mariani, Gloria Wang, Gonzalo Jaimovitch-López, Gregor Betz, Guy Gur-Ari, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hannaneh Hajishirzi, Harsh Mehta, Hayden Bogar, Henry Shevlin, Hinrich Schütze, Hiromu Yakura, Hongming Zhang, Hugh Mee Wong, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, Jacob Hilton, Jaehoon Lee, Jaime Fernández Fisac, James B. Simon, James Koppel, James Zheng, James Zou, Jan Kocoń, Jana Thompson, Janelle Wingfield, Jared Kaplan, Jarema Radom, Jascha Sohl-Dickstein, Jason Phang, Jason Wei, Jason Yosinski, Jekaterina Novikova, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesse Engel, Jesujoba Alabi, Jiacheng Xu, Jiaming Song, Jillian Tang, Joan Waweru, John Burden, John Miller, John U. Balis, Jonathan Batchelder, Jonathan Berant, Jörg Frohberg, Jos Rozen, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Guerr, Joseph Jones, Joshua B. Tenenbaum, Joshua S. Rule, Joyce Chua, Kamil Kanclerz, Karen Livescu, Karl Krauth, Karthik Gopalakrishnan, Katerina Ignatyeva, Katja Markert, Kaustubh D. Dhole, Kevin Gimpel, Kevin Omondi, Kory Mathewson, Kristen Chiafullo, Ksenia Shkaruta, Kumar Shridhar, Kyle McDonell, Kyle Richardson, Laria Reynolds, Leo Gao, Li Zhang, Liam Dugan, Lianhui Qin, Lidia Contreras-Ochando, Louis-Philippe Morency, Luca Moschella, Lucas Lam, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros Colón, Luke Metz, Lütfi Kerem Şenel, Maarten Bosma, Maarten Sap, Maartje ter Hoeve, Maheen Farooqi, Manaal Faruqui, Mantas Mazeika, Marco Baturan, Marco Marelli, Marco Maru, Maria Jose Ramírez Quintana, Marie Tolkiehn, Mario Giulianelli, Martha Lewis, Martin Potthast, Matthew L. Leavitt, Matthias Hagen, Mátyás Schubert, Medina Orduna Baitemirova, Melody Arnaud, Melvin McElrath, Michael A. Yee, Michael Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michał Swędrowski, Michele Bevilacqua, Michihiro Yasunaga, Mihir Kale, Mike Cain, Mimee Xu, Mirac Suzgun, Mitch Walker, Mo Tiwari, Mohit Bansal, Moin Aminnaseri, Mor Geva, Mozhdeh Gheini, Mukund Varma T, Nanyun Peng, Nathan A. Chi, Nayeon Lee, Neta Gur-Ari Krakover, Nicholas Cameron, Nicholas Roberts, Nick Doiron, Nicole Martinez, Nikita Nangia, Niklas Deckers, Niklas Muennighoff, Nitish Shirish Keskar, Niveditha S. Iyer, Noah Constant, Noah Fiedel, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Omer Levy, Owain Evans, Pablo Antonio Moreno Casares, Parth Doshi, Pascale Fung, Paul Pu Liang, Paul Vicol, Pegah Alipoormolabashi, Peiyuan Liao, Percy Liang, Peter Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Pouya Pezeshkpour, Priti Oli, Qiaozhu Mei, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Etta Rudolph, Raefer Gabriel, Rahel Habacker, Ramon Risco, Raphaël Millière, Rhythm Garg, Richard Barnes, Rif A. Saurous, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Roman Sitelew, Ronan LeBras, Rosanne Liu, Rowan Jacobs, Rui Zhang, Ruslan Salakhutdinov, Ryan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Rylan Yang, Sahib Singh, Saif M. Mohammad, Sajant Anand, Sam Dillavou, Sam Shleifer, Sam Wiseman, Samuel Gruetter, Samuel R. Bowman, Samuel S. Schoenholz, Sanghyun Han, Sanjeev Kwatra, Sarah A. Rous, Sarik Ghazarian, Sayan Ghosh, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sepideh Sadeghi, Shadi Hamdan, Sharon Zhou, Shashank Srivastava, Sherry Shi, Shikhar Singh, Shima Asaadi, Shixiang Shane Gu, Shubh Pachchigar, Shubham Toshniwal, Shyam Upadhyay, Shyamolima, Debnath, Siamak Shakeri, Simon Thormeyer, Simone Melzi, Siva Reddy, Sneha Priscilla Makini, Soo-Hwan Lee, Spencer Torene, Sriharsha Hatwar, Stanislas Dehaene, Stefan Divic, Stefano Ermon, Stella Biderman, Stephanie Lin, Stephen Prasad, Steven T. Piantadosi, Stuart M. Shieber, Summer Misherghi, Svetlana Kiritchenko, Swaroop Mishra, Tal Linzen, Tal Schuster, Tao Li, Tao Yu, Tariq Ali, Tatsu Hashimoto, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Tiberius Nkinyili, Timo Schick, Timofei Kornev, Titus Tunduny, Tobias Gerstenberg, Trenton Chang, Trishala Neeraj, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Vera Demberg, Victoria Nyamai, Vikas Raunak, Vinay Ramasesh, Vinay Uday Prabhu, Vishakh Padmakumar, Vivek Srikumar, William Fedus, William Saunders, William Zhang, Wout Vossen, Xiang Ren, Xiaoyu Tong, Xinran Zhao, Xinyi Wu, Xudong Shen, Yadollah Yaghoobzadeh, Yair Lakretz, Yangqiu Song, Yasaman Bahri, Yejin Choi, Yichi Yang, Yiding Hao, Yifu Chen, Yonatan Belinkov, Yu Hou, Yufang Hou, Yuntao Bai, Zachary Seid, Zhuoye Zhao, Zijian Wang, Zijie J. Wang, ZiRui Wang, Ziyi Wu

BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models.

Common Sense Reasoning Math +1

Is a Question Decomposition Unit All We Need?

1 code implementation25 May 2022 Pruthvi Patel, Swaroop Mishra, Mihir Parmar, Chitta Baral

Large Language Models (LMs) have achieved state-of-the-art performance on many Natural Language Processing (NLP) benchmarks.

Let the Model Decide its Curriculum for Multitask Learning

no code implementations DeepLo 2022 Neeraj Varshney, Swaroop Mishra, Chitta Baral

Curriculum learning strategies in prior multi-task learning approaches arrange datasets in a difficulty hierarchy either based on human perception or by exhaustively searching the optimal arrangement.

Multi-Task Learning

Don't Blame the Annotator: Bias Already Starts in the Annotation Instructions

no code implementations1 May 2022 Mihir Parmar, Swaroop Mishra, Mor Geva, Chitta Baral

In this work, we hypothesize that annotators pick up on patterns in the crowdsourcing instructions, which bias them to write many similar examples that are then over-represented in the collected data.

In-BoXBART: Get Instructions into Biomedical Multi-Task Learning

2 code implementations Findings (NAACL) 2022 Mihir Parmar, Swaroop Mishra, Mirali Purohit, Man Luo, M. Hassan Murad, Chitta Baral

Recently, instructional prompts have shown significant improvement towards multi-task generalization; however, the effect of instructional prompts and Multi-Task Learning (MTL) has not been systematically studied in the biomedical domain.

Few-Shot Learning Multi-Task Learning

To Find Waldo You Need Contextual Cues: Debiasing Who's Waldo

1 code implementation30 Mar 2022 Yiran Luo, Pratyay Banerjee, Tejas Gokhale, Yezhou Yang, Chitta Baral

We find that the original Who's Waldo dataset compiled for this task contains a large number of biased samples that are solvable simply by heuristic methods; for instance, in many cases the first name in the sentence corresponds to the largest bounding box, or the sequence of names in the sentence corresponds to an exact left-to-right order in the image.

Benchmarking Person-centric Visual Grounding +1

How Many Data Samples is an Additional Instruction Worth?

1 code implementation17 Mar 2022 Ravsehaj Singh Puri, Swaroop Mishra, Mihir Parmar, Chitta Baral

However, they can write alternate instructions to represent an instruction task.

Choose Your QA Model Wisely: A Systematic Study of Generative and Extractive Readers for Question Answering

no code implementations SpaNLP (ACL) 2022 Man Luo, Kazuma Hashimoto, Semih Yavuz, Zhiwei Liu, Chitta Baral, Yingbo Zhou

Among several interesting findings, it is important to highlight that (1) the generative readers perform better in long context QA, (2) the extractive readers perform better in short context while also showing better out-of-domain generalization, and (3) the encoder of encoder-decoder PrLMs (e. g., T5) turns out to be a strong extractive reader and outperforms the standard choice of encoder-only PrLMs (e. g., RoBERTa).

Domain Generalization Multi-Task Learning +1

ILDAE: Instance-Level Difficulty Analysis of Evaluation Data

1 code implementation ACL 2022 Neeraj Varshney, Swaroop Mishra, Chitta Baral

Knowledge of questions' difficulty level helps a teacher in several ways, such as estimating students' potential quickly by asking carefully selected questions and improving quality of examination by modifying trivial and hard questions.

Improving Biomedical Information Retrieval with Neural Retrievers

no code implementations19 Jan 2022 Man Luo, Arindam Mitra, Tejas Gokhale, Chitta Baral

We show that BM25 and our method can complement each other, and a simple hybrid model leads to further gains in the large corpus setting.

Biomedical Information Retrieval Information Retrieval +4

Unsupervised Natural Language Inference Using PHL Triplet Generation

1 code implementation Findings (ACL) 2022 Neeraj Varshney, Pratyay Banerjee, Tejas Gokhale, Chitta Baral

Transformer-based models achieve impressive performance on numerous Natural Language Inference (NLI) benchmarks when trained on respective training datasets.

Natural Language Inference Sentence

A Bayesian Approach for Medical Inquiry and Disease Inference in Automated Differential Diagnosis

1 code implementation15 Oct 2021 Hong Guan, Chitta Baral

Unlike previous work that simulates data from given probabilities and uses ML algorithms on them, we directly use the Quick Medical Reference (QMR) belief network, and apply Bayesian inference in the inference phase and Bayesian experimental design in the inquiry phase.

Bayesian Inference Experimental Design

Semantically Distributed Robust Optimization for Vision-and-Language Inference

1 code implementation Findings (ACL) 2022 Tejas Gokhale, Abhishek Chaudhary, Pratyay Banerjee, Chitta Baral, Yezhou Yang

Analysis of vision-and-language models has revealed their brittleness under linguistic phenomena such as paraphrasing, negation, textual entailment, and word substitutions with synonyms or antonyms.

Data Augmentation Natural Language Inference +2

A Simple Approach to Jointly Rank Passages and Select Relevant Sentences in the OBQA Context

no code implementations NAACL (ACL) 2022 Man Luo, Shuguang Chen, Chitta Baral

Furthermore, we propose consistency and similarity constraints to promote the correlation and interaction between passage ranking and sentence selection. The experiments demonstrate that our framework can achieve competitive results with previous systems and outperform the baseline by 28\% in terms of exact matching of relevant sentences on the HotpotQA dataset.

Passage Ranking Question Answering +1

Reframing Instructional Prompts to GPTk's Language

no code implementations16 Sep 2021 Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi

Our experiments compare the zero-shot and few-shot performance of LMs prompted with reframed instructions on 12 NLP tasks across 6 categories.

Few-Shot Learning Question Generation +1

Investigating Numeracy Learning Ability of a Text-to-Text Transfer Model

1 code implementation Findings (EMNLP) 2021 Kuntal Kumar Pal, Chitta Baral

Some possible reasons can be the tokenizers and pre-training objectives which are not specifically designed to learn and preserve numeracy.

Transfer Learning

Weakly-Supervised Visual-Retriever-Reader for Knowledge-based Question Answering

1 code implementation EMNLP 2021 Man Luo, Yankai Zeng, Pratyay Banerjee, Chitta Baral

The visual retriever aims to retrieve relevant knowledge, and the visual reader seeks to predict answers based on given knowledge.

Question Answering Retrieval +1

Weakly Supervised Relative Spatial Reasoning for Visual Question Answering

no code implementations ICCV 2021 Pratyay Banerjee, Tejas Gokhale, Yezhou Yang, Chitta Baral

In this work, we evaluate the faithfulness of V\&L models to such geometric understanding, by formulating the prediction of pair-wise relative locations of objects as a classification as well as a regression task.

Question Answering Visual Question Answering +1

Interviewer-Candidate Role Play: Towards Developing Real-World NLP Systems

no code implementations1 Jul 2021 Neeraj Varshney, Swaroop Mishra, Chitta Baral

However, our task leaves a significant challenge for NLP researchers to further improve OOD performance at each stage.

Natural Language Inference

Commonsense Reasoning with Implicit Knowledge in Natural Language

no code implementations AKBC 2021 Pratyay Banerjee, Swaroop Mishra, Kuntal Kumar Pal, Arindam Mitra, Chitta Baral

Two common approaches to this are (i) Use of well-structured commonsense present in knowledge graphs, and (ii) Use of progressively larger transformer language models.

Knowledge Graphs

Constructing Flow Graphs from Procedural Cybersecurity Texts

1 code implementation Findings (ACL) 2021 Kuntal Kumar Pal, Kazuaki Kashihara, Pratyay Banerjee, Swaroop Mishra, Ruoyu Wang, Chitta Baral

We must read the whole text to identify the relevant information or identify the instruction flows to complete a task, which is prone to failures.

Sentence Sentence Embeddings

Unsupervised Pronoun Resolution via Masked Noun-Phrase Prediction

no code implementations ACL 2021 Ming Shen, Pratyay Banerjee, Chitta Baral

In this work, we propose Masked Noun-Phrase Prediction (MNPP), a pre-training strategy to tackle pronoun resolution in a fully unsupervised setting.

Cross-Task Generalization via Natural Language Crowdsourcing Instructions

3 code implementations ACL 2022 Swaroop Mishra, Daniel Khashabi, Chitta Baral, Hannaneh Hajishirzi

Using this meta-dataset, we measure cross-task generalization by training models on seen tasks and measuring generalization to the remaining unseen ones.

Question Answering

Variable Name Recovery in Decompiled Binary Code using Constrained Masked Language Modeling

no code implementations23 Mar 2021 Pratyay Banerjee, Kuntal Kumar Pal, Fish Wang, Chitta Baral

Inspired by recent advances in natural language processing, we propose a novel solution to infer variable names in decompiled code based on Masked Language Modeling, Byte-Pair Encoding, and neural architectures such as Transformers and BERT.

Language Modelling Masked Language Modeling

Self-Supervised Test-Time Learning for Reading Comprehension

no code implementations NAACL 2021 Pratyay Banerjee, Tejas Gokhale, Chitta Baral

Recent work on unsupervised question answering has shown that models can be trained with procedurally generated question-answer pairs and can achieve performance competitive with supervised methods.

Question Answering Reading Comprehension

Can Transformers Reason About Effects of Actions?

no code implementations17 Dec 2020 Pratyay Banerjee, Chitta Baral, Man Luo, Arindam Mitra, Kuntal Pal, Tran C. Son, Neeraj Varshney

A recent work has shown that transformers are able to "reason" with facts and rules in a limited setting where the rules are natural language expressions of conjunctions of conditions implying a conclusion.

Common Sense Reasoning Question Answering

WeaQA: Weak Supervision via Captions for Visual Question Answering

no code implementations Findings (ACL) 2021 Pratyay Banerjee, Tejas Gokhale, Yezhou Yang, Chitta Baral

Methodologies for training visual question answering (VQA) models assume the availability of datasets with human-annotated \textit{Image-Question-Answer} (I-Q-A) triplets.

Question Answering Visual Question Answering

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

3 code implementations3 Dec 2020 Tejas Gokhale, Rushil Anirudh, Bhavya Kailkhura, Jayaraman J. Thiagarajan, Chitta Baral, Yezhou Yang

While this deviation may not be exactly known, its broad characterization is specified a priori, in terms of attributes.

Attribute

MUTANT: A Training Paradigm for Out-of-Distribution Generalization in Visual Question Answering

2 code implementations EMNLP 2020 Tejas Gokhale, Pratyay Banerjee, Chitta Baral, Yezhou Yang

In this paper, we present MUTANT, a training paradigm that exposes the model to perceptually similar, yet semantically distinct mutations of the input, to improve OOD generalization, such as the VQA-CP challenge.

Out-of-Distribution Generalization Question Answering +1

Multi-Perspective Semantic Information Retrieval

no code implementations3 Sep 2020 Samarth Rawal, Chitta Baral

Information Retrieval (IR) is the task of obtaining pieces of data (such as documents or snippets of text) that are relevant to a particular query or need from a large repository of information.

Information Retrieval Retrieval +1

Towards Improving Selective Prediction Ability of NLP Systems

no code implementations RepL4NLP (ACL) 2022 Neeraj Varshney, Swaroop Mishra, Chitta Baral

In (IID, OOD) settings, we show that the representations learned by our calibrator result in an improvement of (15. 81%, 5. 64%) and (6. 19%, 13. 9%) over 'MaxProb' -- a selective prediction baseline -- on NLI and DD tasks respectively.

Natural Language Inference

DQI: A Guide to Benchmark Evaluation

no code implementations10 Aug 2020 Swaroop Mishra, Anjana Arunkumar, Bhavdeep Sachdeva, Chris Bryan, Chitta Baral

A `state of the art' model A surpasses humans in a benchmark B, but fails on similar benchmarks C, D, and E. What does B have that the other benchmarks do not?

Our Evaluation Metric Needs an Update to Encourage Generalization

no code implementations14 Jul 2020 Swaroop Mishra, Anjana Arunkumar, Chris Bryan, Chitta Baral

In order to stop the inflation in model performance -- and thus overestimation in AI systems' capabilities -- we propose a simple and novel evaluation metric, WOOD Score, that encourages generalization during evaluation.

Towards Question Format Independent Numerical Reasoning: A Set of Prerequisite Tasks

no code implementations18 May 2020 Swaroop Mishra, Arindam Mitra, Neeraj Varshney, Bhavdeep Sachdeva, Chitta Baral

However, there exists a strong need for a benchmark which can evaluate the abilities of models, in performing question format independent numerical reasoning, as (i) the numerical reasoning capabilities we want to teach are not controlled by question formats, (ii) for numerical reasoning technology to have the best possible application, it must be able to process language and reason in a way that is not exclusive to a single format, task, dataset or domain.

Natural Language Inference Question Answering +1

DQI: Measuring Data Quality in NLP

1 code implementation2 May 2020 Swaroop Mishra, Anjana Arunkumar, Bhavdeep Sachdeva, Chris Bryan, Chitta Baral

The data creation paradigm consists of several data visualizations to help data creators (i) understand the quality of data and (ii) visualize the impact of the created data instance on the overall quality.

Active Learning Benchmarking

Knowledge Fusion and Semantic Knowledge Ranking for Open Domain Question Answering

no code implementations7 Apr 2020 Pratyay Banerjee, Chitta Baral

Open Domain Question Answering requires systems to retrieve external knowledge and perform multi-hop reasoning by composing knowledge spread over multiple sentences.

Information Retrieval Open-Domain Question Answering +1

Natural Language QA Approaches using Reasoning with External Knowledge

no code implementations6 Mar 2020 Chitta Baral, Pratyay Banerjee, Kuntal Kumar Pal, Arindam Mitra

The challenges inspired by Winograd's councilmen example, and recent developments such as the Rebooting AI book, various NLQA datasets, research on knowledge acquisition in the NLQA context, and their use in various NLQA models have brought the issue of NLQA using ``reasoning'' with external knowledge to the forefront.

Question Answering

VQA-LOL: Visual Question Answering under the Lens of Logic

no code implementations ECCV 2020 Tejas Gokhale, Pratyay Banerjee, Chitta Baral, Yezhou Yang

We propose our {Lens of Logic (LOL)} model which uses question-attention and logic-attention to understand logical connectives in the question, and a novel Fr\'echet-Compatibility Loss, which ensures that the answers of the component questions and the composed question are consistent with the inferred logical operation.

Negation Question Answering +2

Imitation Learning of Robot Policies by Combining Language, Vision and Demonstration

no code implementations26 Nov 2019 Simon Stepputtis, Joseph Campbell, Mariano Phielipp, Chitta Baral, Heni Ben Amor

In this work we propose a novel end-to-end imitation learning approach which combines natural language, vision, and motion information to produce an abstract representation of a task, which in turn is used to synthesize specific motion controllers at run-time.

Imitation Learning

Knowledge Guided Named Entity Recognition for BioMedical Text

no code implementations10 Nov 2019 Pratyay Banerjee, Kuntal Kumar Pal, Murthy Devarakonda, Chitta Baral

In this work, we formulate the NER task as a multi-answer knowledge guided QA task (KGQA) which helps to predict entities only by assigning B, I and O tags without associating entity types with the tags.

named-entity-recognition Named Entity Recognition +2

Imitation Learning of Robot Policies using Language, Vision and Motion

no code implementations25 Sep 2019 Simon Stepputtis, Joseph Campbell, Mariano Phielipp, Chitta Baral, Heni Ben Amor

In this work we propose a novel end-to-end imitation learning approach which combines natural language, vision, and motion information to produce an abstract representation of a task, which in turn can be used to synthesize specific motion controllers at run-time.

Imitation Learning

How Additional Knowledge can Improve Natural Language Commonsense Question Answering?

no code implementations19 Sep 2019 Arindam Mitra, Pratyay Banerjee, Kuntal Kumar Pal, Swaroop Mishra, Chitta Baral

Recently several datasets have been proposed to encourage research in Question Answering domains where commonsense knowledge is expected to play an important role.

Language Modelling Multiple-choice +1

A Generate-Validate Approach to Answering Questions about Qualitative Relationships

no code implementations9 Aug 2019 Arindam Mitra, Chitta Baral, Aurgho Bhattacharjee, Ishan Shrivastava

Qualitative relationships describe how increasing or decreasing one property (e. g. altitude) affects another (e. g. temperature).

Question Answering Transfer Learning

Identification of Adverse Drug Reaction Mentions in Tweets -- SMM4H Shared Task 2019

no code implementations WS 2019 Samarth Rawal, Siddharth Rawal, Saadat Anwar, Chitta Baral

Analyzing social media posts can offer insights into a wide range of topics that are commonly discussed online, providing valuable information for studying various health-related phenomena reported online.

Careful Selection of Knowledge to solve Open Book Question Answering

no code implementations ACL 2019 Pratyay Banerjee, Kuntal Kumar Pal, Arindam Mitra, Chitta Baral

Open book question answering is a type of natural language based QA (NLQA) where questions are expected to be answered with respect to a given set of open book facts, and common knowledge about a topic.

Information Retrieval Question Answering +2

Combining Knowledge Hunting and Neural Language Models to Solve the Winograd Schema Challenge

no code implementations ACL 2019 Ashok Prakash, Arpit Sharma, Arindam Mitra, Chitta Baral

Our end-to-end system built in such a manner improves on the accuracy of two of the available language model based approaches by 5. 53{\%} and 7. 7{\%} respectively.

Language Modelling

Integrating Knowledge and Reasoning in Image Understanding

no code implementations24 Jun 2019 Somak Aditya, Yezhou Yang, Chitta Baral

Deep learning based data-driven approaches have been successfully applied in various image understanding applications ranging from object recognition, semantic segmentation to visual question answering.

Object Recognition Question Answering +2

Blocksworld Revisited: Learning and Reasoning to Generate Event-Sequences from Image Pairs

no code implementations28 May 2019 Tejas Gokhale, Shailaja Sampat, Zhiyuan Fang, Yezhou Yang, Chitta Baral

The process of identifying changes or transformations in a scene along with the ability of reasoning about their causes and effects, is a key aspect of intelligence.

Declarative Question Answering over Knowledge Bases containing Natural Language Text with Answer Set Programming

1 code implementation1 May 2019 Arindam Mitra, Peter Clark, Oyvind Tafjord, Chitta Baral

While in recent years machine learning (ML) based approaches have been the popular approach in developing end-to-end question answering systems, such systems often struggle when additional knowledge is needed to correctly answer the questions.

Logical Reasoning Natural Language Inference +1

Developing and Using Special-Purpose Lexicons for Cohort Selection from Clinical Notes

no code implementations26 Feb 2019 Samarth Rawal, Ashok Prakash, Soumya Adhya, Sidharth Kulkarni, Saadat Anwar, Chitta Baral, Murthy Devarakonda

To help automate the process, National NLP Clinical Challenges (N2C2) conducted a shared challenge by defining 13 criteria for clinical trial cohort selection and by providing training and test datasets.

BIG-bench Machine Learning

Spatial Knowledge Distillation to aid Visual Reasoning

no code implementations10 Dec 2018 Somak Aditya, Rudra Saha, Yezhou Yang, Chitta Baral

We propose a framework that combines recent advances in knowledge distillation (teacher-student framework), relational reasoning and probabilistic logical languages to incorporate such knowledge in existing neural networks for the task of Visual Question Answering.

Knowledge Distillation Question Answering +3

Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering

no code implementations23 Mar 2018 Somak Aditya, Yezhou Yang, Chitta Baral

Here we adopt Visual Question Answering (VQA) as an example task, where a system is expected to answer a question in natural language about an image.

Question Answering Visual Question Answering

From Images to Sentences through Scene Description Graphs using Commonsense Reasoning and Knowledge

no code implementations10 Nov 2015 Somak Aditya, Yezhou Yang, Chitta Baral, Cornelia Fermuller, Yiannis Aloimonos

Specifically, commonsense reasoning is applied on (a) detections obtained from existing perception methods on given images, (b) a "commonsense" knowledge base constructed using natural language processing of image annotations and (c) lexical ontological knowledge from resources such as WordNet.

image-sentence alignment Sentence

An Action Language for Multi-Agent Domains: Foundations

no code implementations6 Nov 2015 Chitta Baral, Gregory Gelfond, Enrico Pontelli, Tran Cao Son

It also allows the specification of agents' dynamic awareness of action occurrences which has future implications on what agents' know about the world and other agents' knowledge about the world.

Event-Object Reasoning with Curated Knowledge Bases: Deriving Missing Information

no code implementations19 Jun 2013 Chitta Baral, Nguyen H. Vo

The broader goal of our research is to formulate answers to why and how questions with respect to knowledge bases, such as AURA.

Encoding Higher Level Extensions of Petri Nets in Answer Set Programming

no code implementations15 Jun 2013 Saadat Anwar, Chitta Baral, Katsumi Inoue

Answering realistic questions about biological systems and pathways similar to the ones used by text books to test understanding of students about biological systems is one of our long term research goals.

Encoding Petri Nets in Answer Set Programming for Simulation Based Reasoning

no code implementations15 Jun 2013 Saadat Anwar, Chitta Baral, Katsumi Inoue

However, we need to make extensions to the Petri Net model and also reason with multiple simulation runs and parallel state evolutions.

Cannot find the paper you are looking for? You can Submit a new open access paper.