no code implementations • NAACL (DeeLIO) 2021 • Lisa Bauer, Lingjia Deng, Mohit Bansal
We examine the effect of domain-specific external knowledge variations on deep large scale language model performance.
1 code implementation • COLING 2022 • Danfeng Guo, Arpit Gupta, Sanchit Agarwal, Jiun-Yu Kao, Shuyang Gao, Arijit Biswas, Chien-Wei Lin, Tagyoung Chung, Mohit Bansal
Learning from multimodal data has become a popular research topic in recent years.
1 code implementation • COLING 2022 • Adyasha Maharana, Mohit Bansal
Recent advances in commonsense reasoning have been fueled by the availability of large-scale human annotated datasets.
1 code implementation • NAACL (ACL) 2022 • Xinya Du, Zixuan Zhang, Sha Li, Pengfei Yu, Hongwei Wang, Tuan Lai, Xudong Lin, Ziqi Wang, Iris Liu, Ben Zhou, Haoyang Wen, Manling Li, Darryl Hannan, Jie Lei, Hyounghun Kim, Rotem Dror, Haoyu Wang, Michael Regan, Qi Zeng, Qing Lyu, Charles Yu, Carl Edwards, Xiaomeng Jin, Yizhu Jiao, Ghazaleh Kazeminejad, Zhenhailong Wang, Chris Callison-Burch, Mohit Bansal, Carl Vondrick, Jiawei Han, Dan Roth, Shih-Fu Chang, Martha Palmer, Heng Ji
We introduce RESIN-11, a new schema-guided event extraction&prediction framework that can be applied to a large variety of newsworthy scenarios.
no code implementations • NAACL 2022 • Sha Li, Mahdi Namazifar, Di Jin, Mohit Bansal, Heng Ji, Yang Liu, Dilek Hakkani-Tur
In this work, we propose to automatically convert the background knowledge documents into document semantic graphs and then perform knowledge selection over such graphs.
1 code implementation • NAACL 2022 • Ori Shapira, Ramakanth Pasunuru, Mohit Bansal, Ido Dagan, Yael Amsterdamer
Interactive summarization is a task that facilitates user-guided exploration of information within a document set.
1 code implementation • NAACL 2022 • Adyasha Maharana, Mohit Bansal
Hence, we examine the effect of a human-like easy-to-difficult curriculum during finetuning of language models for commonsense reasoning tasks.
no code implementations • Findings (NAACL) 2022 • Adyasha Maharana, Quan Tran, Franck Dernoncourt, Seunghyun Yoon, Trung Bui, Walter Chang, Mohit Bansal
We construct and present a new multimodal dataset consisting of software instructional livestreams and containing manual annotations for both detailed and abstract procedural intent that enable training and evaluation of joint video and text understanding models.
no code implementations • NAACL (sdp) 2021 • Yash Gupta, Pawan Sasanka Ammanamanchi, Shikha Bordia, Arjun Manoharan, Deepak Mittal, Ramakanth Pasunuru, Manish Shrivastava, Maneesh Singh, Mohit Bansal, Preethi Jyothi
Large pretrained models have seen enormous success in extractive summarization tasks.
1 code implementation • EMNLP 2021 • Ramakanth Pasunuru, Veselin Stoyanov, Mohit Bansal
In this work, we propose a continual few-shot learning (CFL) task, in which a system is challenged with a difficult phenomenon and asked to learn to correct mistakes with only a few (10 to 15) training examples.
1 code implementation • EMNLP 2021 • Yichen Jiang, Mohit Bansal
Motivated by the failure of a Transformer model on the SCAN compositionality challenge (Lake and Baroni, 2018), which requires parsing a command into actions, we propose two auxiliary sequence prediction tasks as additional training supervision.
1 code implementation • EMNLP 2021 • Hyounghun Kim, Jialu Li, Mohit Bansal
In this paper, we explore the Navigation from Dialogue History (NDH) task, which is based on the Cooperative Vision-and-Dialogue Navigation (CVDN) dataset, and present a state-of-the-art model which is built upon Vision-Language transformers.
1 code implementation • EMNLP 2021 • Adyasha Maharana, Mohit Bansal
Such information is even more important for story visualization since its inputs have an explicit narrative structure that needs to be translated into an image sequence (or visual story).
1 code implementation • NAACL (TextGraphs) 2021 • Qi Zeng, Manling Li, Tuan Lai, Heng Ji, Mohit Bansal, Hanghang Tong
Current methods for event representation ignore related events in a corpus-level global context.
no code implementations • ACL (RepL4NLP) 2021 • Han Guo, Ramakanth Pasunuru, Mohit Bansal
Many recalibration methods have been proposed in the literature for quantifying predictive uncertainty and calibrating model outputs, with varying degrees of complexity.
1 code implementation • 6 Mar 2023 • David Wan, Mengwen Liu, Kathleen McKeown, Markus Dreyer, Mohit Bansal
We present a systematic study of the effect of generation techniques such as beam search and nucleus sampling on faithfulness in abstractive summarization.
1 code implementation • 10 Jan 2023 • Peter Hase, Mohit Bansal, Been Kim, Asma Ghandeharioun
In this paper, we find that we can change how a fact is stored in a model by editing weights that are in a different location than where existing methods suggest that the fact is stored.
no code implementations • 16 Dec 2022 • Swarnadeep Saha, Xinyan Velocity Yu, Mohit Bansal, Ramakanth Pasunuru, Asli Celikyilmaz
We propose MURMUR, a neuro-symbolic modular approach to text generation from semi-structured data with multi-step reasoning.
1 code implementation • 15 Dec 2022 • Yan-Bo Lin, Yi-Lin Sung, Jie Lei, Mohit Bansal, Gedas Bertasius
To do so, we propose a latent audio-visual hybrid (LAVISH) adapter that adapts pretrained ViTs to audio-visual tasks by injecting a small number of trainable parameters into every layer of a frozen ViT.
Ranked #1 on
Audio-visual Question Answering
on MUSIC-AVQA
1 code implementation • 9 Dec 2022 • Feng Cheng, Xizi Wang, Jie Lei, David Crandall, Mohit Bansal, Gedas Bertasius
Furthermore, our model also obtains state-of-the-art video question-answering results on ActivityNet-QA, MSRVTT-QA, MSRVTT-MC and TVQA.
2 code implementations • 5 Dec 2022 • Zineng Tang, ZiYi Yang, Guoxin Wang, Yuwei Fang, Yang Liu, Chenguang Zhu, Michael Zeng, Cha Zhang, Mohit Bansal
UDOP leverages the spatial correlation between textual content and document image to model image, text, and layout modalities with one uniform representation.
1 code implementation • 28 Nov 2022 • Yichen Jiang, Xiang Zhou, Mohit Bansal
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models.
1 code implementation • 21 Nov 2022 • Zineng Tang, Jaemin Cho, Jie Lei, Mohit Bansal
We present Perceiver-VL, a vision-and-language framework that efficiently handles high-dimensional multimodal inputs such as long videos and text.
1 code implementation • 15 Nov 2022 • Derek Tam, Anisha Mascarenhas, Shiyue Zhang, Sarah Kwan, Mohit Bansal, Colin Raffel
To generate summaries that are factually inconsistent, we generate summaries from a suite of summarization models that we have manually annotated as factually inconsistent.
1 code implementation • 14 Nov 2022 • Swarnadeep Saha, Peter Hase, Nazneen Rajani, Mohit Bansal
We observe that (1) GPT-3 explanations are as grammatical as human explanations regardless of the hardness of the test samples, (2) for easy examples, GPT-3 generates highly supportive explanations but human explanations are more generalizable, and (3) for hard examples, human explanations are significantly better than GPT-3 explanations both in terms of label-supportiveness and generalizability judgements.
1 code implementation • 4 Nov 2022 • David Wan, Mohit Bansal
Current metrics for evaluating factuality for abstractive document summarization have achieved high correlations with human judgment, but they do not account for the vision modality and thus are not adequate for vision-and-language summarization.
1 code implementation • 18 Oct 2022 • Prateek Yadav, Mohit Bansal
Furthermore, we propose a novel KNN-based Knowledge Transfer (KKT) module that dynamically initializes a new task's mask based on previous tasks for improving knowledge transfer.
2 code implementations • 28 Sep 2022 • Zineng Tang, Jaemin Cho, Yixin Nie, Mohit Bansal
In this work, we present the Textless Vision-Language Transformer (TVLT), where homogeneous transformer blocks take raw visual and audio inputs for vision-and-language representation learning with minimal modality-specific design, and do not use text-specific modules such as tokenization or automatic speech recognition (ASR).
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+7
1 code implementation • 21 Sep 2022 • Swarnadeep Saha, Shiyue Zhang, Peter Hase, Mohit Bansal
We demonstrate that SP-Search effectively represents the generative process behind human summaries using modules that are typically faithful to their intended behavior.
1 code implementation • 13 Sep 2022 • Adyasha Maharana, Darryl Hannan, Mohit Bansal
Hence, we first propose the task of story continuation, where the generated visual story is conditioned on a source image, allowing for better generalization to narratives with new characters.
Ranked #2 on
Story Continuation
on FlintstonesSV
1 code implementation • 8 Sep 2022 • Shiyue Zhang, David Wan, Mohit Bansal
Though extractive summarization is less prone to the common unfaithfulness issues of abstractive summaries, does that mean extractive is equal to faithful?
1 code implementation • 25 Jul 2022 • Yonatan Bitton, Nitzan Bitton Guetta, Ron Yosef, Yuval Elovici, Mohit Bansal, Gabriel Stanovsky, Roy Schwartz
While vision-and-language models perform well on tasks such as visual question answering, they struggle when it comes to basic human commonsense reasoning skills.
Ranked #1 on
Common Sense Reasoning
on WinoGAViL
1 code implementation • NAACL 2022 • Hyounghun Kim, Abhay Zala, Mohit Bansal
Next, a counterfactual imagined scene change (in textual form) is applied, and the model has to predict the new response to the initial question based on this scene change.
1 code implementation • NAACL (ACL) 2022 • Yinuo Hu, Shiyue Zhang, Viji Sathy, A. T. Panter, Mohit Bansal
Ten university professors from diverse departments serve as evaluators of the system and all agree that SETSum helps them interpret SET results more efficiently; and 6 out of 10 instructors prefer our system over the standard static PDF report (while the remaining 4 would like to have both).
1 code implementation • Findings (NAACL) 2022 • Jialu Li, Hao Tan, Mohit Bansal
Empirically, on the Room-Across-Room dataset, we show that our multilingual agent gets large improvements in all metrics over the strong baseline model when generalizing to unseen environments with the cross-lingual language representation and the environment-agnostic visual representation.
1 code implementation • NAACL 2022 • Xiang Zhou, Shiyue Zhang, Mohit Bansal
MPoSM can model arbitrary tag dependency and perform POS induction through the objective of masked POS reconstruction.
1 code implementation • 22 Jun 2022 • Zhuofan Ying, Peter Hase, Mohit Bansal
In this paper, we show that model FI supervision can meaningfully improve VQA model accuracy as well as performance on several Right-for-the-Right-Reason (RRR) metrics by optimizing for four key model objectives: (1) accurate predictions given limited but sufficient information (Sufficiency); (2) max-entropy predictions given no important information (Uncertainty); (3) invariance of predictions to changes in unimportant features (Invariance); and (4) alignment between model FI explanations and human FI explanations (Plausibility).
no code implementations • 15 Jun 2022 • Sha Li, Mahdi Namazifar, Di Jin, Mohit Bansal, Heng Ji, Yang Liu, Dilek Hakkani-Tur
Providing conversation models with background knowledge has been shown to make open-domain dialogues more informative and engaging.
1 code implementation • 13 Jun 2022 • Yi-Lin Sung, Jaemin Cho, Mohit Bansal
LST saves 69% of the memory costs to fine-tune the whole network, while other methods only save 26% of that in similar parameter usages (hence, 2. 7x more memory savings).
1 code implementation • 9 Jun 2022 • Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ambrose Slone, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew Dai, Andrew La, Andrew Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakaş, B. Ryan Roberts, Bao Sheng Loe, Barret Zoph, Bartłomiej Bojanowski, Batuhan Özyurt, Behnam Hedayatnia, Behnam Neyshabur, Benjamin Inden, Benno Stein, Berk Ekmekci, Bill Yuchen Lin, Blake Howald, Cameron Diao, Cameron Dour, Catherine Stinson, Cedrick Argueta, César Ferri Ramírez, Chandan Singh, Charles Rathkopf, Chenlin Meng, Chitta Baral, Chiyu Wu, Chris Callison-Burch, Chris Waites, Christian Voigt, Christopher D. Manning, Christopher Potts, Cindy Ramirez, Clara E. Rivera, Clemencia Siro, Colin Raffel, Courtney Ashcraft, Cristina Garbacea, Damien Sileo, Dan Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Daniel Freeman, Daniel Khashabi, Daniel Levy, Daniel Moseguí González, Danielle Perszyk, Danny Hernandez, Danqi Chen, Daphne Ippolito, Dar Gilboa, David Dohan, David Drakard, David Jurgens, Debajyoti Datta, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Derek Tam, Dieuwke Hupkes, Diganta Misra, Dilyar Buzan, Dimitri Coelho Mollo, Diyi Yang, Dong-Ho Lee, Ekaterina Shutova, Ekin Dogus Cubuk, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Ellie Pavlick, Emanuele Rodola, Emma Lam, Eric Chu, Eric Tang, Erkut Erdem, Ernie Chang, Ethan A. Chi, Ethan Dyer, Ethan Jerzak, Ethan Kim, Eunice Engefu Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fatemeh Siar, Fernando Martínez-Plumed, Francesca Happé, Francois Chollet, Frieda Rong, Gaurav Mishra, Genta Indra Winata, Gerard de Melo, Germán Kruszewski, Giambattista Parascandolo, Giorgio Mariani, Gloria Wang, Gonzalo Jaimovitch-López, Gregor Betz, Guy Gur-Ari, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hannaneh Hajishirzi, Harsh Mehta, Hayden Bogar, Henry Shevlin, Hinrich Schütze, Hiromu Yakura, Hongming Zhang, Hugh Mee Wong, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, Jacob Hilton, Jaehoon Lee, Jaime Fernández Fisac, James B. Simon, James Koppel, James Zheng, James Zou, Jan Kocoń, Jana Thompson, Jared Kaplan, Jarema Radom, Jascha Sohl-Dickstein, Jason Phang, Jason Wei, Jason Yosinski, Jekaterina Novikova, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesse Engel, Jesujoba Alabi, Jiacheng Xu, Jiaming Song, Jillian Tang, Joan Waweru, John Burden, John Miller, John U. Balis, Jonathan Berant, Jörg Frohberg, Jos Rozen, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Jones, Joshua B. Tenenbaum, Joshua S. Rule, Joyce Chua, Kamil Kanclerz, Karen Livescu, Karl Krauth, Karthik Gopalakrishnan, Katerina Ignatyeva, Katja Markert, Kaustubh D. Dhole, Kevin Gimpel, Kevin Omondi, Kory Mathewson, Kristen Chiafullo, Ksenia Shkaruta, Kumar Shridhar, Kyle McDonell, Kyle Richardson, Laria Reynolds, Leo Gao, Li Zhang, Liam Dugan, Lianhui Qin, Lidia Contreras-Ochando, Louis-Philippe Morency, Luca Moschella, Lucas Lam, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros Colón, Luke Metz, Lütfi Kerem Şenel, Maarten Bosma, Maarten Sap, Maartje ter Hoeve, Maheen Farooqi, Manaal Faruqui, Mantas Mazeika, Marco Baturan, Marco Marelli, Marco Maru, Maria Jose Ramírez Quintana, Marie Tolkiehn, Mario Giulianelli, Martha Lewis, Martin Potthast, Matthew L. Leavitt, Matthias Hagen, Mátyás Schubert, Medina Orduna Baitemirova, Melody Arnaud, Melvin McElrath, Michael A. Yee, Michael Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michał Swędrowski, Michele Bevilacqua, Michihiro Yasunaga, Mihir Kale, Mike Cain, Mimee Xu, Mirac Suzgun, Mo Tiwari, Mohit Bansal, Moin Aminnaseri, Mor Geva, Mozhdeh Gheini, Mukund Varma T, Nanyun Peng, Nathan Chi, Nayeon Lee, Neta Gur-Ari Krakover, Nicholas Cameron, Nicholas Roberts, Nick Doiron, Nikita Nangia, Niklas Deckers, Niklas Muennighoff, Nitish Shirish Keskar, Niveditha S. Iyer, Noah Constant, Noah Fiedel, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Omer Levy, Owain Evans, Pablo Antonio Moreno Casares, Parth Doshi, Pascale Fung, Paul Pu Liang, Paul Vicol, Pegah Alipoormolabashi, Peiyuan Liao, Percy Liang, Peter Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Pouya Pezeshkpour, Priti Oli, Qiaozhu Mei, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Etta Rudolph, Raefer Gabriel, Rahel Habacker, Ramón Risco Delgado, Raphaël Millière, Rhythm Garg, Richard Barnes, Rif A. Saurous, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Roman Sitelew, Ronan LeBras, Rosanne Liu, Rowan Jacobs, Rui Zhang, Ruslan Salakhutdinov, Ryan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Rylan Yang, Sahib Singh, Saif M. Mohammad, Sajant Anand, Sam Dillavou, Sam Shleifer, Sam Wiseman, Samuel Gruetter, Samuel R. Bowman, Samuel S. Schoenholz, Sanghyun Han, Sanjeev Kwatra, Sarah A. Rous, Sarik Ghazarian, Sayan Ghosh, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sepideh Sadeghi, Shadi Hamdan, Sharon Zhou, Shashank Srivastava, Sherry Shi, Shikhar Singh, Shima Asaadi, Shixiang Shane Gu, Shubh Pachchigar, Shubham Toshniwal, Shyam Upadhyay, Shyamolima, Debnath, Siamak Shakeri, Simon Thormeyer, Simone Melzi, Siva Reddy, Sneha Priscilla Makini, Soo-Hwan Lee, Spencer Torene, Sriharsha Hatwar, Stanislas Dehaene, Stefan Divic, Stefano Ermon, Stella Biderman, Stephanie Lin, Stephen Prasad, Steven T. Piantadosi, Stuart M. Shieber, Summer Misherghi, Svetlana Kiritchenko, Swaroop Mishra, Tal Linzen, Tal Schuster, Tao Li, Tao Yu, Tariq Ali, Tatsu Hashimoto, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Tiberius Nkinyili, Timo Schick, Timofei Kornev, Timothy Telleen-Lawton, Titus Tunduny, Tobias Gerstenberg, Trenton Chang, Trishala Neeraj, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Vera Demberg, Victoria Nyamai, Vikas Raunak, Vinay Ramasesh, Vinay Uday Prabhu, Vishakh Padmakumar, Vivek Srikumar, William Fedus, William Saunders, William Zhang, Wout Vossen, Xiang Ren, Xiaoyu Tong, Xinran Zhao, Xinyi Wu, Xudong Shen, Yadollah Yaghoobzadeh, Yair Lakretz, Yangqiu Song, Yasaman Bahri, Yejin Choi, Yichi Yang, Yiding Hao, Yifu Chen, Yonatan Belinkov, Yu Hou, Yufang Hou, Yuntao Bai, Zachary Seid, Zhuoye Zhao, Zijian Wang, Zijie J. Wang, ZiRui Wang, Ziyi Wu
BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models.
2 code implementations • 7 Jun 2022 • Jie Lei, Tamara L. Berg, Mohit Bansal
Training an effective video-and-language model intuitively requires multiple frames as model inputs.
Ranked #2 on
Video Retrieval
on SSv2-template retrieval
1 code implementation • Findings (NAACL) 2022 • Jaemin Cho, Seunghyun Yoon, Ajinkya Kale, Franck Dernoncourt, Trung Bui, Mohit Bansal
Toward more descriptive and distinctive caption generation, we propose using CLIP, a multimodal encoder trained on huge image-text pairs from web, to calculate multimodal similarity and use it as a reward function.
Ranked #26 on
Image Captioning
on COCO Captions
1 code implementation • 22 May 2022 • Zhenhailong Wang, Manling Li, Ruochen Xu, Luowei Zhou, Jie Lei, Xudong Lin, Shuohang Wang, ZiYi Yang, Chenguang Zhu, Derek Hoiem, Shih-Fu Chang, Mohit Bansal, Heng Ji
The goal of this work is to build flexible video-language models that can generalize to various video-to-text tasks from few examples, such as domain-specific captioning, question answering, and future event prediction.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+5
no code implementations • insights (ACL) 2022 • Hyounghun Kim, Aishwarya Padmakumar, Di Jin, Mohit Bansal, Dilek Hakkani-Tur
Natural language guided embodied task completion is a challenging problem since it requires understanding natural language instructions, aligning them with egocentric visual observations, and choosing appropriate actions to execute in the environment to produce desired changes.
1 code implementation • NAACL 2022 • David Wan, Mohit Bansal
We present FactPEGASUS, an abstractive summarization model that addresses the problem of factuality during pre-training and fine-tuning: (1) We augment the sentence selection strategy of PEGASUS's (Zhang et al., 2020) pre-training objective to create pseudo-summaries that are both important and factual; (2) We introduce three complementary components for fine-tuning.
1 code implementation • 11 May 2022 • Haokun Liu, Derek Tam, Mohammed Muqeeth, Jay Mohta, Tenghao Huang, Mohit Bansal, Colin Raffel
ICL incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made.
Ranked #1 on
Few-Shot Text Classification
on RAFT
1 code implementation • Findings (NAACL) 2022 • Arthur Bražinskas, Ramesh Nallapati, Mohit Bansal, Markus Dreyer
In the same vein, we pre-train the adapters in a query-based manner on customer reviews and then fine-tune them on annotated datasets.
no code implementations • AMTA 2022 • Shiyue Zhang, Vishrav Chaudhary, Naman Goyal, James Cross, Guillaume Wenzek, Mohit Bansal, Francisco Guzman
Since a skewed data distribution is considered to be harmful, a sampling strategy is usually used to balance languages in the corpus.
1 code implementation • ACL 2022 • Shiyue Zhang, Ben Frey, Mohit Bansal
We hope that our work serves not only to inform the NLP community about Cherokee, but also to provide inspiration for future work on endangered languages in general.
1 code implementation • NAACL 2022 • Leonardo F. R. Ribeiro, Mengwen Liu, Iryna Gurevych, Markus Dreyer, Mohit Bansal
Despite recent improvements in abstractive summarization, most current approaches generate summaries that are not factually consistent with the source document, severely restricting their trust and usage in real-world applications.
1 code implementation • ACL 2022 • Swarnadeep Saha, Prateek Yadav, Mohit Bansal
In this work, we study pre-trained language models that generate explanation graphs in an end-to-end manner and analyze their ability to learn the structural constraints and semantics of such graphs.
1 code implementation • 6 Apr 2022 • Yan-Bo Lin, Jie Lei, Mohit Bansal, Gedas Bertasius
We introduce an audiovisual method for long-range text-to-video retrieval.
1 code implementation • CVPR 2022 • Jialu Li, Hao Tan, Mohit Bansal
Training on these edit-augmented environments prevents the agent from overfitting to existing environments and helps generalize better to new, unseen environments.
Ranked #2 on
Vision and Language Navigation
on RxR
(using extra training data)
1 code implementation • 14 Mar 2022 • Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal
In this work, we introduce Gradient-free Instructional Prompt Search (GrIPS), a gradient-free, edit-based search approach for improving task instructions for large language models.
no code implementations • 10 Mar 2022 • Jie Lei, Xinlei Chen, Ning Zhang, Mengjiao Wang, Mohit Bansal, Tamara L. Berg, Licheng Yu
In this work, we propose LoopITR, which combines them in the same network for joint learning.
1 code implementation • 24 Feb 2022 • Hyounghun Kim, Doo Soon Kim, Seunghyun Yoon, Franck Dernoncourt, Trung Bui, Mohit Bansal
To our knowledge, this is the first dataset that provides conversational image search and editing annotations, where the agent holds a grounded conversation with users and helps them to search and edit images according to their requests.
2 code implementations • 8 Feb 2022 • Jaemin Cho, Abhay Zala, Mohit Bansal
In this work, we investigate the visual reasoning capabilities and social biases of different text-to-image models, covering both multimodal transformer language models and diffusion models.
2 code implementations • 20 Dec 2021 • Revanth Gangi Reddy, Xilin Rui, Manling Li, Xudong Lin, Haoyang Wen, Jaemin Cho, Lifu Huang, Mohit Bansal, Avirup Sil, Shih-Fu Chang, Alexander Schwing, Heng Ji
Specifically, the task involves multi-hop questions that require reasoning over image-caption pairs to identify the grounded visual object being referred to and then predicting a span from the news body text to answer the question.
2 code implementations • NAACL 2022 • Ori Ernst, Avi Caciularu, Ori Shapira, Ramakanth Pasunuru, Mohit Bansal, Jacob Goldberger, Ido Dagan
Text clustering methods were traditionally incorporated into multi-document summarization (MDS) as a means for coping with considerable information repetition.
no code implementations • 16 Dec 2021 • Lisa Bauer, Karthik Gopalakrishnan, Spandana Gella, Yang Liu, Mohit Bansal, Dilek Hakkani-Tur
We define three broad classes of task descriptions for these tasks: statement, question, and completion, with numerous lexical variants within each class.
1 code implementation • Findings (EMNLP) 2021 • Yichen Jiang, Mohit Bansal
On examples with a maximum source and target length of 30 from De-En, WMT'16 English-Romanian, and WMT'21 English-Chinese translation tasks, our learned order outperforms all heuristic generation orders on four out of six tasks.
1 code implementation • CVPR 2022 • Yi-Lin Sung, Jaemin Cho, Mohit Bansal
Our results demonstrate that training the adapter with the weight-sharing technique (4. 18% of total parameters for image-text tasks and 3. 39% for video-text tasks) can match the performance of fine-tuning the entire model.
1 code implementation • 8 Dec 2021 • Yixin Nie, Linjie Li, Zhe Gan, Shuohang Wang, Chenguang Zhu, Michael Zeng, Zicheng Liu, Mohit Bansal, Lijuan Wang
Based on this, we ask an even bolder question: can we have an all-MLP architecture for VL modeling, where both VL fusion and the vision encoder are replaced with MLPs?
1 code implementation • NeurIPS 2021 • Jie Lei, Tamara Berg, Mohit Bansal
Each video in the dataset is annotated with: (1) a human-written free-form NL query, (2) relevant moments in the video w. r. t.
1 code implementation • 26 Nov 2021 • Peter Hase, Mona Diab, Asli Celikyilmaz, Xian Li, Zornitsa Kozareva, Veselin Stoyanov, Mohit Bansal, Srinivasan Iyer
In this paper, we discuss approaches to detecting when models have beliefs about the world, and we improve on methods for updating model beliefs to be more truthful, with a focus on methods based on learned optimizers or hypernetworks.
1 code implementation • 1 Nov 2021 • Prateek Yadav, Peter Hase, Mohit Bansal
Current approaches try to optimize for the cost incurred by users when adopting a recourse, but they assume that all users share the same cost function.
1 code implementation • 21 Oct 2021 • Adyasha Maharana, Mohit Bansal
Prior work in this domain has shown that there is ample room for improvement in the generated image sequence in terms of visual quality, consistency and relevance.
1 code implementation • 30 Sep 2021 • Yichen Jiang, Mohit Bansal
Motivated by the failure of a Transformer model on the SCAN compositionality challenge (Lake and Baroni, 2018), which requires parsing a command into actions, we propose two auxiliary sequence prediction tasks that track the progress of function and argument semantics, as additional training supervision.
1 code implementation • EMNLP (ACL) 2021 • Eran Hirsch, Alon Eirew, Ori Shapira, Avi Caciularu, Arie Cattan, Ori Ernst, Ramakanth Pasunuru, Hadar Ronen, Mohit Bansal, Ido Dagan
We introduce iFacetSum, a web application for exploring topical document sets.
1 code implementation • EMNLP 2021 • Shiyue Zhang, Mohit Bansal
In this work, we propose flexible semiautomatic to automatic summary evaluation metrics, following the Pyramid human evaluation method.
1 code implementation • ACL 2021 • Zineng Tang, Shiyue Zhang, Hyounghun Kim, Mohit Bansal
Recent years have witnessed various types of generative models for natural language generation (NLG), especially RNNs or transformer based sequence-to-sequence models, as well as variational autoencoder (VAE) and generative adversarial network (GAN) based models.
no code implementations • ACL 2021 • Yi Fung, Christopher Thomas, Revanth Gangi Reddy, Sandeep Polisetty, Heng Ji, Shih-Fu Chang, Kathleen McKeown, Mohit Bansal, Avi Sil
To defend against machine-generated fake news, an effective mechanism is urgently needed.
1 code implementation • ACL 2021 • Jie Lei, Tamara L. Berg, Mohit Bansal
We introduce mTVR, a large-scale multilingual video moment retrieval dataset, containing 218K English and Chinese queries from 21. 8K TV show video clips.
1 code implementation • ACL 2021 • Shiyue Zhang, Asli Celikyilmaz, Jianfeng Gao, Mohit Bansal
Furthermore, we find that widely used automatic evaluation metrics (ROUGE, BERTScore) are weakly correlated with human judgments on this email thread summarization task.
Ranked #1 on
Email Thread Summarization
on EmailSum (short)
2 code implementations • ACL 2021 • Shiyue Zhang, Benjamin Frey, Mohit Bansal
The quantitative evaluation demonstrates that our backbone translation models achieve state-of-the-art translation performance and our quality estimation well correlates with both BLEU and human judgment.
2 code implementations • 20 Jul 2021 • Jie Lei, Tamara L. Berg, Mohit Bansal
Each video in the dataset is annotated with: (1) a human-written free-form NL query, (2) relevant moments in the video w. r. t.
Ranked #1 on
Moment Retrieval
on Charades-STA
4 code implementations • 13 Jul 2021 • Sheng Shen, Liunian Harold Li, Hao Tan, Mohit Bansal, Anna Rohrbach, Kai-Wei Chang, Zhewei Yao, Kurt Keutzer
Most existing Vision-and-Language (V&L) models rely on pre-trained visual encoders, using a relatively small set of manually-annotated data (as compared to web-crawled data), to perceive the visual world.
Ranked #4 on
Vision and Language Navigation
on RxR
(using extra training data)
1 code implementation • NeurIPS 2021 • Zineng Tang, Jaemin Cho, Hao Tan, Mohit Bansal
We train a multi-modal teacher model on a video-text dataset, and then transfer its knowledge to a student language model with a text dataset.
1 code implementation • 21 Jun 2021 • Hao Tan, Jie Lei, Thomas Wolf, Mohit Bansal
Unlike language, where the text tokens are more independent, neighboring video tokens typically have strong correlations (e. g., consecutive video frames usually look very similar), and hence uniformly masking individual tokens will make the task too trivial to learn useful representations.
Ranked #5 on
Action Recognition
on Diving-48
no code implementations • Joint Conference on Lexical and Computational Semantics 2021 • Duccio Pappadopulo, Lisa Bauer, Marco Farina, Ozan İrsoy, Mohit Bansal
In this paper, we apply DAG-LSTMs to the conversation disentanglement task.
no code implementations • 14 Jun 2021 • Jiaao Chen, Derek Tam, Colin Raffel, Mohit Bansal, Diyi Yang
NLP has achieved great progress in the past decade through the use of neural models and large labeled datasets.
1 code implementation • 8 Jun 2021 • Linjie Li, Jie Lei, Zhe Gan, Licheng Yu, Yen-Chun Chen, Rohit Pillai, Yu Cheng, Luowei Zhou, Xin Eric Wang, William Yang Wang, Tamara Lee Berg, Mohit Bansal, Jingjing Liu, Lijuan Wang, Zicheng Liu
Most existing video-and-language (VidL) research focuses on a single dataset, or multiple datasets of a single task.
1 code implementation • NAACL 2021 • Swarnadeep Saha, Prateek Yadav, Mohit Bansal
In order to jointly learn from all proof graphs and exploit the correlations between multiple proofs for a question, we pose this task as a set generation problem over structured output spaces where each proof is represented as a directed graph.
1 code implementation • NAACL 2021 • Yichen Jiang, Asli Celikyilmaz, Paul Smolensky, Paul Soulos, Sudha Rao, Hamid Palangi, Roland Fernandez, Caitlin Smith, Mohit Bansal, Jianfeng Gao
On several syntactic and semantic probing tasks, we demonstrate the emergent structural information in the role vectors and improved syntactic interpretability in the TPR layer outputs.
1 code implementation • NAACL 2021 • Zineng Tang, Jie Lei, Mohit Bansal
Second, to alleviate the temporal misalignment issue, our method incorporates an entropy minimization-based constrained attention loss, to encourage the model to automatically focus on the correct caption from a pool of candidate ASR captions.
1 code implementation • NAACL 2021 • Ramakanth Pasunuru, Mengwen Liu, Mohit Bansal, Sujith Ravi, Markus Dreyer
We also show improvements in a transfer-only setup on the DUC-2004 dataset.
1 code implementation • NeurIPS 2021 • Peter Hase, Harry Xie, Mohit Bansal
In this paper, we study several under-explored dimensions of FI explanations, providing conceptual and empirical improvements for this form of explanation.
1 code implementation • NAACL 2021 • Ori Shapira, Ramakanth Pasunuru, Hadar Ronen, Mohit Bansal, Yael Amsterdamer, Ido Dagan
In this paper, we develop an end-to-end evaluation framework for interactive summarization, focusing on expansion-based interaction, which considers the accumulating information along a user session.
1 code implementation • NAACL 2021 • Adyasha Maharana, Darryl Hannan, Mohit Bansal
Therefore, we also provide an exploration of evaluation metrics for the model, focused on aspects of the generated frames such as the presence/quality of generated characters, the relevance to captions, and the diversity of the generated images.
1 code implementation • EACL 2021 • Lisa Bauer, Mohit Bansal
For knowledge integration to yield peak performance, it is critical to select a knowledge graph (KG) that is well-aligned with the given task's objective.
no code implementations • EACL 2021 • Xiang Zhou, Heba Elfardy, Christos Christodoulopoulos, Thomas Butler, Mohit Bansal
Using the observations and experimental results, we provide practical suggestions on how to create more reliable datasets for the unreliable news detection task.
1 code implementation • NAACL 2021 • Jialu Li, Hao Tan, Mohit Bansal
One key challenge in this task is to ground instructions with the current visual information that the agent perceives.
1 code implementation • Findings (ACL) 2022 • Xiang Zhou, Yixin Nie, Mohit Bansal
We introduce distributed NLI, a new NLU task with a goal to predict the distribution of human judgements for natural language inference.
1 code implementation • EMNLP 2021 • Swarnadeep Saha, Prateek Yadav, Lisa Bauer, Mohit Bansal
Recent commonsense-reasoning tasks are typically discriminative in nature, where a model answers a multiple-choice question for a certain context.
no code implementations • NAACL 2021 • Douwe Kiela, Max Bartolo, Yixin Nie, Divyansh Kaushik, Atticus Geiger, Zhengxuan Wu, Bertie Vidgen, Grusha Prasad, Amanpreet Singh, Pratik Ringshia, Zhiyi Ma, Tristan Thrush, Sebastian Riedel, Zeerak Waseem, Pontus Stenetorp, Robin Jia, Mohit Bansal, Christopher Potts, Adina Williams
We introduce Dynabench, an open-source platform for dynamic dataset creation and model benchmarking.
1 code implementation • 4 Apr 2021 • Hyounghun Kim, Abhay Zala, Graham Burri, Mohit Bansal
During the correctional-captioning task, models must generate descriptions of how to move from the current to target pose image, whereas in the retrieval task, models should select the correct target pose given the initial pose and correctional description.
2 code implementations • EMNLP 2021 • Derek Tam, Rakesh R Menon, Mohit Bansal, Shashank Srivastava, Colin Raffel
However, PET uses task-specific unlabeled data.
1 code implementation • 2 Mar 2021 • Ramakanth Pasunuru, Asli Celikyilmaz, Michel Galley, Chenyan Xiong, Yizhe Zhang, Mohit Bansal, Jianfeng Gao
The progress in Query-focused Multi-Document Summarization (QMDS) has been limited by the lack of sufficient largescale high-quality training datasets.
no code implementations • 2 Mar 2021 • Ramakanth Pasunuru, David Rosenberg, Gideon Mann, Mohit Bansal
Since these are sequence models, we must choose an ordering of the objects in the graphics programs for likelihood training.
1 code implementation • CVPR 2021 • Jie Lei, Linjie Li, Luowei Zhou, Zhe Gan, Tamara L. Berg, Mohit Bansal, Jingjing Liu
Experiments on text-to-video retrieval and video question answering on six datasets demonstrate that ClipBERT outperforms (or is on par with) existing methods that exploit full-length videos, suggesting that end-to-end learning with just a few sparsely sampled clips is often more accurate than using densely extracted offline features from full-length videos, proving the proverbial less-is-more principle.
Ranked #17 on
Visual Question Answering (VQA)
on MSRVTT-QA
(using extra training data)
1 code implementation • 4 Feb 2021 • Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal
On 7 popular vision-and-language benchmarks, including visual question answering, referring expression comprehension, visual commonsense reasoning, most of which have been previously modeled as discriminative tasks, our generative approach (with a single unified architecture) reaches comparable performance to recent task-specific state-of-the-art vision-and-language models.
Ranked #3 on
Image Captioning
on nocaps val
1 code implementation • LNLS (ACL) 2022 • Peter Hase, Mohit Bansal
In order to carefully control important properties of the data and explanations, we introduce a synthetic dataset for experiments, and we also make use of three existing datasets with explanations: e-SNLI, TACRED, and SemEval.
2 code implementations • NAACL 2021 • Karan Goel, Nazneen Rajani, Jesse Vig, Samson Tan, Jason Wu, Stephan Zheng, Caiming Xiong, Mohit Bansal, Christopher Ré
Despite impressive performance on standard benchmarks, deep neural networks are often brittle when deployed in real-world systems.
1 code implementation • EMNLP 2021 • Han Guo, Nazneen Fatema Rajani, Peter Hase, Mohit Bansal, Caiming Xiong
With the availability of the fast influence functions, we demonstrate their usefulness in four applications.
no code implementations • EMNLP (BlackboxNLP) 2021 • Grusha Prasad, Yixin Nie, Mohit Bansal, Robin Jia, Douwe Kiela, Adina Williams
Given the increasingly prominent role NLP models (will) play in our lives, it is important for human expectations of model behavior to align with actual model behavior.
no code implementations • ACL 2021 • Yixin Nie, Mary Williamson, Mohit Bansal, Douwe Kiela, Jason Weston
To quantify how well natural language understanding models can capture consistency in a general conversation, we introduce the DialoguE COntradiction DEtection task (DECODE) and a new conversational dataset containing both human-human and human-bot contradictory dialogues.
no code implementations • EMNLP 2020 • Ramakanth Pasunuru, Han Guo, Mohit Bansal
Further, it is important to consider using a dynamic combination and curriculum of metric rewards that flexibly changes over time.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Hyounghun Kim, Abhay Zala, Graham Burri, Hao Tan, Mohit Bansal
During this task, the agent (similar to a PokeMON GO player) is asked to find and collect different target objects one-by-one by navigating based on natural language instructions in a complex, realistic outdoor environment, but then also ARRAnge the collected objects part-by-part in an egocentric grid-layout environment.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Yichen Jiang, Shikha Bordia, Zheng Zhong, Charles Dognin, Maneesh Singh, Mohit Bansal
We introduce HoVer (HOppy VERification), a dataset for many-hop evidence extraction and fact verification.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Ramakanth Pasunuru, Mohit Bansal
Architecture search is the automatic process of designing the model or cell structure that is optimal for the given dataset or task.
1 code implementation • EMNLP 2020 • Swarnadeep Saha, Yixin Nie, Mohit Bansal
Reasoning about conjuncts in conjunctive sentences is important for a deeper understanding of conjunctions in English and also how their usages and semantics differ from conjunctive and disjunctive boolean logic.
1 code implementation • EMNLP 2020 • Jie Lei, Licheng Yu, Tamara L. Berg, Mohit Bansal
Given a video with aligned dialogue, people can often infer what is more likely to happen next.
1 code implementation • EMNLP 2020 • Hao Tan, Mohit Bansal
We find that the main reason hindering this exploration is the large divergence in magnitude and distributions between the visually-grounded language datasets and pure-language corpora.
1 code implementation • EMNLP 2020 • Shiyue Zhang, Benjamin Frey, Mohit Bansal
To help save this endangered language, we introduce ChrEn, a Cherokee-English parallel dataset, to facilitate machine translation research between Cherokee and English.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Peter Hase, Shiyue Zhang, Harry Xie, Mohit Bansal
We provide code for the experiments in this paper at https://github. com/peterbhase/LAS-NL-Explanations
1 code implementation • EMNLP 2020 • Yixin Nie, Xiang Zhou, Mohit Bansal
Analysis reveals that: (1) high human disagreement exists in a noticeable amount of examples in these datasets; (2) the state-of-the-art models lack the ability to recover the distribution over human labels; (3) models achieve near-perfect accuracy on the subset of data with a high level of human agreement, whereas they can barely beat a random guess on the data with low levels of human agreement, which compose most of the common errors made by state-of-the-art models on the evaluation sets.
2 code implementations • EMNLP 2020 • Swarnadeep Saha, Sayan Ghosh, Shashank Srivastava, Mohit Bansal
First, PROVER generates proofs with an accuracy of 87%, while retaining or improving performance on the QA task, compared to RuleTakers (up to 6% improvement on zero-shot evaluation).
no code implementations • 17 Sep 2020 • Ori Shapira, Ramakanth Pasunuru, Hadar Ronen, Mohit Bansal, Yael Amsterdamer, Ido Dagan
Allowing users to interact with multi-document summarizers is a promising direction towards improving and customizing summary results.
1 code implementation • CoNLL (EMNLP) 2021 • Ori Ernst, Ori Shapira, Ramakanth Pasunuru, Michael Lepioshkin, Jacob Goldberger, Mohit Bansal, Ido Dagan
Aligning sentences in a reference summary with their counterparts in source documents was shown as a useful auxiliary summarization task, notably for generating training data for salience detection.
no code implementations • WS 2020 • Yixin Nie, Lisa Bauer, Mohit Bansal
Automatic fact checking is an important task motivated by the need for detecting and preventing the spread of misinformation across the web.
1 code implementation • ACL 2020 • Hyounghun Kim, Zineng Tang, Mohit Bansal
Moreover, our model is also comprised of dual-level attention (word/object and frame level), multi-head self/cross-integration for different sources (video and dense captions), and gates which pass more relevant information to the classifier.
1 code implementation • ACL 2020 • Jie Lei, Li-Wei Wang, Yelong Shen, Dong Yu, Tamara L. Berg, Mohit Bansal
Generating multi-sentence descriptions for videos is one of the most challenging captioning tasks due to its high requirements for not only visual relevance but also discourse-based coherence across the sentences in the paragraph.
Ranked #5 on
Video Captioning
on ActivityNet Captions
1 code implementation • ACL 2020 • Xiang Zhou, Mohit Bansal
While deep learning models are making fast progress on the task of Natural Language Inference, recent studies have also shown that these models achieve high accuracy by exploiting several dataset biases, and without deep understanding of the language semantics.
1 code implementation • 6 May 2020 • Yubo Zhang, Hao Tan, Mohit Bansal
Vision-and-Language Navigation (VLN) requires an agent to follow natural-language instructions, explore the given environments, and reach the desired target locations.
1 code implementation • ACL 2020 • Peter Hase, Mohit Bansal
Through two kinds of simulation tests involving text and tabular data, we evaluate five explanations methods: (1) LIME, (2) Anchor, (3) Decision Boundary, (4) a Prototype model, and (5) a Composite approach that combines explanations from each method.
1 code implementation • EMNLP 2020 • Xiang Zhou, Yixin Nie, Hao Tan, Mohit Bansal
For the first question, we conduct a thorough empirical study over analysis sets and find that in addition to the unstable final performance, the instability exists all along the training curve.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Adyasha Maharana, Mohit Bansal
In this work, we present several effective adversaries and automated data augmentation policy search methods with the goal of making reading comprehension models more robust to adversarial evaluation, but also improving generalization to the source domain as well as new domains and languages.
2 code implementations • ECCV 2020 • Jie Lei, Licheng Yu, Tamara L. Berg, Mohit Bansal
The queries are also labeled with query types that indicate whether each of them is more related to video or subtitle or both, allowing for in-depth analysis of the dataset and the methods that built on top of it.
Ranked #2 on
Video Retrieval
on TVR
1 code implementation • 22 Jan 2020 • Darryl Hannan, Akshay Jain, Mohit Bansal
By analyzing this model, we investigate which words in the question are indicative of the modality.
no code implementations • 17 Jan 2020 • Hyounghun Kim, Hao Tan, Mohit Bansal
The Visual Dialog task requires a model to exploit both image and conversational context information to generate the next response to the dialogue.
no code implementations • 15 Jan 2020 • Tong Niu, Mohit Bansal
In our work, we build dialogue models that are dynamically aware of what utterances or tokens are dull without any feature-engineering.
no code implementations • 13 Jan 2020 • Han Guo, Ramakanth Pasunuru, Mohit Bansal
Next, we develop a DistanceNet model which uses these distance measures, or a mixture of these distance measures, as an additional loss function to be minimized jointly with the task's loss function, so as to achieve better unsupervised domain adaptation.
2 code implementations • ACL 2020 • Yixin Nie, Adina Williams, Emily Dinan, Mohit Bansal, Jason Weston, Douwe Kiela
We introduce a new large-scale NLI benchmark dataset, collected via an iterative, adversarial human-and-model-in-the-loop procedure.
1 code implementation • IJCNLP 2019 • Tong Niu, Mohit Bansal
Automatic data augmentation (AutoAugment) (Cubuk et al., 2019) searches for optimal perturbation policies via a controller trained using performance rewards of a sampled policy on the target task, hence reducing data-level model bias.
2 code implementations • IJCNLP 2019 • Yixin Nie, Songhe Wang, Mohit Bansal
In this work, we give general guidelines on system design for MRS by proposing a simple yet effective pipeline system with special consideration on hierarchical semantic retrieval at both paragraph and sentence level, and their potential effects on the downstream task.
Ranked #48 on
Question Answering
on HotpotQA
1 code implementation • IJCNLP 2019 • Shiyue Zhang, Mohit Bansal
Second, since the traditional evaluation metrics (e. g., BLEU) often fall short in evaluating the quality of generated questions, we propose a QA-based evaluation method which measures the QG model's ability to mimic human annotators in generating QA training data.
1 code implementation • IJCNLP 2019 • Yichen Jiang, Mohit Bansal
Multi-hop QA requires a model to connect multiple pieces of evidence scattered in a long context to answer the question.
6 code implementations • IJCNLP 2019 • Hao Tan, Mohit Bansal
In LXMERT, we build a large-scale Transformer model that consists of three encoders: an object relationship encoder, a language encoder, and a cross-modality encoder.
Ranked #1 on
Visual Question Answering (VQA)
on VizWiz 2018
1 code implementation • ACL 2019 • Hao Tan, Franck Dernoncourt, Zhe Lin, Trung Bui, Mohit Bansal
To push forward the research in this direction, we first introduce a new language-guided image editing dataset that contains a large number of real image pairs with corresponding editing instructions.
1 code implementation • ACL 2019 • Yichen Jiang, Mohit Bansal
After adversarial training, the baseline's performance improves but is still limited on the adversarial evaluation.