1 code implementation • 22 Jun 2022 • Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashish Upadhyay, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Wang, Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Hiroaki Hayashi, Jekaterina Novikova, Jenna Kanerva, Jenny Chim, Jiawei Zhou, Jordan Clive, Joshua Maynez, João Sedoc, Juraj Juraska, Kaustubh Dhole, Khyathi Raghavi Chandu, Laura Perez-Beltrachini, Leonardo F. R. Ribeiro, Lewis Tunstall, Li Zhang, Mahima Pushkarna, Mathias Creutz, Michael White, Mihir Sanjay Kale, Moussa Kamal Eddine, Nico Daheim, Nishant Subramani, Ondrej Dusek, Paul Pu Liang, Pawan Sasanka Ammanamanchi, Qi Zhu, Ratish Puduppully, Reno Kriz, Rifat Shahriyar, Ronald Cardenas, Saad Mahamood, Salomey Osei, Samuel Cahyawijaya, Sanja Štajner, Sebastien Montella, Shailza, Shailza Jolly, Simon Mille, Tahmid Hasan, Tianhao Shen, Tosin Adewumi, Vikas Raunak, Vipul Raheja, Vitaly Nikolaev, Vivian Tsai, Yacine Jernite, Ying Xu, Yisi Sang, Yixin Liu, Yufang Hou
This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims.
no code implementations • 13 Dec 2021 • Shailza Jolly, Pepa Atanasova, Isabelle Augenstein
In addition, we show the applicability of our approach in a completely unsupervised setting.
1 code implementation • 6 Dec 2021 • Shailza Jolly, Zi Xuan Zhang, Andreas Dengel, Lili Mou
To this end, we propose a search-and-learning approach that leverages pretrained language models but inserts the missing slots to improve the semantic coverage.
1 code implementation • NAACL 2021 • Shailza Jolly, Sandro Pezzelle, Moin Nabi
We propose EASE, a simple diagnostic tool for Visual Question Answering (VQA) which quantifies the difficulty of an image, question sample.
no code implementations • ACL (GEM) 2021 • Sebastian Gehrmann, Tosin Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Aremu Anuoluwapo, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna Clinciu, Dipanjan Das, Kaustubh D. Dhole, Wanyu Du, Esin Durmus, Ondřej Dušek, Chris Emezue, Varun Gangal, Cristina Garbacea, Tatsunori Hashimoto, Yufang Hou, Yacine Jernite, Harsh Jhamtani, Yangfeng Ji, Shailza Jolly, Mihir Kale, Dhruv Kumar, Faisal Ladhak, Aman Madaan, Mounica Maddela, Khyati Mahajan, Saad Mahamood, Bodhisattwa Prasad Majumder, Pedro Henrique Martins, Angelina McMillan-Major, Simon Mille, Emiel van Miltenburg, Moin Nadeem, Shashi Narayan, Vitaly Nikolaev, Rubungo Andre Niyongabo, Salomey Osei, Ankur Parikh, Laura Perez-Beltrachini, Niranjan Ramesh Rao, Vikas Raunak, Juan Diego Rodriguez, Sashank Santhanam, João Sedoc, Thibault Sellam, Samira Shaikh, Anastasia Shimorina, Marco Antonio Sobrevilla Cabezudo, Hendrik Strobelt, Nishant Subramani, Wei Xu, Diyi Yang, Akhila Yerukola, Jiawei Zhou
We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics.
Ranked #1 on
Extreme Summarization
on GEM-XSum
Abstractive Text Summarization
Cross-Lingual Abstractive Summarization
+5
no code implementations • COLING 2020 • Shailza Jolly, Tobias Falke, Caglar Tirkaz, Daniil Sorokin
Recent progress through advanced neural models pushed the performance of task-oriented dialog systems to almost perfect accuracy on existing benchmark datasets for intent classification and slot labeling.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Shailza Jolly, Shubham Kapoor
However, these models fail to perform well on rephrasings of a question, which raises some important questions like Are these models robust towards linguistic variations?
no code implementations • LANTERN (COLING) 2020 • Stanislav Frolov, Shailza Jolly, Jörn Hees, Andreas Dengel
We create additional training samples by concatenating question and answer (QA) pairs and employ a standard VQA model to provide the T2I model with an auxiliary learning signal.
1 code implementation • 26 Mar 2020 • Shailza Jolly, Sebastian Palacio, Joachim Folz, Federico Raue, Joern Hees, Andreas Dengel
In recent years, progress in the Visual Question Answering (VQA) field has largely been driven by public challenges and large datasets.
no code implementations • 12 Sep 2018 • Shailza Jolly, Sandro Pezzelle, Tassilo Klein, Andreas Dengel, Moin Nabi
We show that our metric is effective in providing a more fine-grained evaluation both on the quantitative and qualitative level.
no code implementations • 25 Aug 2018 • Shailza Jolly, Brian Kenji Iwana, Ryohei Kuroki, Seiichi Uchida
We use LRP to explain the pixel-wise contributions of book cover design and highlight the design elements contributing towards particular genres.