Search Results for author: Yufang Hou

Found 53 papers, 22 papers with code

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

3 code implementations • 9 Jun 2022 • Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ambrose Slone, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew Dai, Andrew La, Andrew Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakaş, B. Ryan Roberts, Bao Sheng Loe, Barret Zoph, Bartłomiej Bojanowski, Batuhan Özyurt, Behnam Hedayatnia, Behnam Neyshabur, Benjamin Inden, Benno Stein, Berk Ekmekci, Bill Yuchen Lin, Blake Howald, Bryan Orinion, Cameron Diao, Cameron Dour, Catherine Stinson, Cedrick Argueta, César Ferri Ramírez, Chandan Singh, Charles Rathkopf, Chenlin Meng, Chitta Baral, Chiyu Wu, Chris Callison-Burch, Chris Waites, Christian Voigt, Christopher D. Manning, Christopher Potts, Cindy Ramirez, Clara E. Rivera, Clemencia Siro, Colin Raffel, Courtney Ashcraft, Cristina Garbacea, Damien Sileo, Dan Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Daniel Freeman, Daniel Khashabi, Daniel Levy, Daniel Moseguí González, Danielle Perszyk, Danny Hernandez, Danqi Chen, Daphne Ippolito, Dar Gilboa, David Dohan, David Drakard, David Jurgens, Debajyoti Datta, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Derek Tam, Dieuwke Hupkes, Diganta Misra, Dilyar Buzan, Dimitri Coelho Mollo, Diyi Yang, Dong-Ho Lee, Dylan Schrader, Ekaterina Shutova, Ekin Dogus Cubuk, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Ellie Pavlick, Emanuele Rodola, Emma Lam, Eric Chu, Eric Tang, Erkut Erdem, Ernie Chang, Ethan A. Chi, Ethan Dyer, Ethan Jerzak, Ethan Kim, Eunice Engefu Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fatemeh Siar, Fernando Martínez-Plumed, Francesca Happé, Francois Chollet, Frieda Rong, Gaurav Mishra, Genta Indra Winata, Gerard de Melo, Germán Kruszewski, Giambattista Parascandolo, Giorgio Mariani, Gloria Wang, Gonzalo Jaimovitch-López, Gregor Betz, Guy Gur-Ari, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hannaneh Hajishirzi, Harsh Mehta, Hayden Bogar, Henry Shevlin, Hinrich Schütze, Hiromu Yakura, Hongming Zhang, Hugh Mee Wong, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, Jacob Hilton, Jaehoon Lee, Jaime Fernández Fisac, James B. Simon, James Koppel, James Zheng, James Zou, Jan Kocoń, Jana Thompson, Janelle Wingfield, Jared Kaplan, Jarema Radom, Jascha Sohl-Dickstein, Jason Phang, Jason Wei, Jason Yosinski, Jekaterina Novikova, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesse Engel, Jesujoba Alabi, Jiacheng Xu, Jiaming Song, Jillian Tang, Joan Waweru, John Burden, John Miller, John U. Balis, Jonathan Batchelder, Jonathan Berant, Jörg Frohberg, Jos Rozen, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Guerr, Joseph Jones, Joshua B. Tenenbaum, Joshua S. Rule, Joyce Chua, Kamil Kanclerz, Karen Livescu, Karl Krauth, Karthik Gopalakrishnan, Katerina Ignatyeva, Katja Markert, Kaustubh D. Dhole, Kevin Gimpel, Kevin Omondi, Kory Mathewson, Kristen Chiafullo, Ksenia Shkaruta, Kumar Shridhar, Kyle McDonell, Kyle Richardson, Laria Reynolds, Leo Gao, Li Zhang, Liam Dugan, Lianhui Qin, Lidia Contreras-Ochando, Louis-Philippe Morency, Luca Moschella, Lucas Lam, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros Colón, Luke Metz, Lütfi Kerem Şenel, Maarten Bosma, Maarten Sap, Maartje ter Hoeve, Maheen Farooqi, Manaal Faruqui, Mantas Mazeika, Marco Baturan, Marco Marelli, Marco Maru, Maria Jose Ramírez Quintana, Marie Tolkiehn, Mario Giulianelli, Martha Lewis, Martin Potthast, Matthew L. Leavitt, Matthias Hagen, Mátyás Schubert, Medina Orduna Baitemirova, Melody Arnaud, Melvin McElrath, Michael A. Yee, Michael Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michał Swędrowski, Michele Bevilacqua, Michihiro Yasunaga, Mihir Kale, Mike Cain, Mimee Xu, Mirac Suzgun, Mitch Walker, Mo Tiwari, Mohit Bansal, Moin Aminnaseri, Mor Geva, Mozhdeh Gheini, Mukund Varma T, Nanyun Peng, Nathan A. Chi, Nayeon Lee, Neta Gur-Ari Krakover, Nicholas Cameron, Nicholas Roberts, Nick Doiron, Nicole Martinez, Nikita Nangia, Niklas Deckers, Niklas Muennighoff, Nitish Shirish Keskar, Niveditha S. Iyer, Noah Constant, Noah Fiedel, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Omer Levy, Owain Evans, Pablo Antonio Moreno Casares, Parth Doshi, Pascale Fung, Paul Pu Liang, Paul Vicol, Pegah Alipoormolabashi, Peiyuan Liao, Percy Liang, Peter Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Pouya Pezeshkpour, Priti Oli, Qiaozhu Mei, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Etta Rudolph, Raefer Gabriel, Rahel Habacker, Ramon Risco, Raphaël Millière, Rhythm Garg, Richard Barnes, Rif A. Saurous, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Roman Sitelew, Ronan LeBras, Rosanne Liu, Rowan Jacobs, Rui Zhang, Ruslan Salakhutdinov, Ryan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Rylan Yang, Sahib Singh, Saif M. Mohammad, Sajant Anand, Sam Dillavou, Sam Shleifer, Sam Wiseman, Samuel Gruetter, Samuel R. Bowman, Samuel S. Schoenholz, Sanghyun Han, Sanjeev Kwatra, Sarah A. Rous, Sarik Ghazarian, Sayan Ghosh, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sepideh Sadeghi, Shadi Hamdan, Sharon Zhou, Shashank Srivastava, Sherry Shi, Shikhar Singh, Shima Asaadi, Shixiang Shane Gu, Shubh Pachchigar, Shubham Toshniwal, Shyam Upadhyay, Shyamolima, Debnath, Siamak Shakeri, Simon Thormeyer, Simone Melzi, Siva Reddy, Sneha Priscilla Makini, Soo-Hwan Lee, Spencer Torene, Sriharsha Hatwar, Stanislas Dehaene, Stefan Divic, Stefano Ermon, Stella Biderman, Stephanie Lin, Stephen Prasad, Steven T. Piantadosi, Stuart M. Shieber, Summer Misherghi, Svetlana Kiritchenko, Swaroop Mishra, Tal Linzen, Tal Schuster, Tao Li, Tao Yu, Tariq Ali, Tatsu Hashimoto, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Tiberius Nkinyili, Timo Schick, Timofei Kornev, Titus Tunduny, Tobias Gerstenberg, Trenton Chang, Trishala Neeraj, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Vera Demberg, Victoria Nyamai, Vikas Raunak, Vinay Ramasesh, Vinay Uday Prabhu, Vishakh Padmakumar, Vivek Srikumar, William Fedus, William Saunders, William Zhang, Wout Vossen, Xiang Ren, Xiaoyu Tong, Xinran Zhao, Xinyi Wu, Xudong Shen, Yadollah Yaghoobzadeh, Yair Lakretz, Yangqiu Song, Yasaman Bahri, Yejin Choi, Yichi Yang, Yiding Hao, Yifu Chen, Yonatan Belinkov, Yu Hou, Yufang Hou, Yuntao Bai, Zachary Seid, Zhuoye Zhao, Zijian Wang, Zijie J. Wang, ZiRui Wang, Ziyi Wu

BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models.

Common Sense Reasoning Math +1

2,650

Paper
Code

Identification of Tasks, Datasets, Evaluation Metrics, and Numeric Scores for Scientific Leaderboards Construction

1 code implementation • ACL 2019 • Yufang Hou, Charles Jochim, Martin Gleize, Francesca Bonin, Debasis Ganguly

While the fast-paced inception of novel tasks and new datasets helps foster active research in a community towards interesting directions, keeping track of the abundance of research activity in different areas on different datasets is likely to become increasingly difficult.

Ranked #2 on Scientific Results Extraction on NLP-TDMS (Exp, arXiv only)

Scientific Results Extraction

Paper
Code

TDMSci: A Specialized Corpus for Scientific Literature Entity Tagging of Tasks Datasets and Metrics

1 code implementation • EACL 2021 • Yufang Hou, Charles Jochim, Martin Gleize, Francesca Bonin, Debasis Ganguly

Tasks, Datasets and Evaluation Metrics are important concepts for understanding experimental scientific papers.

Data Augmentation

Paper
Code

D2S: Document-to-Slide Generation Via Query-Based Text Summarization

1 code implementation • NAACL 2021 • Edward Sun, Yufang Hou, Dakuo Wang, Yunfeng Zhang, Nancy X. R. Wang

Presentations are critical for communication in all areas of our lives, yet the creation of slide decks is often tedious and time-consuming.

Benchmarking Long Form Question Answering +1

Paper
Code

Ensembling Graph Predictions for AMR Parsing

1 code implementation • NeurIPS 2021 • Hoang Thanh Lam, Gabriele Picco, Yufang Hou, Young-suk Lee, Lam M. Nguyen, Dzung T. Phan, Vanessa López, Ramon Fernandez Astudillo

In many machine learning tasks, models are trained to predict structure data such as graphs.

Ranked #2 on AMR Parsing on LDC2020T02 (using extra training data)

AMR Parsing

Paper
Code

Fantastic Questions and Where to Find Them: FairytaleQA -- An Authentic Dataset for Narrative Comprehension

1 code implementation • 26 Mar 2022 • Ying Xu, Dakuo Wang, Mo Yu, Daniel Ritchie, Bingsheng Yao, Tongshuang Wu, Zheng Zhang, Toby Jia-Jun Li, Nora Bradford, Branda Sun, Tran Bao Hoang, Yisi Sang, Yufang Hou, Xiaojuan Ma, Diyi Yang, Nanyun Peng, Zhou Yu, Mark Warschauer

Through benchmarking with QG models, we show that the QG model trained on FairytaleQA is capable of asking high-quality and more diverse questions.

Ranked #1 on Question Generation on FairytaleQA

Benchmarking Question Answering +2

Paper
Code

End-to-End Construction of NLP Knowledge Graph

1 code implementation • Findings (ACL) 2021 • Ishani Mondal, Yufang Hou, Charles Jochim

Paper
Code

Educational Question Generation of Children Storybooks via Question Type Distribution Learning and Event-Centric Summarization

1 code implementation • ACL 2022 • Zhenjie Zhao, Yufang Hou, Dakuo Wang, Mo Yu, Chengzhong Liu, Xiaojuan Ma

Generating educational questions of fairytales or storybooks is vital for improving children's literacy ability.

Question Answering Question Generation +1

Paper
Code

Bridging Anaphora Resolution as Question Answering

1 code implementation • ACL 2020 • Yufang Hou

Most previous studies on bridging anaphora resolution (Poesio et al., 2004; Hou et al., 2013b; Hou, 2018a) use the pairwise model to tackle the problem and assume that the gold mention information is given.

Bridging Anaphora Resolution Question Answering +1

Paper
Code

Fine-grained Information Status Classification Using Discourse Context-Aware BERT

1 code implementation • COLING 2020 • Yufang Hou

Previous work on bridging anaphora recognition (Hou et al., 2013a) casts the problem as a subtask of learning fine-grained information status (IS).

General Classification

Paper
Code

End-to-end Neural Information Status Classification

1 code implementation • Findings (EMNLP) 2021 • Yufang Hou

In this paper, we propose an end-to-end neural approach for information status classification.

Classification

Paper
Code

Missing Counter-Evidence Renders NLP Fact-Checking Unrealistic for Misinformation

1 code implementation • 25 Oct 2022 • Max Glockner, Yufang Hou, Iryna Gurevych

In our analysis, we show that, by design, existing NLP task definitions for fact-checking cannot refute misinformation as professional fact-checkers do for the majority of claims.

Fact Checking Misinformation

Paper
Code

Matching Pairs: Attributing Fine-Tuned Models to their Pre-Trained Large Language Models

1 code implementation • 15 Jun 2023 • Myles Foley, Ambrish Rawat, Taesung Lee, Yufang Hou, Gabriele Picco, Giulio Zizzo

The wide applicability and adaptability of generative large language models (LLMs) has enabled their rapid adoption.

Paper
Code

HAConvGNN: Hierarchical Attention Based Convolutional Graph Neural Network for Code Documentation Generation in Jupyter Notebooks

2 code implementations • Findings (EMNLP) 2021 • Xuye Liu, Dakuo Wang, April Wang, Yufang Hou, Lingfei Wu

Jupyter notebook allows data scientists to write machine learning code together with its documentation in cells.

Code Documentation Generation Code Summarization +1

Paper
Code

On the Role of Summary Content Units in Text Summarization Evaluation

1 code implementation • 2 Apr 2024 • Marcel Nawrath, Agnieszka Nowak, Tristan Ratz, Danilo C. Walenta, Juri Opitz, Leonardo F. R. Ribeiro, João Sedoc, Daniel Deutsch, Simon Mille, Yixin Liu, Lining Zhang, Sebastian Gehrmann, Saad Mahamood, Miruna Clinciu, Khyathi Chandu, Yufang Hou

At the heart of the Pyramid evaluation method for text summarization lie human written summary content units (SCUs).

Natural Language Inference Sentence +1

Paper
Code

CiteBench: A benchmark for Scientific Citation Text Generation

1 code implementation • 19 Dec 2022 • Martin Funkquist, Ilia Kuznetsov, Yufang Hou, Iryna Gurevych

To address this challenge, we propose CiteBench: a benchmark for citation text generation that unifies multiple diverse datasets and enables standardized evaluation of citation text generation models across task designs and domains.

Text Generation

Paper
Code

'Don't Get Too Technical with Me': A Discourse Structure-Based Framework for Science Journalism

1 code implementation • 23 Oct 2023 • Ronald Cardenas, Bingsheng Yao, Dakuo Wang, Yufang Hou

Science journalism refers to the task of reporting technical findings of a scientific paper as a less technical news article to the general public audience.

Paper
Code

Constrained Multi-Task Learning for Bridging Resolution

1 code implementation • ACL 2022 • Hideo Kobayashi, Yufang Hou, Vincent Ng

We examine the extent to which supervised bridging resolvers can be improved without employing additional labeled bridging data by proposing a novel constrained multi-task learning framework for bridging resolution, within which we (1) design cross-task consistency constraints to guide the learning process; (2) pre-train the entity coreference model in the multi-task framework on the large amount of publicly available coreference data; and (3) integrating prior knowledge encoded in rule-based resolvers.

Multi-Task Learning

Paper
Code

Needle in a Haystack: An Analysis of High-Agreement Workers on MTurk for Summarization

1 code implementation • 20 Dec 2022 • Lining Zhang, Simon Mille, Yufang Hou, Daniel Deutsch, Elizabeth Clark, Yixin Liu, Saad Mahamood, Sebastian Gehrmann, Miruna Clinciu, Khyathi Chandu, João Sedoc

To prevent the costly and inefficient use of resources on low-quality annotations, we want a method for creating a pool of dependable annotators who can effectively complete difficult tasks, such as evaluating automatic summarization.

Paper
Code

Employing Argumentation Knowledge Graphs for Neural Argument Generation

1 code implementation • ACL 2021 • Khalid Al Khatib, Lukas Trautner, Henning Wachsmuth, Yufang Hou, Benno Stein

Generating high-quality arguments, while being challenging, may benefit a wide range of downstream applications, such as writing assistants and argument search engines.

Knowledge Graphs Text Generation

Paper
Code

Enhanced Word Representations for Bridging Anaphora Resolution

no code implementations • NAACL 2018 • Yufang Hou

Most current models of word representations(e. g., GloVe) have successfully captured fine-grained semantics.

Bridging Anaphora Resolution Semantic Similarity +2

Paper
Add Code

A Deterministic Algorithm for Bridging Anaphora Resolution

no code implementations • EMNLP 2018 • Yufang Hou

Additionally, we further improve the results for bridging anaphora resolution reported in Hou (2018) by combining our simple deterministic approach with Hou et al.(2013b)'s best system MLN II.

Bridging Anaphora Resolution Word Embeddings

Paper
Add Code

Unrestricted Bridging Resolution

no code implementations • CL 2018 • Yufang Hou, Katja Markert, Michael Strube

The second stage, bridging antecedent selection, finds the antecedents for all predicted bridging anaphors.

General Classification

Paper
Add Code

Will it Blend? Blending Weak and Strong Labeled Data in a Neural Network for Argumentation Mining

no code implementations • ACL 2018 • Eyal Shnarch, Carlos Alzate, Lena Dankin, Martin Gleize, Yufang Hou, Leshem Choshen, Ranit Aharonov, Noam Slonim

We propose a methodology to blend high quality but scarce strong labeled data with noisy but abundant weak labeled data during the training of neural networks.

Information Retrieval Natural Language Understanding +3

Paper
Add Code

Argumentation Quality Assessment: Theory vs. Practice

no code implementations • ACL 2017 • Henning Wachsmuth, Nona Naderi, Ivan Habernal, Yufang Hou, Graeme Hirst, Iryna Gurevych, Benno Stein

Argumentation quality is viewed differently in argumentation theory and in practical assessment approaches.

Argument Mining

Paper
Add Code

Computational Argumentation Quality Assessment in Natural Language

no code implementations • EACL 2017 • Henning Wachsmuth, Nona Naderi, Yufang Hou, Yonatan Bilu, Vinodkumar Prabhakaran, Tim Alberdingk Thijm, Graeme Hirst, Benno Stein

Research on computational argumentation faces the problem of how to automatically assess the quality of an argument or argumentation.

Paper
Add Code

Know Who Your Friends Are: Understanding Social Connections from Unstructured Text

no code implementations • NAACL 2018 • L{\'e}a Deleris, Francesca Bonin, Elizabeth Daly, St{\'e}phane Deparis, Yufang Hou, Charles Jochim, Yassine Lassoued, Killian Levacher

Having an understanding of interpersonal relationships is helpful in many contexts.

Paper
Add Code

Argument Relation Classification Using a Joint Inference Model

no code implementations • WS 2017 • Yufang Hou, Charles Jochim

In this paper, we address the problem of argument relation classification where argument units are from different texts.

Argument Mining Classification +4

Paper
Add Code

Incremental Fine-grained Information Status Classification Using Attention-based LSTMs

no code implementations • COLING 2016 • Yufang Hou

Information status plays an important role in discourse processing.

Classification Common Sense Reasoning +1

Paper
Add Code

Collective Classification for Fine-grained Information Status

no code implementations • ACL 2012 • Katja Markert, Yufang Hou, Michael Strube

Classification Coreference Resolution +1

Paper
Add Code

Global Inference for Bridging Anaphora Resolution

no code implementations • NAACL 2013 • Yufang Hou, Katja Markert, Michael Strube

Bridging Anaphora Resolution Coreference Resolution +1

Paper
Add Code

A Rule-Based System for Unrestricted Bridging Resolution: Recognizing Bridging Anaphora and Finding Links to Antecedents

no code implementations • EMNLP 2014 • Yufang Hou, Katja Markert, Michael Strube

Coreference Resolution

Paper
Add Code

Cascading Collective Classification for Bridging Anaphora Recognition using a Rich Linguistic Feature Set

no code implementations • EMNLP 2013 • Yufang Hou, Katja Markert, Michael Strube

General Classification

Paper
Add Code

Analyzing Sentiment in Classical Chinese Poetry

no code implementations • WS 2015 • Yufang Hou, Anette Frank

Sentiment Analysis

Paper
Add Code

Extracting Factual Min/Max Age Information from Clinical Trial Studies

no code implementations • WS 2019 • Yufang Hou, Debasis Ganguly, Lea A. Deleris, Francesca Bonin

Population age information is an essential characteristic of clinical trials.

Passage Retrieval Question Answering +1

Paper
Add Code

Fine-grained Information Status Classification Using Discourse Context-Aware Self-Attention

no code implementations • 13 Aug 2019 • Yufang Hou

Previous work on bridging anaphora recognition (Hou et al., 2013a) casts the problem as a subtask of learning fine-grained information status (IS).

General Classification

Paper
Add Code

A Summarization System for Scientific Documents

no code implementations • IJCNLP 2019 • Shai Erera, Michal Shmueli-Scheuer, Guy Feigenblat, Ora Peled Nakash, Odellia Boni, Haggai Roitman, Doron Cohen, Bar Weiner, Yosi Mass, Or Rivlin, Guy Lev, Achiya Jerbi, Jonathan Herzig, Yufang Hou, Charles Jochim, Martin Gleize, Francesca Bonin, David Konopnicki

We present a novel system providing summaries for Computer Science publications.

Paper
Add Code

Corpus Wide Argument Mining -- a Working Solution

no code implementations • 25 Nov 2019 • Liat Ein-Dor, Eyal Shnarch, Lena Dankin, Alon Halfon, Benjamin Sznajder, Ariel Gera, Carlos Alzate, Martin Gleize, Leshem Choshen, Yufang Hou, Yonatan Bilu, Ranit Aharonov, Noam Slonim

One of the main tasks in argument mining is the retrieval of argumentative content pertaining to a given topic.

Argument Mining Retrieval +1

Paper
Add Code

HBCP Corpus: A New Resource for the Analysis of Behavioural Change Intervention Reports

no code implementations • LREC 2020 • Francesca Bonin, Martin Gleize, Ailbhe Finnerty, C. Moore, ice, Charles Jochim, Emma Norris, Yufang Hou, Alison J. Wright, Debasis Ganguly, Emily Hayes, Silje Zink, Aless Pascale, ra, Pol Mac Aonghusa, Susan Michie

Due to the fast pace at which research reports in behaviour change are published, researchers, consultants and policymakers would benefit from more automatic ways to process these reports.

Paper
Add Code

The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics

no code implementations • ACL (GEM) 2021 • Sebastian Gehrmann, Tosin Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Aremu Anuoluwapo, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna Clinciu, Dipanjan Das, Kaustubh D. Dhole, Wanyu Du, Esin Durmus, Ondřej Dušek, Chris Emezue, Varun Gangal, Cristina Garbacea, Tatsunori Hashimoto, Yufang Hou, Yacine Jernite, Harsh Jhamtani, Yangfeng Ji, Shailza Jolly, Mihir Kale, Dhruv Kumar, Faisal Ladhak, Aman Madaan, Mounica Maddela, Khyati Mahajan, Saad Mahamood, Bodhisattwa Prasad Majumder, Pedro Henrique Martins, Angelina McMillan-Major, Simon Mille, Emiel van Miltenburg, Moin Nadeem, Shashi Narayan, Vitaly Nikolaev, Rubungo Andre Niyongabo, Salomey Osei, Ankur Parikh, Laura Perez-Beltrachini, Niranjan Ramesh Rao, Vikas Raunak, Juan Diego Rodriguez, Sashank Santhanam, João Sedoc, Thibault Sellam, Samira Shaikh, Anastasia Shimorina, Marco Antonio Sobrevilla Cabezudo, Hendrik Strobelt, Nishant Subramani, Wei Xu, Diyi Yang, Akhila Yerukola, Jiawei Zhou

We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics.

Ranked #1 on Extreme Summarization on GEM-XSum

Abstractive Text Summarization Cross-Lingual Abstractive Summarization +5

Paper
Add Code

Probing for Bridging Inference in Transformer Language Models

1 code implementation • NAACL 2021 • Onkar Pandit, Yufang Hou

We probe pre-trained transformer language models for bridging inference.

Bridging Anaphora Resolution Cloze Test

Paper
Code

End-to-End NLP Knowledge Graph Construction

no code implementations • 2 Jun 2021 • Ishani Mondal, Yufang Hou, Charles Jochim

This paper studies the end-to-end construction of an NLP Knowledge Graph (KG) from scientific papers.

graph construction

Paper
Add Code

Overview of the 2021 Key Point Analysis Shared Task

no code implementations • EMNLP (ArgMining) 2021 • Roni Friedman, Lena Dankin, Yufang Hou, Ranit Aharonov, Yoav Katz, Noam Slonim

We describe the 2021 Key Point Analysis (KPA-2021) shared task on key point analysis that we organized as a part of the 8th Workshop on Argument Mining (ArgMining 2021) at EMNLP 2021.

Argument Mining Text Summarization

Paper
Add Code

Argument Mining for Scholarly Document Processing: Taking Stock and Looking Ahead

no code implementations • NAACL (sdp) 2021 • Khalid Al Khatib, Tirthankar Ghosal, Yufang Hou, Anita de Waard, Dayne Freitag

Argument mining targets structures in natural language related to interpretation and persuasion which are central to scientific communication.

Argument Mining

Paper
Add Code

Finding Sub-task Structure with Natural Language Instruction

no code implementations • LNLS (ACL) 2022 • Ryokan Ri, Yufang Hou, Radu Marinescu, Akihiro Kishimoto

When mapping a natural language instruction to a sequence of actions, it is often useful toidentify sub-tasks in the instruction.

Segmentation

Paper
Add Code

Fantastic Questions and Where to Find Them: FairytaleQA – An Authentic Dataset for Narrative Comprehension

no code implementations • ACL 2022 • Ying Xu, Dakuo Wang, Mo Yu, Daniel Ritchie, Bingsheng Yao, Tongshuang Wu, Zheng Zhang, Toby Li, Nora Bradford, Branda Sun, Tran Hoang, Yisi Sang, Yufang Hou, Xiaojuan Ma, Diyi Yang, Nanyun Peng, Zhou Yu, Mark Warschauer

Through benchmarking with QG models, we show that the QG model trained on FairytaleQA is capable of asking high-quality and more diverse questions.

Benchmarking Question Answering +2

Paper
Add Code

GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

no code implementations • 22 Jun 2022 • Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashish Upadhyay, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Wang, Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Hiroaki Hayashi, Jekaterina Novikova, Jenna Kanerva, Jenny Chim, Jiawei Zhou, Jordan Clive, Joshua Maynez, João Sedoc, Juraj Juraska, Kaustubh Dhole, Khyathi Raghavi Chandu, Laura Perez-Beltrachini, Leonardo F. R. Ribeiro, Lewis Tunstall, Li Zhang, Mahima Pushkarna, Mathias Creutz, Michael White, Mihir Sanjay Kale, Moussa Kamal Eddine, Nico Daheim, Nishant Subramani, Ondrej Dusek, Paul Pu Liang, Pawan Sasanka Ammanamanchi, Qi Zhu, Ratish Puduppully, Reno Kriz, Rifat Shahriyar, Ronald Cardenas, Saad Mahamood, Salomey Osei, Samuel Cahyawijaya, Sanja Štajner, Sebastien Montella, Shailza, Shailza Jolly, Simon Mille, Tahmid Hasan, Tianhao Shen, Tosin Adewumi, Vikas Raunak, Vipul Raheja, Vitaly Nikolaev, Vivian Tsai, Yacine Jernite, Ying Xu, Yisi Sang, Yixin Liu, Yufang Hou

This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims.

Benchmarking Text Generation

Paper
Add Code

NECE: Narrative Event Chain Extraction Toolkit

no code implementations • 17 Aug 2022 • Guangxuan Xu, Paulina Toro Isaza, Moshi Li, Akintoye Oloko, Bingsheng Yao, Cassia Sanctos, Aminat Adebiyi, Yufang Hou, Nanyun Peng, Dakuo Wang

To understand a narrative, it is essential to comprehend the temporal event flows, especially those associated with main characters; however, this can be challenging with lengthy and unstructured narrative texts.

Question Answering

Paper
Add Code

End-to-End Neural Bridging Resolution

no code implementations • COLING 2022 • Hideo Kobayashi, Yufang Hou, Vincent Ng

The state of bridging resolution research is rather unsatisfactory: not only are state-of-the-art resolvers evaluated in unrealistic settings, but the neural models underlying these resolvers are weaker than those used for entity coreference resolution.

coreference-resolution

Paper
Add Code

A Diachronic Analysis of Paradigm Shifts in NLP Research: When, How, and Why?

no code implementations • 22 May 2023 • Aniket Pramanick, Yufang Hou, Saif M. Mohammad, Iryna Gurevych

In this study, we propose a systematic framework for analyzing the evolution of research topics in a scientific field using causal discovery and inference techniques.

Causal Discovery

Paper
Add Code

Are Fairy Tales Fair? Analyzing Gender Bias in Temporal Narrative Event Chains of Children's Fairy Tales

no code implementations • 26 May 2023 • Paulina Toro Isaza, Guangxuan Xu, Akintoye Oloko, Yufang Hou, Nanyun Peng, Dakuo Wang

Social biases and stereotypes are embedded in our culture in part through their presence in our stories, as evidenced by the rich history of humanities and social science literature analyzing such biases in children stories.

Paper
Add Code

How to Handle Different Types of Out-of-Distribution Scenarios in Computational Argumentation? A Comprehensive and Fine-Grained Field Study

no code implementations • 15 Sep 2023 • Andreas Waldis, Yufang Hou, Iryna Gurevych

Our findings challenge the previously asserted general superiority of in-context learning (ICL) for OOD.

Argument Mining In-Context Learning +2

Paper
Add Code

Dive into the Chasm: Probing the Gap between In- and Cross-Topic Generalization

1 code implementation • 2 Feb 2024 • Andreas Waldis, Yufang Hou, Iryna Gurevych

Pre-trained language models (LMs) perform well in In-Topic setups, where training and testing data come from the same topics.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.