Search Results for author: Vera Demberg

Found 76 papers, 3 papers with code

A practical perspective on connective generation

no code implementations CODI 2021 Frances Yung, Merel Scholman, Vera Demberg

In the current contribution, we analyse whether a sophisticated connective generation module is necessary to select a connective, or whether this can be solved with simple methods (such as random choice between connectives that are known to express a given relation, or usage of a generic language model).

Language Modelling Text Generation

Design Choices in Crowdsourcing Discourse Relation Annotations: The Effect of Worker Selection and Training

no code implementations LREC 2022 Merel Scholman, Valentina Pyatkin, Frances Yung, Ido Dagan, Reut Tsarfaty, Vera Demberg

The current contribution studies the effect of worker selection and training on the agreement on implicit relation labels between workers and gold labels, for both the DC and the QA method.

Comparison of methods for explicit discourse connective identification across various domains

no code implementations CODI 2021 Merel Scholman, Tianai Dong, Frances Yung, Vera Demberg

Existing parse methods use varying approaches to identify explicit discourse connectives, but their performance has not been consistently evaluated in comparison to each other, nor have they been evaluated consistently on text other than newspaper articles.

Semi-automatic discourse annotation in a low-resource language: Developing a connective lexicon for Nigerian Pidgin

no code implementations CODI 2021 Marian Marchal, Merel Scholman, Vera Demberg

The lexicon shows that the majority of Nigerian Pidgin connectives are borrowed from its English lexifier, but that there are also some connectives that are unique to Nigerian Pidgin.

Label distributions help implicit discourse relation classification

no code implementations COLING (CODI, CRAC) 2022 Frances Yung, Kaveri Anuranjana, Merel Scholman, Vera Demberg

Implicit discourse relations can convey more than one relation sense, but much of the research on discourse relations has focused on single relation senses.

Classification Implicit Discourse Relation Classification

Improving Zero-Shot Multilingual Text Generation via Iterative Distillation

no code implementations COLING 2022 Ernie Chang, Alex Marin, Vera Demberg

The demand for multilingual dialogue systems often requires a costly labeling process, where human translators derive utterances in low resource languages from resource rich language annotation.

Knowledge Distillation Text Generation

Establishing Annotation Quality in Multi-label Annotations

no code implementations COLING 2022 Marian Marchal, Merel Scholman, Frances Yung, Vera Demberg

In many linguistic fields requiring annotated data, multiple interpretations of a single item are possible.

Programmable Annotation with Diversed Heuristics and Data Denoising

no code implementations COLING 2022 Ernie Chang, Alex Marin, Vera Demberg

To this end, we propose a novel data programming framework that can jointly construct labeled data for language generation and understanding tasks – by allowing the annotators to modify an automatically-inferred alignment rule set between sequence labels and text, instead of writing rules from scratch.

Denoising Text Generation

Logic-Guided Message Generation from Raw Real-Time Sensor Data

no code implementations LREC 2022 Ernie Chang, Alisa Kovtunova, Stefan Borgwardt, Vera Demberg, Kathryn Chapman, Hui-Syuan Yeh

We find that formulating the task as an end-to-end problem leads to two major challenges in content selection – the sensor data is both redundant and diverse across environments, thereby making it hard for the encoders to select and reason on the data.

Text Generation

DiscoGeM: A Crowdsourced Corpus of Genre-Mixed Implicit Discourse Relations

1 code implementation LREC 2022 Merel Scholman, Tianai Dong, Frances Yung, Vera Demberg

Both the corpus and the dataset can facilitate a multitude of applications and research purposes, for example to function as training data to improve the performance of automatic discourse relation parsers, as well as facilitate research into non-connective signals of discourse relations.

Relation Classification

Visual Writing Prompts: Character-Grounded Story Generation with Curated Image Sequences

no code implementations20 Jan 2023 Xudong Hong, Asad Sayeed, Khushboo Mehra, Vera Demberg, Bernt Schiele

The image sequences are aligned with a total of 12K stories which were collected via crowdsourcing given the image sequences and a set of grounded characters from the corresponding image sequence.

Story Generation

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

1 code implementation9 Jun 2022 Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ambrose Slone, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew Dai, Andrew La, Andrew Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakaş, B. Ryan Roberts, Bao Sheng Loe, Barret Zoph, Bartłomiej Bojanowski, Batuhan Özyurt, Behnam Hedayatnia, Behnam Neyshabur, Benjamin Inden, Benno Stein, Berk Ekmekci, Bill Yuchen Lin, Blake Howald, Cameron Diao, Cameron Dour, Catherine Stinson, Cedrick Argueta, César Ferri Ramírez, Chandan Singh, Charles Rathkopf, Chenlin Meng, Chitta Baral, Chiyu Wu, Chris Callison-Burch, Chris Waites, Christian Voigt, Christopher D. Manning, Christopher Potts, Cindy Ramirez, Clara E. Rivera, Clemencia Siro, Colin Raffel, Courtney Ashcraft, Cristina Garbacea, Damien Sileo, Dan Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Daniel Freeman, Daniel Khashabi, Daniel Levy, Daniel Moseguí González, Danielle Perszyk, Danny Hernandez, Danqi Chen, Daphne Ippolito, Dar Gilboa, David Dohan, David Drakard, David Jurgens, Debajyoti Datta, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Derek Tam, Dieuwke Hupkes, Diganta Misra, Dilyar Buzan, Dimitri Coelho Mollo, Diyi Yang, Dong-Ho Lee, Ekaterina Shutova, Ekin Dogus Cubuk, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Ellie Pavlick, Emanuele Rodola, Emma Lam, Eric Chu, Eric Tang, Erkut Erdem, Ernie Chang, Ethan A. Chi, Ethan Dyer, Ethan Jerzak, Ethan Kim, Eunice Engefu Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fatemeh Siar, Fernando Martínez-Plumed, Francesca Happé, Francois Chollet, Frieda Rong, Gaurav Mishra, Genta Indra Winata, Gerard de Melo, Germán Kruszewski, Giambattista Parascandolo, Giorgio Mariani, Gloria Wang, Gonzalo Jaimovitch-López, Gregor Betz, Guy Gur-Ari, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hannaneh Hajishirzi, Harsh Mehta, Hayden Bogar, Henry Shevlin, Hinrich Schütze, Hiromu Yakura, Hongming Zhang, Hugh Mee Wong, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, Jacob Hilton, Jaehoon Lee, Jaime Fernández Fisac, James B. Simon, James Koppel, James Zheng, James Zou, Jan Kocoń, Jana Thompson, Jared Kaplan, Jarema Radom, Jascha Sohl-Dickstein, Jason Phang, Jason Wei, Jason Yosinski, Jekaterina Novikova, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesse Engel, Jesujoba Alabi, Jiacheng Xu, Jiaming Song, Jillian Tang, Joan Waweru, John Burden, John Miller, John U. Balis, Jonathan Berant, Jörg Frohberg, Jos Rozen, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Jones, Joshua B. Tenenbaum, Joshua S. Rule, Joyce Chua, Kamil Kanclerz, Karen Livescu, Karl Krauth, Karthik Gopalakrishnan, Katerina Ignatyeva, Katja Markert, Kaustubh D. Dhole, Kevin Gimpel, Kevin Omondi, Kory Mathewson, Kristen Chiafullo, Ksenia Shkaruta, Kumar Shridhar, Kyle McDonell, Kyle Richardson, Laria Reynolds, Leo Gao, Li Zhang, Liam Dugan, Lianhui Qin, Lidia Contreras-Ochando, Louis-Philippe Morency, Luca Moschella, Lucas Lam, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros Colón, Luke Metz, Lütfi Kerem Şenel, Maarten Bosma, Maarten Sap, Maartje ter Hoeve, Maheen Farooqi, Manaal Faruqui, Mantas Mazeika, Marco Baturan, Marco Marelli, Marco Maru, Maria Jose Ramírez Quintana, Marie Tolkiehn, Mario Giulianelli, Martha Lewis, Martin Potthast, Matthew L. Leavitt, Matthias Hagen, Mátyás Schubert, Medina Orduna Baitemirova, Melody Arnaud, Melvin McElrath, Michael A. Yee, Michael Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michał Swędrowski, Michele Bevilacqua, Michihiro Yasunaga, Mihir Kale, Mike Cain, Mimee Xu, Mirac Suzgun, Mo Tiwari, Mohit Bansal, Moin Aminnaseri, Mor Geva, Mozhdeh Gheini, Mukund Varma T, Nanyun Peng, Nathan Chi, Nayeon Lee, Neta Gur-Ari Krakover, Nicholas Cameron, Nicholas Roberts, Nick Doiron, Nikita Nangia, Niklas Deckers, Niklas Muennighoff, Nitish Shirish Keskar, Niveditha S. Iyer, Noah Constant, Noah Fiedel, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Omer Levy, Owain Evans, Pablo Antonio Moreno Casares, Parth Doshi, Pascale Fung, Paul Pu Liang, Paul Vicol, Pegah Alipoormolabashi, Peiyuan Liao, Percy Liang, Peter Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Pouya Pezeshkpour, Priti Oli, Qiaozhu Mei, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Etta Rudolph, Raefer Gabriel, Rahel Habacker, Ramón Risco Delgado, Raphaël Millière, Rhythm Garg, Richard Barnes, Rif A. Saurous, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Roman Sitelew, Ronan LeBras, Rosanne Liu, Rowan Jacobs, Rui Zhang, Ruslan Salakhutdinov, Ryan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Rylan Yang, Sahib Singh, Saif M. Mohammad, Sajant Anand, Sam Dillavou, Sam Shleifer, Sam Wiseman, Samuel Gruetter, Samuel R. Bowman, Samuel S. Schoenholz, Sanghyun Han, Sanjeev Kwatra, Sarah A. Rous, Sarik Ghazarian, Sayan Ghosh, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sepideh Sadeghi, Shadi Hamdan, Sharon Zhou, Shashank Srivastava, Sherry Shi, Shikhar Singh, Shima Asaadi, Shixiang Shane Gu, Shubh Pachchigar, Shubham Toshniwal, Shyam Upadhyay, Shyamolima, Debnath, Siamak Shakeri, Simon Thormeyer, Simone Melzi, Siva Reddy, Sneha Priscilla Makini, Soo-Hwan Lee, Spencer Torene, Sriharsha Hatwar, Stanislas Dehaene, Stefan Divic, Stefano Ermon, Stella Biderman, Stephanie Lin, Stephen Prasad, Steven T. Piantadosi, Stuart M. Shieber, Summer Misherghi, Svetlana Kiritchenko, Swaroop Mishra, Tal Linzen, Tal Schuster, Tao Li, Tao Yu, Tariq Ali, Tatsu Hashimoto, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Tiberius Nkinyili, Timo Schick, Timofei Kornev, Timothy Telleen-Lawton, Titus Tunduny, Tobias Gerstenberg, Trenton Chang, Trishala Neeraj, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Vera Demberg, Victoria Nyamai, Vikas Raunak, Vinay Ramasesh, Vinay Uday Prabhu, Vishakh Padmakumar, Vivek Srikumar, William Fedus, William Saunders, William Zhang, Wout Vossen, Xiang Ren, Xiaoyu Tong, Xinran Zhao, Xinyi Wu, Xudong Shen, Yadollah Yaghoobzadeh, Yair Lakretz, Yangqiu Song, Yasaman Bahri, Yejin Choi, Yichi Yang, Yiding Hao, Yifu Chen, Yonatan Belinkov, Yu Hou, Yufang Hou, Yuntao Bai, Zachary Seid, Zhuoye Zhao, Zijian Wang, Zijie J. Wang, ZiRui Wang, Ziyi Wu

BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models.

Common Sense Reasoning Memorization

Entity Enhancement for Implicit Discourse Relation Classification in the Biomedical Domain

no code implementations ACL 2021 Wei Shi, Vera Demberg

Implicit discourse relation classification is a challenging task, in particular when the text domain is different from the standard Penn Discourse Treebank (PDTB; Prasad et al., 2008) training corpus domain (Wall Street Journal in 1990s).

Implicit Discourse Relation Classification

Exploring the Potential of Lexical Paraphrases for Mitigating Noise-Induced Comprehension Errors

no code implementations18 Jul 2021 Anupama Chingacham, Vera Demberg, Dietrich Klakow

We evaluate the intelligibility of synonyms in context and find that choosing a lexical unit that is less risky to be misheard than its synonym introduced an average gain in comprehension of 37% at SNR -5 dB and 21% at SNR 0 dB for babble noise.

Speech Synthesis

Time-Aware Ancient Chinese Text Translation and Inference

1 code implementation ACL (LChange) 2021 Ernie Chang, Yow-Ting Shiue, Hui-Syuan Yeh, Vera Demberg

In this paper, we aim to address the challenges surrounding the translation of ancient Chinese text: (1) The linguistic gap due to the difference in eras results in translations that are poor in quality, and (2) most translations are missing the contextual information that is often very crucial to understanding the text.

Translation

Neural Data-to-Text Generation with LM-based Text Augmentation

no code implementations EACL 2021 Ernie Chang, Xiaoyu Shen, Dawei Zhu, Vera Demberg, Hui Su

Our approach automatically augments the data available for training by (i) generating new text samples based on replacing specific values by alternative ones from the same category, (ii) generating new text samples based on GPT-2, and (iii) proposing an automatic method for pairing the new text samples with data samples.

Data-to-Text Generation Text Augmentation

Jointly Improving Language Understanding and Generation with Quality-Weighted Weak Supervision of Automatic Labeling

no code implementations EACL 2021 Ernie Chang, Vera Demberg, Alex Marin

Neural natural language generation (NLG) and understanding (NLU) models are data-hungry and require massive amounts of annotated data to be competitive.

Text Generation

Does the Order of Training Samples Matter? Improving Neural Data-to-Text Generation with Curriculum Learning

no code implementations EACL 2021 Ernie Chang, Hui-Syuan Yeh, Vera Demberg

Efforts have been dedicated to improving text generation systems by changing the order of training samples in a process known as curriculum learning.

Data-to-Text Generation

Diverse and Relevant Visual Storytelling with Scene Graph Embeddings

no code implementations CONLL 2020 Xudong Hong, Rakshith Shetty, Asad Sayeed, Khushboo Mehra, Vera Demberg, Bernt Schiele

A problem in automatically generated stories for image sequences is that they use overly generic vocabulary and phrase structure and fail to match the distributional characteristics of human-generated text.

Story Generation Visual Storytelling

DART: A Lightweight Quality-Suggestive Data-to-Text Annotation Tool

no code implementations COLING 2020 Ernie Chang, Jeriah Caplinger, Alex Marin, Xiaoyu Shen, Vera Demberg

We present a lightweight annotation tool, the Data AnnotatoR Tool (DART), for the general task of labeling structured data with textual descriptions.

Active Learning text annotation

Unsupervised Pidgin Text Generation By Pivoting English Data and Self-Training

no code implementations18 Mar 2020 Ernie Chang, David Ifeoluwa Adelani, Xiaoyu Shen, Vera Demberg

In this work, we develop techniques targeted at bridging the gap between Pidgin English and English in the context of natural language generation.

Data-to-Text Generation Machine Translation +1

Improving Language Generation from Feature-Rich Tree-Structured Data with Relational Graph Convolutional Encoders

no code implementations WS 2019 Xudong Hong, Ernie Chang, Vera Demberg

The Multilingual Surface Realization Shared Task 2019 focuses on generating sentences from lemmatized sets of universal dependency parses with rich features.

Data Augmentation Text Generation

Crowdsourcing Discourse Relation Annotations by a Two-Step Connective Insertion Task

no code implementations WS 2019 Frances Yung, Vera Demberg, Merel Scholman

The perspective of being able to crowd-source coherence relations bears the promise of acquiring annotations for new texts quickly, which could then increase the size and variety of discourse-annotated corpora.

A Hybrid Model for Globally Coherent Story Generation

no code implementations WS 2019 Fangzhou Zhai, Vera Demberg, Pavel Shkadzko, Wei Shi, Asad Sayeed

The model exploits a symbolic text planning module to produce text plans, thus reducing the demand of data; a neural surface realization module then generates fluent text conditioned on the text plan.

Story Generation

Verb-Second Effect on Quantifier Scope Interpretation

no code implementations WS 2019 Asad Sayeed, Matthias Lindemann, Vera Demberg

Sentences like {``}Every child climbed a tree{''} have at least two interpretations depending on the precedence order of the universal quantifier and the indefinite.

Toward Bayesian Synchronous Tree Substitution Grammars for Sentence Planning

no code implementations WS 2018 David M. Howcroft, Dietrich Klakow, Vera Demberg

Developing conventional natural language generation systems requires extensive attention from human experts in order to craft complex sets of sentence planning rules.

Text Generation

Acquiring Annotated Data with Cross-lingual Explicitation for Implicit Discourse Relation Classification

no code implementations WS 2019 Wei Shi, Frances Yung, Vera Demberg

Implicit discourse relation classification is one of the most challenging and important tasks in discourse parsing, due to the lack of connective as strong linguistic cues.

Discourse Parsing General Classification +2

Do Speakers Produce Discourse Connectives Rationally?

no code implementations WS 2018 Frances Yung, Vera Demberg

A number of different discourse connectives can be used to mark the same discourse relation, but it is unclear what factors affect connective choice.

Informativeness

Learning distributed event representations with a multi-task approach

no code implementations SEMEVAL 2018 Xudong Hong, Asad Sayeed, Vera Demberg

Human world knowledge contains information about prototypical events and their participants and locations.

Multi-Task Learning

Improving Variational Encoder-Decoders in Dialogue Generation

no code implementations6 Feb 2018 Xiaoyu Shen, Hui Su, Shuzi Niu, Vera Demberg

Variational encoder-decoders (VEDs) have shown promising results in dialogue generation.

Dialogue Generation

G-TUNA: a corpus of referring expressions in German, including duration information

no code implementations WS 2017 David Howcroft, Jorrig Vogels, Vera Demberg

Corpora of referring expressions elicited from human participants in a controlled environment are an important resource for research on automatic referring expression generation.

Referring Expression Referring expression generation +1

How compatible are our discourse annotations? Insights from mapping RST-DT and PDTB annotations

no code implementations28 Apr 2017 Vera Demberg, Fatemeh Torabi Asr, Merel Scholman

Discourse-annotated corpora are an important resource for the community, but they are often annotated according to different frameworks.

Implicit Relations

A Systematic Study of Neural Discourse Models for Implicit Discourse Relation

no code implementations EACL 2017 Attapol Rutherford, Vera Demberg, Nianwen Xue

Here, we propose neural network models that are based on feedforward and long-short term memory architecture and systematically study the effects of varying structures.

Discourse Parsing

On the Need of Cross Validation for Discourse Relation Classification

no code implementations EACL 2017 Wei Shi, Vera Demberg

The task of implicit discourse relation classification has received increased attention in recent years, including two CoNNL shared tasks on the topic.

Classification General Classification +4

Psycholinguistic Models of Sentence Processing Improve Sentence Readability Ranking

no code implementations EACL 2017 David M. Howcroft, Vera Demberg

While previous research on readability has typically focused on document-level measures, recent work in areas such as natural language generation has pointed out the need of sentence-level readability measures.

Information Retrieval Text Generation +1

From OpenCCG to AI Planning: Detecting Infeasible Edges in Sentence Generation

no code implementations COLING 2016 Maximilian Schwenger, {\'A}lvaro Torralba, Joerg Hoffmann, David M. Howcroft, Vera Demberg

The search space in grammar-based natural language generation tasks can get very large, which is particularly problematic when generating long utterances or paragraphs.

Text Generation

Annotating Discourse Relations in Spoken Language: A Comparison of the PDTB and CCR Frameworks

no code implementations LREC 2016 Ines Rehbein, Merel Scholman, Vera Demberg

In discourse relation annotation, there is currently a variety of different frameworks being used, and most of them have been developed and employed mostly on written data.

Cannot find the paper you are looking for? You can Submit a new open access paper.