Search Results for author: Vera Demberg

Found 87 papers, 8 papers with code

Establishing Annotation Quality in Multi-label Annotations

no code implementations • COLING 2022 • Marian Marchal, Merel Scholman, Frances Yung, Vera Demberg

In many linguistic fields requiring annotated data, multiple interpretations of a single item are possible.

Paper
Add Code

Design Choices in Crowdsourcing Discourse Relation Annotations: The Effect of Worker Selection and Training

no code implementations • LREC 2022 • Merel Scholman, Valentina Pyatkin, Frances Yung, Ido Dagan, Reut Tsarfaty, Vera Demberg

The current contribution studies the effect of worker selection and training on the agreement on implicit relation labels between workers and gold labels, for both the DC and the QA method.

Relation

Paper
Add Code

Improving Zero-Shot Multilingual Text Generation via Iterative Distillation

no code implementations • COLING 2022 • Ernie Chang, Alex Marin, Vera Demberg

The demand for multilingual dialogue systems often requires a costly labeling process, where human translators derive utterances in low resource languages from resource rich language annotation.

Knowledge Distillation Text Generation

Paper
Add Code

Zero-shot Script Parsing

no code implementations • COLING 2022 • Fangzhou Zhai, Vera Demberg, Alexander Koller

Script knowledge is useful to a variety of NLP tasks.

Zero-Shot Learning

Paper
Add Code

Few-Shot Pidgin Text Adaptation via Contrastive Fine-Tuning

no code implementations • COLING 2022 • Ernie Chang, Jesujoba O. Alabi, David Ifeoluwa Adelani, Vera Demberg

The surging demand for multilingual dialogue systems often requires a costly labeling process for each language addition.

Text Generation

Paper
Add Code

Semi-automatic discourse annotation in a low-resource language: Developing a connective lexicon for Nigerian Pidgin

no code implementations • CODI 2021 • Marian Marchal, Merel Scholman, Vera Demberg

The lexicon shows that the majority of Nigerian Pidgin connectives are borrowed from its English lexifier, but that there are also some connectives that are unique to Nigerian Pidgin.

Relation

Paper
Add Code

Barch: an English Dataset of Bar Chart Summaries

no code implementations • LREC 2022 • Iza Škrjanec, Muhammad Salman Edhi, Vera Demberg

This dataset contains 47 charts based on a selection of 18 topics.

Text Generation

Paper
Add Code

DiscoGeM: A Crowdsourced Corpus of Genre-Mixed Implicit Discourse Relations

1 code implementation • LREC 2022 • Merel Scholman, Tianai Dong, Frances Yung, Vera Demberg

Both the corpus and the dataset can facilitate a multitude of applications and research purposes, for example to function as training data to improve the performance of automatic discourse relation parsers, as well as facilitate research into non-connective signals of discourse relations.

Relation Relation Classification

Paper
Code

Comparison of methods for explicit discourse connective identification across various domains

no code implementations • CODI 2021 • Merel Scholman, Tianai Dong, Frances Yung, Vera Demberg

Existing parse methods use varying approaches to identify explicit discourse connectives, but their performance has not been consistently evaluated in comparison to each other, nor have they been evaluated consistently on text other than newspaper articles.

Paper
Add Code

A practical perspective on connective generation

no code implementations • CODI 2021 • Frances Yung, Merel Scholman, Vera Demberg

In the current contribution, we analyse whether a sophisticated connective generation module is necessary to select a connective, or whether this can be solved with simple methods (such as random choice between connectives that are known to express a given relation, or usage of a generic language model).

Language Modelling Relation +1

Paper
Add Code

Two-Stage Movie Script Summarization: An Efficient Method For Low-Resource Long Document Summarization

no code implementations • COLING (CreativeSumm) 2022 • Dongqi Pu, Xudong Hong, Pin-Jie Lin, Ernie Chang, Vera Demberg

The Creative Summarization Shared Task at COLING 2022 aspires to generate summaries given long-form texts from creative writing.

8k Decoder +1

Paper
Add Code

Programmable Annotation with Diversed Heuristics and Data Denoising

no code implementations • COLING 2022 • Ernie Chang, Alex Marin, Vera Demberg

To this end, we propose a novel data programming framework that can jointly construct labeled data for language generation and understanding tasks – by allowing the annotators to modify an automatically-inferred alignment rule set between sequence labels and text, instead of writing rules from scratch.

Denoising Text Generation

Paper
Add Code

Logic-Guided Message Generation from Raw Real-Time Sensor Data

no code implementations • LREC 2022 • Ernie Chang, Alisa Kovtunova, Stefan Borgwardt, Vera Demberg, Kathryn Chapman, Hui-Syuan Yeh

We find that formulating the task as an end-to-end problem leads to two major challenges in content selection – the sensor data is both redundant and diverse across environments, thereby making it hard for the encoders to select and reason on the data.

Text Generation

Paper
Add Code

Label distributions help implicit discourse relation classification

no code implementations • COLING (CODI, CRAC) 2022 • Frances Yung, Kaveri Anuranjana, Merel Scholman, Vera Demberg

Implicit discourse relations can convey more than one relation sense, but much of the research on discourse relations has focused on single relation senses.

Classification Implicit Discourse Relation Classification +1

Paper
Add Code

RST-LoRA: A Discourse-Aware Low-Rank Adaptation for Long Document Abstractive Summarization

no code implementations • 1 May 2024 • Dongqi Pu, Vera Demberg

For long document summarization, discourse structure is important to discern the key content of the text and the differences in importance level between sentences.

Abstractive Text Summarization Document Summarization

Paper
Add Code

Modeling Orthographic Variation Improves NLP Performance for Nigerian Pidgin

no code implementations • 28 Apr 2024 • Pin-Jie Lin, Merel Scholman, Muhammed Saeed, Vera Demberg

We test the effect of this data augmentation on two critical NLP tasks: machine translation and sentiment analysis.

Data Augmentation Machine Translation +2

Paper
Add Code

SciNews: From Scholarly Complexities to Public Narratives -- A Dataset for Scientific News Report Generation

no code implementations • 26 Mar 2024 • Dongqi Pu, Yifan Wang, Jia Loy, Vera Demberg

Scientific news reports serve as a bridge, adeptly translating complex research articles into reports that resonate with the broader public.

Text Generation

Paper
Add Code

Prompting Implicit Discourse Relation Annotation

no code implementations • 7 Feb 2024 • Frances Yung, Mansoor Ahmad, Merel Scholman, Vera Demberg

Pre-trained large language models, such as ChatGPT, archive outstanding performance in various reasoning tasks without supervised training and were found to have outperformed crowdsourcing workers.

Classification Implicit Discourse Relation Classification +3

Paper
Add Code

Improving fit to human reading times via temperature-scaled surprisal

1 code implementation • 15 Nov 2023 • Tong Liu, Iza Škrjanec, Vera Demberg

We propose to use temperature-scaled surprisal, a surprisal calculated by shaped probability, to be the predictor of human reading times.

Paper
Code

Do large language models and humans have similar behaviors in causal inference with script knowledge?

1 code implementation • 13 Nov 2023 • Xudong Hong, Margarita Ryzhova, Daniel Adrian Biondi, Vera Demberg

However, reading times remain similar when cause A is not explicitly mentioned, indicating that humans can easily infer event B from their script knowledge.

Causal Inference

Paper
Code

Tackling Hallucinations in Neural Chart Summarization

1 code implementation • 1 Aug 2023 • Saad Obaid ul Islam, Iza Škrjanec, Ondřej Dušek, Vera Demberg

Hallucinations in text generation occur when the system produces text that is not grounded in the input.

Natural Language Inference Text Generation

Paper
Code

Revisiting Sample Size Determination in Natural Language Understanding

1 code implementation • 1 Jul 2023 • Ernie Chang, Muhammad Hassan Rashid, Pin-Jie Lin, Changsheng Zhao, Vera Demberg, Yangyang Shi, Vikas Chandra

Knowing exactly how many data points need to be labeled to achieve a certain model performance is a hugely beneficial step towards reducing the overall budgets for annotation.

Active Learning Natural Language Understanding

Paper
Code

ChatGPT vs Human-authored Text: Insights into Controllable Text Summarization and Sentence Style Transfer

no code implementations • 13 Jun 2023 • Dongqi Pu, Vera Demberg

Large-scale language models, like ChatGPT, have garnered significant media attention and stunned the public with their remarkable capacity for generating coherent text from short natural language prompts.

Sentence Style Transfer +1

Paper
Add Code

Incorporating Distributions of Discourse Structure for Long Document Abstractive Summarization

no code implementations • 26 May 2023 • Dongqi Pu, Yifan Wang, Vera Demberg

For text summarization, the role of discourse structure is pivotal in discerning the core content of a text.

Abstractive Text Summarization

Paper
Add Code

Design Choices for Crowdsourcing Implicit Discourse Relations: Revealing the Biases Introduced by Task Design

1 code implementation • 3 Apr 2023 • Valentina Pyatkin, Frances Yung, Merel C. J. Scholman, Reut Tsarfaty, Ido Dagan, Vera Demberg

Disagreement in natural language annotation has mostly been studied from a perspective of biases introduced by the annotators and the annotation frameworks.

Paper
Code

Visual Writing Prompts: Character-Grounded Story Generation with Curated Image Sequences

no code implementations • 20 Jan 2023 • Xudong Hong, Asad Sayeed, Khushboo Mehra, Vera Demberg, Bernt Schiele

The image sequences are aligned with a total of 12K stories which were collected via crowdsourcing given the image sequences and a set of grounded characters from the corresponding image sequence.

Coherence Evaluation Grounded language learning +3

Paper
Add Code

A Data-Driven Investigation of Noise-Adaptive Utterance Generation with Linguistic Modification

no code implementations • 19 Oct 2022 • Anupama Chingacham, Vera Demberg, Dietrich Klakow

In noisy environments, speech can be hard to understand for humans.

Speech Synthesis Text Generation

Paper
Add Code

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

3 code implementations • 9 Jun 2022 • Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ambrose Slone, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew Dai, Andrew La, Andrew Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakaş, B. Ryan Roberts, Bao Sheng Loe, Barret Zoph, Bartłomiej Bojanowski, Batuhan Özyurt, Behnam Hedayatnia, Behnam Neyshabur, Benjamin Inden, Benno Stein, Berk Ekmekci, Bill Yuchen Lin, Blake Howald, Bryan Orinion, Cameron Diao, Cameron Dour, Catherine Stinson, Cedrick Argueta, César Ferri Ramírez, Chandan Singh, Charles Rathkopf, Chenlin Meng, Chitta Baral, Chiyu Wu, Chris Callison-Burch, Chris Waites, Christian Voigt, Christopher D. Manning, Christopher Potts, Cindy Ramirez, Clara E. Rivera, Clemencia Siro, Colin Raffel, Courtney Ashcraft, Cristina Garbacea, Damien Sileo, Dan Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Daniel Freeman, Daniel Khashabi, Daniel Levy, Daniel Moseguí González, Danielle Perszyk, Danny Hernandez, Danqi Chen, Daphne Ippolito, Dar Gilboa, David Dohan, David Drakard, David Jurgens, Debajyoti Datta, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Derek Tam, Dieuwke Hupkes, Diganta Misra, Dilyar Buzan, Dimitri Coelho Mollo, Diyi Yang, Dong-Ho Lee, Dylan Schrader, Ekaterina Shutova, Ekin Dogus Cubuk, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Ellie Pavlick, Emanuele Rodola, Emma Lam, Eric Chu, Eric Tang, Erkut Erdem, Ernie Chang, Ethan A. Chi, Ethan Dyer, Ethan Jerzak, Ethan Kim, Eunice Engefu Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fatemeh Siar, Fernando Martínez-Plumed, Francesca Happé, Francois Chollet, Frieda Rong, Gaurav Mishra, Genta Indra Winata, Gerard de Melo, Germán Kruszewski, Giambattista Parascandolo, Giorgio Mariani, Gloria Wang, Gonzalo Jaimovitch-López, Gregor Betz, Guy Gur-Ari, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hannaneh Hajishirzi, Harsh Mehta, Hayden Bogar, Henry Shevlin, Hinrich Schütze, Hiromu Yakura, Hongming Zhang, Hugh Mee Wong, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, Jacob Hilton, Jaehoon Lee, Jaime Fernández Fisac, James B. Simon, James Koppel, James Zheng, James Zou, Jan Kocoń, Jana Thompson, Janelle Wingfield, Jared Kaplan, Jarema Radom, Jascha Sohl-Dickstein, Jason Phang, Jason Wei, Jason Yosinski, Jekaterina Novikova, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesse Engel, Jesujoba Alabi, Jiacheng Xu, Jiaming Song, Jillian Tang, Joan Waweru, John Burden, John Miller, John U. Balis, Jonathan Batchelder, Jonathan Berant, Jörg Frohberg, Jos Rozen, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Guerr, Joseph Jones, Joshua B. Tenenbaum, Joshua S. Rule, Joyce Chua, Kamil Kanclerz, Karen Livescu, Karl Krauth, Karthik Gopalakrishnan, Katerina Ignatyeva, Katja Markert, Kaustubh D. Dhole, Kevin Gimpel, Kevin Omondi, Kory Mathewson, Kristen Chiafullo, Ksenia Shkaruta, Kumar Shridhar, Kyle McDonell, Kyle Richardson, Laria Reynolds, Leo Gao, Li Zhang, Liam Dugan, Lianhui Qin, Lidia Contreras-Ochando, Louis-Philippe Morency, Luca Moschella, Lucas Lam, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros Colón, Luke Metz, Lütfi Kerem Şenel, Maarten Bosma, Maarten Sap, Maartje ter Hoeve, Maheen Farooqi, Manaal Faruqui, Mantas Mazeika, Marco Baturan, Marco Marelli, Marco Maru, Maria Jose Ramírez Quintana, Marie Tolkiehn, Mario Giulianelli, Martha Lewis, Martin Potthast, Matthew L. Leavitt, Matthias Hagen, Mátyás Schubert, Medina Orduna Baitemirova, Melody Arnaud, Melvin McElrath, Michael A. Yee, Michael Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michał Swędrowski, Michele Bevilacqua, Michihiro Yasunaga, Mihir Kale, Mike Cain, Mimee Xu, Mirac Suzgun, Mitch Walker, Mo Tiwari, Mohit Bansal, Moin Aminnaseri, Mor Geva, Mozhdeh Gheini, Mukund Varma T, Nanyun Peng, Nathan A. Chi, Nayeon Lee, Neta Gur-Ari Krakover, Nicholas Cameron, Nicholas Roberts, Nick Doiron, Nicole Martinez, Nikita Nangia, Niklas Deckers, Niklas Muennighoff, Nitish Shirish Keskar, Niveditha S. Iyer, Noah Constant, Noah Fiedel, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Omer Levy, Owain Evans, Pablo Antonio Moreno Casares, Parth Doshi, Pascale Fung, Paul Pu Liang, Paul Vicol, Pegah Alipoormolabashi, Peiyuan Liao, Percy Liang, Peter Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Pouya Pezeshkpour, Priti Oli, Qiaozhu Mei, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Etta Rudolph, Raefer Gabriel, Rahel Habacker, Ramon Risco, Raphaël Millière, Rhythm Garg, Richard Barnes, Rif A. Saurous, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Roman Sitelew, Ronan LeBras, Rosanne Liu, Rowan Jacobs, Rui Zhang, Ruslan Salakhutdinov, Ryan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Rylan Yang, Sahib Singh, Saif M. Mohammad, Sajant Anand, Sam Dillavou, Sam Shleifer, Sam Wiseman, Samuel Gruetter, Samuel R. Bowman, Samuel S. Schoenholz, Sanghyun Han, Sanjeev Kwatra, Sarah A. Rous, Sarik Ghazarian, Sayan Ghosh, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sepideh Sadeghi, Shadi Hamdan, Sharon Zhou, Shashank Srivastava, Sherry Shi, Shikhar Singh, Shima Asaadi, Shixiang Shane Gu, Shubh Pachchigar, Shubham Toshniwal, Shyam Upadhyay, Shyamolima, Debnath, Siamak Shakeri, Simon Thormeyer, Simone Melzi, Siva Reddy, Sneha Priscilla Makini, Soo-Hwan Lee, Spencer Torene, Sriharsha Hatwar, Stanislas Dehaene, Stefan Divic, Stefano Ermon, Stella Biderman, Stephanie Lin, Stephen Prasad, Steven T. Piantadosi, Stuart M. Shieber, Summer Misherghi, Svetlana Kiritchenko, Swaroop Mishra, Tal Linzen, Tal Schuster, Tao Li, Tao Yu, Tariq Ali, Tatsu Hashimoto, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Tiberius Nkinyili, Timo Schick, Timofei Kornev, Titus Tunduny, Tobias Gerstenberg, Trenton Chang, Trishala Neeraj, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Vera Demberg, Victoria Nyamai, Vikas Raunak, Vinay Ramasesh, Vinay Uday Prabhu, Vishakh Padmakumar, Vivek Srikumar, William Fedus, William Saunders, William Zhang, Wout Vossen, Xiang Ren, Xiaoyu Tong, Xinran Zhao, Xinyi Wu, Xudong Shen, Yadollah Yaghoobzadeh, Yair Lakretz, Yangqiu Song, Yasaman Bahri, Yejin Choi, Yichi Yang, Yiding Hao, Yifu Chen, Yonatan Belinkov, Yu Hou, Yufang Hou, Yuntao Bai, Zachary Seid, Zhuoye Zhao, Zijian Wang, Zijie J. Wang, ZiRui Wang, Ziyi Wu

BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models.

Common Sense Reasoning Math +1

2,715

Paper
Code

The SelectGen Challenge: Finding the Best Training Samples for Few-Shot Neural Text Generation

no code implementations • INLG (ACL) 2021 • Ernie Chang, Xiaoyu Shen, Alex Marin, Vera Demberg

We propose a shared task on training instance selection for few-shot neural text generation.

Text Generation

Paper
Add Code

Entity Enhancement for Implicit Discourse Relation Classification in the Biomedical Domain

no code implementations • ACL 2021 • Wei Shi, Vera Demberg

Implicit discourse relation classification is a challenging task, in particular when the text domain is different from the standard Penn Discourse Treebank (PDTB; Prasad et al., 2008) training corpus domain (Wall Street Journal in 1990s).

Implicit Discourse Relation Classification Relation

Paper
Add Code

Exploring the Potential of Lexical Paraphrases for Mitigating Noise-Induced Comprehension Errors

no code implementations • 18 Jul 2021 • Anupama Chingacham, Vera Demberg, Dietrich Klakow

We evaluate the intelligibility of synonyms in context and find that choosing a lexical unit that is less risky to be misheard than its synonym introduced an average gain in comprehension of 37% at SNR -5 dB and 21% at SNR 0 dB for babble noise.

Speech Synthesis

Paper
Add Code

On Training Instance Selection for Few-Shot Neural Text Generation

no code implementations • ACL 2021 • Ernie Chang, Xiaoyu Shen, Hui-Syuan Yeh, Vera Demberg

In this work, we present a study on training instance selection in few-shot neural text generation.

Clustering Data-to-Text Generation +3

Paper
Add Code

Time-Aware Ancient Chinese Text Translation and Inference

1 code implementation • ACL (LChange) 2021 • Ernie Chang, Yow-Ting Shiue, Hui-Syuan Yeh, Vera Demberg

In this paper, we aim to address the challenges surrounding the translation of ancient Chinese text: (1) The linguistic gap due to the difference in eras results in translations that are poor in quality, and (2) most translations are missing the contextual information that is often very crucial to understanding the text.

Translation

Paper
Code

Does the Order of Training Samples Matter? Improving Neural Data-to-Text Generation with Curriculum Learning

no code implementations • EACL 2021 • Ernie Chang, Hui-Syuan Yeh, Vera Demberg

Efforts have been dedicated to improving text generation systems by changing the order of training samples in a process known as curriculum learning.

Data-to-Text Generation

Paper
Add Code

Jointly Improving Language Understanding and Generation with Quality-Weighted Weak Supervision of Automatic Labeling

no code implementations • EACL 2021 • Ernie Chang, Vera Demberg, Alex Marin

Neural natural language generation (NLG) and understanding (NLU) models are data-hungry and require massive amounts of annotated data to be competitive.

Text Generation

Paper
Add Code

Neural Data-to-Text Generation with LM-based Text Augmentation

no code implementations • EACL 2021 • Ernie Chang, Xiaoyu Shen, Dawei Zhu, Vera Demberg, Hui Su

Our approach automatically augments the data available for training by (i) generating new text samples based on replacing specific values by alternative ones from the same category, (ii) generating new text samples based on GPT-2, and (iii) proposing an automatic method for pairing the new text samples with data samples.

Data-to-Text Generation Text Augmentation

Paper
Add Code

Story Generation with Rich Details

no code implementations • COLING 2020 • Fangzhou Zhai, Vera Demberg, Alexander Koller

Automatically generated stories need to be not only coherent, but also interesting.

Informativeness Story Generation

Paper
Add Code

Diverse and Relevant Visual Storytelling with Scene Graph Embeddings

no code implementations • CONLL 2020 • Xudong Hong, Rakshith Shetty, Asad Sayeed, Khushboo Mehra, Vera Demberg, Bernt Schiele

A problem in automatically generated stories for image sequences is that they use overly generic vocabulary and phrase structure and fail to match the distributional characteristics of human-generated text.

Ranked #5 on Visual Storytelling on VIST

Visual Storytelling

Paper
Add Code

DART: A Lightweight Quality-Suggestive Data-to-Text Annotation Tool

no code implementations • COLING 2020 • Ernie Chang, Jeriah Caplinger, Alex Marin, Xiaoyu Shen, Vera Demberg

We present a lightweight annotation tool, the Data AnnotatoR Tool (DART), for the general task of labeling structured data with textual descriptions.

Active Learning text annotation

Paper
Add Code

Unsupervised Pidgin Text Generation By Pivoting English Data and Self-Training

no code implementations • 18 Mar 2020 • Ernie Chang, David Ifeoluwa Adelani, Xiaoyu Shen, Vera Demberg

In this work, we develop techniques targeted at bridging the gap between Pidgin English and English in the context of natural language generation.

Data-to-Text Generation Machine Translation +1

Paper
Add Code

Improving Language Generation from Feature-Rich Tree-Structured Data with Relational Graph Convolutional Encoders

no code implementations • WS 2019 • Xudong Hong, Ernie Chang, Vera Demberg

The Multilingual Surface Realization Shared Task 2019 focuses on generating sentences from lemmatized sets of universal dependency parses with rich features.

Data Augmentation Text Generation

Paper
Add Code

Next Sentence Prediction helps Implicit Discourse Relation Classification within and across Domains

no code implementations • IJCNLP 2019 • Wei Shi, Vera Demberg

Implicit discourse relation classification is one of the most difficult tasks in discourse parsing.

Discourse Parsing General Classification +3

Paper
Add Code

A Hybrid Model for Globally Coherent Story Generation

no code implementations • WS 2019 • Fangzhou Zhai, Vera Demberg, Pavel Shkadzko, Wei Shi, Asad Sayeed

The model exploits a symbolic text planning module to produce text plans, thus reducing the demand of data; a neural surface realization module then generates fluent text conditioned on the text plan.

Story Generation

Paper
Add Code

Crowdsourcing Discourse Relation Annotations by a Two-Step Connective Insertion Task

no code implementations • WS 2019 • Frances Yung, Vera Demberg, Merel Scholman

The perspective of being able to crowd-source coherence relations bears the promise of acquiring annotations for new texts quickly, which could then increase the size and variety of discourse-annotated corpora.

Relation Vocal Bursts Valence Prediction

Paper
Add Code

Verb-Second Effect on Quantifier Scope Interpretation

no code implementations • WS 2019 • Asad Sayeed, Matthias Lindemann, Vera Demberg

Sentences like {``}Every child climbed a tree{''} have at least two interpretations depending on the precedence order of the universal quantifier and the indefinite.

World Knowledge

Paper
Add Code

Learning to Explicitate Connectives with Seq2Seq Network for Implicit Discourse Relation Classification

no code implementations • WS 2019 • Wei Shi, Vera Demberg

Implicit discourse relation classification is one of the most difficult steps in discourse parsing.

Classification Discourse Parsing +3

Paper
Add Code

Toward Bayesian Synchronous Tree Substitution Grammars for Sentence Planning

no code implementations • WS 2018 • David M. Howcroft, Dietrich Klakow, Vera Demberg

Developing conventional natural language generation systems requires extensive attention from human experts in order to craft complex sets of sentence planning rules.

Sentence Text Generation

Paper
Add Code

Using Universal Dependencies in cross-linguistic complexity research

no code implementations • WS 2018 • Aleks Berdicevskis, rs, {\c{C}}a{\u{g}}r{\i} {\c{C}}{\"o}ltekin, Katharina Ehret, Kilu von Prince, Daniel Ross, Bill Thompson, Chunxiao Yan, Vera Demberg, Gary Lupyan, Taraka Rama, Christian Bentz

We evaluate corpus-based measures of linguistic complexity obtained using Universal Dependencies (UD) treebanks.

Paper
Add Code

Acquiring Annotated Data with Cross-lingual Explicitation for Implicit Discourse Relation Classification

no code implementations • WS 2019 • Wei Shi, Frances Yung, Vera Demberg

Implicit discourse relation classification is one of the most challenging and important tasks in discourse parsing, due to the lack of connective as strong linguistic cues.

16k Discourse Parsing +5

Paper
Add Code

Do Speakers Produce Discourse Connectives Rationally?

no code implementations • WS 2018 • Frances Yung, Vera Demberg

A number of different discourse connectives can be used to mark the same discourse relation, but it is unclear what factors affect connective choice.

Informativeness

Paper
Add Code

Learning distributed event representations with a multi-task approach

no code implementations • SEMEVAL 2018 • Xudong Hong, Asad Sayeed, Vera Demberg

Human world knowledge contains information about prototypical events and their participants and locations.

Multi-Task Learning World Knowledge

Paper
Add Code

Rollenwechsel-English: a large-scale semantic role corpus

no code implementations • LREC 2018 • Asad Sayeed, Pavel Shkadzko, Vera Demberg

Language Modelling

Paper
Add Code

A vision-grounded dataset for predicting typical locations for verbs

no code implementations • LREC 2018 • Nelson Mukuze, Anna Rohrbach, Vera Demberg, Bernt Schiele

Common Sense Reasoning Image Captioning

Paper
Add Code

Improving Variational Encoder-Decoders in Dialogue Generation

no code implementations • 6 Feb 2018 • Xiaoyu Shen, Hui Su, Shuzi Niu, Vera Demberg

Variational encoder-decoders (VEDs) have shown promising results in dialogue generation.

Dialogue Generation

Paper
Add Code

Using Explicit Discourse Connectives in Translation for Implicit Discourse Relation Classification

no code implementations • IJCNLP 2017 • Wei Shi, Frances Yung, Raphael Rubino, Vera Demberg

Implicit discourse relation recognition is an extremely challenging task due to the lack of indicative connectives.

General Classification Implicit Discourse Relation Classification +4

Paper
Add Code

G-TUNA: a corpus of referring expressions in German, including duration information

no code implementations • WS 2017 • David Howcroft, Jorrig Vogels, Vera Demberg

Corpora of referring expressions elicited from human participants in a controlled environment are an important resource for research on automatic referring expression generation.

Referring Expression Referring expression generation +1

Paper
Add Code

How compatible are our discourse annotations? Insights from mapping RST-DT and PDTB annotations

no code implementations • 28 Apr 2017 • Vera Demberg, Fatemeh Torabi Asr, Merel Scholman

Discourse-annotated corpora are an important resource for the community, but they are often annotated according to different frameworks.

Implicit Relations Relation

Paper
Add Code

On the Need of Cross Validation for Discourse Relation Classification

no code implementations • EACL 2017 • Wei Shi, Vera Demberg

The task of implicit discourse relation classification has received increased attention in recent years, including two CoNNL shared tasks on the topic.

Classification General Classification +5

Paper
Add Code

A Systematic Study of Neural Discourse Models for Implicit Discourse Relation

no code implementations • EACL 2017 • Attapol Rutherford, Vera Demberg, Nianwen Xue

Here, we propose neural network models that are based on feedforward and long-short term memory architecture and systematically study the effects of varying structures.

Discourse Parsing Relation

Paper
Add Code

Crowdsourcing discourse interpretations: On the influence of context and the reliability of a connective insertion task

no code implementations • WS 2017 • Merel Scholman, Vera Demberg

In this paper, we investigate whether crowdsourcing can be used to obtain reliable discourse relation annotations.

Relation

Paper
Add Code

Psycholinguistic Models of Sentence Processing Improve Sentence Readability Ranking

no code implementations • EACL 2017 • David M. Howcroft, Vera Demberg

While previous research on readability has typically focused on document-level measures, recent work in areas such as natural language generation has pointed out the need of sentence-level readability measures.

Information Retrieval Sentence +2

Paper
Add Code

Modeling Semantic Expectation: Using Script Knowledge for Referent Prediction

no code implementations • TACL 2017 • Ashutosh Modi, Ivan Titov, Vera Demberg, Asad Sayeed, Manfred Pinkal

Recent research in psycholinguistics has provided increasing evidence that humans predict upcoming content.

Common Sense Reasoning Referring Expression

Paper
Add Code

From OpenCCG to AI Planning: Detecting Infeasible Edges in Sentence Generation

no code implementations • COLING 2016 • Maximilian Schwenger, {\'A}lvaro Torralba, Joerg Hoffmann, David M. Howcroft, Vera Demberg

The search space in grammar-based natural language generation tasks can get very large, which is particularly problematic when generating long utterances or paragraphs.

Sentence Text Generation

Paper
Add Code

Event participant modelling with neural networks

no code implementations • EMNLP 2016 • Ottokar Tilk, Vera Demberg, Asad Sayeed, Dietrich Klakow, Stefan Thater

Language Modelling Machine Translation

Paper
Add Code

How can we adapt generation to the user's cognitive load?

no code implementations • WS 2016 • Vera Demberg

Text Generation

Paper
Add Code

Thematic fit evaluation: an aspect of selectional preferences

no code implementations • WS 2016 • Asad Sayeed, Clayton Greenberg, Vera Demberg

Decision Making

Paper
Add Code

Roleo: Visualising Thematic Fit Spaces on the Web

no code implementations • ACL 2016 • Asad Sayeed, Xudong Hong, Vera Demberg

Word Embeddings

Paper
Add Code

Neural Network Models for Implicit Discourse Relation Classification in English and Chinese without Surface Features

no code implementations • 7 Jun 2016 • Attapol T. Rutherford, Vera Demberg, Nianwen Xue

Inferring implicit discourse relations in natural language text is the most difficult subtask in discourse parsing.

Discourse Parsing General Classification +2

Paper
Add Code

Improving event prediction by representing script participants

no code implementations • NAACL 2016 • Simon Ahrendt, Vera Demberg

Paper
Add Code

LingoTurk: managing crowdsourced tasks for psycholinguistics

no code implementations • NAACL 2016 • Florian Pusse, Asad Sayeed, Vera Demberg

Machine Translation

Paper
Add Code

Annotating Discourse Relations in Spoken Language: A Comparison of the PDTB and CCR Frameworks

no code implementations • LREC 2016 • Ines Rehbein, Merel Scholman, Vera Demberg

In discourse relation annotation, there is currently a variety of different frameworks being used, and most of them have been developed and employed mostly on written data.

Relation