Search Results for author: Barbara Plank

Found 94 papers, 29 papers with code

Biomedical Event Extraction as Sequence Labeling

no code implementations EMNLP 2020 Alan Ramponi, Rob van der Goot, Rosario Lombardo, Barbara Plank

We introduce Biomedical Event Extraction as Sequence Labeling (BeeSL), a joint end-to-end neural information extraction model.

Benchmark Event Extraction +1

NLP North at WNUT-2020 Task 2: Pre-training versus Ensembling for Detection of Informative COVID-19 English Tweets

no code implementations EMNLP (WNUT) 2020 Anders Giovanni Møller, Rob van der Goot, Barbara Plank

With the COVID-19 pandemic raging world-wide since the beginning of the 2020 decade, the need for monitoring systems to track relevant information on social media is vitally important.

Lexical Resources for Low-Resource PoS Tagging in Neural Times

no code implementations WS (NoDaLiDa) 2019 Barbara Plank, Sigrid Klerke

More and more evidence is appearing that integrating symbolic lexical knowledge into neural models aids learning.

Cross-Lingual POS Tagging POS

The Lacunae of Danish Natural Language Processing

no code implementations WS (NoDaLiDa) 2019 Andreas Kirkedal, Barbara Plank, Leon Derczynski, Natalie Schluter

Danish is a North Germanic language spoken principally in Denmark, a country with a long tradition of technological and scientific innovation.

Natural Language Processing

Finding the needle in a haystack: Extraction of Informative COVID-19 Danish Tweets

no code implementations WNUT (ACL) 2021 Benjamin Olsen, Barbara Plank

In this work, we introduce a new dataset of 5, 000 tweets for finding informative COVID-19 tweets for Danish.

Resources and Evaluations for Danish Entity Resolution

no code implementations CRAC (ACL) 2021 Maria Barrett, Hieu Lam, Martin Wu, Ophélie Lacroix, Barbara Plank, Anders Søgaard

Automatic coreference resolution is understudied in Danish even though most of the Danish Dependency Treebank (Buch-Kromann, 2003) is annotated with coreference relations.

Coreference Resolution Entity Disambiguation +2

Sort by Structure: Language Model Ranking as Dependency Probing

no code implementations10 Jun 2022 Max Müller-Eberstein, Rob van der Goot, Barbara Plank

Making an informed choice of pre-trained language model (LM) is critical for performance, yet environmentally costly, and as such widely underexplored.

Computer Vision Language Modelling +2

Experimental Standards for Deep Learning Research: A Natural Language Processing Perspective

1 code implementation13 Apr 2022 Dennis Ulmer, Elisa Bassignana, Max Müller-Eberstein, Daniel Varab, Mike Zhang, Christian Hardmeier, Barbara Plank

The field of Deep Learning (DL) has undergone explosive growth during the last decade, with a substantial impact on Natural Language Processing (NLP) as well.

Natural Language Processing

Probing for Labeled Dependency Trees

1 code implementation ACL 2022 Max Müller-Eberstein, Rob van der Goot, Barbara Plank

Probing has become an important tool for analyzing representations in Natural Language Processing (NLP).

Dependency Parsing Informativeness +1

Genre as Weak Supervision for Cross-lingual Dependency Parsing

1 code implementation EMNLP 2021 Max Müller-Eberstein, Rob van der Goot, Barbara Plank

Recent work has shown that monolingual masked language models learn to represent data-driven notions of language variation which can be used for domain-targeted training data selection.

Dependency Parsing

Cartography Active Learning

2 code implementations Findings (EMNLP) 2021 Mike Zhang, Barbara Plank

We propose Cartography Active Learning (CAL), a novel Active Learning (AL) algorithm that exploits the behavior of the model on individual instances during training as a proxy to find the most informative instances for labeling.

Active Learning Text Classification

SemEval-2021 Task 12: Learning with Disagreements

no code implementations SEMEVAL 2021 Alexandra Uma, Tommaso Fornaciari, Anca Dumitrache, Tristan Miller, Jon Chamberlain, Barbara Plank, Edwin Simpson, Massimo Poesio

Disagreement between coders is ubiquitous in virtually all datasets annotated with human judgements in both natural language processing and computer vision.

Computer Vision Natural Language Processing

Proceedings of the First Workshop on Weakly Supervised Learning (WeaSuL)

no code implementations8 Jul 2021 Michael A. Hedderich, Benjamin Roth, Katharina Kann, Barbara Plank, Alex Ratner, Dietrich Klakow

Welcome to WeaSuL 2021, the First Workshop on Weakly Supervised Learning, co-located with ICLR 2021.

On the Effectiveness of Dataset Embeddings in Mono-lingual,Multi-lingual and Zero-shot Conditions

no code implementations EACL (AdaptNLP) 2021 Rob van der Goot, Ahmet Üstün, Barbara Plank

However, it remains unclear in which situations these dataset embeddings are most effective, because they are used in a large variety of settings, languages and tasks.

Dependency Parsing Lemmatization +1

Longitudinal Citation Prediction using Temporal Graph Neural Networks

no code implementations10 Dec 2020 Andreas Nugaard Holm, Barbara Plank, Dustin Wright, Isabelle Augenstein

Citation count prediction is the task of predicting the number of citations a paper has gained after a period of time.

Citation Prediction

Team DiSaster at SemEval-2020 Task 11: Combining BERT and Hand-crafted Features for Identifying Propaganda Techniques in News

no code implementations SEMEVAL 2020 Anders Kaas, Viktor Torp Thomsen, Barbara Plank

We present an ablation study which shows that even though BERT representations are very powerful also for this task, BERT still benefits from being combined with carefully designed task-specific features.

Buhscitu at SemEval-2020 Task 7: Assessing Humour in Edited News Headlines Using Hand-Crafted Features and Online Knowledge Bases

no code implementations SEMEVAL 2020 Kristian N{\o}rgaard Jensen, Nicolaj Filrup Rasmussen, Thai Wang, Marco Placenti, Barbara Plank

This paper describes a system that aims at assessing humour intensity in edited news headlines as part of the 7th task of SemEval-2020 on {``}Humor, Emphasis and Sentiment{''}.

Language Modelling

Neural Unsupervised Domain Adaptation in NLP---A Survey

1 code implementation COLING 2020 Alan Ramponi, Barbara Plank

We also revisit the notion of domain, and we uncover a bias in the type of Natural Language Processing tasks which received most attention.

Natural Language Processing Out-of-Distribution Generalization +1

FT Speech: Danish Parliament Speech Corpus

no code implementations25 May 2020 Andreas Kirkedal, Marija Stepanović, Barbara Plank

A combination of FT Speech with in-domain language data provides comparable results to models trained specifically on Spr\r{a}kbanken, showing that FT Speech transfers well to this data set.

Automatic Speech Recognition speech-recognition

Cross-Domain Evaluation of Edge Detection for Biomedical Event Extraction

no code implementations LREC 2020 Alan Ramponi, Barbara Plank, Rosario Lombardo

Biomedical event extraction is a crucial task in order to automatically extract information from the increasingly growing body of biomedical literature.

Domain Adaptation Edge Detection +1

At a Glance: The Impact of Gaze Aggregation Views on Syntactic Tagging

no code implementations WS 2019 Sigrid Klerke, Barbara Plank

Hence, caution is warranted when using gaze data as signal for NLP, as no single view is robust over tasks, modeling choice and gaze corpus.

Chunking Natural Language Processing +2

The Best of Both Worlds: Lexical Resources To Improve Low-Resource Part-of-Speech Tagging

no code implementations21 Nov 2018 Barbara Plank, Sigrid Klerke, Zeljko Agic

In natural language processing, the deep learning revolution has shifted the focus from conventional hand-crafted symbolic representations to dense inputs, which are adequate representations learned automatically from corpora.

Cross-Lingual POS Tagging Natural Language Processing +2

Distant Supervision from Disparate Sources for Low-Resource Part-of-Speech Tagging

1 code implementation EMNLP 2018 Barbara Plank, Željko Agić

We introduce DsDs: a cross-lingual neural part-of-speech tagger that learns from disparate sources of distant supervision, and realistically scales to hundreds of low-resource languages.

Part-Of-Speech Tagging TAG

When Simple n-gram Models Outperform Syntactic Approaches: Discriminating between Dutch and Flemish

no code implementations COLING 2018 Martin Kroon, Masha Medvedeva, Barbara Plank

In this paper we present the results of our participation in the Discriminating between Dutch and Flemish in Subtitles VarDial 2018 shared task.

Predicting Authorship and Author Traits from Keystroke Dynamics

no code implementations WS 2018 Barbara Plank

Written text transmits a good deal of nonverbal information related to the author{'}s identity and social factors, such as age, gender and personality.

Machine Translation

Bleaching Text: Abstract Features for Cross-lingual Gender Prediction

1 code implementation ACL 2018 Rob van der Goot, Nikola Ljubešić, Ian Matroos, Malvina Nissim, Barbara Plank

Gender prediction has typically focused on lexical and social network features, yielding good performance, but making systems highly language-, topic-, and platform-dependent.

Gender Prediction

Strong Baselines for Neural Semi-supervised Learning under Domain Shift

2 code implementations ACL 2018 Sebastian Ruder, Barbara Plank

In this paper, we re-evaluate classic general-purpose bootstrapping approaches in the context of neural networks under domain shifts vs. recent neural approaches and propose a novel multi-task tri-training method that reduces the time and space complexity of classic tri-training.

Domain Adaptation Multi-Task Learning +2

ALL-IN-1: Short Text Classification with One Model for All Languages

1 code implementation26 Oct 2017 Barbara Plank

We present ALL-IN-1, a simple model for multilingual text classification that does not require any parallel data.

General Classification Multilingual text classification +2

The Power of Character N-grams in Native Language Identification

no code implementations WS 2017 Artur Kulmizev, Bo Blankers, Johannes Bjerva, Malvina Nissim, Gertjan van Noord, Barbara Plank, Martijn Wieling

In this paper, we explore the performance of a linear SVM trained on language independent character features for the NLI Shared Task 2017.

Native Language Identification Text Classification

Learning to select data for transfer learning with Bayesian Optimization

1 code implementation EMNLP 2017 Sebastian Ruder, Barbara Plank

Domain similarity measures can be used to gauge adaptability and select suitable data for transfer learning, but existing approaches define ad hoc measures that are deemed suitable for respective tasks.

Part-Of-Speech Tagging Sentiment Analysis +1

Cross-lingual tagger evaluation without test data

no code implementations EACL 2017 {\v{Z}}eljko Agi{\'c}, Barbara Plank, Anders S{\o}gaard

We address the challenge of cross-lingual POS tagger evaluation in absence of manually annotated test data.

POS

When silver glitters more than gold: Bootstrapping an Italian part-of-speech tagger for Twitter

no code implementations9 Nov 2016 Barbara Plank, Malvina Nissim

We bootstrap a state-of-the-art part-of-speech tagger to tag Italian Twitter data, in the context of the Evalita 2016 PoSTWITA shared task.

TAG

Keystroke dynamics as signal for shallow syntactic parsing

1 code implementation COLING 2016 Barbara Plank

Keystroke dynamics have been extensively used in psycholinguistic and writing research to gain insights into cognitive processing.

CCG Supertagging Chunking +1

Semantic Tagging with Deep Residual Networks

1 code implementation COLING 2016 Johannes Bjerva, Barbara Plank, Johan Bos

We propose a novel semantic tagging task, sem-tagging, tailored for the purpose of multilingual semantic parsing, and present the first tagger using deep residual networks (ResNets).

Part-Of-Speech Tagging POS +1

What to do about non-standard (or non-canonical) language in NLP

no code implementations28 Aug 2016 Barbara Plank

The solution is not obvious: we cannot control for all factors, and it is not clear how to best go beyond the current practice of training on homogeneous data from a single domain and language.

Natural Language Processing

Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss

3 code implementations ACL 2016 Barbara Plank, Anders Søgaard, Yoav Goldberg

Bidirectional long short-term memory (bi-LSTM) networks have recently proven successful for various NLP sequence modeling tasks, but little is known about their reliance to input representations, target languages, data set size, and label noise.

Part-Of-Speech Tagging POS

Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures

no code implementations15 Jan 2016 Raffaella Bernardi, Ruket Cakici, Desmond Elliott, Aykut Erdem, Erkut Erdem, Nazli Ikizler-Cinbis, Frank Keller, Adrian Muscat, Barbara Plank

Automatic description generation from natural images is a challenging problem that has recently received a large amount of interest from the computer vision and natural language processing communities.

Benchmark Computer Vision +1

SenTube: A Corpus for Sentiment Analysis on YouTube Social Media

no code implementations LREC 2014 Olga Uryupina, Barbara Plank, Aliaksei Severyn, Agata Rotondi, Aless Moschitti, ro

In this paper we present SenTube -- a dataset of user-generated comments on YouTube videos annotated for information content and sentiment polarity.

Document Classification Informativeness +3

When POS data sets don't add up: Combatting sample bias

no code implementations LREC 2014 Dirk Hovy, Barbara Plank, Anders S{\o}gaard

We present a systematic study of several Twitter POS data sets, the problems of label and data bias, discuss their effects on model performance, and show how to overcome them to learn models that perform well on various test sets, achieving relative error reduction of up to 21{\%}.

Natural Language Processing POS +1

Cannot find the paper you are looking for? You can Submit a new open access paper.