In this paper, we explore possible improvements of transformer models in a low-resource setting.
For this, we investigate different sources of external knowledge and evaluate the performance of our models on in-domain data as well as on special transfer datasets that are designed to assess fine-grained reasoning capabilities.
Concretely, we propose a correction module that is trained to estimate the model's correctness as well as an iterative prediction update based on the prediction's gradient.
For this, we study the effects of model transfer on sequence labeling across various domains and tasks and show that our methods based on model similarity and support vector machines are able to predict promising sources, resulting in performance increases of up to 24 F1 points.
Second, the different embedding types can form clusters in the common embedding space, preventing the computation of a meaningful average of different embeddings and thus, reducing performance.
Deep neural networks and huge language models are becoming omnipresent in natural language applications.
The recognition and normalization of clinical information, such as tumor morphology mentions, is an important, but complex process consisting of multiple subtasks.
Simple yet effective data augmentation techniques have been proposed for sentence-level and sentence-pair natural language processing tasks.
The user study shows that our models increase the ability of the users to judge the correctness of the system and that scores like F1 are not enough to estimate the usefulness of a model in a practical setting with human users.
Named entity recognition has been extensively studied on English news texts.
Natural language processing has huge potential in the medical domain which recently led to a lot of research in this field.
With this paper, we publish our annotation guidelines, as well as our SOFC-Exp corpus consisting of 45 open-access scholarly articles annotated by domain experts.
Recent work showed that embeddings from related languages can improve the performance of sequence tagging, even for monolingual models.
Although temporal tagging is still dominated by rule-based systems, there have been recent attempts at neural temporal taggers.
Exploiting natural language processing in the clinical domain requires de-identification, i. e., anonymization of personal information in texts.
In particular, we explore different ways of integrating the named entity types of the relation arguments into a neural network for relation classification, including a joint training and a structured prediction approach.
We therefore propose a novel model for satire detection with an adversarial component to control for the confounding variable of publication source.
As a result, comparability of models across tasks is missing and their applicability to new tasks is limited.
We study cross-lingual sequence tagging with little or no labeled data in the target language.
Character-level models of tokens have been shown to be effective at dealing with within-token noise and out-of-vocabulary words.
In this paper, we demonstrate the importance of coreference resolution for natural language processing on the example of the TAC Slot Filling shared task.
The experimental results reveal that Brown word clusters, part-of-speech tags and open-class words are the most effective at reducing the perplexity of factored language models on the Mandarin-English Code-Switching corpus SEAME.
This paper addresses the problem of corpus-level entity typing, i. e., inferring from a large corpus that an entity is a member of a class such as "food" or "artist".
We introduce globally normalized convolutional neural networks for joint entity classification and relation extraction.
For the second noise type, we propose ways to improve the integration of noisy entity type predictions into relation extraction.
We introduce the first generic text representation model that is completely nonsymbolic, i. e., it does not require the availability of a segmentation or tokenization method that attempts to identify words or other symbolic units in text.
This paper investigates two different neural architectures for the task of relation classification: convolutional neural networks and recurrent neural networks.
We address relation classification in the context of slot filling, the task of finding and evaluating fillers like "Steve Jobs" for the slot X in "X founded Apple".