no code implementations • EMNLP (sdp) 2020 • Dennis Aumiller, Satya Almasian, Philip Hausner, Michael Gertz
This work presents the entry by the team from Heidelberg University in the CL-SciSumm 2020 shared task at the Scholarly Document Processing workshop at EMNLP 2020.
1 code implementation • 19 Dec 2024 • Ashish Chouhan, Saifeldin Mandour, Michael Gertz
Exploratory search of large text corpora is essential in domains like biomedical research, where large amounts of research literature are continuously generated.
1 code implementation • 14 Jul 2024 • Satya Almasian, Milena Bruseva, Michael Gertz
Quantitative information plays a crucial role in understanding and interpreting the content of documents.
no code implementations • 5 Apr 2024 • Patricia Lamirande, Eamonn A. Gaffney, Michael Gertz, Philip K. Maini, Jessica R. Crawshaw, Antonello Caruso
In human eyes, we investigated the impact of variability in vitreous cavity size and eccentricity, and in injection location, on drug elimination.
1 code implementation • 24 Mar 2024 • Ashish Chouhan, Michael Gertz
With the increase in legislative documents at the EU, the number of new terms and their definitions is increasing as well.
1 code implementation • 22 May 2023 • Jing Fan, Dennis Aumiller, Michael Gertz
We introduce SRLScore, a reference-free evaluation metric designed with text summarization in mind.
2 code implementations • 15 May 2023 • Satya Almasian, Vivian Kazakova, Philip Göldner, Michael Gertz
Compared to other information extraction approaches, interestingly only a few works exist that describe methods for a proper extraction and representation of quantities in text.
1 code implementation • 17 Jan 2023 • Dennis Aumiller, Jing Fan, Michael Gertz
We attribute poor evaluation quality to a variety of different factors, which are investigated in more detail in this work: A lack of qualitative (and diverse) gold data considered for training, understudied (and untreated) positional biases in some of the existing datasets, and the lack of easily accessible and streamlined pre-processing strategies or analysis tools.
1 code implementation • 4 Jan 2023 • Dennis Aumiller, Michael Gertz
Previous state-of-the-art models for lexical simplification consist of complex pipelines with several components, each of which requires deep technical knowledge and fine-tuned interaction to achieve its full potential.
1 code implementation • 24 Oct 2022 • Dennis Aumiller, Ashish Chouhan, Michael Gertz
Existing summarization datasets come with two main drawbacks: (1) They tend to focus on overly exposed domains, such as news articles or wiki-like texts, and (2) are primarily monolingual, with few multilingual datasets.
2 code implementations • LREC 2022 • Dennis Aumiller, Michael Gertz
Traditionally, Text Simplification is treated as a monolingual translation task where sentences between source texts and their simplified counterparts are aligned for training.
Ranked #1 on Text Summarization on Klexikon
1 code implementation • 30 Sep 2021 • Satya Almasian, Dennis Aumiller, Michael Gertz
By supplementing training resources with weakly labeled data from rule-based systems, our model surpasses previous works in temporal tagging and type classification, especially on rare classes.
Ranked #1 on Temporal Tagging on TempEval-3
1 code implementation • 7 Dec 2020 • Dennis Aumiller, Satya Almasian, Sebastian Lackner, Michael Gertz
The growing complexity of legal cases has lead to an increasing interest in legal information retrieval systems that can effectively satisfy user-specific information needs.
1 code implementation • 17 Jul 2019 • Cong Tran, Won-Yong Shin, Andreas Spitz, Michael Gertz
In this paper, we present DeepNC, a novel method for inferring the missing parts of a network based on a deep generative model of graphs.
no code implementations • 29 May 2019 • Andreas Spitz, Satya Almasian, Michael Gertz
The recent introduction of entity-centric implicit network representations of unstructured text offers novel ways for exploring entity relations in document collections and streams efficiently and interactively.
no code implementations • 22 May 2019 • Gloria Feher, Andreas Spitz, Michael Gertz
Word embeddings have gained significant attention as learnable representations of semantic relations between words, and have been shown to improve upon the results of traditional word representations.
1 code implementation • 6 Feb 2019 • Satya Almasian, Andreas Spitz, Michael Gertz
We discuss two distinct approaches to the generation of such embeddings, namely the training of state-of-the-art embeddings on raw-text and annotated versions of the corpus, as well as node embeddings of a co-occurrence graph representation of the annotated corpus.
3 code implementations • Lernen, Wissen, Daten, Analysen 2018 • Erich Schubert, Michael Gertz
Density-based clustering is closely associated with the two algorithms DBSCAN and OPTICS.
no code implementations • EMNLP 2017 • Ludwig Richter, Johanna Gei{\ss}, Andreas Spitz, Michael Gertz
Geographic information extraction from textual data sources, called geoparsing, is a key task in text processing and central to subsequent spatial analysis approaches.
1 code implementation • 11 Aug 2017 • Erich Schubert, Andreas Spitz, Michael Weiler, Johanna Geiß, Michael Gertz
We then select keywords based on their significance and construct the word cloud based on the derived affinity.
no code implementations • LREC 2014 • Jannik Str{\"o}tgen, Thomas B{\"o}gel, Julian Zell, Ayser Armiti, Tran Van Canh, Michael Gertz
Thus, references to historic dates are often not well handled by temporal taggers although they frequently occur in narrative-style documents about history, e. g., in many Wikipedia articles.
no code implementations • LREC 2014 • Thomas B{\"o}gel, Jannik Str{\"o}tgen, Michael Gertz
Computational Narratology is an emerging field within the Digital Humanities.
no code implementations • LREC 2012 • Jannik Str{\"o}tgen, Michael Gertz
Only recently, two temporal annotated corpora of narrative-style documents were developed, and it was shown that a domain shift results in significant challenges for temporal tagging.