Search Results for author: Michael Gertz

Found 25 papers, 14 papers with code

UniHD@CL-SciSumm 2020: Citation Extraction as Search

no code implementations EMNLP (sdp) 2020 Dennis Aumiller, Satya Almasian, Philip Hausner, Michael Gertz

This work presents the entry by the team from Heidelberg University in the CL-SciSumm 2020 shared task at the Scholarly Document Processing workshop at EMNLP 2020.

Re-Ranking

LexDrafter: Terminology Drafting for Legislative Documents using Retrieval Augmented Generation

1 code implementation24 Mar 2024 Ashish Chouhan, Michael Gertz

With the increase in legislative documents at the EU, the number of new terms and their definitions is increasing as well.

Retrieval

Evaluating Factual Consistency of Texts with Semantic Role Labeling

1 code implementation22 May 2023 Jing Fan, Dennis Aumiller, Michael Gertz

We introduce SRLScore, a reference-free evaluation metric designed with text summarization in mind.

Semantic Role Labeling Text Generation +1

CQE: A Comprehensive Quantity Extractor

2 code implementations15 May 2023 Satya Almasian, Vivian Kazakova, Philip Göldner, Michael Gertz

Compared to other information extraction approaches, interestingly only a few works exist that describe methods for a proper extraction and representation of quantities in text.

Dependency Parsing

On the State of German (Abstractive) Text Summarization

1 code implementation17 Jan 2023 Dennis Aumiller, Jing Fan, Michael Gertz

We attribute poor evaluation quality to a variety of different factors, which are investigated in more detail in this work: A lack of qualitative (and diverse) gold data considered for training, understudied (and untreated) positional biases in some of the existing datasets, and the lack of easily accessible and streamlined pre-processing strategies or analysis tools.

Abstractive Text Summarization Attribute +1

UniHD at TSAR-2022 Shared Task: Is Compute All We Need for Lexical Simplification?

1 code implementation4 Jan 2023 Dennis Aumiller, Michael Gertz

Previous state-of-the-art models for lexical simplification consist of complex pipelines with several components, each of which requires deep technical knowledge and fine-tuned interaction to achieve its full potential.

Lexical Simplification

EUR-Lex-Sum: A Multi- and Cross-lingual Dataset for Long-form Summarization in the Legal Domain

1 code implementation24 Oct 2022 Dennis Aumiller, Ashish Chouhan, Michael Gertz

Existing summarization datasets come with two main drawbacks: (1) They tend to focus on overly exposed domains, such as news articles or wiki-like texts, and (2) are primarily monolingual, with few multilingual datasets.

Klexikon: A German Dataset for Joint Summarization and Simplification

2 code implementations LREC 2022 Dennis Aumiller, Michael Gertz

Traditionally, Text Simplification is treated as a monolingual translation task where sentences between source texts and their simplified counterparts are aligned for training.

Text Simplification Text Summarization

BERT got a Date: Introducing Transformers to Temporal Tagging

1 code implementation30 Sep 2021 Satya Almasian, Dennis Aumiller, Michael Gertz

By supplementing training resources with weakly labeled data from rule-based systems, our model surpasses previous works in temporal tagging and type classification, especially on rare classes.

Classification Language Modelling +4

Structural Text Segmentation of Legal Documents

1 code implementation7 Dec 2020 Dennis Aumiller, Satya Almasian, Sebastian Lackner, Michael Gertz

The growing complexity of legal cases has lead to an increasing interest in legal information retrieval systems that can effectively satisfy user-specific information needs.

Change Detection Passage Retrieval +4

DeepNC: Deep Generative Network Completion

1 code implementation17 Jul 2019 Cong Tran, Won-Yong Shin, Andreas Spitz, Michael Gertz

In this paper, we present DeepNC, a novel method for inferring the missing parts of a network based on a deep generative model of graphs.

Link Prediction

TopExNet: Entity-Centric Network Topic Exploration in News Streams

no code implementations29 May 2019 Andreas Spitz, Satya Almasian, Michael Gertz

The recent introduction of entity-centric implicit network representations of unstructured text offers novel ways for exploring entity relations in document collections and streams efficiently and interactively.

Retrieving Multi-Entity Associations: An Evaluation of Combination Modes for Word Embeddings

no code implementations22 May 2019 Gloria Feher, Andreas Spitz, Michael Gertz

Word embeddings have gained significant attention as learnable representations of semantic relations between words, and have been shown to improve upon the results of traditional word representations.

Retrieval Word Embeddings

Word Embeddings for Entity-annotated Texts

1 code implementation6 Feb 2019 Satya Almasian, Andreas Spitz, Michael Gertz

We discuss two distinct approaches to the generation of such embeddings, namely the training of state-of-the-art embeddings on raw-text and annotated versions of the corpus, as well as node embeddings of a co-occurrence graph representation of the annotated corpus.

Clustering Entity Embeddings +4

Improving the Cluster Structure Extracted from OPTICS Plots

3 code implementations Lernen, Wissen, Daten, Analysen 2018 Erich Schubert, Michael Gertz

Density-based clustering is closely associated with the two algorithms DBSCAN and OPTICS.

HeidelPlace: An Extensible Framework for Geoparsing

no code implementations EMNLP 2017 Ludwig Richter, Johanna Gei{\ss}, Andreas Spitz, Michael Gertz

Geographic information extraction from textual data sources, called geoparsing, is a key task in text processing and central to subsequent spatial analysis approaches.

Semantic Word Clouds with Background Corpus Normalization and t-distributed Stochastic Neighbor Embedding

1 code implementation11 Aug 2017 Erich Schubert, Andreas Spitz, Michael Weiler, Johanna Geiß, Michael Gertz

We then select keywords based on their significance and construct the word cloud based on the derived affinity.

Extending HeidelTime for Temporal Expressions Referring to Historic Dates

no code implementations LREC 2014 Jannik Str{\"o}tgen, Thomas B{\"o}gel, Julian Zell, Ayser Armiti, Tran Van Canh, Michael Gertz

Thus, references to historic dates are often not well handled by temporal taggers although they frequently occur in narrative-style documents about history, e. g., in many Wikipedia articles.

Document Summarization Machine Translation +1

Temporal Tagging on Different Domains: Challenges, Strategies, and Gold Standards

no code implementations LREC 2012 Jannik Str{\"o}tgen, Michael Gertz

Only recently, two temporal annotated corpora of narrative-style documents were developed, and it was shown that a domain shift results in significant challenges for temporal tagging.

Temporal Tagging

Cannot find the paper you are looking for? You can Submit a new open access paper.