Search Results for author: Malte Ostendorff

Found 23 papers, 12 papers with code

Claim Extraction and Law Matching for COVID-19-related Legislation

1 code implementation LREC 2022 Niklas Dehio, Malte Ostendorff, Georg Rehm

We investigate an automated approach to extract legal claims from news articles and to match the claims with their corresponding applicable laws.

Legal Reasoning

Semantic Relations between Text Segments for Semantic Storytelling: Annotation Tool - Dataset - Evaluation

no code implementations LREC 2022 Michael Raring, Malte Ostendorff, Georg Rehm

Essential is the automated processing of text segments extracted from different content resources by identifying the relevance of a text segment to a topic and its semantic relation to other text segments.


Symmetric Dot-Product Attention for Efficient Training of BERT Language Models

no code implementations10 Jun 2024 Martin Courtois, Malte Ostendorff, Leonhard Hennig, Georg Rehm

In this work, we propose an alternative compatibility function for the self-attention mechanism introduced by the Transformer architecture.

Machine Translation

Investigating Gender Bias in Turkish Language Models

1 code implementation17 Apr 2024 Orhun Caglidil, Malte Ostendorff, Georg Rehm

However, prior research has primarily focused on the English language, especially in the context of gender bias.


Tokenizer Choice For LLM Training: Negligible or Crucial?

no code implementations12 Oct 2023 Mehdi Ali, Michael Fromm, Klaudia Thellmann, Richard Rutmann, Max Lübbering, Johannes Leveling, Katrin Klug, Jan Ebert, Niclas Doll, Jasper Schulze Buschhoff, Charvi Jain, Alexander Arno Weber, Lena Jurkschat, Hammam Abdelwahab, Chelsea John, Pedro Ortiz Suarez, Malte Ostendorff, Samuel Weinbach, Rafet Sifa, Stefan Kesselheim, Nicolas Flores-Herr

The recent success of Large Language Models (LLMs) has been predominantly driven by curating the training dataset composition, scaling of model architectures and dataset sizes and advancements in pretraining objectives, leaving tokenizer influence as a blind spot.

Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning

no code implementations23 Jan 2023 Malte Ostendorff, Georg Rehm

To address this problem, we introduce a cross-lingual and progressive transfer learning approach, called CLP-Transfer, that transfers models from a source language, for which pretrained models are publicly available, like English, to a new target language.

Cross-Lingual Transfer Language Modelling +1

Specialized Document Embeddings for Aspect-based Similarity of Research Papers

1 code implementation28 Mar 2022 Malte Ostendorff, Till Blume, Terry Ruas, Bela Gipp, Georg Rehm

We compare and analyze three generic document embeddings, six specialized document embeddings and a pairwise classification baseline in the context of research paper recommendations.

Document Classification Recommendation Systems +1

HiStruct+: Improving Extractive Text Summarization with Hierarchical Structure Information

no code implementations Findings (ACL) 2022 Qian Ruan, Malte Ostendorff, Georg Rehm

Using various experimental settings on three datasets (i. e., CNN/DailyMail, PubMed and arXiv), our HiStruct+ model outperforms a strong baseline collectively, which differs from our model only in that the hierarchical structure information is not injected.

Extractive Summarization Extractive Text Summarization +2

Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings

1 code implementation14 Feb 2022 Malte Ostendorff, Nils Rethmeier, Isabelle Augenstein, Bela Gipp, Georg Rehm

Learning scientific document representations can be substantially improved through contrastive learning objectives, where the challenge lies in creating positive and negative training samples that encode the desired similarity semantics.

Citation Prediction Contrastive Learning +3

A Qualitative Evaluation of User Preference for Link-based vs. Text-based Recommendations of Wikipedia Articles

1 code implementation16 Sep 2021 Malte Ostendorff, Corinna Breitinger, Bela Gipp

We conclude that users of literature recommendation systems can benefit most from hybrid approaches that combine both link- and text-based approaches, where the user's information needs and preferences should control the weighting for the approaches used.

Recommendation Systems

Evaluating Document Representations for Content-based Legal Literature Recommendations

1 code implementation28 Apr 2021 Malte Ostendorff, Elliott Ash, Terry Ruas, Bela Gipp, Julian Moreno-Schneider, Georg Rehm

Simultaneously, legal recommender systems are typically evaluated in small-scale user study without any public available benchmark datasets.

Recommendation Systems Representation Learning +1

Aspect-based Document Similarity for Research Papers

1 code implementation COLING 2020 Malte Ostendorff, Terry Ruas, Till Blume, Bela Gipp, Georg Rehm

Our findings motivate future research of aspect-based document similarity and the development of a recommender system based on the evaluated techniques.

Document Classification Recommendation Systems

Contextual Document Similarity for Content-based Literature Recommender Systems

no code implementations1 Aug 2020 Malte Ostendorff

In this doctoral thesis, we explore contextual document similarity measures, i. e., methods that determine document similarity as a triple of two documents and the context of their similarity.

Recommendation Systems

Towards an Open Platform for Legal Information

no code implementations27 May 2020 Malte Ostendorff, Till Blume, Saskia Ostendorff

Recent advances in the area of legal information systems have led to a variety of applications that promise support in processing and accessing legal documents.

Cannot find the paper you are looking for? You can Submit a new open access paper.