Search Results for author: Jan Philip Wahle

Found 17 papers, 13 papers with code

Citation Amnesia: NLP and Other Academic Fields Are in a Citation Age Recession

1 code implementation19 Feb 2024 Jan Philip Wahle, Terry Ruas, Mohamed Abdalla, Bela Gipp, Saif M. Mohammad

This study examines the tendency to cite older work across 20 fields of study over 43 years (1980--2023).

Text-Guided Image Clustering

1 code implementation5 Feb 2024 Andreas Stephan, Lukas Miklautz, Kevin Sidak, Jan Philip Wahle, Bela Gipp, Claudia Plant, Benjamin Roth

We, therefore, propose Text-Guided Image Clustering, i. e., generating text using image captioning and visual question-answering (VQA) models and subsequently clustering the generated text.

Clustering Image Captioning +3

We are Who We Cite: Bridges of Influence Between Natural Language Processing and Other Academic Fields

1 code implementation23 Oct 2023 Jan Philip Wahle, Terry Ruas, Mohamed Abdalla, Bela Gipp, Saif M. Mohammad

We analyzed ~77k NLP papers, ~3. 1m citations from NLP papers to other papers, and ~1. 8m citations from other papers to NLP papers.

Math

Paraphrase Types for Generation and Detection

1 code implementation23 Oct 2023 Jan Philip Wahle, Bela Gipp, Terry Ruas

Current approaches in paraphrase generation and detection heavily rely on a single general similarity score, ignoring the intricate linguistic properties of language.

Binary Classification Paraphrase Generation

The Elephant in the Room: Analyzing the Presence of Big Tech in Natural Language Processing Research

1 code implementation4 May 2023 Mohamed Abdalla, Jan Philip Wahle, Terry Ruas, Aurélie Névéol, Fanny Ducel, Saif M. Mohammad, Karën Fort

Recent advances in deep learning methods for natural language processing (NLP) have created new business opportunities and made NLP research critical for industry development.

Paraphrase Detection: Human vs. Machine Content

1 code implementation24 Mar 2023 Jonas Becker, Jan Philip Wahle, Terry Ruas, Bela Gipp

Additionally, we identify four datasets as the most diverse and challenging for paraphrase detection.

A Cohesive Distillation Architecture for Neural Language Models

no code implementations12 Jan 2023 Jan Philip Wahle

We developed two methods to test our hypothesis that efficient architectures can gain knowledge from LMs and extract valuable information from lexical sources.

Knowledge Distillation Language Modelling +3

Analyzing Multi-Task Learning for Abstractive Text Summarization

1 code implementation26 Oct 2022 Frederic Kirstein, Jan Philip Wahle, Terry Ruas, Bela Gipp

Further, we find that choice and combinations of task families influence downstream performance more than the training scheme, supporting the use of task families for abstractive text summarization.

Abstractive Text Summarization Multi-Task Learning +3

CS-Insights: A System for Analyzing Computer Science Research

2 code implementations13 Oct 2022 Terry Ruas, Jan Philip Wahle, Lennart Küll, Saif M. Mohammad, Bela Gipp

This paper presents CS-Insights, an interactive web application to analyze computer science publications from DBLP through multiple perspectives.

How Large Language Models are Transforming Machine-Paraphrased Plagiarism

3 code implementations7 Oct 2022 Jan Philip Wahle, Terry Ruas, Frederic Kirstein, Bela Gipp

The recent success of large language models for text generation poses a severe threat to academic integrity, as plagiarists can generate realistic paraphrases indistinguishable from original work.

Paraphrase Generation

D3: A Massive Dataset of Scholarly Metadata for Analyzing the State of Computer Science Research

1 code implementation LREC 2022 Jan Philip Wahle, Terry Ruas, Saif M. Mohammad, Bela Gipp

We present an initial analysis focused on the volume of computer science research (e. g., number of papers, authors, research activity), trends in topics of interest, and citation patterns.

Testing the Generalization of Neural Language Models for COVID-19 Misinformation Detection

1 code implementation15 Nov 2021 Jan Philip Wahle, Nischal Ashok, Terry Ruas, Norman Meuschke, Tirthankar Ghosal, Bela Gipp

We expect that evaluating a broad spectrum of datasets and models will benefit future research in developing misinformation detection systems.

Misinformation

Incorporating Word Sense Disambiguation in Neural Language Models

2 code implementations15 Jun 2021 Jan Philip Wahle, Terry Ruas, Norman Meuschke, Bela Gipp

We present two supervised (pre-)training methods to incorporate gloss definitions from lexical resources into neural language models (LMs).

Word Sense Disambiguation

Identifying Machine-Paraphrased Plagiarism

2 code implementations22 Mar 2021 Jan Philip Wahle, Terry Ruas, Tomáš Foltýnek, Norman Meuschke, Bela Gipp

Employing paraphrasing tools to conceal plagiarized text is a severe threat to academic integrity.

Text Matching

Cannot find the paper you are looking for? You can Submit a new open access paper.