Search Results for author: Terry Ruas

Found 30 papers, 24 papers with code

Semantic Feature Structure Extraction From Documents Based on Extended Lexical Chains

no code implementations GWC 2018 Terry Ruas, William Grosky

For our approach, we develop two kinds of lexical chains: (i) a multilevel flexible chain representation of the extracted semantic values, which is used to construct a fixed segmentation of these chains and constituent words in the text; and (ii) a fixed lexical chain obtained directly from the initial semantic representation from a document.

Retrieval Sentence

Citation Amnesia: NLP and Other Academic Fields Are in a Citation Age Recession

1 code implementation19 Feb 2024 Jan Philip Wahle, Terry Ruas, Mohamed Abdalla, Bela Gipp, Saif M. Mohammad

This study examines the tendency to cite older work across 20 fields of study over 43 years (1980--2023).

The Media Bias Taxonomy: A Systematic Literature Review on the Forms and Automated Detection of Media Bias

1 code implementation26 Dec 2023 Timo Spinde, Smi Hinterreiter, Fabian Haak, Terry Ruas, Helge Giese, Norman Meuschke, Bela Gipp

However, we have identified a lack of interdisciplinarity in existing projects, and a need for more awareness of the various types of media bias to support methodologically thorough performance evaluations of media bias detection systems.

Bias Detection

Paraphrase Types for Generation and Detection

1 code implementation23 Oct 2023 Jan Philip Wahle, Bela Gipp, Terry Ruas

Current approaches in paraphrase generation and detection heavily rely on a single general similarity score, ignoring the intricate linguistic properties of language.

Binary Classification Paraphrase Generation

We are Who We Cite: Bridges of Influence Between Natural Language Processing and Other Academic Fields

1 code implementation23 Oct 2023 Jan Philip Wahle, Terry Ruas, Mohamed Abdalla, Bela Gipp, Saif M. Mohammad

We analyzed ~77k NLP papers, ~3. 1m citations from NLP papers to other papers, and ~1. 8m citations from other papers to NLP papers.

Math

The Elephant in the Room: Analyzing the Presence of Big Tech in Natural Language Processing Research

1 code implementation4 May 2023 Mohamed Abdalla, Jan Philip Wahle, Terry Ruas, Aurélie Névéol, Fanny Ducel, Saif M. Mohammad, Karën Fort

Recent advances in deep learning methods for natural language processing (NLP) have created new business opportunities and made NLP research critical for industry development.

Introducing MBIB -- the first Media Bias Identification Benchmark Task and Dataset Collection

1 code implementation25 Apr 2023 Martin Wessel, Tomáš Horych, Terry Ruas, Akiko Aizawa, Bela Gipp, Timo Spinde

A unified benchmark encourages the development of more robust systems and shifts the current paradigm in media bias detection evaluation towards solutions that tackle not one but multiple media bias types simultaneously.

Bias Detection

Paraphrase Detection: Human vs. Machine Content

1 code implementation24 Mar 2023 Jonas Becker, Jan Philip Wahle, Terry Ruas, Bela Gipp

Additionally, we identify four datasets as the most diverse and challenging for paraphrase detection.

Analyzing Multi-Task Learning for Abstractive Text Summarization

1 code implementation26 Oct 2022 Frederic Kirstein, Jan Philip Wahle, Terry Ruas, Bela Gipp

Further, we find that choice and combinations of task families influence downstream performance more than the training scheme, supporting the use of task families for abstractive text summarization.

Abstractive Text Summarization Multi-Task Learning +3

CS-Insights: A System for Analyzing Computer Science Research

2 code implementations13 Oct 2022 Terry Ruas, Jan Philip Wahle, Lennart Küll, Saif M. Mohammad, Bela Gipp

This paper presents CS-Insights, an interactive web application to analyze computer science publications from DBLP through multiple perspectives.

How Large Language Models are Transforming Machine-Paraphrased Plagiarism

3 code implementations7 Oct 2022 Jan Philip Wahle, Terry Ruas, Frederic Kirstein, Bela Gipp

The recent success of large language models for text generation poses a severe threat to academic integrity, as plagiarists can generate realistic paraphrases indistinguishable from original work.

Paraphrase Generation

Neural Media Bias Detection Using Distant Supervision With BABE -- Bias Annotations By Experts

1 code implementation29 Sep 2022 Timo Spinde, Manuel Plank, Jan-David Krieger, Terry Ruas, Bela Gipp, Akiko Aizawa

Fine-tuning and evaluating the model on our proposed supervised data set, we achieve a macro F1-score of 0. 804, outperforming existing methods.

Bias Detection Sentence

A Domain-adaptive Pre-training Approach for Language Bias Detection in News

1 code implementation22 May 2022 Jan-David Krieger, Timo Spinde, Terry Ruas, Juhi Kulshrestha, Bela Gipp

We present DA-RoBERTa, a new state-of-the-art transformer-based model adapted to the media bias domain which identifies sentence-level bias with an F1 score of 0. 814.

Bias Detection Decision Making +1

D3: A Massive Dataset of Scholarly Metadata for Analyzing the State of Computer Science Research

1 code implementation LREC 2022 Jan Philip Wahle, Terry Ruas, Saif M. Mohammad, Bela Gipp

We present an initial analysis focused on the volume of computer science research (e. g., number of papers, authors, research activity), trends in topics of interest, and citation patterns.

Specialized Document Embeddings for Aspect-based Similarity of Research Papers

1 code implementation28 Mar 2022 Malte Ostendorff, Till Blume, Terry Ruas, Bela Gipp, Georg Rehm

We compare and analyze three generic document embeddings, six specialized document embeddings and a pairwise classification baseline in the context of research paper recommendations.

Document Classification Recommendation Systems +1

Detecting Cross-Language Plagiarism using Open Knowledge Graphs

1 code implementation18 Nov 2021 Johannes Stegmüller, Fabian Bauer-Marquart, Norman Meuschke, Terry Ruas, Moritz Schubotz, Bela Gipp

Identifying cross-language plagiarism is challenging, especially for distant language pairs and sense-for-sense translations.

Knowledge Graphs Machine Translation +1

Testing the Generalization of Neural Language Models for COVID-19 Misinformation Detection

1 code implementation15 Nov 2021 Jan Philip Wahle, Nischal Ashok, Terry Ruas, Norman Meuschke, Tirthankar Ghosal, Bela Gipp

We expect that evaluating a broad spectrum of datasets and models will benefit future research in developing misinformation detection systems.

Misinformation

Incorporating Word Sense Disambiguation in Neural Language Models

2 code implementations15 Jun 2021 Jan Philip Wahle, Terry Ruas, Norman Meuschke, Bela Gipp

We present two supervised (pre-)training methods to incorporate gloss definitions from lexical resources into neural language models (LMs).

Word Sense Disambiguation

Evaluating Document Representations for Content-based Legal Literature Recommendations

1 code implementation28 Apr 2021 Malte Ostendorff, Elliott Ash, Terry Ruas, Bela Gipp, Julian Moreno-Schneider, Georg Rehm

Simultaneously, legal recommender systems are typically evaluated in small-scale user study without any public available benchmark datasets.

Recommendation Systems Representation Learning +1

Identifying Machine-Paraphrased Plagiarism

2 code implementations22 Mar 2021 Jan Philip Wahle, Terry Ruas, Tomáš Foltýnek, Norman Meuschke, Bela Gipp

Employing paraphrasing tools to conceal plagiarized text is a severe threat to academic integrity.

Text Matching

Enhanced word embeddings using multi-semantic representation through lexical chains

1 code implementation22 Jan 2021 Terry Ruas, Charles Henrique Porto Ferreira, William Grosky, Fabrício Olivetti de França, Débora Maria Rossi Medeiros

The relationship between words in a sentence often tells us more about the underlying semantic content of a document than its actual words, individually.

Document Classification Sentence +1

Multi-sense embeddings through a word sense disambiguation process

1 code implementation21 Jan 2021 Terry Ruas, William Grosky, Akiko Aizawa

Natural Language Understanding has seen an increasing number of publications in the last few years, especially after robust word embeddings models became prominent, when they proved themselves able to capture and represent semantic relationships from massive amounts of data.

Natural Language Understanding Word Embeddings +2

Aspect-based Document Similarity for Research Papers

1 code implementation COLING 2020 Malte Ostendorff, Terry Ruas, Till Blume, Bela Gipp, Georg Rehm

Our findings motivate future research of aspect-based document similarity and the development of a recommender system based on the evaluated techniques.

Document Classification Recommendation Systems

Why Machines Cannot Learn Mathematics, Yet

no code implementations20 May 2019 André Greiner-Petter, Terry Ruas, Moritz Schubotz, Akiko Aizawa, William Grosky, Bela Gipp

Nowadays, Machine Learning (ML) is seen as the universal solution to improve the effectiveness of information retrieval (IR) methods.

BIG-bench Machine Learning Information Retrieval +1

Cannot find the paper you are looking for? You can Submit a new open access paper.