Search Results for author: Kathrin Blagec

Found 8 papers, 5 papers with code

A global analysis of metrics used for measuring performance in natural language processing

1 code implementation nlppower (ACL) 2022 Kathrin Blagec, Georg Dorffner, Milad Moradi, Simon Ott, Matthias Samwald

Our results suggest that the large majority of natural language processing metrics currently used have properties that may result in an inadequate reflection of a models' performance.

Benchmarking Machine Translation

Benchmark datasets driving artificial intelligence development fail to capture the needs of medical professionals

no code implementations18 Jan 2022 Kathrin Blagec, Jakob Kraiger, Wolfgang Frühwirt, Matthias Samwald

Furthermore, there is a lack of systematized meta-information that allows clinical AI researchers to quickly determine accessibility, scope, content and other characteristics of datasets and benchmark datasets relevant to the clinical domain.

A curated, ontology-based, large-scale knowledge graph of artificial intelligence tasks and benchmarks

1 code implementation4 Oct 2021 Kathrin Blagec, Adriano Barbosa-Silva, Simon Ott, Matthias Samwald

Research in artificial intelligence (AI) is addressing a growing number of tasks through a rapidly growing number of models and methodologies.

Neural sentence embedding models for semantic similarity estimation in the biomedical domain

1 code implementation1 Oct 2021 Kathrin Blagec, Hong Xu, Asan Agibetov, Matthias Samwald

BACKGROUND: In this study, we investigated the efficacy of current state-of-the-art neural sentence embedding models for semantic similarity estimation of sentences from biomedical literature.

Semantic Similarity Semantic Textual Similarity +4

GPT-3 Models are Poor Few-Shot Learners in the Biomedical Domain

1 code implementation6 Sep 2021 Milad Moradi, Kathrin Blagec, Florian Haberl, Matthias Samwald

However, in-domain pretraining seems not to be sufficient; novel pretraining and few-shot learning strategies are required in the biomedical NLP domain.

Few-Shot Learning Language Modelling +1

Deep learning models are not robust against noise in clinical text

1 code implementation27 Aug 2021 Milad Moradi, Kathrin Blagec, Matthias Samwald

The proposed perturbation methods can be used in performance evaluation tests to assess how robustly clinical NLP models can operate on noisy data, in real-world settings.

A critical analysis of metrics used for measuring progress in artificial intelligence

no code implementations6 Aug 2020 Kathrin Blagec, Georg Dorffner, Milad Moradi, Matthias Samwald

Our results suggest that the large majority of metrics currently used have properties that may result in an inadequate reflection of a models' performance.

Benchmarking

Cannot find the paper you are looking for? You can Submit a new open access paper.