Search Results for author: Pius von Däniken

Found 7 papers, 2 papers with code

ZHAW-InIT - Social Media Geolocation at VarDial 2020

no code implementations VarDial (COLING) 2020 Fernando Benites, Manuela Hürlimann, Pius von Däniken, Mark Cieliebak

We describe our approaches for the Social Media Geolocation (SMG) task at the VarDial Evaluation Campaign 2020.

A Measure of the System Dependence of Automated Metrics

no code implementations4 Dec 2024 Pius von Däniken, Jan Deriu, Mark Cieliebak

Automated metrics for Machine Translation have made significant progress, with the goal of replacing expensive and time-consuming human evaluations.

Machine Translation Translation

Favi-Score: A Measure for Favoritism in Automated Preference Ratings for Generative AI Evaluation

no code implementations3 Jun 2024 Pius von Däniken, Jan Deriu, Don Tuggener, Mark Cieliebak

Thus, we propose that preference-based metrics ought to be evaluated on both sign accuracy scores and favoritism.

Text Generation

Correction of Errors in Preference Ratings from Automated Metrics for Text Generation

no code implementations6 Jun 2023 Jan Deriu, Pius von Däniken, Don Tuggener, Mark Cieliebak

A major challenge in the field of Text Generation is evaluation: Human evaluations are cost-intensive, and automated metrics often display considerable disagreement with human judgments.

Machine Translation Text Generation +1

On the Effectiveness of Automated Metrics for Text Generation Systems

no code implementations24 Oct 2022 Pius von Däniken, Jan Deriu, Don Tuggener, Mark Cieliebak

A major challenge in the field of Text Generation is evaluation because we lack a sound theory that can be leveraged to extract guidelines for evaluation campaigns.

Text Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.