TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	EXTRA DATA	REMOVE
Grammatical Error Correction	CoNLL-2014 Shared Task	T5	F0.5	68.87	# 4
Grammatical Error Correction	Falko-MERLIN	gT5 xxl	F0.5	75.96	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-simple-recipe-for-multilingual-grammatical/grammatical-error-correction-on-falko-merlin)](https://paperswithcode.com/sota/grammatical-error-correction-on-falko-merlin?p=a-simple-recipe-for-multilingual-grammatical)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-simple-recipe-for-multilingual-grammatical/grammatical-error-correction-on-conll-2014)](https://paperswithcode.com/sota/grammatical-error-correction-on-conll-2014?p=a-simple-recipe-for-multilingual-grammatical)`

A Simple Recipe for Multilingual Grammatical Error Correction

ACL 2021 · Sascha Rothe, Jonathan Mallinson, Eric Malmi, Sebastian Krause, Aliaksei Severyn ·

This paper presents a simple recipe to train state-of-the-art multilingual Grammatical Error Correction (GEC) models. We achieve this by first proposing a language-agnostic method to generate a large number of synthetic examples. The second ingredient is to use large-scale multilingual language models (up to 11B parameters). Once fine-tuned on language-specific supervised sets we surpass the previous state-of-the-art results on GEC benchmarks in four languages: English, Czech, German and Russian. Having established a new set of baselines for GEC, we make our results easily reproducible and accessible by releasing a cLang-8 dataset. It is produced by using our best model, which we call gT5, to clean the targets of a widely used yet noisy lang-8 dataset. cLang-8 greatly simplifies typical GEC training pipelines composed of multiple fine-tuning stages -- we demonstrate that performing a single fine-tuning step on cLang-8 with the off-the-shelf language models yields further accuracy improvements over an already top-performing gT5 model for English.