On the Use of BERT for Automated Essay Scoring: Joint Learning of Multi-Scale Essay Representation

NAACL 2022  ·  Yongjie Wang, Chuan Wang, Ruobing Li, Hui Lin ·

In recent years, pre-trained models have become dominant in most natural language processing (NLP) tasks. However, in the area of Automated Essay Scoring (AES), pre-trained models such as BERT have not been properly used to outperform other deep learning models such as LSTM. In this paper, we introduce a novel multi-scale essay representation for BERT that can be jointly learned. We also employ multiple losses and transfer learning from out-of-domain essays to further improve the performance. Experiment results show that our approach derives much benefit from joint learning of multi-scale essay representation and obtains almost the state-of-the-art result among all deep learning models in the ASAP task. Our multi-scale essay representation also generalizes well to CommonLit Readability Prize data set, which suggests that the novel text representation proposed in this paper may be a new and effective choice for long-text tasks.

PDF Abstract NAACL 2022 PDF NAACL 2022 Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Automated Essay Scoring ASAP Tran-BERT-MS-ML-R Quadratic Weighted Kappa 0.791 # 1

Methods