This paper presents our work in WMT 2021 Quality Estimation (QE) Shared Task.
However, we argue that there are gaps between the predictor and the estimator in both data quality and training objectives, which preclude QE models from benefiting from a large number of parallel corpora more directly.
Modern machine learning algorithms usually involve tuning multiple (from one to thousands) hyperparameters which play a pivotal role in terms of model generalizability.
To address this problem, in this paper, we propose a novel scalable quadruply stochastic gradient algorithm (QSG-S2AUC) for nonlinear semi-supervised AUC optimization.
Specifically, to handle two types of data instances involved in S$^3$VM, TSGS$^3$VM samples a labeled instance and an unlabeled instance as well with the random features in each iteration to compute a triply stochastic gradient.