ACES a dataset consisting of 68 phenomena ranging from simple perturbations at the word/character level to more complex errors based on discourse and real-world knowledge. It can be used to evaluate a wide range of Machine Translation metrics.
Paper | Code | Results | Date | Stars |
---|