1 code implementation • LREC 2020 • Marta R. Costa-jussà, Pau Li Lin, Cristina España-Bonet
We introduce GeBioToolkit, a tool for extracting multilingual parallel corpora at sentence level, with document and gender information from Wikipedia biographies.