Mac-Morpho is a corpus of Brazilian Portuguese texts annotated with part-of-speech tags. Its first version was released in 2003 [1], and since then, two revisions have been made in order to improve the quality of the resource [2, 3]. The corpus is available for download split into train, development and test sections. These are 76%, 4% and 20% of the corpus total, respectively (the reason for the unusual numbers is that the corpus was first split into 80%/20% train/test, and then 5% of the train section was set aside for development). This split was used in [3], and new POS tagging research with Mac-Morpho is encouraged to follow it in order to make consistent comparisons possible.
Paper | Code | Results | Date | Stars |
---|