Rapid Deployment of Phrase Structure Parsing for Related Languages: A Case Study of Insular Scandinavian
This paper presents ongoing work that aims to improve machine parsing of Faroese using a combination of Faroese and Icelandic training data. We show that even if we only have a relatively small parsed corpus of one language, namely 53,000 words of Faroese, we can obtain better results by adding information about phrase structure from a closely related language which has a similar syntax. Our experiment uses the Berkeley parser. We demonstrate that the addition of Icelandic data without any other modification to the experimental setup results in an f-measure improvement from 75.44{\%} to 78.05{\%} in Faroese and an improvement in part-of-speech tagging accuracy from 88.86{\%} to 90.40{\%}.
PDF Abstract