L1-L2 Parallel Dependency Treebank as Learner Corpus

WS 2017  ·  John Lee, Keying Li, Herman Leung ·

This opinion paper proposes the use of parallel treebank as learner corpus. We show how an L1-L2 parallel treebank {---} i.e., parse trees of non-native sentences, aligned to the parse trees of their target hypotheses {---} can facilitate retrieval of sentences with specific learner errors. We argue for its benefits, in terms of corpus re-use and interoperability, over a conventional learner corpus annotated with error tags. As a proof of concept, we conduct a case study on word-order errors made by learners of Chinese as a foreign language. We report precision and recall in retrieving a range of word-order error categories from L1-L2 tree pairs annotated in the Universal Dependency framework.

PDF Abstract
No code implementations yet. Submit your code now

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here