Multiword Expressions in Child Language

The goal of this work is to introduce CHILDES-MWE, which contains English CHILDES corpora automatically annotated with Multiword Expressions (MWEs) information. The result is a resource with almost 350,000 sentences annotated with more than 70,000 distinct MWEs of various types from both longitudinal and latitudinal corpora. This resource can be used for large scale language acquisition studies of how MWEs feature in child language. Focusing on compound nouns (CN), we then verify in a longitudinal study if there are differences in the distribution and compositionality of CNs in child-directed and child-produced sentences across ages. Moreover, using additional latitudinal data, we investigate if there are further differences in CN usage and in compositionality preferences. The results obtained for the child-produced sentences reflect CN distribution and compositionality in child-directed sentences.

PDF Abstract LREC 2016 PDF LREC 2016 Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here