Defining and Counting Phonological Classes in Cross-linguistic Segment Databases

LREC 2016  ·  Dan Dediu, Scott Moisik ·

Recently, there has been an explosion in the availability of large, good-quality cross-linguistic databases such as WALS (Dryer {\&} Haspelmath, 2013), Glottolog (Hammarstrom et al., 2015) and Phoible (Moran {\&} McCloy, 2014). Databases such as Phoible contain the actual segments used by various languages as they are given in the primary language descriptions. However, this segment-level representation cannot be used directly for analyses that require generalizations over classes of segments that share theoretically interesting features. Here we present a method and the associated R (R Core Team, 2014) code that allows the flexible definition of such meaningful classes and that can identify the sets of segments falling into such a class for any language inventory. The method and its results are important for those interested in exploring cross-linguistic patterns of phonetic and phonological diversity and their relationship to extra-linguistic factors and processes such as climate, economics, history or human genetics.

PDF Abstract LREC 2016 PDF LREC 2016 Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here