Search Results for author: Kalvin Chang

Found 8 papers, 7 papers with code

WikiHan: A New Comparative Dataset for Chinese Languages

2 code implementations COLING 2022 Kalvin Chang, Chenxuan Cui, Youngmin Kim, David R. Mortensen

Most comparative datasets of Chinese varieties are not digital; however, Wiktionary includes a wealth of transcriptions of words from these varieties.

Decoder

Self-supervised Speech Representations Still Struggle with African American Vernacular English

1 code implementation26 Aug 2024 Kalvin Chang, Yi-Hui Chou, Jiatong Shi, Hsuan-Ming Chen, Nicole Holliday, Odette Scharenborg, David R. Mortensen

Underperformance of ASR systems for speakers of African American Vernacular English (AAVE) and other marginalized language varieties is a well-documented phenomenon, and one that reinforces the stigmatization of these varieties.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Can Large Language Models Code Like a Linguist?: A Case Study in Low Resource Sound Law Induction

no code implementations18 Jun 2024 Atharva Naik, Kexun Zhang, Nathaniel Robinson, Aravind Mysore, Clayton Marr, Hong Sng Rebecca Byrnes, Anna Cai, Kalvin Chang, David Mortensen

Historical linguists have long written a kind of incompletely formalized ''program'' that converts reconstructed words in an ancestor language into words in one of its attested descendants that consist of a series of ordered string rewrite functions (called sound laws).

Phonotactic Complexity across Dialects

1 code implementation20 Feb 2024 Ryan Soh-Eun Shim, Kalvin Chang, David R. Mortensen

Received wisdom in linguistic typology holds that if the structure of a language becomes more complex in one dimension, it will simplify in another, building on the assumption that all languages are equally complex (Joseph and Newmeyer, 2012).

Language Modelling

Automating Sound Change Prediction for Phylogenetic Inference: A Tukanoan Case Study

1 code implementation2 Feb 2024 Kalvin Chang, Nathaniel R. Robinson, Anna Cai, Ting Chen, Annie Zhang, David R. Mortensen

We describe a set of new methods to partially automate linguistic phylogenetic inference given (1) cognate sets with their respective protoforms and sound laws, (2) a mapping from phones to their articulatory features and (3) a typological database of sound changes.

Transformed Protoform Reconstruction

1 code implementation4 Jul 2023 Young Min Kim, Kalvin Chang, Chenxuan Cui, David Mortensen

We update their model with the state-of-the-art seq2seq model: the Transformer.

Decoder

Cannot find the paper you are looking for? You can Submit a new open access paper.