Search Results for author: Chenchen Ding

Found 28 papers, 2 papers with code

Overview of the 7th Workshop on Asian Translation

no code implementations • AACL (WAT) 2020 • Toshiaki Nakazawa, Hideki Nakayama, Chenchen Ding, Raj Dabre, Shohei Higashiyama, Hideya Mino, Isao Goto, Win Pa Pa, Anoop Kunchukuttan, Shantipriya Parida, Ondřej Bojar, Sadao Kurohashi

This paper presents the results of the shared tasks from the 7th workshop on Asian translation (WAT2020).

Translation

Paper
Add Code

Multi-Source Cross-Lingual Constituency Parsing

no code implementations • ICON 2021 • Hour Kaing, Chenchen Ding, Katsuhito Sudoh, Masao Utiyama, Eiichiro Sumita, Satoshi Nakamura

Pretrained multilingual language models have become a key part of cross-lingual transfer for many natural language processing tasks, even those without bilingual information.

Constituency Parsing Cross-Lingual Transfer +1

Paper
Add Code

Overview of the 8th Workshop on Asian Translation

no code implementations • ACL (WAT) 2021 • Toshiaki Nakazawa, Hideki Nakayama, Chenchen Ding, Raj Dabre, Shohei Higashiyama, Hideya Mino, Isao Goto, Win Pa Pa, Anoop Kunchukuttan, Shantipriya Parida, Ondřej Bojar, Chenhui Chu, Akiko Eriguchi, Kaori Abe, Yusuke Oda, Sadao Kurohashi

This paper presents the results of the shared tasks from the 8th workshop on Asian translation (WAT2021).

Translation

Paper
Add Code

FeatureBART: Feature Based Sequence-to-Sequence Pre-Training for Low-Resource NMT

no code implementations • COLING 2022 • Abhisek Chakrabarty, Raj Dabre, Chenchen Ding, Hideki Tanaka, Masao Utiyama, Eiichiro Sumita

In this paper we present FeatureBART, a linguistically motivated sequence-to-sequence monolingual pre-training strategy in which syntactic features such as lemma, part-of-speech and dependency labels are incorporated into the span prediction based pre-training framework (BART).

LEMMA NMT

Paper
Add Code

English-Myanmar NMT and SMT with Pre-ordering: NICT’s Machine Translation Systems at WAT-2018

no code implementations • PACLIC 2018 • Rui Wang, Chenchen Ding, Masao Utiyama, Eiichiro Sumita

Machine Translation NMT +1

Paper
Add Code

Overview of the 5th Workshop on Asian Translation

no code implementations • PACLIC 2018 • Toshiaki Nakazawa, Katsuhito Sudoh, Shohei Higashiyama, Chenchen Ding, Raj Dabre, Hideya Mino, Isao Goto, Win Pa Pa, Anoop Kunchukuttan, Sadao Kurohashi

Translation

Paper
Add Code

Outlier-Aware Training for Low-Bit Quantization of Structural Re-Parameterized Networks

no code implementations • 11 Feb 2024 • Muqun Niu, Yuan Ren, Boyu Li, Chenchen Ding

Lightweight design of Convolutional Neural Networks (CNNs) requires co-design efforts in the model architectures and compression techniques.

Quantization

Paper
Add Code

A Crucial Parameter for Rank-Frequency Relation in Natural Languages

no code implementations • 1 Feb 2024 • Chenchen Ding

$f \propto r^{-\alpha} \cdot (r+\gamma)^{-\beta}$ has been empirically shown more precise than a na\"ive power law $f\propto r^{-\alpha}$ to model the rank-frequency ($r$-$f$) relation of words in natural languages.

Relation

Paper
Add Code

A Two Parameters Equation for Word Rank-Frequency Relation

no code implementations • 2 May 2022 • Chenchen Ding

Let $f (\cdot)$ be the absolute frequency of words and $r$ be the rank of words in decreasing order of frequency, then the following function can fit the rank-frequency relation \[ f (r;s, t) = \left(\frac{r_{\tt max}}{r}\right)^{1-s} \left(\frac{r_{\tt max}+t \cdot r_{\tt exp}}{r+t \cdot r_{\tt exp}}\right)^{1+(1+t)s} \] where $r_{\tt max}$ and $r_{\tt exp}$ are the maximum and the expectation of the rank, respectively; $s>0$ and $t>0$ are parameters estimated from data.

Relation Vocal Bursts Valence Prediction

Paper
Add Code

Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition

1 code implementation • 8 Apr 2022 • Qianying Liu, Zhuo Gong, Zhengdong Yang, Yuhang Yang, Sheng Li, Chenchen Ding, Nobuaki Minematsu, Hao Huang, Fei Cheng, Chenhui Chu, Sadao Kurohashi

Low-resource speech recognition has been long-suffering from insufficient training data.

speech-recognition Speech Recognition

Paper
Code

Transliteration of Foreign Words in Burmese

no code implementations • 7 Oct 2021 • Chenchen Ding

This manuscript provides general descriptions on transliteration of foreign words in the Burmese language.

Transliteration

Paper
Add Code

Improving Low-Resource NMT through Relevance Based Linguistic Features Incorporation

no code implementations • COLING 2020 • Abhisek Chakrabarty, Raj Dabre, Chenchen Ding, Masao Utiyama, Eiichiro Sumita

In this study, linguistic knowledge at different levels are incorporated into the neural machine translation (NMT) framework to improve translation quality for language pairs with extremely limited data.

Machine Translation NMT +1

Paper
Add Code

A Three-Parameter Rank-Frequency Relation in Natural Languages

no code implementations • ACL 2020 • Chenchen Ding, Masao Utiyama, Eiichiro Sumita

We present that, the rank-frequency relation in textual data follows $f \propto r^{-\alpha}(r+\gamma)^{-\beta}$, where $f$ is the token frequency and $r$ is the rank by frequency, with ($\alpha$, $\beta$, $\gamma$) as parameters.

Relation

Paper
Add Code

A Myanmar (Burmese)-English Named Entity Transliteration Dictionary

no code implementations • LREC 2020 • Aye Myat Mon, Chenchen Ding, Hour Kaing, Khin Mar Soe, Masao Utiyama, Eiichiro Sumita

For the Myanmar (Burmese) language, robust automatic transliteration for borrowed English words is a challenging task because of the complex Myanmar writing system and the lack of data.

Transliteration

Paper
Add Code

Supervised and Unsupervised Machine Translation for Myanmar-English and Khmer-English

no code implementations • WS 2019 • Benjamin Marie, Hour Kaing, Aye Myat Mon, Chenchen Ding, Atsushi Fujita, Masao Utiyama, Eiichiro Sumita

This paper presents the NICT{'}s supervised and unsupervised machine translation systems for the WAT2019 Myanmar-English and Khmer-English translation tasks.

NMT Translation +1

Paper
Add Code

Overview of the 6th Workshop on Asian Translation

no code implementations • WS 2019 • Toshiaki Nakazawa, Nobushige Doi, Shohei Higashiyama, Chenchen Ding, Raj Dabre, Hideya Mino, Isao Goto, Win Pa Pa, Anoop Kunchukuttan, Yusuke Oda, Shantipriya Parida, Ond{\v{r}}ej Bojar, Sadao Kurohashi

This paper presents the results of the shared tasks from the 6th workshop on Asian translation (WAT2019) including Ja↔En, Ja↔Zh scientific paper translation subtasks, Ja↔En, Ja↔Ko, Ja↔En patent translation subtasks, Hi↔En, My↔En, Km↔En, Ta↔En mixed domain subtasks and Ru↔Ja news commentary translation task.

Translation

Paper
Add Code

English-Myanmar Supervised and Unsupervised NMT: NICT's Machine Translation Systems at WAT-2019

no code implementations • WS 2019 • Rui Wang, Haipeng Sun, Kehai Chen, Chenchen Ding, Masao Utiyama, Eiichiro Sumita

This paper presents the NICT{'}s participation (team ID: NICT) in the 6th Workshop on Asian Translation (WAT-2019) shared translation task, specifically Myanmar (Burmese) - English task in both translation directions.

Language Modelling Machine Translation +2

Paper
Add Code

MY-AKKHARA: A Romanization-based Burmese (Myanmar) Input Method

no code implementations • IJCNLP 2019 • Chenchen Ding, Masao Utiyama, Eiichiro Sumita

MY-AKKHARA is a method used to input Burmese texts encoded in the Unicode standard, based on commonly accepted Latin transcription.

Paper
Add Code

Simplified Abugidas

no code implementations • ACL 2018 • Chenchen Ding, Masao Utiyama, Eiichiro Sumita

An abugida is a writing system where the consonant letters represent syllables with a default vowel and other vowels are denoted by diacritics.

Sentence

Paper
Add Code

Overview of the 4th Workshop on Asian Translation

no code implementations • WS 2017 • Toshiaki Nakazawa, Shohei Higashiyama, Chenchen Ding, Hideya Mino, Isao Goto, Hideto Kazawa, Yusuke Oda, Graham Neubig, Sadao Kurohashi

For the WAT2017, 12 institutions participated in the shared tasks.

Machine Translation Translation

Paper
Add Code

Similar Southeast Asian Languages: Corpus-Based Case Study on Thai-Laotian and Malay-Indonesian

no code implementations • WS 2016 • Chenchen Ding, Masao Utiyama, Eiichiro Sumita

This paper illustrates the similarity between Thai and Laotian, and between Malay and Indonesian, based on an investigation on raw parallel data from Asian Language Treebank.

Machine Translation Translation +1

Paper
Add Code

Overview of the 3rd Workshop on Asian Translation

no code implementations • WS 2016 • Toshiaki Nakazawa, Chenchen Ding, Hideya Mino, Isao Goto, Graham Neubig, Sadao Kurohashi

For the WAT2016, 15 institutions participated in the shared tasks.

Machine Translation Translation

Paper
Add Code

Khmer Word Segmentation Using Conditional Random Fields

1 code implementation • 15 Oct 2015 • Vichet Chea, Ye Kyaw Thu, Chenchen Ding, Masao Utiyama, Andrew Finch, Eiichiro Sumita

The trained CRF segmenter was compared empirically to a baseline approach based on maximum matching that used a dictionary extracted from the manually segmented corpus.

Segmentation Text Segmentation +1