Search Results for author: Berlin Chen

Found 90 papers, 3 papers with code

A Study on Contextualized Language Modeling for Machine Reading Comprehension

no code implementations ROCLING 2021 Chin-Ying Wu, Yung-Chang Hsu, Berlin Chen

With the recent breakthrough of deep learning technologies, research on machine reading comprehension (MRC) has attracted much attention and found its versatile applications in many use cases.

Ensemble Learning Language Modelling +1

Building an Enhanced Autoregressive Document Retriever Leveraging Supervised Contrastive Learning

no code implementations ROCLING 2022 Yi-Cheng Wang, Tzu-Ting Yang, Hsin-Wei Wang, Yung-Chang Hsu, Berlin Chen

DSI dramatically simplifies the whole retrieval process by encoding all information about the document collection into the parameter space of a single Transformer model, on top of which DSI can in turn generate the relevant document identities (IDs) in an autoregressive manner in response to a user query.

Contrastive Learning Information Retrieval +1

A Preliminary Study on Automated Speaking Assessment of English as a Second Language (ESL) Students

no code implementations ROCLING 2022 Tzu-I Wu, Tien-Hong Lo, Fu-An Chao, Yao-Ting Sung, Berlin Chen

Due to the surge in global demand for English as a second language (ESL), developments of automated methods for grading speaking proficiency have gained considerable attention.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Automated Speaking Assessment of Conversation Tests with Novel Graph-based Modeling on Spoken Response Coherence

no code implementations11 Sep 2024 Jiun-Ting Li, Bi-Cheng Yan, Tien-Hong Lo, Yi-Cheng Wang, Yung-Chang Hsu, Berlin Chen

Automated speaking assessment in conversation tests (ASAC) aims to evaluate the overall speaking proficiency of an L2 (second-language) speaker in a setting where an interlocutor interacts with one or more candidates.

Zero-Shot Text-to-Speech as Golden Speech Generator: A Systematic Framework and its Applicability in Automatic Pronunciation Assessment

no code implementations11 Sep 2024 Tien-Hong Lo, Meng-Ting Tsai, Berlin Chen

Second language (L2) learners can improve their pronunciation by imitating golden speech, especially when the speech that aligns with their respective speech characteristics.

Effective Noise-aware Data Simulation for Domain-adaptive Speech Enhancement Leveraging Dynamic Stochastic Perturbation

1 code implementation3 Sep 2024 Chien-Chun Wang, Li-Wei Chen, Hung-Shin Lee, Berlin Chen, Hsin-Min Wang

Cross-domain speech enhancement (SE) is often faced with severe challenges due to the scarcity of noise and background information in an unseen target domain, leading to a mismatch between training and test conditions.

Speech Enhancement

Optimizing Automatic Speech Assessment: W-RankSim Regularization and Hybrid Feature Fusion Strategies

no code implementations16 Jun 2024 Chung-Wen Wu, Berlin Chen

To address this challenge, we approach ASA as an ordinal classification task, introducing Weighted Vectors Ranking Similarity (W-RankSim) as a novel regularization technique.

Ordinal Classification

An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution

no code implementations11 Apr 2024 Tien-Hong Lo, Fu-An Chao, Tzu-I Wu, Yao-Ting Sung, Berlin Chen

Automated speaking assessment (ASA) typically involves automatic speech recognition (ASR) and hand-crafted feature extraction from the ASR transcript of a learner's speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

DANCER: Entity Description Augmented Named Entity Corrector for Automatic Speech Recognition

no code implementations26 Mar 2024 Yi-Cheng Wang, Hsin-Wei Wang, Bi-Cheng Yan, Chi-Han Lin, Berlin Chen

End-to-end automatic speech recognition (E2E ASR) systems often suffer from mistranscription of domain-specific phrases, such as named entities, sometimes leading to catastrophic failures in downstream tasks.

Automatic Speech Recognition Language Modelling +2

Speech-Aware Neural Diarization with Encoder-Decoder Attractor Guided by Attention Constraints

no code implementations21 Mar 2024 PeiYing Lee, HauYun Guo, Berlin Chen

End-to-End Neural Diarization with Encoder-Decoder based Attractor (EEND-EDA) is an end-to-end neural model for automatic speaker segmentation and labeling.

Decoder

An Effective Mixture-Of-Experts Approach For Code-Switching Speech Recognition Leveraging Encoder Disentanglement

no code implementations27 Feb 2024 Tzu-Ting Yang, Hsin-Wei Wang, Yi-Cheng Wang, Chi-Han Lin, Berlin Chen

With the massive developments of end-to-end (E2E) neural networks, recent years have witnessed unprecedented breakthroughs in automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Leveraging Language ID to Calculate Intermediate CTC Loss for Enhanced Code-Switching Speech Recognition

no code implementations15 Dec 2023 Tzu-Ting Yang, Hsin-Wei Wang, Berlin Chen

In recent years, end-to-end speech recognition has emerged as a technology that integrates the acoustic, pronunciation dictionary, and language model components of the traditional Automatic Speech Recognition model.

Automatic Speech Recognition Language Identification +3

Preserving Phonemic Distinctions for Ordinal Regression: A Novel Loss Function for Automatic Pronunciation Assessment

no code implementations3 Oct 2023 Bi-Cheng Yan, Hsin-Wei Wang, Yi-Cheng Wang, Jiun-Ting Li, Chi-Han Lin, Berlin Chen

Automatic pronunciation assessment (APA) manages to quantify the pronunciation proficiency of a second language (L2) learner in a language.

regression

A Hierarchical Context-aware Modeling Approach for Multi-aspect and Multi-granular Pronunciation Assessment

no code implementations29 May 2023 Fu-An Chao, Tien-Hong Lo, Tzu-I Wu, Yao-Ting Sung, Berlin Chen

Automatic Pronunciation Assessment (APA) plays a vital role in Computer-assisted Pronunciation Training (CAPT) when evaluating a second language (L2) learner's speaking proficiency.

Automatic Speech Recognition Multi-Task Learning +4

Geometric Learning of Hidden Markov Models via a Method of Moments Algorithm

no code implementations2 Jul 2022 Berlin Chen, Cyrus Mostajeran, Salem Said

We present a novel algorithm for learning the parameters of hidden Markov models (HMMs) in a geometric setting where the observations take values in Riemannian manifolds.

Effective Cross-Utterance Language Modeling for Conversational Speech Recognition

no code implementations5 Nov 2021 Bi-Cheng Yan, Hsin-Wei Wang, Shih-Hsuan Chiu, Hsuan-Sheng Chiu, Berlin Chen

Conversational speech normally is embodied with loose syntactic structures at the utterance level but simultaneously exhibits topical coherence relations across consecutive utterances.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Exploring Non-Autoregressive End-To-End Neural Modeling For English Mispronunciation Detection And Diagnosis

no code implementations1 Nov 2021 Hsin-Wei Wang, Bi-Cheng Yan, Hsuan-Sheng Chiu, Yung-Chang Hsu, Berlin Chen

In addition, we design and develop a pronunciation modeling network stacked on top of the NAR E2E models of our method to further boost the effectiveness of MD&D.

Improving End-To-End Modeling for Mispronunciation Detection with Effective Augmentation Mechanisms

no code implementations17 Oct 2021 Tien-Hong Lo, Yao-Ting Sung, Berlin Chen

Recently, end-to-end (E2E) models, which allow to take spectral vector sequences of L2 (second-language) learners' utterances as input and produce the corresponding phone-level sequences as output, have attracted much research attention in developing mispronunciation detection (MD) systems.

Maximum F1-score training for end-to-end mispronunciation detection and diagnosis of L2 English speech

no code implementations31 Aug 2021 Bi-Cheng Yan, Shao-Wei Fan Jiang, Fu-An Chao, Berlin Chen

End-to-end (E2E) neural models are increasingly attracting attention as a promising modeling approach for mispronunciation detection and diagnosis (MDD).

Data Augmentation

Cross-domain Single-channel Speech Enhancement Model with Bi-projection Fusion Module for Noise-robust ASR

no code implementations26 Aug 2021 Fu-An Chao, Jeih-weih Hung, Berlin Chen

In recent decades, many studies have suggested that phase information is crucial for speech enhancement (SE), and time-domain single-channel speech enhancement techniques have shown promise in noise suppression and robust automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

TENET: A Time-reversal Enhancement Network for Noise-robust ASR

1 code implementation4 Jul 2021 Fu-An Chao, Shao-Wei Fan Jiang, Bi-Cheng Yan, Jeih-weih Hung, Berlin Chen

Due to the unprecedented breakthroughs brought about by deep learning, speech enhancement (SE) techniques have been developed rapidly and play an important role prior to acoustic modeling to mitigate noise effects on speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Cross-utterance Reranking Models with BERT and Graph Convolutional Networks for Conversational Speech Recognition

no code implementations13 Jun 2021 Shih-Hsuan Chiu, Tien-Hong Lo, Fu-An Chao, Berlin Chen

In view of this, we in this paper seek to represent the historical context information of an utterance as graph-structured data so as to distill cross-utterances, global word interaction relationships.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Innovative Bert-based Reranking Language Models for Speech Recognition

no code implementations11 Apr 2021 Shih-Hsuan Chiu, Berlin Chen

More recently, Bidirectional Encoder Representations from Transformers (BERT) was proposed and has achieved impressive success on many natural language processing (NLP) tasks such as question answering and language understanding, due mainly to its effective pre-training then fine-tuning paradigm as well as strong local contextual modeling ability.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

The NTNU Taiwanese ASR System for Formosa Speech Recognition Challenge 2020

no code implementations IJCLCLP 2021 Fu-An Chao, Tien-Hong Lo, Shi-Yan Weng, Shih-Hsuan Chiu, Yao-Ting Sung, Berlin Chen

This paper describes the NTNU ASR system participating in the Formosa Speech Recognition Challenge 2020 (FSR-2020) supported by the Formosa Speech in the Wild project (FSW).

Data Augmentation Speech Enhancement +3

End-to-End Mispronunciation Detection and Diagnosis From Raw Waveforms

no code implementations4 Mar 2021 Bi-Cheng Yan, Berlin Chen

Furthermore, our model can achieve comparable mispronunciation detection performance in relation to state-of-the-art E2E MDD models that take input the standard handcrafted acoustic features.

Effective Decoder Masking for Transformer Based End-to-End Speech Recognition

no code implementations27 Oct 2020 Shi-Yan Weng, Berlin Chen

The attention-based encoder-decoder modeling paradigm has achieved promising results on a variety of speech processing tasks like automatic speech recognition (ASR), text-to-speech (TTS) and among others.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Effective FAQ Retrieval and Question Matching With Unsupervised Knowledge Injection

no code implementations27 Oct 2020 Wen-Ting Tseng, Tien-Hong Lo, Yung-Chang Hsu, Berlin Chen

To this end, predominant approaches to FAQ retrieval typically rank question-answer pairs by considering either the similarity between the query and a question (q-Q), the relevance between the query and the associated answer of a question (q-A), or combining the clues gathered from the q-Q similarity measure and the q-A relevance measure.

Language Modelling Retrieval +1

An Effective Contextual Language Modeling Framework for Speech Summarization with Augmented Features

no code implementations1 Jun 2020 Shi-Yan Weng, Tien-Hong Lo, Berlin Chen

Tremendous amounts of multimedia associated with speech information are driving an urgent need to develop efficient and effective automatic summarization methods.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

An Effective End-to-End Modeling Approach for Mispronunciation Detection

no code implementations18 May 2020 Tien-Hong Lo, Shi-Yan Weng, Hsiu-jui Chang, Berlin Chen

Recently, end-to-end (E2E) automatic speech recognition (ASR) systems have garnered tremendous attention because of their great success and unified modeling paradigms in comparison to conventional hybrid DNN-HMM ASR systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

The NTNU System at the Interspeech 2020 Non-Native Children's Speech ASR Challenge

no code implementations18 May 2020 Tien-Hong Lo, Fu-An Chao, Shi-Yan Weng, Berlin Chen

This paper describes the NTNU ASR system participating in the Interspeech 2020 Non-Native Children's Speech ASR Challenge supported by the SIG-CHILD group of ISCA.

Data Augmentation Diversity +1

Looking for ELMo's friends: Sentence-Level Pretraining Beyond Language Modeling

no code implementations ICLR 2019 Samuel R. Bowman, Ellie Pavlick, Edouard Grave, Benjamin Van Durme, Alex Wang, Jan Hula, Patrick Xia, Raghavendra Pappagari, R. Thomas McCoy, Roma Patel, Najoung Kim, Ian Tenney, Yinghui Huang, Katherin Yu, Shuning Jin, Berlin Chen

Work on the problem of contextualized word representation—the development of reusable neural network components for sentence understanding—has recently seen a surge of progress centered on the unsupervised pretraining task of language modeling with methods like ELMo (Peters et al., 2018).

Language Modelling Sentence

Learning to Distill: The Essence Vector Modeling Framework

no code implementations COLING 2016 Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang

The D-EV model not only inherits the advantages of the EV model but also can infer a more robust representation for a given spoken paragraph against imperfect speech recognition.

Denoising Document Embedding +6

Novel Word Embedding and Translation-based Language Modeling for Extractive Speech Summarization

no code implementations22 Jul 2016 Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang, Hsin-Hsi Chen

Word embedding methods revolve around learning continuous distributed vector representations of words with neural networks, which can capture semantic and/or syntactic cues, and in turn be used to induce similarity measures among words, sentences and documents in context.

Language Modelling Representation Learning +1

Improved Spoken Document Summarization with Coverage Modeling Techniques

no code implementations20 Jan 2016 Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang

In addition to MMR, there is only a dearth of research concentrating on reducing redundancy or increasing diversity for the spoken document summarization task, as far as we are aware.

Diversity Document Summarization +2

Leveraging Word Embeddings for Spoken Document Summarization

no code implementations14 Jun 2015 Kuan-Yu Chen, Shih-Hung Liu, Hsin-Min Wang, Berlin Chen, Hsin-Hsi Chen

Owing to the rapidly growing multimedia content available on the Internet, extractive spoken document summarization, with the purpose of automatically selecting a set of representative sentences from a spoken document to concisely express the most important theme of the document, has been an active area of research and experimentation.

Document Summarization Sentence +1

Cannot find the paper you are looking for? You can Submit a new open access paper.