Search Results for author: Anton Ragni

Found 17 papers, 6 papers with code

How Much Context Does My Attention-Based ASR System Need?

1 code implementation24 Oct 2023 Robert Flynn, Anton Ragni

For the task of speech recognition, the use of more than 30 seconds of acoustic context during training is uncommon, and under-investigated in literature.

speech-recognition Speech Recognition

Energy-Based Models For Speech Synthesis

no code implementations19 Oct 2023 Wanli Sun, Zehai Tu, Anton Ragni

It also describes how sampling from EBMs can be performed using Langevin Markov Chain Monte-Carlo (MCMC).

Speech Synthesis

On the Effectiveness of Speech Self-supervised Learning for Music

no code implementations11 Jul 2023 Yinghao Ma, Ruibin Yuan, Yizhi Li, Ge Zhang, Xingran Chen, Hanzhi Yin, Chenghua Lin, Emmanouil Benetos, Anton Ragni, Norbert Gyenge, Ruibo Liu, Gus Xia, Roger Dannenberg, Yike Guo, Jie Fu

Our findings suggest that training with music data can generally improve performance on MIR tasks, even when models are trained using paradigms designed for speech.

Information Retrieval Music Information Retrieval +2

Leveraging Cross-Utterance Context For ASR Decoding

no code implementations29 Jun 2023 Robert Flynn, Anton Ragni

While external language models (LMs) are often incorporated into the decoding stage of automated speech recognition systems, these models usually operate with limited context.

speech-recognition Speech Recognition

HERB: Measuring Hierarchical Regional Bias in Pre-trained Language Models

1 code implementation5 Nov 2022 Yizhi Li, Ge Zhang, Bohao Yang, Chenghua Lin, Shi Wang, Anton Ragni, Jie Fu

In addition to verifying the existence of regional bias in LMs, we find that the biases on regional groups can be strongly influenced by the geographical clustering of the groups.

Fairness

Approximate Fixed-Points in Recurrent Neural Networks

no code implementations4 Jun 2021 Zhengxiong Wang, Anton Ragni

Although exact fixed-points inherit the same parallelization and inconsistency issues, this paper shows that approximate fixed-points can be computed in parallel and used consistently in training and inference including tasks such as lattice rescoring.

Continuous representations of intents for dialogue systems

no code implementations8 May 2021 Sindre André Jacobsen, Anton Ragni

Finally, this paper will show how the proposed model can be augmented with unseen intents without retraining any of the seen ones.

Intent Detection Zero-Shot Learning

Bi-Directional Lattice Recurrent Neural Networks for Confidence Estimation

4 code implementations30 Oct 2018 Qiujia Li, Preben Ness, Anton Ragni, Mark Gales

The standard approach to mitigate errors made by an automatic speech recognition system is to use confidence scores associated with each predicted word.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Confidence Estimation and Deletion Prediction Using Bidirectional Recurrent Neural Networks

no code implementations30 Oct 2018 Anton Ragni, Qiujia Li, Mark Gales, Yu Wang

These errors are not accounted for by the standard confidence estimation schemes and are hard to rectify in the upstream and downstream processing.

Phonetic and Graphemic Systems for Multi-Genre Broadcast Transcription

no code implementations1 Feb 2018 Yu Wang, Xie Chen, Mark Gales, Anton Ragni, Jeremy Wong

As the combination approaches become more complicated the difference between the phonetic and graphemic systems further decreases.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Future Word Contexts in Neural Network Language Models

no code implementations18 Aug 2017 Xie Chen, Xunying Liu, Anton Ragni, Yu Wang, Mark Gales

Instead of using a recurrent unit to capture the complete future word contexts, a feedforward unit is used to model a finite number of succeeding, future, words.

speech-recognition Speech Recognition

Incorporating Uncertainty into Deep Learning for Spoken Language Assessment

no code implementations ACL 2017 Andrey Malinin, Anton Ragni, Kate Knill, Mark Gales

On experiments conducted on data from the Business Language Testing Service (BULATS), the proposed approach is found to outperform GPs and DNNs with MCD in uncertainty-based rejection whilst achieving comparable grading performance.

Cannot find the paper you are looking for? You can Submit a new open access paper.