no code implementations • 13 Sep 2023 • Shucong Zhang, Huiyuan Wang, Wei Lin
High-dimensional compositional data are prevalent in many applications.
no code implementations • 11 Sep 2023 • Titouan Parcollet, Ha Nguyen, Solene Evain, Marcely Zanon Boito, Adrien Pupier, Salima Mdhaffar, Hang Le, Sina Alisamir, Natalia Tomashenko, Marco Dinarelli, Shucong Zhang, Alexandre Allauzen, Maximin Coavoux, Yannick Esteve, Mickael Rouvier, Jerome Goulian, Benjamin Lecouteux, Francois Portet, Solange Rossato, Fabien Ringeval, Didier Schwab, Laurent Besacier
Self-supervised learning (SSL) is at the origin of unprecedented improvements in many different domains including computer vision and natural language processing.
1 code implementation • 12 Jul 2023 • Titouan Parcollet, Rogier Van Dalen, Shucong Zhang, Sourav Bhattacharya
Unfortunately, token mixing with self-attention takes quadratic time in the length of the speech utterance, slowing down inference as well as training and increasing memory consumption.
no code implementations • 8 Nov 2022 • Shucong Zhang, Malcolm Chadwick, Alberto Gil C. P. Ramos, Sourav Bhattacharya
Personalised speech enhancement (PSE), which extracts only the speech of a target user and removes everything else from a recorded audio clip, can potentially improve users' experiences of audio AI modules deployed in the wild.
no code implementations • 11 Mar 2022 • Mohan Li, Shucong Zhang, Catalin Zorila, Rama Doddipatla
In this paper, we propose an online attention mechanism, known as cumulative attention (CA), for streaming Transformer-based automatic speech recognition (ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 9 Feb 2021 • Shucong Zhang, Cong-Thanh Do, Rama Doddipatla, Erfan Loweimi, Peter Bell, Steve Renals
Although the lower layers of a deep neural network learn features which are transferable across datasets, these layers are not transferable within the same dataset.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 8 Nov 2020 • Shucong Zhang, Erfan Loweimi, Peter Bell, Steve Renals
Self-attention models such as Transformers, which can capture temporal relationships without being limited by the distance between events, have given competitive speech recognition results.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 8 Nov 2020 • Shucong Zhang, Erfan Loweimi, Peter Bell, Steve Renals
To the best of our knowledge, we have achieved state-of-the-art end-to-end Transformer based model performance on Switchboard and AMI.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 28 May 2020 • Shucong Zhang, Erfan Loweimi, Peter Bell, Steve Renals
Recently, self-attention models such as Transformers have given competitive results compared to recurrent neural network systems in speech recognition.
no code implementations • 25 Sep 2019 • Shucong Zhang, Cong-Thanh Do, Rama Doddipatla, Erfan Loweimi, Peter Bell, Steve Renals
Interpreting the top layers as a classifier and the lower layers a feature extractor, one can hypothesize that unwanted network convergence may occur when the classifier has overfit with respect to the feature extractor.