no code implementations • 7 Feb 2025 • Wolfgang Mack, Ahmed Mustafa, Rafał Łaganowski, Samer Hijazy
Neural codecs, comprising an encoder, quantizer, and decoder, enable signal transmission at exceptionally low bitrates.
no code implementations • 28 Aug 2024 • Muhammad Tahir Rafique, Ahmed Mustafa, Hasan Sajid
The growing demand for road use in urban areas has led to significant traffic congestion, posing challenges that are costly to mitigate through infrastructure expansion alone.
no code implementations • 27 Aug 2024 • Ahmed Mustafa, Muhammad Tahir Rafique, Muhammad Ijlal Baig, Hasan Sajid, Muhammad Jawad Khan, Karam Dad Kallu
This research paper introduces a novel word-level Optical Character Recognition (OCR) model specifically designed for digital Urdu text, leveraging transformer-based architectures and attention mechanisms to address the distinct challenges of Urdu script recognition, including its diverse text styles, fonts, and variations.
2 code implementations • 31 May 2024 • Jean-Marc Valin, Ahmed Mustafa, Jan Büthe
Neural vocoders are now being used in a wide range of speech processing applications.
1 code implementation • 25 Sep 2023 • Jan Büthe, Ahmed Mustafa, Jean-Marc Valin, Karim Helwani, Michael M. Goodwin
Speech codec enhancement methods are designed to remove distortions added by speech codecs.
1 code implementation • 13 Jul 2023 • Jan Büthe, Jean-Marc Valin, Ahmed Mustafa
Classical speech coding uses low-complexity postfilters with zero lookahead to enhance the quality of coded speech, but their effectiveness is limited by their simplicity.
no code implementations • 8 Dec 2022 • Jean-Marc Valin, Jan Büthe, Ahmed Mustafa, Michael Klingbeil
Despite recent advancements in packet loss concealment (PLC) using deep learning techniques, packet loss remains a significant challenge in real-time speech communication.
no code implementations • 8 Dec 2022 • Ahmed Mustafa, Jean-Marc Valin, Jan Büthe, Paris Smaragdis, Mike Goodwin
GAN vocoders are currently one of the state-of-the-art methods for building high-quality neural waveform generative models.
1 code implementation • 11 May 2022 • Jean-Marc Valin, Ahmed Mustafa, Christopher Montgomery, Timothy B. Terriberry, Michael Klingbeil, Paris Smaragdis, Arvindh Krishnaswamy
As deep speech enhancement algorithms have recently demonstrated capabilities greatly surpassing their traditional counterparts for suppressing noise, reverberation and echo, attention is turning to the problem of packet loss concealment (PLC).
no code implementations • 9 Aug 2021 • Ahmed Mustafa, Jan Büthe, Srikanth Korse, Kishan Gupta, Guillaume Fuchs, Nicola Pia
Recently, GAN vocoders have seen rapid progress in speech synthesis, starting to outperform autoregressive models in perceptual quality with much higher generation speed.
2 code implementations • 3 Nov 2020 • Ahmed Mustafa, Nicola Pia, Guillaume Fuchs
In recent years, neural vocoders have surpassed classical speech generation approaches in naturalness and perceptual quality of the synthesized speech.
no code implementations • 1 Jul 2019 • Ahmed Mustafa, Arijit Biswas, Christian Bergler, Julia Schottenhamml, Andreas Maier
Recently, autoregressive deep generative models such as WaveNet and SampleRNN have been used as speech vocoders to scale up the perceptual quality of the reconstructed signals without increasing the coding rate.