1 code implementation • 15 Jan 2024 • Nicolae-Catalin Ristea, Andrei Anghel, Radu Tudor Ionescu
Subsequently, we combine language-specific Bidirectional Encoder Representations from Transformers (BERT) with Wav2Vec2. 0 audio features via a novel cascaded cross-modal transformer (CCMT).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 22 Sep 2023 • Ross Cutler, Ando Saabas, Tanel Parnamaa, Marju Purin, Evgenii Indenbom, Nicolae-Catalin Ristea, Jegor Gužvin, Hannes Gamper, Sebastian Braun, Robert Aichner
This is the fourth AEC challenge and it is enhanced by adding a second track for personalized acoustic echo cancellation, reducing the algorithmic + buffering latency to 20ms, as well as including a full-band version of AECMOS.
2 code implementations • 14 Sep 2023 • Babak Naderi, Ross Cutler, Nicolae-Catalin Ristea
The commonly used standard ITU-T Rec.
1 code implementation • 6 Sep 2023 • Codrut Rotaru, Nicolae-Catalin Ristea, Radu Tudor Ionescu
We introduce RoDia, the first dataset for Romanian dialect identification from speech.
1 code implementation • 31 Aug 2023 • Neelu Madan, Nicolae-Catalin Ristea, Kamal Nasrollahi, Thomas B. Moeslund, Radu Tudor Ionescu
In this paper, we propose a curriculum learning approach that updates the masking strategy to continually increase the complexity of the self-supervised reconstruction task.
no code implementations • 27 Jul 2023 • Nicolae-Catalin Ristea, Radu Tudor Ionescu
We propose a novel cascaded cross-modal transformer (CCMT) that combines speech and text transcripts to detect customer requests and complaints in phone conversations.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • CVPR 2024 • Nicolae-Catalin Ristea, Florinel-Alin Croitoru, Radu Tudor Ionescu, Marius Popescu, Fahad Shahbaz Khan, Mubarak Shah
We propose an efficient abnormal event detection model based on a lightweight masked auto-encoder (AE) applied at the video frame level.
Ranked #12 on Anomaly Detection on UCSD Ped2
no code implementations • 13 Jun 2023 • Nicolae-Catalin Ristea, Andrei Anghel, Mihai Datcu
Sea ice is a crucial component of the Earth's climate system and is highly sensitive to changes in temperature and atmospheric conditions.
no code implementations • 5 Jun 2023 • Evgenii Indenbom, Nicolae-Catalin Ristea, Ando Saabas, Tanel Parnamaa, Jegor Guzvin, Ross Cutler
Acoustic echo cancellation (AEC), noise suppression (NS) and dereverberation (DR) are an integral part of modern full-duplex communication systems.
1 code implementation • 28 Nov 2022 • Florinel-Alin Croitoru, Nicolae-Catalin Ristea, Dana Dascalescu, Radu Tudor Ionescu, Fahad Shahbaz Khan, Mubarak Shah
We propose a very fast frame-level model for anomaly detection in video, which learns to detect anomalies by distilling knowledge from multiple highly accurate object-level teacher models.
Ranked #20 on Anomaly Detection on CUHK Avenue
1 code implementation • 25 Sep 2022 • Neelu Madan, Nicolae-Catalin Ristea, Radu Tudor Ionescu, Kamal Nasrollahi, Fahad Shahbaz Khan, Thomas B. Moeslund, Mubarak Shah
In this work, we extend our previous self-supervised predictive convolutional attentive block (SSPCAB) with a 3D masked convolutional layer, a transformer for channel-wise attention, as well as a novel self-supervised objective based on Huber loss.
Ranked #5 on Anomaly Detection on CUHK Avenue
1 code implementation • 18 May 2022 • Florinel-Alin Croitoru, Nicolae-Catalin Ristea, Radu Tudor Ionescu, Nicu Sebe
In this work, we propose a novel curriculum learning approach termed Learning Rate Curriculum (LeRaC), which leverages the use of a different learning rate for each layer of a neural network to create a data-agnostic curriculum during the initial training epochs.
Ranked #4 on Speech Emotion Recognition on CREMA-D
no code implementations • 9 Apr 2022 • Nicolae-Catalin Ristea, Andrei Anghel, Mihai Datcu, Bertrand Chapron
Overall, we encourage the development of data centring approaches, showing that, data preprocessing could bring significant performance improvements over existing deep learning models.
1 code implementation • 8 Apr 2022 • Mariana-Iuliana Georgescu, Radu Tudor Ionescu, Andreea-Iuliana Miron, Olivian Savencu, Nicolae-Catalin Ristea, Nicolae Verga, Fahad Shahbaz Khan
Our attention module uses the convolution operation to perform joint spatial-channel attention on multiple concatenated input tensors, where the kernel (receptive field) size controls the reduction rate of the spatial attention, and the number of convolutional filters controls the reduction rate of the channel attention, respectively.
Ranked #1 on Image Super-Resolution on IXI
1 code implementation • 17 Mar 2022 • Nicolae-Catalin Ristea, Radu Tudor Ionescu, Fahad Shahbaz Khan
Following the successful application of vision transformers in multiple computer vision tasks, these models have drawn the attention of the signal processing community.
Ranked #1 on Time Series Analysis on Speech Commands
4 code implementations • CVPR 2022 • Nicolae-Catalin Ristea, Neelu Madan, Radu Tudor Ionescu, Kamal Nasrollahi, Fahad Shahbaz Khan, Thomas B. Moeslund, Mubarak Shah
Our block is equipped with a loss that minimizes the reconstruction error with respect to the masked area in the receptive field.
Ranked #1 on Anomaly Detection on CUHK Avenue (TBDC metric)
1 code implementation • 12 Oct 2021 • Nicolae-Catalin Ristea, Andreea-Iuliana Miron, Olivian Savencu, Mariana-Iuliana Georgescu, Nicolae Verga, Fahad Shahbaz Khan, Radu Tudor Ionescu
Our neural model can be trained on unpaired images, due to the integration of a multi-level cycle-consistency loss.
no code implementations • 22 Mar 2021 • Nicolae-Catalin Ristea, Radu Tudor Ionescu
Instead of just combining the models, we propose a self-paced ensemble learning scheme in which models learn from each other over several iterations.
Ranked #6 on Speech Emotion Recognition on CREMA-D
no code implementations • 29 Feb 2020 • Nicolae-Catalin Ristea, Liviu Cristian Dutu, Anamaria Radoi
In order to increase the accuracy of the recognition system, we analyze also the speech data and fuse the information coming from both sources, i. e., visual and audio.
1 code implementation • 2 Feb 2020 • Mariana-Iuliana Georgescu, Radu Tudor Ionescu, Nicolae-Catalin Ristea, Nicu Sebe
In order to classify linearly non-separable data, neurons are typically organized into multi-layer neural networks that are equipped with at least one hidden layer.
Ranked #8 on Speech Emotion Recognition on CREMA-D