Browse State-of-the-Art
Sign In
Subscribe to the PwC Newsletter
Stay informed on the latest trending ML papers with code, research developments, libraries, methods, and datasets.
Read previous issues
Join the community
You need to
log in
to edit.
You can
create a new account
if you don't have one.
Browse SoTA
> Audio
142 benchmarks • 83 tasks • 200 datasets • 1691 papers with code
340 benchmarks
3546 papers with code
Text Classification
164 benchmarks
1224 papers with code
Graph Classification
74 benchmarks
435 papers with code
Audio Classification
29 benchmarks
164 papers with code
Medical Image Classification
10 benchmarks
156 papers with code
See all 21 tasks
2D Semantic Segmentation
Image Segmentation
9 benchmarks
1811 papers with code
Text Style Transfer
3 benchmarks
91 papers with code
Scene Parsing
66 benchmarks
80 papers with code
2D Semantic Segmentation
108 benchmarks
41 papers with code
Reflection Removal
5 benchmarks
32 papers with code
See all 16 tasks
Speech Recognition
Speech Recognition
175 benchmarks
1251 papers with code
Automatic Speech Recognition (ASR)
13 benchmarks
557 papers with code
Visual Speech Recognition
10 benchmarks
49 papers with code
Robust Speech Recognition
24 papers with code
Target Speaker Extraction
15 papers with code
See all 12 tasks
Few-Shot Learning
Few-Shot Learning
66 benchmarks
1183 papers with code
One-Shot Learning
1 benchmark
99 papers with code
Few-Shot Semantic Segmentation
17 benchmarks
89 papers with code
Cross-Domain Few-Shot
10 benchmarks
68 papers with code
Unsupervised Few-Shot Learning
13 papers with code
See all 13 tasks
2D Classification
Style Transfer
3 benchmarks
706 papers with code
Neural Rendering
176 papers with code
Voice Conversion
3 benchmarks
162 papers with code
Neural Network Compression
1 benchmark
77 papers with code
Cell Detection
4 benchmarks
56 papers with code
See all 22 tasks
Text-to-Image Generation
15 benchmarks
394 papers with code
Conformal Prediction
200 papers with code
Text Simplification
11 benchmarks
124 papers with code
Self-Supervised Image Classification
3 benchmarks
93 papers with code
Music Source Separation
3 benchmarks
56 papers with code
See all 17 tasks
Emotion Recognition
Emotion Recognition
58 benchmarks
538 papers with code
Speech Emotion Recognition
18 benchmarks
123 papers with code
Emotion Recognition in Conversation
12 benchmarks
79 papers with code
Multimodal Emotion Recognition
3 benchmarks
68 papers with code
Emotion-Cause Pair Extraction
2 benchmarks
21 papers with code
See all 13 tasks
Speech Synthesis
Speech Synthesis
17 benchmarks
334 papers with code
Expressive Speech Synthesis
13 papers with code
Emotional Speech Synthesis
6 papers with code
text-to-speech translation
2 papers with code
Speech Synthesis - Assamese
1 benchmark
1 papers with code
See all 16 tasks
Accented Speech Recognition
Speech Synthesis
17 benchmarks
334 papers with code
Speech Enhancement
Speech Enhancement
24 benchmarks
251 papers with code
Bandwidth Extension
6 benchmarks
19 papers with code
Speech Dereverberation
5 benchmarks
19 papers with code
Packet Loss Concealment
4 papers with code
Speech Intelligibility Evaluation
DeepFake Detection
DeepFake Detection
10 benchmarks
190 papers with code
Synthetic Speech Detection
9 papers with code
Human Detection of Deepfakes
1 papers with code
Multimodal Forgery Detection
1 benchmark
1 papers with code
Audio Classification
Audio Classification
29 benchmarks
164 papers with code
Environmental Sound Classification
3 benchmarks
24 papers with code
Audio Multiple Target Classification
1 papers with code
Semi-supervised Audio Classification
1 papers with code
Language Identification
Language Identification
7 benchmarks
134 papers with code
Dialect Identification
33 papers with code
Native Language Identification
1 benchmark
5 papers with code
Music Generation
Music Generation
1 benchmark
161 papers with code
Music Performance Rendering
5 papers with code
Music Texture Transfer
1 papers with code
Voice Conversion
Voice Conversion
3 benchmarks
162 papers with code
Audio Generation
Audio Generation
7 benchmarks
90 papers with code
Voice Cloning
27 papers with code
Audio Super-Resolution
4 benchmarks
14 papers with code
Room Impulse Response (RIR)
14 papers with code
Video-to-Sound Generation
Text-To-Speech Synthesis
Text-To-Speech Synthesis
7 benchmarks
100 papers with code
Prosody Prediction
1 benchmark
3 papers with code
Zero-Shot Multi-Speaker TTS
3 papers with code
Sound Event Detection
Sound Event Detection
5 benchmarks
90 papers with code
Audio Signal Processing
blind source separation
47 papers with code
Audio Signal Processing
24 papers with code
Audio Compression
16 papers with code
Audio Effects Modeling
2 papers with code
Audio Source Separation
Audio Source Separation
8 benchmarks
52 papers with code
Target Sound Extraction
4 benchmarks
8 papers with code
Directional Hearing
2 benchmarks
1 papers with code
Single-Label Target Sound Extraction
Audio captioning
Audio captioning
5 benchmarks
50 papers with code
Retrieval-augmented Few-shot In-context Audio Captioning
1 benchmark
5 papers with code
Zero-shot Audio Captioning
2 benchmarks
3 papers with code
Sound Classification
Sound Classification
54 papers with code
Audio Tagging
Audio Tagging
1 benchmark
43 papers with code
Acoustic Scene Classification
Acoustic Scene Classification
5 benchmarks
41 papers with code
Sound Source Localization
Sound Source Localization
37 papers with code
Sound Event Localization and Detection
Sound Event Localization and Detection
5 benchmarks
32 papers with code
Environmental Sound Classification
Environmental Sound Classification
3 benchmarks
24 papers with code
Self-Supervised Sound Classification
1 papers with code
Instrument Recognition
Instrument Recognition
3 benchmarks
23 papers with code
Text-to-Music Generation
Text-to-Music Generation
2 benchmarks
19 papers with code
Audio Super-Resolution
Audio Super-Resolution
4 benchmarks
14 papers with code
Online Beat Tracking
Inference Optimization
14 papers with code
Voice Anti-spoofing
Voice Anti-spoofing
3 benchmarks
14 papers with code
Audio inpainting
Audio inpainting
12 papers with code
Beat Tracking
Beat Tracking
15 benchmarks
12 papers with code
Direction of Arrival Estimation
Direction of Arrival Estimation
1 benchmark
12 papers with code
Instance Search
Instance Search
9 papers with code
Audio Fingerprint
1 papers with code
Audio Denoising
Audio Denoising
3 benchmarks
10 papers with code
Audio-Visual Synchronization
Audio-Visual Synchronization
9 papers with code
Downbeat Tracking
Downbeat Tracking
13 benchmarks
8 papers with code
Chord Recognition
Chord Recognition
7 papers with code
Audio Quality Assessment
Audio Quality Assessment
1 benchmark
6 papers with code
Visually Guided Sound Source Separation
Visually Guided Sound Source Separation
5 papers with code
Vowel Classification
Vowel Classification
5 papers with code
Audio Effects Modeling
Pitch control
3 papers with code
Timbre Interpolation
1 papers with code
Audio declipping
Audio declipping
4 papers with code
Bird Classification
Bird Audio Detection
3 papers with code
Bird Classification
Bird Species Classification With Audio-Visual Data
Music Compression
Music Compression
3 papers with code
Hearing Aid and device processing
Cadenza 1 - Task 1 - Headphone
1 benchmark
1 papers with code
Cadenza 1 - Task 2 - In Car
1 benchmark
1 papers with code
Hearing Aid and device processing
Audio Signal Recognition
Audio Signal Recognition
1 papers with code
Gunshot Detection
1 papers with code
Music Quality Assessment
Music Quality Assessment
2 papers with code
Text to Audio Retrieval
audio moment retrieval
2 papers with code
fake voice detection
fake voice detection
1 benchmark
2 papers with code
Acoustic Novelty Detection
Acoustic Novelty Detection
1 benchmark
1 papers with code
Audio Dequantization
Audio Dequantization
1 papers with code
Directional Hearing
Real-time Directional Hearing
1 benchmark
1 papers with code
Semi-Supervised Audio Regression
Semi-Supervised Audio Regression
1 papers with code
Shooter Localization
Shooter Localization
1 papers with code
Soundscape evaluation
Soundscape evaluation
1 papers with code
Speaker Orientation
Speaker Orientation
1 papers with code
Target Sound Extraction
Streaming Target Sound Extraction
1 benchmark
1 papers with code
Active Speaker Localization
Active Speaker Localization
Synthetic Song Detection
Synthetic Song Detection
Video/Text-to-Audio Generation
Video/Text-to-Audio Generation