Self-Supervised Audio Classification

7 papers with code • 2 benchmarks • 1 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Self-Supervised Audio Classification

Trend	Dataset	Best Model	Paper	Code	Compare
	ESC-50	BraVe:V-FA (TSM-50x2)			See all
	AudioSet (MLP)	BraVe:V-FA (TSM-50x2)			See all

Datasets

ESC-50

Most implemented papers

Most implemented Social Latest No code

ATST: Audio Representation Learning with Teacher-Student Transformer

Audio-WestlakeU/audiossl • • 26 Apr 2022

Self-supervised learning (SSL) learns knowledge from a large amount of unlabeled data, and then transfers the knowledge to a specific problem with a limited number of labeled data.

Paper
Code

Putting An End to End-to-End: Gradient-Isolated Learning of Representations

loeweX/Greedy_InfoMax • • NeurIPS 2019

We propose a novel deep learning method for local self-supervised representation learning that does not require labels nor end-to-end backpropagation but exploits the natural order in data instead.

Paper
Code

Self-Supervised Learning by Cross-Modal Audio-Video Clustering

HumamAlwassel/XDC • • NeurIPS 2020

To the best of our knowledge, XDC is the first self-supervised learning method that outperforms large-scale fully-supervised pretraining for action recognition on the same architecture.

Paper
Code

Audio-Visual Instance Discrimination with Cross-Modal Agreement

facebookresearch/AVID-CMA • • CVPR 2021

Our method uses contrastive learning for cross-modal discrimination of video from audio and vice-versa.

Paper
Code

Self-Supervised MultiModal Versatile Networks

deepmind/deepmind-research • • NeurIPS 2020

In particular, we explore how best to combine the modalities, such that fine-grained representations of the visual and audio modalities can be maintained, whilst also integrating text into a common embedding.

Paper
Code

Broaden Your Views for Self-Supervised Video Learning

deepmind/brave • • ICCV 2021

Most successful self-supervised learning methods are trained to align the representations of two independent views from the data.

Paper
Code

Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity

pritamqu/CrissCross • • 9 Nov 2021

We present CrissCross, a self-supervised framework for learning audio-visual representations.

Paper
Code

Self-Supervised Audio Classification

Benchmarks Add a Result

Datasets

Most implemented papers

ATST: Audio Representation Learning with Teacher-Student Transformer

Putting An End to End-to-End: Gradient-Isolated Learning of Representations

Self-Supervised Learning by Cross-Modal Audio-Video Clustering

Audio-Visual Instance Discrimination with Cross-Modal Agreement

Self-Supervised MultiModal Versatile Networks

Broaden Your Views for Self-Supervised Video Learning

Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity

Content

Benchmarks

Add a Result