TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Spoken Command Recognition	Speech Command v2	COLA	Accuracy	95.5	# 4
Speaker Identification	VoxCeleb1	COLA	Top-1 (%)	37.7	# 9
Speaker Identification	VoxCeleb1	COLA	Accuracy	37.7	# 9

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/contrastive-learning-of-general-purpose-audio/spoken-command-recognition-on-speech-command)](https://paperswithcode.com/sota/spoken-command-recognition-on-speech-command?p=contrastive-learning-of-general-purpose-audio)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/contrastive-learning-of-general-purpose-audio/speaker-identification-on-voxceleb1)](https://paperswithcode.com/sota/speaker-identification-on-voxceleb1?p=contrastive-learning-of-general-purpose-audio)`

Contrastive Learning of General-Purpose Audio Representations

21 Oct 2020 · Aaqib Saeed, David Grangier, Neil Zeghidour ·

We introduce COLA, a self-supervised pre-training approach for learning a general-purpose representation of audio. Our approach is based on contrastive learning: it learns a representation which assigns high similarity to audio segments extracted from the same recording while assigning lower similarity to segments from different recordings. We build on top of recent advances in contrastive learning for computer vision and reinforcement learning to design a lightweight, easy-to-implement self-supervised model of audio. We pre-train embeddings on the large-scale Audioset database and transfer these representations to 9 diverse classification tasks, including speech, music, animal sounds, and acoustic scenes. We show that despite its simplicity, our method significantly outperforms previous self-supervised systems. We furthermore conduct ablation studies to identify key design choices and release a library to pre-train and fine-tune COLA models.

PDF Abstract

Code

Add Remove Mark official

google-research/google-research official

32,743

SarthakYadav/audax

Tasks

Add Remove

CoLA

Contrastive Learning

Speaker Identification

Spoken Command Recognition

Datasets

VoxCeleb1

AudioSet

Results from the Paper

Edit

Ranked #4 on Spoken Command Recognition on Speech Command v2

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Spoken Command Recognition	Speech Command v2	COLA	Accuracy	95.5	# 4	Compare
Speaker Identification	VoxCeleb1	COLA	Top-1 (%)	37.7	# 9	Compare
Speaker Identification	VoxCeleb1	COLA	Accuracy	37.7	# 9	Compare

Methods

Add Remove

COLA • Contrastive Learning

Edit Social Preview

Contrastive Learning of General-Purpose Audio Representations

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove