LRS2 (Lip Reading Sentences 2)

Introduced by Chung et al. in Lip Reading Sentences in the Wild

The Oxford-BBC Lip Reading Sentences 2 (LRS2) dataset is one of the largest publicly available datasets for lip reading sentences in-the-wild. The database consists of mainly news and talk shows from BBC programs. Each sentence is up to 100 characters in length. The training, validation and test sets are divided according to broadcast date. It is a challenging set since it contains thousands of speakers without speaker labels and large variation in head pose. The pre-training set contains 96,318 utterances, the training set contains 45,839 utterances, the validation set contains 1,082 utterances and the test set contains 1,242 utterances.

Source: Audio-visual Recognition of Overlapped speech for the LRS2 dataset

Homepage

Benchmarks

Add a new result Link an existing benchmark

Task	Dataset Variant	Best Model
Lipreading	LRS2	CTC/Attention
Audio-Visual Speech Recognition	LRS2	CTC/Attention
Automatic Speech Recognition (ASR)	LRS2	CTC/Attention
Speech Separation	LRS2	TDFNet-small
Unconstrained Lip-synchronization	LRS2	Wav2Lip + ViT + MARLIN
Visual Speech Recognition	LRS2	VTP with more data
Image Manipulation	LRS2	TPS
Speech Recognition	LRS2	RAVEn Large
Visual Keyword Spotting	LRS2	Transpotter