The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the cocktail party effect from an augmented-reality (AR) -motivated multi-sensor egocentric world view. The dataset contains AR glasses egocentric multi-channel microphone array audio, wide field-of-view RGB video, speech source pose, headset microphone audio, annotated voice activity, speech transcriptions, head and face bounding boxes and source identification labels. We have created and are releasing this dataset to facilitate research in multi-modal AR solutions to the cocktail party problem.
15 PAPERS • 4 BENCHMARKS
A View From Somewhere (AVFS)—a dataset of 638,180 face similarity judgments over 4,921 faces. Each judgment corresponds to the odd-one-out (i.e., least similar) face in a triplet of faces and is accompanied by both the identifier and demographic attributes of the annotator who made the judgment.
1 PAPER • NO BENCHMARKS YET