Seeing voices and hearing voices: learning discriminative embeddings using cross-modal self-supervision

29 Apr 2020Soo-Whan ChungHong Goo KangJoon Son Chung

The goal of this work is to train discriminative cross-modal embeddings without access to manually annotated data. Recent advances in self-supervised learning have shown that effective representations can be learnt from natural cross-modal synchrony... (read more)

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.