$f$-Mutual Information Contrastive Learning

29 Sep 2021 · Guojun Zhang, Yiwei Lu, Sun Sun, Hongyu Guo, YaoLiang Yu ·

Self-supervised contrastive learning is an emerging field due to its power in providing good data representations. Such learning paradigm widely adopts the InfoNCE loss, which is closely connected with maximizing the mutual information. In this work, we propose the $f$-Mutual Information Contrastive Learning framework ($f$-MICL) , which directly maximizes the $f$-divergence-based generalization of mutual information. We theoretically prove that, under mild assumptions, our $f$-MICL naturally attains the alignment for positive pairs and the uniformity for data representations, the two main factors for the success of contrastive learning. We further provide theoretical guidance on designing the similarity function and choosing the effective $f$-divergences for $f$-MICL. Using several benchmark tasks from both vision and natural text, we empirically verify that our novel method outperforms or performs on par with state-of-the-art strategies.

PDF Abstract