1 code implementation • ECCV 2020 • Valentina Sanguineti, Pietro Morerio, Niccolò Pozzetti, Danilo Greco, Marco Cristani, Vittorio Murino
However, since 2D planar arrays are cumbersome and not as widespread as ordinary microphones, we propose that the richer information content of acoustic images can be distilled, through a self-supervised learning scheme, into more powerful audio and visual feature representations.