no code implementations • 29 Nov 2023 • Juntao Zhang, Yuehuai Liu, Yu-Wing Tai, Chi-Keung Tang
Specifically, C3Net first aligns the conditions from multi-modalities to the same semantic latent space using modality-specific encoders based on contrastive training.