no code implementations • 1 Dec 2023 • Ying Nie, wei he, Kai Han, Yehui Tang, Tianyu Guo, Fanyi Du, Yunhe Wang
Moreover, based on the observation that the accuracy of CLIP model does not increase correspondingly as the parameters of text encoder increase, an extra objective of masked language modeling (MLM) is leveraged for maximizing the potential of the shortened text encoder.