no code implementations • 11 Dec 2023 • Jaeseok Byun, Dohoon Kim, Taesup Moon
We consider the critical issue of false negatives in Vision-Language Pre-training (VLP), a challenge that arises from the inherent many-to-many correspondence of image-text pairs in large-scale web-crawled datasets.
1 code implementation • 8 Aug 2022 • Jaeseok Byun, Taebaek Hwang, Jianlong Fu, Taesup Moon
In contrast to the mainstream VLP methods, we highlight that two routinely applied steps during pre-training have crucial impact on the performance of the pre-trained model: in-batch hard negative sampling for image-text matching (ITM) and assigning the large masking probability for the masked language modeling (MLM).
1 code implementation • CVPR 2021 • Jaeseok Byun, Sungmin Cha, Taesup Moon
To that end, we propose Fast Blind Image Denoiser (FBI-Denoiser) for Poisson-Gaussian noise, which consists of two neural network models; 1) PGE-Net that estimates Poisson-Gaussian noise parameters 2000 times faster than the conventional methods and 2) FBI-Net that realizes a much more efficient BSN for pixelwise affine denoiser in terms of the number of parameters and inference speed.