2 code implementations • 1 Apr 2024 • Zhiqiu Lin, Deepak Pathak, Baiqi Li, Jiayao Li, Xide Xia, Graham Neubig, Pengchuan Zhang, Deva Ramanan
For instance, the widely-used CLIPScore measures the alignment between a (generated) image and text prompt, but it fails to produce reliable scores for complex prompts involving compositions of objects, attributes, and relations.
no code implementations • 31 Mar 2023 • Ximeng Sun, Pengchuan Zhang, Peizhao Zhang, Hardik Shah, Kate Saenko, Xide Xia
We transfer the knowledge from the pre-trained CLIP-ViTL/14 model to a ViT-B/32 model, with only 40M public images and 28. 4M unpaired public sentences.
no code implementations • ICCV 2023 • Ximeng Sun, Pengchuan Zhang, Peizhao Zhang, Hardik Shah, Kate Saenko, Xide Xia
In this paper, we introduce a new distillation mechanism (DIME-FM) that allows us to transfer the knowledge contained in large VLFMs to smaller, customized foundation models using a relatively small amount of inexpensive, unpaired images and sentences.
no code implementations • 20 Oct 2020 • Xide Xia, Tianfan Xue, Wei-Sheng Lai, Zheng Sun, Abby Chang, Brian Kulis, Jiawen Chen
We present a novel algorithm for transferring artistic styles of semantically meaningful local regions of an image onto local regions of a target video while preserving its photorealism.
3 code implementations • ECCV 2020 • Xide Xia, Meng Zhang, Tianfan Xue, Zheng Sun, Hui Fang, Brian Kulis, Jiawen Chen
Photorealistic style transfer is the task of transferring the artistic style of an image onto a content target, producing a result that is plausibly taken with a camera.
2 code implementations • NeurIPS 2020 • Ali Siahkamari, Xide Xia, Venkatesh Saligrama, David Castanon, Brian Kulis
Bregman divergences generalize measures such as the squared Euclidean distance and the KL divergence, and arise throughout many areas of machine learning.
3 code implementations • ICCV 2019 • Xingchao Peng, Qinxun Bai, Xide Xia, Zijun Huang, Kate Saenko, Bo wang
Conventional unsupervised domain adaptation (UDA) assumes that training data are sampled from a single domain.
11 code implementations • 22 Nov 2017 • Xide Xia, Brian Kulis
While significant attention has been recently focused on designing supervised deep semantic segmentation algorithms for vision tasks, there are many domains in which sufficient supervised pixel-level labels are difficult to obtain.