no code implementations • 14 Mar 2024 • Yu-Chu Yu, Chi-Pin Huang, Jr-Jen Chen, Kai-Po Chang, Yung-Hsuan Lai, Fu-En Yang, Yu-Chiang Frank Wang
Large-scale vision-language models (VLMs) have shown a strong zero-shot generalization capability on unseen-domain data.
1 code implementation • 12 Dec 2023 • I-Jieh Liu, Ci-Siang Lin, Fu-En Yang, Yu-Chiang Frank Wang
Nevertheless, it is still challenging for FL to deal with user heterogeneity in their local data distribution in the real-world FL scenario, and this issue becomes even more severe in multi-label image classification.
no code implementations • 29 Nov 2023 • Chi-Pin Huang, Kai-Po Chang, Chung-Ting Tsai, Yung-Hsuan Lai, Fu-En Yang, Yu-Chiang Frank Wang
The former refrains the model from producing images associated with the target concept for any paraphrased or learned prompts, while the latter preserves its ability in generating images with non-target concepts.
no code implementations • ICCV 2023 • Fu-En Yang, Chien-Yi Wang, Yu-Chiang Frank Wang
To leverage robust representations from large-scale models while enabling efficient model personalization for heterogeneous clients, we propose a novel personalized FL framework of client-specific Prompt Generation (pFedPG), which learns to deploy a personalized prompt generator at the server for producing client-specific visual prompts that efficiently adapts frozen backbones to local data distributions.
no code implementations • 19 Feb 2023 • Yuan-Chia Cheng, Zu-Yun Shiau, Fu-En Yang, Yu-Chiang Frank Wang
In this paper, we present a learning framework of Tendency-and-Assignment Explainer (TAX), designed to offer interpretability at the annotator and assignment levels.
1 code implementation • 30 Aug 2022 • Cheng-Yen Hsieh, Chih-Jung Chang, Fu-En Yang, Yu-Chiang Frank Wang
In particular, we present a cross-scale patch-level correlation learning in SS-PRL, which allows the model to aggregate and associate information learned across patch scales.
no code implementations • 27 Dec 2021 • Yuan-Chia Cheng, Ci-Siang Lin, Fu-En Yang, Yu-Chiang Frank Wang
Few-shot classification aims to carry out classification given only few labeled examples for the categories of interest.
1 code implementation • NeurIPS 2021 • Fu-En Yang, Yuan-Chia Cheng, Zu-Yun Shiau, Yu-Chiang Frank Wang
Domain generalization (DG) aims to transfer the learning task from a single or multiple source domains to unseen target domains.
no code implementations • 2 Nov 2021 • Yuan-Hao Lee, Fu-En Yang, Yu-Chiang Frank Wang
Few-shot semantic segmentation addresses the learning task in which only few images with ground truth pixel-level labels are available for the novel classes of interest.
1 code implementation • CVPR 2021 • Cheng-Fu Yang, Wan-Cyuan Fan, Fu-En Yang, Yu-Chiang Frank Wang
To better exploit the text input, so that implicit objects or relationships can be properly inferred during layout generation, we propose a LayoutTransformer Network (LT-Net) in this paper.
no code implementations • 26 Feb 2021 • Fu-En Yang, Jing-Cheng Chang, Yuan-Hao Lee, Yu-Chiang Frank Wang
Generating videos with content and motion variations is a challenging task in computer vision.
no code implementations • 1 Jan 2021 • Cheng-Fu Yang, Wan-Cyuan Fan, Fu-En Yang, Yu-Chiang Frank Wang
In the areas of machine learning and computer vision, text-to-image synthesis aims at producing image outputs given the input text.
no code implementations • 21 Oct 2020 • Jia-Wei Yan, Ci-Siang Lin, Fu-En Yang, Yu-Jhe Li, Yu-Chiang Frank Wang
Learning interpretable and interpolatable latent representations has been an emerging research direction, allowing researchers to understand and utilize the derived latent space for further applications such as visual synthesis or recognition.
no code implementations • 25 Apr 2018 • Yu-Jhe Li, Fu-En Yang, Yen-Cheng Liu, Yu-Ying Yeh, Xiaofei Du, Yu-Chiang Frank Wang
Person re-identification (Re-ID) aims at recognizing the same person from images taken across different cameras.
Ranked #19 on Unsupervised Domain Adaptation on Duke to Market