1 code implementation • 16 Mar 2023 • Zipeng Xu, Songlong Xing, Enver Sangineto, Nicu Sebe
However, directly using CLIP to guide style transfer leads to undesirable artifacts (mainly written words and unrelated visual entities) spread over the image.
1 code implementation • ICCV 2023 • Zipeng Xu, Enver Sangineto, Nicu Sebe
Despite the progress made in the style transfer task, most previous work focus on transferring only relatively simple features like color or texture, while missing more abstract concepts such as overall art expression or painter-specific traits.
1 code implementation • 16 Mar 2022 • Duo Zheng, Fandong Meng, Qingyi Si, Hairun Fan, Zipeng Xu, Jie zhou, Fangxiang Feng, Xiaojie Wang
Visual dialog has witnessed great progress after introducing various vision-oriented goals into the conversation, especially such as GuessWhich and GuessWhat, where the only image is visible by either and both of the questioner and the answerer, respectively.
1 code implementation • CVPR 2022 • Zipeng Xu, Tianwei Lin, Hao Tang, Fu Li, Dongliang He, Nicu Sebe, Radu Timofte, Luc van Gool, Errui Ding
We propose a novel framework, i. e., Predict, Prevent, and Evaluate (PPE), for disentangled text-driven image manipulation that requires little manual annotation while being applicable to a wide variety of manipulations.
1 code implementation • Findings (EMNLP) 2021 • Duo Zheng, Zipeng Xu, Fandong Meng, Xiaojie Wang, Jiaan Wang, Jie zhou
To enhance VD Questioner: 1) we propose a Related entity enhanced Questioner (ReeQ) that generates questions under the guidance of related entities and learns entity-based questioning strategy from human dialogs; 2) we propose an Augmented Guesser (AugG) that is strong and is optimized for the VD setting especially.
1 code implementation • 12 Jul 2021 • Zipeng Xu, Fandong Meng, Xiaojie Wang, Duo Zheng, Chenxu Lv, Jie zhou
In Reinforcement Learning, it is crucial to represent states and assign rewards based on the action-caused transitions of states.
1 code implementation • 1 Oct 2020 • Zipeng Xu, Fangxiang Feng, Xiaojie Wang, Yushu Yang, Huixing Jiang, Zhongyuan Wang
In this paper, we propose an Answer-Driven Visual State Estimator (ADVSE) to impose the effects of different answers on visual states.