1 code implementation • 16 Jun 2024 • Xiaoxiao Ma, Mohan Zhou, Tao Liang, Yalong Bai, Tiejun Zhao, Huaian Chen, Yi Jin
We present STAR, a text-to-image model that employs scale-wise auto-regressive paradigm.
no code implementations • 25 Jan 2024 • Mohan Zhou, Yalong Bai, Qing Yang, Tiejun Zhao
The ability to fine-tune generative models for text-to-image generation tasks is crucial, particularly facing the complexity involved in accurately interpreting and visualizing textual inputs.
no code implementations • 20 Jul 2023 • Mohan Zhou, Yalong Bai, Wei zhang, Ting Yao, Tiejun Zhao, Tao Mei
In this paper, we propose a novel learning-based evaluation metric named Preference Score (PS) for fitting human preference according to the quantitative evaluations across different dimensions.
no code implementations • 5 Jul 2023 • Mohan Zhou, Yalong Bai, Wei zhang, Ting Yao, Tiejun Zhao
Based on ViCo and ViCo-X, we define three novel tasks targeting the interaction modeling during the face-to-face conversation: 1) responsive listening head generation making listeners respond actively to the speaker with non-verbal signals, 2) expressive talking head generation guiding speakers to be aware of listeners' behaviors, and 3) conversational head generation to integrate the talking/listening ability in one interlocutor.
no code implementations • 21 Jun 2023 • Mohan Zhou, Yalong Bai, Wei zhang, Ting Yao, Tiejun Zhao, Tao Mei
Dynamically synthesizing talking speech that actively responds to a listening head is critical during the face-to-face interaction.
no code implementations • 27 Dec 2021 • Mohan Zhou, Yalong Bai, Wei zhang, Ting Yao, Tiejun Zhao, Tao Mei
Automatically synthesizing listening behavior that actively responds to a talking head, is critical to applications such as digital human, virtual agents and social robots.
1 code implementation • 26 Jul 2021 • Yalong Bai, Mohan Zhou, Wei zhang, BoWen Zhou, Tao Mei
Experimental results on ImageNet demonstrate the compatibility and effectiveness on a much wider range of augmentations, while consuming fewer parameters and lower computational costs at inference time.
2 code implementations • CVPR 2020 • Mohan Zhou, Yalong Bai, Wei zhang, Tiejun Zhao, Tao Mei
Specifically, we first propose an object-extent learning module for localizing the object according to the visual patterns shared among the instances in the same category.
Ranked #18 on
Fine-Grained Image Classification
on CUB-200-2011