Search Results for author: Xichen Pan

Found 2 papers, 2 papers with code

Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models

1 code implementation20 Nov 2022 Xichen Pan, Pengda Qin, Yuhong Li, Hui Xue, Wenhu Chen

Conditioned diffusion models have demonstrated state-of-the-art text-to-image synthesis capacity.

Image Generation Story Visualization

Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition

1 code implementation ACL 2022 Xichen Pan, Peiyu Chen, Yichen Gong, Helong Zhou, Xinbing Wang, Zhouhan Lin

In particular, audio and visual front-ends are trained on large-scale unimodal datasets, then we integrate components of both front-ends into a larger multimodal framework which learns to recognize parallel audio-visual data into characters through a combination of CTC and seq2seq decoding.

Audio-Visual Speech Recognition Automatic Speech Recognition +5

Cannot find the paper you are looking for? You can Submit a new open access paper.