Search Results for author: Maozong Zheng

Found 2 papers, 0 papers with code

ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer

no code implementations22 May 2023 Huadai Liu, Rongjie Huang, Xuan Lin, Wenqiang Xu, Maozong Zheng, Hong Chen, Jinzheng He, Zhou Zhao

To mitigate the data scarcity in learning visual acoustic information, we 1) introduce a self-supervised learning framework to enhance both the visual-text encoder and denoiser decoder; 2) leverage the diffusion transformer scalable in terms of parameters and capacity to learn visual scene information.

Denoising Self-Supervised Learning

AntPivot: Livestream Highlight Detection via Hierarchical Attention Mechanism

no code implementations10 Jun 2022 Yang Zhao, Xuan Lin, Wenqiang Xu, Maozong Zheng, Zhengyong Liu, Zhou Zhao

In recent days, streaming technology has greatly promoted the development in the field of livestream.

Highlight Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.