Search Results for author: Huiguo He

Found 5 papers, 2 papers with code

MobileVidFactory: Automatic Diffusion-Based Social Media Video Generation for Mobile Devices from Text

no code implementations • 31 Jul 2023 • Junchen Zhu, Huan Yang, Wenjing Wang, Huiguo He, Zixi Tuo, Yongsheng Yu, Wen-Huang Cheng, Lianli Gao, Jingkuan Song, Jianlong Fu, Jiebo Luo

In the basic generation, we take advantage of the pretrained image diffusion model, and adapt it to a high-quality open-domain vertical video generator for mobile devices.

Video Generation

Paper
Add Code

Learning Profitable NFT Image Diffusions via Multiple Visual-Policy Guided Reinforcement Learning

no code implementations • 20 Jun 2023 • Huiguo He, Tianfu Wang, Huan Yang, Jianlong Fu, Nicholas Jing Yuan, Jian Yin, Hongyang Chao, Qi Zhang

The proposed framework consists of a large language model (LLM), a diffusion-based image generator, and a series of visual rewards by design.

Attribute Image Generation +3

Paper
Add Code

MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images

no code implementations • 12 Jun 2023 • Junchen Zhu, Huan Yang, Huiguo He, Wenjing Wang, Zixi Tuo, Wen-Huang Cheng, Lianli Gao, Jingkuan Song, Jianlong Fu

To generate videos, we extend the capabilities of a pretrained text-to-image diffusion model through a two-stage process.

Retrieval

Paper
Add Code

Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation

1 code implementation • 18 May 2023 • Wenjing Wang, Huan Yang, Zixi Tuo, Huiguo He, Junchen Zhu, Jianlong Fu, Jiaying Liu

Moreover, to fully unlock model capabilities for high-quality video generation and promote the development of the field, we curate a large-scale and open-source video dataset called HD-VG-130M.

Ranked #1 on Text-to-Video Generation on WebVid

Text-to-Image Generation Text-to-Video Generation +2

Paper
Code

MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation

1 code implementation • CVPR 2023 • Ludan Ruan, Yiyang Ma, Huan Yang, Huiguo He, Bei Liu, Jianlong Fu, Nicholas Jing Yuan, Qin Jin, Baining Guo

To generate joint audio-video pairs, we propose a novel Multi-Modal Diffusion model (i. e., MM-Diffusion), with two-coupled denoising autoencoders.

Denoising FAD +1

329

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.