Search Results for author: Wendi Zheng

Found 12 papers, 9 papers with code

ZeroFlow: Overcoming Catastrophic Forgetting is Easier than You Think

no code implementations2 Jan 2025 Tao Feng, Wei Li, Didi Zhu, Hangjie Yuan, Wendi Zheng, Dan Zhang, Jie Tang

This work provides essential insights and tools for advancing forward pass methods to overcome forgetting.

Continual Learning

DreamPolish: Domain Score Distillation With Progressive Geometry Generation

no code implementations3 Nov 2024 Yean Cheng, Ziqi Cai, Ming Ding, Wendi Zheng, Shiyu Huang, Yuxiao Dong, Jie Tang, Boxin Shi

We introduce DreamPolish, a text-to-3D generation model that excels in producing refined geometry and high-quality textures.

3D Generation Text to 3D +1

CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

1 code implementation12 Aug 2024 Zhuoyi Yang, Jiayan Teng, Wendi Zheng, Ming Ding, Shiyu Huang, Jiazheng Xu, Yuanming Yang, Wenyi Hong, Xiaohan Zhang, Guanyu Feng, Da Yin, Xiaotao Gu, Yuxuan Zhang, Weihan Wang, Yean Cheng, Ting Liu, Bin Xu, Yuxiao Dong, Jie Tang

We present CogVideoX, a large-scale text-to-video generation model based on diffusion transformer, which can generate 10-second continuous videos aligned with text prompt, with a frame rate of 16 fps and resolution of 768 * 1360 pixels.

Text-to-Video Generation Video Alignment +2

Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer

1 code implementation7 May 2024 Zhuoyi Yang, Heyang Jiang, Wenyi Hong, Jiayan Teng, Wendi Zheng, Yuxiao Dong, Ming Ding, Jie Tang

However, due to a quadratic increase in memory during generating ultra-high-resolution images (e. g. 4096*4096), the resolution of generated images is often limited to 1024*1024.

Image Generation Super-Resolution

MACRec: a Multi-Agent Collaboration Framework for Recommendation

2 code implementations23 Feb 2024 Zhefan Wang, Yuanqing Yu, Wendi Zheng, Weizhi Ma, Min Zhang

LLM-based agents have gained considerable attention for their decision-making skills and ability to handle complex tasks.

Conversational Recommendation Decision Making +2

CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers

1 code implementation29 May 2022 Wenyi Hong, Ming Ding, Wendi Zheng, Xinghan Liu, Jie Tang

Large-scale pretrained transformers have created milestones in text (GPT-3) and text-to-image (DALL-E and CogView) generation.

Text-to-Video Generation Video Generation

CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers

1 code implementation28 Apr 2022 Ming Ding, Wendi Zheng, Wenyi Hong, Jie Tang

The development of the transformer-based text-to-image models are impeded by its slow generation and complexity for high-resolution images.

Language Modeling Language Modelling +2

CogView: Mastering Text-to-Image Generation via Transformers

4 code implementations NeurIPS 2021 Ming Ding, Zhuoyi Yang, Wenyi Hong, Wendi Zheng, Chang Zhou, Da Yin, Junyang Lin, Xu Zou, Zhou Shao, Hongxia Yang, Jie Tang

Text-to-Image generation in the general domain has long been an open problem, which requires both a powerful generative model and cross-modal understanding.

Ranked #53 on Text-to-Image Generation on MS COCO (using extra training data)

Super-Resolution Zero-Shot Text-to-Image Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.