Search Results for author: Zekai Zhang

Found 11 papers, 6 papers with code

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

3 code implementations14 Feb 2025 Guoqing Ma, Haoyang Huang, Kun Yan, Liangyu Chen, Nan Duan, Shengming Yin, Changyi Wan, Ranchen Ming, Xiaoniu Song, Xing Chen, Yu Zhou, Deshan Sun, Deyu Zhou, Jian Zhou, Kaijun Tan, Kang An, Mei Chen, Wei Ji, Qiling Wu, Wen Sun, Xin Han, Yanan Wei, Zheng Ge, Aojie Li, Bin Wang, Bizhu Huang, Bo wang, Brian Li, Changxing Miao, Chen Xu, Chenfei Wu, Chenguang Yu, Dapeng Shi, Dingyuan Hu, Enle Liu, Gang Yu, Ge Yang, Guanzhe Huang, Gulin Yan, Haiyang Feng, Hao Nie, Haonan Jia, Hanpeng Hu, Hanqi Chen, Haolong Yan, Heng Wang, Hongcheng Guo, Huilin Xiong, Huixin Xiong, Jiahao Gong, Jianchang Wu, Jiaoren Wu, Jie Wu, Jie Yang, Jiashuai Liu, Jiashuo Li, Jingyang Zhang, Junjing Guo, Junzhe Lin, Kaixiang Li, Lei Liu, Lei Xia, Liang Zhao, Liguo Tan, Liwen Huang, Liying Shi, Ming Li, Mingliang Li, Muhua Cheng, Na Wang, Qiaohui Chen, Qinglin He, Qiuyan Liang, Quan Sun, Ran Sun, Rui Wang, Shaoliang Pang, Shiliang Yang, Sitong Liu, SiQi Liu, Shuli Gao, Tiancheng Cao, Tianyu Wang, Weipeng Ming, Wenqing He, Xu Zhao, Xuelin Zhang, Xianfang Zeng, Xiaojia Liu, Xuan Yang, Yaqi Dai, Yanbo Yu, Yang Li, Yineng Deng, Yingming Wang, Yilei Wang, Yuanwei Lu, Yu Chen, Yu Luo, Yuchu Luo, Yuhe Yin, Yuheng Feng, Yuxiang Yang, Zecheng Tang, Zekai Zhang, Zidong Yang, Binxing Jiao, Jiansheng Chen, Jing Li, Shuchang Zhou, Xiangyu Zhang, Xinhao Zhang, Yibo Zhu, Heung-Yeung Shum, Daxin Jiang

We present Step-Video-T2V, a state-of-the-art text-to-video pre-trained model with 30B parameters and the ability to generate videos up to 204 frames in length.

Video Generation Video Reconstruction

Diffusion Models Learn Low-Dimensional Distributions via Subspace Clustering

1 code implementation4 Sep 2024 Peng Wang, Huijie Zhang, Zekai Zhang, Siyi Chen, Yi Ma, Qing Qu

Remarkably, these models can achieve this even with a small number of training samples despite a large image dimension, circumventing the curse of dimensionality.

Clustering Denoising

RO-SVD: A Reconfigurable Hardware Copyright Protection Framework for AIGC Applications

no code implementations17 Jun 2024 Zhuoheng Ran, Muhammad A. A. Abdelgawad, Zekai Zhang, Ray C. C. Cheung, Hong Yan

The dramatic surge in the utilisation of generative artificial intelligence (GenAI) underscores the need for a secure and efficient mechanism to responsibly manage, use and disseminate multi-dimensional data generated by artificial intelligence (AI).

Management

E-Bench: Towards Evaluating the Ease-of-Use of Large Language Models

no code implementations16 Jun 2024 Zhenyu Zhang, Bingguang Hao, Jinpeng Li, Zekai Zhang, Dongyan Zhao

Most large language models (LLMs) are sensitive to prompts, and another synonymous expression or a typo may lead to unexpected results for the model.

StyleChat: Learning Recitation-Augmented Memory in LLMs for Stylized Dialogue Generation

no code implementations18 Mar 2024 Jinpeng Li, Zekai Zhang, Quan Tu, Xin Cheng, Dongyan Zhao, Rui Yan

Furthermore, although many prompt-based methods have been proposed to accomplish specific tasks, their performance in complex real-world scenarios involving a wide variety of dialog styles further enhancement.

Dialogue Generation

PPTC-R benchmark: Towards Evaluating the Robustness of Large Language Models for PowerPoint Task Completion

1 code implementation6 Mar 2024 Zekai Zhang, Yiduo Guo, Yaobo Liang, Dongyan Zhao, Nan Duan

The growing dependence on Large Language Models (LLMs) for finishing user instructions necessitates a comprehensive understanding of their robustness to complex task completion in real-world situations.

Sentence

StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis

no code implementations30 Jan 2024 Zecheng Tang, Chenfei Wu, Zekai Zhang, Mingheng Ni, Shengming Yin, Yu Liu, Zhengyuan Yang, Lijuan Wang, Zicheng Liu, Juntao Li, Nan Duan

To leverage LLMs for visual synthesis, traditional methods convert raster image information into discrete grid tokens through specialized visual modules, while disrupting the model's ability to capture the true semantic representation of visual scenes.

Vector Graphics

PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion

1 code implementation3 Nov 2023 Yiduo Guo, Zekai Zhang, Yaobo Liang, Dongyan Zhao, Nan Duan

Recent evaluations of Large Language Models (LLMs) have centered around testing their zero-shot/few-shot capabilities for basic natural language tasks and their ability to translate instructions into tool APIs.

ResDiff: Combining CNN and Diffusion Model for Image Super-Resolution

1 code implementation15 Mar 2023 Shuyao Shang, Zhengyang Shan, Guangxing Liu, LunQian Wang, XingHua Wang, Zekai Zhang, Jinglin Zhang

Adapting the Diffusion Probabilistic Model (DPM) for direct image super-resolution is wasteful, given that a simple Convolutional Neural Network (CNN) can recover the main low-frequency content.

Image Super-Resolution

LeaF: Learning Frames for 4D Point Cloud Sequence Understanding

no code implementations ICCV 2023 Yunze Liu, Junyu Chen, Zekai Zhang, Jingwei Huang, Li Yi

With such frames, we can factorize geometry and motion to facilitate a feature-space geometric reconstruction for more effective 4D learning.

Descriptive

Cannot find the paper you are looking for? You can Submit a new open access paper.