Search Results for author: Xinyuan Chen

Found 24 papers, 15 papers with code

LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models

2 code implementations26 Sep 2023 Yaohui Wang, Xinyuan Chen, Xin Ma, Shangchen Zhou, Ziqi Huang, Yi Wang, Ceyuan Yang, Yinan He, Jiashuo Yu, Peiqing Yang, Yuwei Guo, Tianxing Wu, Chenyang Si, Yuming Jiang, Cunjian Chen, Chen Change Loy, Bo Dai, Dahua Lin, Yu Qiao, Ziwei Liu

To this end, we propose LaVie, an integrated video generation framework that operates on cascaded video latent diffusion models, comprising a base T2V model, a temporal interpolation model, and a video super-resolution model.

Text-to-Video Generation Video Generation +1

VBench: Comprehensive Benchmark Suite for Video Generative Models

1 code implementation29 Nov 2023 Ziqi Huang, Yinan He, Jiashuo Yu, Fan Zhang, Chenyang Si, Yuming Jiang, Yuanhan Zhang, Tianxing Wu, Qingyang Jin, Nattapol Chanpaisit, Yaohui Wang, Xinyuan Chen, LiMin Wang, Dahua Lin, Yu Qiao, Ziwei Liu

We will open-source VBench, including all prompts, evaluation methods, generated videos, and human preference annotations, and also include more video generation models in VBench to drive forward the field of video generation.

Image Generation Video Generation

DG-Font: Deformable Generative Networks for Unsupervised Font Generation

1 code implementation CVPR 2021 Yangchen Xie, Xinyuan Chen, Li Sun, Yue Lu

Font generation is a challenging problem especially for some writing systems that consist of a large number of characters and has attracted a lot of attention in recent years.

Font Generation Image-to-Image Translation

DGFont++: Robust Deformable Generative Networks for Unsupervised Font Generation

1 code implementation30 Dec 2022 Xinyuan Chen, Yangchen Xie, Li Sun, Yue Lu

Moreover, we introduce contrastive self-supervised learning to learn a robust style representation for fonts by understanding the similarity and dissimilarities of fonts.

Font Generation Self-Supervised Learning +1

SinSR: Diffusion-Based Image Super-Resolution in a Single Step

1 code implementation23 Nov 2023 YuFei Wang, Wenhan Yang, Xinyuan Chen, Yaohui Wang, Lanqing Guo, Lap-Pui Chau, Ziwei Liu, Yu Qiao, Alex C. Kot, Bihan Wen

Extensive experiments conducted on synthetic and real-world datasets demonstrate that the proposed method can achieve comparable or even superior performance compared to both previous SOTA methods and the teacher model, in just one sampling step, resulting in a remarkable up to x10 speedup for inference.

Image Super-Resolution

Diff-Font: Diffusion Model for Robust One-Shot Font Generation

1 code implementation12 Dec 2022 Haibin He, Xinyuan Chen, Chaoyue Wang, Juhua Liu, Bo Du, DaCheng Tao, Yu Qiao

Specifically, a large stroke-wise dataset is constructed, and a stroke-wise diffusion model is proposed to preserve the structure and the completion of each generated character.

Font Generation

Brush Your Text: Synthesize Any Scene Text on Images via Diffusion Model

1 code implementation19 Dec 2023 Lingjun Zhang, Xinyuan Chen, Yaohui Wang, Yue Lu, Yu Qiao

To tackle this problem, we propose Diff-Text, which is a training-free scene text generation framework for any language.

Text Generation Text-to-Image Generation

Cross Attention Based Style Distribution for Controllable Person Image Synthesis

1 code implementation1 Aug 2022 Xinyue Zhou, Mingyu Yin, Xinyuan Chen, Li Sun, Changxin Gao, Qingli Li

In this paper, we propose a cross attention based style distribution module that computes between the source semantic styles and target pose for pose transfer.

Pose Transfer Virtual Try-on

Long-Term Rhythmic Video Soundtracker

1 code implementation2 May 2023 Jiashuo Yu, Yaohui Wang, Xinyuan Chen, Xiao Sun, Yu Qiao

To this end, we present Long-Term Rhythmic Video Soundtracker (LORIS), a novel framework to synthesize long-term conditional waveforms.

ConditionVideo: Training-Free Condition-Guided Text-to-Video Generation

1 code implementation11 Oct 2023 Bo Peng, Xinyuan Chen, Yaohui Wang, Chaochao Lu, Yu Qiao

In this work, we introduce ConditionVideo, a training-free approach to text-to-video generation based on the provided condition, video, and input text, by leveraging the power of off-the-shelf text-to-image generation methods (e. g., Stable Diffusion).

Text-to-Image Generation Text-to-Video Generation +1

LEO: Generative Latent Image Animator for Human Video Synthesis

5 code implementations6 May 2023 Yaohui Wang, Xin Ma, Xinyuan Chen, Antitza Dantcheva, Bo Dai, Yu Qiao

Our key idea is to represent motion as a sequence of flow maps in the generation process, which inherently isolate motion from appearance.

Disentanglement Video Editing

Gated-GAN: Adversarial Gated Networks for Multi-Collection Style Transfer

2 code implementations4 Apr 2019 Xinyuan Chen, Chang Xu, Xiaokang Yang, Li Song, DaCheng Tao

We propose adversarial gated networks (Gated GAN) to transfer multiple styles in a single model.

Style Transfer

Vlogger: Make Your Dream A Vlog

1 code implementation17 Jan 2024 Shaobin Zhuang, Kunchang Li, Xinyuan Chen, Yaohui Wang, Ziwei Liu, Yu Qiao, Yali Wang

More importantly, Vlogger can generate over 5-minute vlogs from open-world descriptions, without loss of video coherence on script and actor.

Language Modelling Large Language Model +1

S-OHEM: Stratified Online Hard Example Mining for Object Detection

no code implementations5 May 2017 Minne Li, Zhaoning Zhang, Hao Yu, Xinyuan Chen, Dongsheng Li

S-OHEM exploits OHEM with stratified sampling, a widely-adopted sampling technique, to choose the training examples according to this influence during hard example mining, and thus enhance the performance of object detectors.

object-detection Object Detection

OCR-RTPS: An OCR-based real-time positioning system for the valet parking

no code implementations8 Dec 2022 Zizhang Wu, Xinyuan Chen, Jizheng Wang, Xiaoquan Wang, Yuanzhu Gan, Muqing Fang, Tianhao Xu

Obtaining the position of ego-vehicle is a crucial prerequisite for automatic control and path planning in the field of autonomous driving.

Autonomous Driving Optical Character Recognition (OCR) +1

Hierarchical Diffusion Autoencoders and Disentangled Image Manipulation

no code implementations24 Apr 2023 Zeyu Lu, Chengyue Wu, Xinyuan Chen, Yaohui Wang, Lei Bai, Yu Qiao, Xihui Liu

To mitigate those limitations, we propose Hierarchical Diffusion Autoencoders (HDAE) that exploit the fine-grained-to-abstract and lowlevel-to-high-level feature hierarchy for the latent space of diffusion models.

Image Generation Image Manipulation +1

PPD: A New Valet Parking Pedestrian Fisheye Dataset for Autonomous Driving

no code implementations20 Sep 2023 Zizhang Wu, Xinyuan Chen, Fan Song, Yuanzhu Gan, Tianhao Xu, Jian Pu, Rui Tang

In this paper, wepresent the Parking Pedestrian Dataset (PPD), a large-scale fisheye dataset to support research dealing with real-world pedestrians, especially with occlusions and diverse postures.

Autonomous Driving Data Augmentation +1

EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion

no code implementations11 Dec 2023 Zehuan Huang, Hao Wen, Junting Dong, Yaohui Wang, Yangguang Li, Xinyuan Chen, Yan-Pei Cao, Ding Liang, Yu Qiao, Bo Dai, Lu Sheng

Generating multiview images from a single view facilitates the rapid generation of a 3D mesh conditioned on a single image.

SSIM

A flexible Bayesian g-formula for causal survival analyses with time-dependent confounding

no code implementations4 Feb 2024 Xinyuan Chen, Liangyuan Hu, Fan Li

To enhance the traditional parametric g-formula approach, we developed a more adaptable Bayesian g-formula estimator.

Causal Inference Dimensionality Reduction +1

Cannot find the paper you are looking for? You can Submit a new open access paper.