Search Results for author: Ran Yi

Found 62 papers, 25 papers with code

Pinco: Position-induced Consistent Adapter for Diffusion Transformer in Foreground-conditioned Inpainting

no code implementations5 Dec 2024 Guangben Lu, Yuzhen Du, Zhimin Sun, Ran Yi, Yifan Qi, Yizhe Tang, Tianyi Wang, Lizhuang Ma, Fangyuan Zou

Foreground-conditioned inpainting aims to seamlessly fill the background region of an image by utilizing the provided foreground subject and a text description.

Image Inpainting Position

MV-Adapter: Multi-view Consistent Image Generation Made Easy

no code implementations4 Dec 2024 Zehuan Huang, Yuan-Chen Guo, Haoran Wang, Ran Yi, Lizhuang Ma, Yan-Pei Cao, Lu Sheng

To efficiently model the 3D geometric knowledge within the adapter, we introduce innovative designs that include duplicated self-attention layers and parallel attention architecture, enabling the adapter to inherit the powerful priors of the pre-trained models to model the novel 3D knowledge.

3D Generation

SuperMat: Physically Consistent PBR Material Estimation at Interactive Rates

no code implementations26 Nov 2024 Yijia Hong, Yuan-Chen Guo, Ran Yi, Yulong Chen, Yan-Pei Cao, Lizhuang Ma

We present SuperMat, a single-step framework that achieves high-quality material decomposition with one-step inference.

Computational Efficiency Denoising

Textual Decomposition Then Sub-motion-space Scattering for Open-Vocabulary Motion Generation

no code implementations6 Nov 2024 Ke Fan, Jiangning Zhang, Ran Yi, Jingyu Gong, Yabiao Wang, Yating Wang, Xin Tan, Chengjie Wang, Lizhuang Ma

For Textual Decomposition, we design a fine-grained description conversion algorithm, and combine it with the generalization ability of a large language model to convert any given motion text into atomic texts.

Large Language Model Motion Generation

Rectified Diffusion Guidance for Conditional Generation

no code implementations24 Oct 2024 Mengfei Xia, Nan Xue, Yujun Shen, Ran Yi, Tieliang Gong, Yong-Jin Liu

Classifier-Free Guidance (CFG), which combines the conditional and unconditional score functions with two coefficients summing to one, serves as a practical technique for diffusion model sampling.

Denoising

AttentionPainter: An Efficient and Adaptive Stroke Predictor for Scene Painting

no code implementations21 Oct 2024 Yizhe Tang, Yue Wang, Teng Hu, Ran Yi, Xin Tan, Lizhuang Ma, Yu-Kun Lai, Paul L. Rosin

Stroke-based Rendering (SBR) aims to decompose an input image into a sequence of parameterized strokes, which can be rendered into a painting that resembles the input image.

reinforcement-learning Reinforcement Learning

Emphasizing Semantic Consistency of Salient Posture for Speech-Driven Gesture Generation

no code implementations17 Oct 2024 Fengqi Liu, Hexiang Wang, Jingyu Gong, Ran Yi, Qianyu Zhou, Xuequan Lu, Jiangbo Lu, Lizhuang Ma

Specifically, we first learn a joint manifold space for the individual representation of audio and body pose to exploit the inherent semantic association between two modalities, and propose to enforce semantic consistency via a consistency loss.

Gesture Generation

AdR-Gaussian: Accelerating Gaussian Splatting with Adaptive Radius

no code implementations13 Sep 2024 Xinzhe Wang, Ran Yi, Lizhuang Ma

In order to accelerate Gaussian splatting, we propose AdR-Gaussian, which moves part of serial culling in Render stage into the earlier Preprocess stage to enable parallel culling, employing adaptive radius to narrow the rendering pixel range for each Gaussian, and introduces a load balancing method to minimize thread waiting time during the pixel-parallel rendering.

SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation

no code implementations10 Sep 2024 Teng Hu, Jiangning Zhang, Ran Yi, Hongrui Huang, Yabiao Wang, Lizhuang Ma

Inspired by model pruning which lightens large pre-trained models by removing unimportant parameters, we propose a novel model fine-tuning method to make full use of these ineffective parameters and enable the pre-trained model with new task-specified capabilities.

Video Generation

PVP-Recon: Progressive View Planning via Warping Consistency for Sparse-View Surface Reconstruction

no code implementations9 Sep 2024 Sheng Ye, Yuze He, Matthieu Lin, Jenny Sheng, Ruoyu Fan, Yiheng Han, Yubin Hu, Ran Yi, Yu-Hui Wen, Yong-Jin Liu, Wenping Wang

Neural implicit representations have revolutionized dense multi-view surface reconstruction, yet their performance significantly diminishes with sparse input views.

Surface Reconstruction

Portrait3D: 3D Head Generation from Single In-the-wild Portrait Image

no code implementations24 Jun 2024 Jinkun Hao, Junshu Tang, Jiangning Zhang, Ran Yi, Yijia Hong, Moran Li, Weijian Cao, Yating Wang, Lizhuang Ma

We then use the canny map, ID features of the portrait image, and a pre-trained text-to-normal/depth diffusion model to generate ID-aware geometry supervision, and 3D-GAN inversion is employed to generate ID-aware geometry initialization.

Texture Synthesis

SuperSVG: Superpixel-based Scalable Vector Graphics Synthesis

1 code implementation CVPR 2024 Teng Hu, Ran Yi, Baihong Qian, Jiangning Zhang, Paul L. Rosin, Yu-Kun Lai

Then, we propose a two-stage self-training framework, where a coarse-stage model is employed to reconstruct the main structure and a refinement-stage model is used for enriching the details.

Superpixels Vector Graphics

GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting

no code implementations29 Apr 2024 Bo Chen, Shoukang Hu, Qi Chen, Chenpeng Du, Ran Yi, Yanmin Qian, Xie Chen

We present GStalker, a 3D audio-driven talking face generation model with Gaussian Splatting for both fast training (40 minutes) and real-time rendering (125 FPS) with a 3$\sim$5 minute video for training material, in comparison with previous 2D and 3D NeRF-based modeling frameworks which require hours of training and seconds of rendering per frame.

Talking Face Generation

MotionMaster: Training-free Camera Motion Transfer For Video Generation

no code implementations24 Apr 2024 Teng Hu, Jiangning Zhang, Ran Yi, Yating Wang, Hongrui Huang, Jieyu Weng, Yabiao Wang, Lizhuang Ma

Furthermore, we propose a few-shot camera motion disentanglement method to extract the common camera motion from multiple videos with similar camera motions, which employs a window-based clustering technique to extract the common features in temporal attention maps of multiple videos.

Disentanglement Motion Disentanglement +2

Learning Topology Uniformed Face Mesh by Volume Rendering for Multi-view Reconstruction

no code implementations8 Apr 2024 Yating Wang, Ran Yi, Ke Fan, Jinkun Hao, Jiangbo Lu, Lizhuang Ma

Our goal is to leverage the superiority of neural volume rendering into multi-view reconstruction of face mesh with consistent topology.

3D Reconstruction Face Reconstruction +1

SDPose: Tokenized Pose Estimation via Circulation-Guide Self-Distillation

1 code implementation CVPR 2024 Sichen Chen, Yingyi Zhang, Siming Huang, Ran Yi, Ke Fan, Ruixin Zhang, Peixian Chen, Jun Wang, Shouhong Ding, Lizhuang Ma

To mitigate the problem of under-fitting, we design a transformer module named Multi-Cycled Transformer(MCT) based on multiple-cycled forwards to more fully exploit the potential of small model parameters.

Edge-computing Pose Estimation

Continuous Piecewise-Affine Based Motion Model for Image Animation

1 code implementation17 Jan 2024 Hexiang Wang, Fengqi Liu, Qianyu Zhou, Ran Yi, Xin Tan, Lizhuang Ma

To address this issue, we propose to model motion from the source image to the driving frame in highly-expressive diffeomorphism spaces.

Image Animation

Towards More Accurate Diffusion Model Acceleration with A Timestep Tuner

no code implementations CVPR 2024 Mengfei Xia, Yujun Shen, Changsong Lei, Yu Zhou, Deli Zhao, Ran Yi, Wenping Wang, Yong-Jin Liu

A diffusion model which is formulated to produce an image using thousands of denoising steps usually suffers from a slow inference speed.

Denoising

Re-thinking Data Availability Attacks Against Deep Neural Networks

no code implementations CVPR 2024 Bin Fang, Bo Li, Shuang Wu, Shouhong Ding, Ran Yi, Lizhuang Ma

In this paper we re-examine the existing availability attack methods and propose a novel two-stage min-max-min optimization paradigm to generate robust unlearnable noise.

Plasticine3D: 3D Non-Rigid Editing with Text Guidance by Multi-View Embedding Optimization

no code implementations15 Dec 2023 Yige Chen, Teng Hu, Yizhe Tang, Siyuan Chen, Ang Chen, Ran Yi

With the help of Score Distillation Sampling (SDS) and the rapid development of neural 3D representations, some methods have been proposed to perform 3D editing such as adding additional geometries, or overwriting textures.

3D Generation Text to 3D

AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model

1 code implementation10 Dec 2023 Teng Hu, Jiangning Zhang, Ran Yi, Yuzhen Du, Xu Chen, Liang Liu, Yabiao Wang, Chengjie Wang

Existing anomaly inspection methods are limited in their performance due to insufficient anomaly data.

Image Generation

SMaRt: Improving GANs with Score Matching Regularity

no code implementations30 Nov 2023 Mengfei Xia, Yujun Shen, Ceyuan Yang, Ran Yi, Wenping Wang, Yong-Jin Liu

In this work, we revisit the mathematical foundations of GANs, and theoretically reveal that the native adversarial loss for GAN training is insufficient to fix the problem of subsets with positive Lebesgue measure of the generated data manifold lying out of the real data manifold.

valid

Towards More Accurate Diffusion Model Acceleration with A Timestep Aligner

no code implementations14 Oct 2023 Mengfei Xia, Yujun Shen, Changsong Lei, Yu Zhou, Ran Yi, Deli Zhao, Wenping Wang, Yong-Jin Liu

By viewing the generation of diffusion models as a discretized integrating process, we argue that the quality drop is partly caused by applying an inaccurate integral direction to a timestep interval.

Denoising

MMPI: a Flexible Radiance Field Representation by Multiple Multi-plane Images Blending

no code implementations30 Sep 2023 Yuze He, Peng Wang, Yubin Hu, Wang Zhao, Ran Yi, Yong-Jin Liu, Wenping Wang

In this paper, we explore the potential of MPI and show that MPI can synthesize high-quality novel views of complex scenes with diverse camera distributions and view directions, which are not only limited to simple forward-facing scenes.

Autonomous Driving Novel View Synthesis

Contrastive Pseudo Learning for Open-World DeepFake Attribution

1 code implementation ICCV 2023 Zhimin Sun, Shen Chen, Taiping Yao, Bangjie Yin, Ran Yi, Shouhong Ding, Lizhuang Ma

The challenge in sourcing attribution for forgery faces has gained widespread attention due to the rapid development of generative techniques.

DeepFake Detection Face Swapping +1

Toward High Quality Facial Representation Learning

1 code implementation7 Sep 2023 Yue Wang, Jinlong Peng, Jiangning Zhang, Ran Yi, Liang Liu, Yabiao Wang, Chengjie Wang

To improve the facial representation quality, we use feature map of a pre-trained visual backbone as a supervision item and use a partially pre-trained decoder for mask image modeling.

Contrastive Learning Decoder +3

Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption

1 code implementation ICCV 2023 Teng Hu, Jiangning Zhang, Liang Liu, Ran Yi, Siqi Kou, Haokun Zhu, Xu Chen, Yabiao Wang, Chengjie Wang, Lizhuang Ma

To address these problems, we propose a novel phasic content fusing few-shot diffusion model with directional distribution consistency loss, which targets different learning objectives at distinct training stages of the diffusion model.

Domain Adaptation

Stroke-based Neural Painting and Stylization with Dynamically Predicted Painting Region

2 code implementations7 Sep 2023 Teng Hu, Ran Yi, Haokun Zhu, Liang Liu, Jinlong Peng, Yabiao Wang, Chengjie Wang, Lizhuang Ma

To solve the problem, we propose Compositional Neural Painter, a novel stroke-based rendering framework which dynamically predicts the next painting region based on the current canvas, instead of dividing the image plane uniformly into painting regions.

Style Transfer

RFENet: Towards Reciprocal Feature Evolution for Glass Segmentation

1 code implementation12 Jul 2023 Ke Fan, Changan Wang, Yabiao Wang, Chengjie Wang, Ran Yi, Lizhuang Ma

Glass-like objects are widespread in daily life but remain intractable to be segmented for most existing methods.

Semantic Segmentation

Towards Generalizable Data Protection With Transferable Unlearnable Examples

no code implementations18 May 2023 Bin Fang, Bo Li, Shuang Wu, Tianyi Zheng, Shouhong Ding, Ran Yi, Lizhuang Ma

One of the crucial factors contributing to this success has been the access to an abundance of high-quality data for constructing machine learning models.

Re-thinking Data Availablity Attacks Against Deep Neural Networks

no code implementations18 May 2023 Bin Fang, Bo Li, Shuang Wu, Ran Yi, Shouhong Ding, Lizhuang Ma

The unauthorized use of personal data for commercial purposes and the clandestine acquisition of private data for training machine learning models continue to raise concerns.

Instance-Aware Domain Generalization for Face Anti-Spoofing

1 code implementation CVPR 2023 Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Xuequan Lu, Ran Yi, Shouhong Ding, Lizhuang Ma

To address these issues, we propose a novel perspective for DG FAS that aligns features on the instance level without the need for domain labels.

Domain Generalization Face Anti-Spoofing +1

Towards Artistic Image Aesthetics Assessment: a Large-scale Dataset and a New Method

1 code implementation CVPR 2023 Ran Yi, Haoyuan Tian, Zhihao Gu, Yu-Kun Lai, Paul L. Rosin

To fill the gap in the field of artistic image aesthetics assessment (AIAA), we first introduce a large-scale AIAA dataset: Boldbrush Artistic Image Dataset (BAID), which consists of 60, 337 artistic images covering various art forms, with more than 360, 000 votes from online users.

Multimodal Industrial Anomaly Detection via Hybrid Fusion

1 code implementation CVPR 2023 Yue Wang, Jinlong Peng, Jiangning Zhang, Ran Yi, Yabiao Wang, Chengjie Wang

2D-based Industrial Anomaly Detection has been widely discussed, however, multimodal industrial anomaly detection based on 3D point clouds and RGB images still has many untouched fields.

Ranked #4 on RGB+3D Anomaly Detection and Segmentation on MVTEC 3D-AD (using extra training data)

Contrastive Learning RGB+3D Anomaly Detection and Segmentation

PCKRF: Point Cloud Completion and Keypoint Refinement With Fusion Data for 6D Pose Estimation

1 code implementation7 Oct 2022 Yiheng Han, Irvin Haozhe Zhan, Long Zeng, Yu-Ping Wang, Ran Yi, MinJing Yu, Matthieu Gaetan Lin, Jenny Sheng, Yong-Jin Liu

In this paper, we propose Point Cloud Completion and Keypoint Refinement with Fusion Data (PCKRF), a new pose refinement pipeline for 6D pose estimation.

6D Pose Estimation Point Cloud Completion +1

Adaptive Mixture of Experts Learning for Generalizable Face Anti-Spoofing

no code implementations20 Jul 2022 Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Ran Yi, Shouhong Ding, Lizhuang Ma

Existing DG-based FAS approaches always capture the domain-invariant features for generalizing on the various unseen domains.

Domain Generalization Face Anti-Spoofing +1

Generative Domain Adaptation for Face Anti-Spoofing

no code implementations20 Jul 2022 Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Ran Yi, Kekai Sheng, Shouhong Ding, Lizhuang Ma

Most existing UDA FAS methods typically fit the trained models to the target domain via aligning the distribution of semantic high-level features.

Domain Adaptation Face Anti-Spoofing

Dynamic Neural Textures: Generating Talking-Face Videos with Continuously Controllable Expressions

no code implementations13 Apr 2022 Zipeng Ye, Zhiyao Sun, Yu-Hui Wen, Yanan sun, Tian Lv, Ran Yi, Yong-Jin Liu

In this paper, we propose a method to generate talking-face videos with continuously controllable expressions in real-time.

Video Generation

LAKe-Net: Topology-Aware Point Cloud Completion by Localizing Aligned Keypoints

1 code implementation CVPR 2022 Junshu Tang, Zhijun Gong, Ran Yi, Yuan Xie, Lizhuang Ma

An asymmetric keypoint locator, including an unsupervised multi-scale keypoint detector and a complete keypoint generator, is proposed for localizing aligned keypoints from complete and partial point clouds.

Point Cloud Completion

CtlGAN: Few-shot Artistic Portraits Generation with Contrastive Transfer Learning

no code implementations16 Mar 2022 Yue Wang, Ran Yi, Luying Li, Ying Tai, Chengjie Wang, Lizhuang Ma

We propose a new encoder which embeds real faces into Z+ space and proposes a dual-path training strategy to better cope with the adapted decoder and eliminate the artifacts.

Decoder Image-to-Image Translation +2

Quality Metric Guided Portrait Line Drawing Generation from Unpaired Training Data

1 code implementation8 Feb 2022 Ran Yi, Yong-Jin Liu, Yu-Kun Lai, Paul L. Rosin

In this paper, we propose a novel method to automatically transform face photos to portrait drawings using unpaired training data with two new features; i. e., our method can (1) learn to generate high quality portrait drawings in multiple styles using a single network and (2) generate portrait drawings in a "new style" unseen in the training data.

CFNet: Learning Correlation Functions for One-Stage Panoptic Segmentation

no code implementations13 Jan 2022 Yifeng Chen, Wenqing Chu, Fangfang Wang, Ying Tai, Ran Yi, Zhenye Gan, Liang Yao, Chengjie Wang, Xi Li

Recently, there is growing attention on one-stage panoptic segmentation methods which aim to segment instances and stuff jointly within a fully convolutional pipeline efficiently.

Instance Segmentation Panoptic Segmentation +1

Exploiting Fine-grained Face Forgery Clues via Progressive Enhancement Learning

no code implementations28 Dec 2021 Qiqi Gu, Shen Chen, Taiping Yao, Yang Chen, Shouhong Ding, Ran Yi

The progressive enhancement process facilitates the learning of discriminative features with fine-grained face forgery clues.

Domain Adaptive Semantic Segmentation via Regional Contrastive Consistency Regularization

1 code implementation11 Oct 2021 Qianyu Zhou, Chuyun Zhuang, Ran Yi, Xuequan Lu, Lizhuang Ma

In this paper, we propose a novel and fully end-to-end trainable approach, called regional contrastive consistency regularization (RCCR) for domain adaptive semantic segmentation.

Semantic Segmentation Synthetic-to-Real Translation +1

NPRportrait 1.0: A Three-Level Benchmark for Non-Photorealistic Rendering of Portraits

no code implementations1 Sep 2020 Paul L. Rosin, Yu-Kun Lai, David Mould, Ran Yi, Itamar Berger, Lars Doyle, Seungyong Lee, Chuan Li, Yong-Jin Liu, Amir Semmo, Ariel Shamir, Minjung Son, Holger Winnemoller

Despite the recent upsurge of activity in image-based non-photorealistic rendering (NPR), and in particular portrait image stylisation, due to the advent of neural style transfer, the state of performance evaluation in this field is limited, especially compared to the norms in the computer vision and machine learning communities.

Style Transfer

Unpaired Portrait Drawing Generation via Asymmetric Cycle Mapping

1 code implementation CVPR 2020 Ran Yi, Yong-Jin Liu, Yu-Kun Lai, Paul L. Rosin

We observe that due to the significant imbalance of information richness between photos and drawings, existing unpaired transfer methods such as CycleGAN tends to embed invisible reconstruction information indiscriminately in the whole drawings, leading to important facial features partially missing in drawings.

3D-CariGAN: An End-to-End Solution to 3D Caricature Generation from Face Photos

1 code implementation15 Mar 2020 Zipeng Ye, Mengfei Xia, Yanan sun, Ran Yi, MinJing Yu, Juyong Zhang, Yu-Kun Lai, Yong-Jin Liu

The most challenging issue for our system is that the source domain of face photos (characterized by normal 2D faces) is significantly different from the target domain of 3D caricatures (characterized by 3D exaggerated face shapes and textures).

Caricature

Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose

1 code implementation24 Feb 2020 Ran Yi, Zipeng Ye, Juyong Zhang, Hujun Bao, Yong-Jin Liu

In this paper, we address this problem by proposing a deep neural network model that takes an audio signal A of a source person and a very short video V of a target person as input, and outputs a synthesized high-quality talking face video with personalized head pose (making use of the visual information in V), expression and lip synchronization (by considering both A and V).

3D Face Animation Video Generation

A Configuration-Space Decomposition Scheme for Learning-based Collision Checking

no code implementations17 Nov 2019 Yiheng Han, Wang Zhao, Jia Pan, Zipeng Ye, Ran Yi, Yong-Jin Liu

Motion planning for robots of high degrees-of-freedom (DOFs) is an important problem in robotics with sampling-based methods in configuration space C as one popular solution.

BIG-bench Machine Learning Motion Planning +1

APDrawingGAN: Generating Artistic Portrait Drawings From Face Photos With Hierarchical GANs

6 code implementations CVPR 2019 Ran Yi, Yong-Jin Liu, Yu-Kun Lai, Paul L. Rosin

Moreover, artists tend to use different strategies to draw different facial features and the lines drawn are only loosely related to obvious image features.

Image Stylization

Content-Sensitive Supervoxels via Uniform Tessellations on Video Manifolds

no code implementations CVPR 2018 Ran Yi, Yong-Jin Liu, Yu-Kun Lai

We propose an efficient Lloyd-like method with a splitting-merging scheme to compute a uniform tessellation on M, which induces the CSS in X. Theoretically our method has a good competitive ratio O(1).

Cannot find the paper you are looking for? You can Submit a new open access paper.