no code implementations • 5 Dec 2024 • Guangben Lu, Yuzhen Du, Zhimin Sun, Ran Yi, Yifan Qi, Yizhe Tang, Tianyi Wang, Lizhuang Ma, Fangyuan Zou
Foreground-conditioned inpainting aims to seamlessly fill the background region of an image by utilizing the provided foreground subject and a text description.
no code implementations • 4 Dec 2024 • Zehuan Huang, Yuan-Chen Guo, Haoran Wang, Ran Yi, Lizhuang Ma, Yan-Pei Cao, Lu Sheng
To efficiently model the 3D geometric knowledge within the adapter, we introduce innovative designs that include duplicated self-attention layers and parallel attention architecture, enabling the adapter to inherit the powerful priors of the pre-trained models to model the novel 3D knowledge.
no code implementations • 26 Nov 2024 • Yijia Hong, Yuan-Chen Guo, Ran Yi, Yulong Chen, Yan-Pei Cao, Lizhuang Ma
We present SuperMat, a single-step framework that achieves high-quality material decomposition with one-step inference.
no code implementations • 6 Nov 2024 • Ke Fan, Jiangning Zhang, Ran Yi, Jingyu Gong, Yabiao Wang, Yating Wang, Xin Tan, Chengjie Wang, Lizhuang Ma
For Textual Decomposition, we design a fine-grained description conversion algorithm, and combine it with the generalization ability of a large language model to convert any given motion text into atomic texts.
no code implementations • 24 Oct 2024 • Mengfei Xia, Nan Xue, Yujun Shen, Ran Yi, Tieliang Gong, Yong-Jin Liu
Classifier-Free Guidance (CFG), which combines the conditional and unconditional score functions with two coefficients summing to one, serves as a practical technique for diffusion model sampling.
no code implementations • 21 Oct 2024 • Yizhe Tang, Yue Wang, Teng Hu, Ran Yi, Xin Tan, Lizhuang Ma, Yu-Kun Lai, Paul L. Rosin
Stroke-based Rendering (SBR) aims to decompose an input image into a sequence of parameterized strokes, which can be rendered into a painting that resembles the input image.
no code implementations • 17 Oct 2024 • Fengqi Liu, Hexiang Wang, Jingyu Gong, Ran Yi, Qianyu Zhou, Xuequan Lu, Jiangbo Lu, Lizhuang Ma
Specifically, we first learn a joint manifold space for the individual representation of audio and body pose to exploit the inherent semantic association between two modalities, and propose to enforce semantic consistency via a consistency loss.
no code implementations • 13 Sep 2024 • Xinzhe Wang, Ran Yi, Lizhuang Ma
In order to accelerate Gaussian splatting, we propose AdR-Gaussian, which moves part of serial culling in Render stage into the earlier Preprocess stage to enable parallel culling, employing adaptive radius to narrow the rendering pixel range for each Gaussian, and introduces a load balancing method to minimize thread waiting time during the pixel-parallel rendering.
no code implementations • 10 Sep 2024 • Teng Hu, Jiangning Zhang, Ran Yi, Hongrui Huang, Yabiao Wang, Lizhuang Ma
Inspired by model pruning which lightens large pre-trained models by removing unimportant parameters, we propose a novel model fine-tuning method to make full use of these ineffective parameters and enable the pre-trained model with new task-specified capabilities.
no code implementations • 9 Sep 2024 • Sheng Ye, Yuze He, Matthieu Lin, Jenny Sheng, Ruoyu Fan, Yiheng Han, Yubin Hu, Ran Yi, Yu-Hui Wen, Yong-Jin Liu, Wenping Wang
Neural implicit representations have revolutionized dense multi-view surface reconstruction, yet their performance significantly diminishes with sparse input views.
no code implementations • 24 Jun 2024 • Jinkun Hao, Junshu Tang, Jiangning Zhang, Ran Yi, Yijia Hong, Moran Li, Weijian Cao, Yating Wang, Lizhuang Ma
We then use the canny map, ID features of the portrait image, and a pre-trained text-to-normal/depth diffusion model to generate ID-aware geometry supervision, and 3D-GAN inversion is employed to generate ID-aware geometry initialization.
1 code implementation • CVPR 2024 • Teng Hu, Ran Yi, Baihong Qian, Jiangning Zhang, Paul L. Rosin, Yu-Kun Lai
Then, we propose a two-stage self-training framework, where a coarse-stage model is employed to reconstruct the main structure and a refinement-stage model is used for enriching the details.
no code implementations • 4 Jun 2024 • Chengjie Wang, Haokun Zhu, Jinlong Peng, Yue Wang, Ran Yi, Yunsheng Wu, Lizhuang Ma, Jiangning Zhang
Existing industrial anomaly detection methods primarily concentrate on unsupervised learning with pristine RGB images.
no code implementations • 24 May 2024 • Ke Fan, Junshu Tang, Weijian Cao, Ran Yi, Moran Li, Jingyu Gong, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Lizhuang Ma
Text-to-motion synthesis is a crucial task in computer vision.
Ranked #5 on Motion Synthesis on InterHuman
no code implementations • 29 Apr 2024 • Bo Chen, Shoukang Hu, Qi Chen, Chenpeng Du, Ran Yi, Yanmin Qian, Xie Chen
We present GStalker, a 3D audio-driven talking face generation model with Gaussian Splatting for both fast training (40 minutes) and real-time rendering (125 FPS) with a 3$\sim$5 minute video for training material, in comparison with previous 2D and 3D NeRF-based modeling frameworks which require hours of training and seconds of rendering per frame.
no code implementations • 24 Apr 2024 • Teng Hu, Jiangning Zhang, Ran Yi, Yating Wang, Hongrui Huang, Jieyu Weng, Yabiao Wang, Lizhuang Ma
Furthermore, we propose a few-shot camera motion disentanglement method to extract the common camera motion from multiple videos with similar camera motions, which employs a window-based clustering technique to extract the common features in temporal attention maps of multiple videos.
no code implementations • 8 Apr 2024 • Yating Wang, Ran Yi, Ke Fan, Jinkun Hao, Jiangbo Lu, Lizhuang Ma
Our goal is to leverage the superiority of neural volume rendering into multi-view reconstruction of face mesh with consistent topology.
1 code implementation • CVPR 2024 • Sichen Chen, Yingyi Zhang, Siming Huang, Ran Yi, Ke Fan, Ruixin Zhang, Peixian Chen, Jun Wang, Shouhong Ding, Lizhuang Ma
To mitigate the problem of under-fitting, we design a transformer module named Multi-Cycled Transformer(MCT) based on multiple-cycled forwards to more fully exploit the potential of small model parameters.
1 code implementation • 17 Jan 2024 • Hexiang Wang, Fengqi Liu, Qianyu Zhou, Ran Yi, Xin Tan, Lizhuang Ma
To address this issue, we propose to model motion from the source image to the driving frame in highly-expressive diffeomorphism spaces.
no code implementations • CVPR 2024 • Mengfei Xia, Yujun Shen, Changsong Lei, Yu Zhou, Deli Zhao, Ran Yi, Wenping Wang, Yong-Jin Liu
A diffusion model which is formulated to produce an image using thousands of denoising steps usually suffers from a slow inference speed.
no code implementations • CVPR 2024 • Bin Fang, Bo Li, Shuang Wu, Shouhong Ding, Ran Yi, Lizhuang Ma
In this paper we re-examine the existing availability attack methods and propose a novel two-stage min-max-min optimization paradigm to generate robust unlearnable noise.
no code implementations • 23 Dec 2023 • Changsong Lei, Mengfei Xia, Shaofeng Wang, Yaqian Liang, Ran Yi, Yuhui Wen, YongJin Liu
To address this challenge, we propose a general tooth arrangement neural network based on the diffusion probabilistic model.
no code implementations • 15 Dec 2023 • Yige Chen, Teng Hu, Yizhe Tang, Siyuan Chen, Ang Chen, Ran Yi
With the help of Score Distillation Sampling (SDS) and the rapid development of neural 3D representations, some methods have been proposed to perform 3D editing such as adding additional geometries, or overwriting textures.
1 code implementation • 10 Dec 2023 • Teng Hu, Jiangning Zhang, Ran Yi, Yuzhen Du, Xu Chen, Liang Liu, Yabiao Wang, Chengjie Wang
Existing anomaly inspection methods are limited in their performance due to insufficient anomaly data.
no code implementations • 30 Nov 2023 • Mengfei Xia, Yujun Shen, Ceyuan Yang, Ran Yi, Wenping Wang, Yong-Jin Liu
In this work, we revisit the mathematical foundations of GANs, and theoretically reveal that the native adversarial loss for GAN training is insufficient to fix the problem of subsets with positive Lebesgue measure of the generated data manifold lying out of the real data manifold.
no code implementations • 9 Nov 2023 • Haokun Zhu, Juang Ian Chong, Teng Hu, Ran Yi, Yu-Kun Lai, Paul L. Rosin
Vector graphics are widely used in graphical designs and have received more and more attention.
no code implementations • 14 Oct 2023 • Mengfei Xia, Yujun Shen, Changsong Lei, Yu Zhou, Ran Yi, Deli Zhao, Wenping Wang, Yong-Jin Liu
By viewing the generation of diffusion models as a discretized integrating process, we argue that the quality drop is partly caused by applying an inaccurate integral direction to a timestep interval.
1 code implementation • 4 Oct 2023 • Yuze He, Yushi Bai, Matthieu Lin, Wang Zhao, Yubin Hu, Jenny Sheng, Ran Yi, Juanzi Li, Yong-Jin Liu
Recent methods in text-to-3D leverage powerful pretrained diffusion models to optimize NeRF.
no code implementations • 30 Sep 2023 • Yuze He, Peng Wang, Yubin Hu, Wang Zhao, Ran Yi, Yong-Jin Liu, Wenping Wang
In this paper, we explore the potential of MPI and show that MPI can synthesize high-quality novel views of complex scenes with diverse camera distributions and view directions, which are not only limited to simple forward-facing scenes.
1 code implementation • ICCV 2023 • Zhimin Sun, Shen Chen, Taiping Yao, Bangjie Yin, Ran Yi, Shouhong Ding, Lizhuang Ma
The challenge in sourcing attribution for forgery faces has gained widespread attention due to the rapid development of generative techniques.
1 code implementation • 7 Sep 2023 • Yue Wang, Jinlong Peng, Jiangning Zhang, Ran Yi, Liang Liu, Yabiao Wang, Chengjie Wang
To improve the facial representation quality, we use feature map of a pre-trained visual backbone as a supervision item and use a partially pre-trained decoder for mask image modeling.
1 code implementation • ICCV 2023 • Teng Hu, Jiangning Zhang, Liang Liu, Ran Yi, Siqi Kou, Haokun Zhu, Xu Chen, Yabiao Wang, Chengjie Wang, Lizhuang Ma
To address these problems, we propose a novel phasic content fusing few-shot diffusion model with directional distribution consistency loss, which targets different learning objectives at distinct training stages of the diffusion model.
2 code implementations • 7 Sep 2023 • Teng Hu, Ran Yi, Haokun Zhu, Liang Liu, Jinlong Peng, Yabiao Wang, Chengjie Wang, Lizhuang Ma
To solve the problem, we propose Compositional Neural Painter, a novel stroke-based rendering framework which dynamically predicts the next painting region based on the current canvas, instead of dividing the image plane uniformly into painting regions.
1 code implementation • ICCV 2023 • Zhiwei Zhang, Zhizhong Zhang, Qian Yu, Ran Yi, Yuan Xie, Lizhuang Ma
3D panoptic segmentation is a challenging perception task that requires both semantic segmentation and instance segmentation.
1 code implementation • 12 Jul 2023 • Ke Fan, Changan Wang, Yabiao Wang, Chengjie Wang, Ran Yi, Lizhuang Ma
Glass-like objects are widespread in daily life but remain intractable to be segmented for most existing methods.
no code implementations • 18 May 2023 • Bin Fang, Bo Li, Shuang Wu, Tianyi Zheng, Shouhong Ding, Ran Yi, Lizhuang Ma
One of the crucial factors contributing to this success has been the access to an abundance of high-quality data for constructing machine learning models.
no code implementations • 18 May 2023 • Bin Fang, Bo Li, Shuang Wu, Ran Yi, Shouhong Ding, Lizhuang Ma
The unauthorized use of personal data for commercial purposes and the clandestine acquisition of private data for training machine learning models continue to raise concerns.
1 code implementation • CVPR 2023 • Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Xuequan Lu, Ran Yi, Shouhong Ding, Lizhuang Ma
To address these issues, we propose a novel perspective for DG FAS that aligns features on the instance level without the need for domain labels.
1 code implementation • CVPR 2023 • Ran Yi, Haoyuan Tian, Zhihao Gu, Yu-Kun Lai, Paul L. Rosin
To fill the gap in the field of artistic image aesthetics assessment (AIAA), we first introduce a large-scale AIAA dataset: Boldbrush Artistic Image Dataset (BAID), which consists of 60, 337 artistic images covering various art forms, with more than 360, 000 votes from online users.
2 code implementations • ICCV 2023 • Junshu Tang, Tengfei Wang, Bo Zhang, Ting Zhang, Ran Yi, Lizhuang Ma, Dong Chen
In this work, we investigate the problem of creating high-fidelity 3D content from only a single image.
1 code implementation • CVPR 2023 • Yue Wang, Jinlong Peng, Jiangning Zhang, Ran Yi, Yabiao Wang, Chengjie Wang
2D-based Industrial Anomaly Detection has been widely discussed, however, multimodal industrial anomaly detection based on 3D point clouds and RGB images still has many untouched fields.
Ranked #4 on RGB+3D Anomaly Detection and Segmentation on MVTEC 3D-AD (using extra training data)
Contrastive Learning RGB+3D Anomaly Detection and Segmentation
2 code implementations • ICCV 2023 • Zhihao Gu, Liang Liu, Xu Chen, Ran Yi, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Annan Shu, Guannan Jiang, Lizhuang Ma
Specifically, we first propose a normality recall memory (NR Memory) to strengthen the normality of student-generated features by recalling the stored normal information.
Ranked #14 on Anomaly Detection on MVTec AD
1 code implementation • 7 Oct 2022 • Yiheng Han, Irvin Haozhe Zhan, Long Zeng, Yu-Ping Wang, Ran Yi, MinJing Yu, Matthieu Gaetan Lin, Jenny Sheng, Yong-Jin Liu
In this paper, we propose Point Cloud Completion and Keypoint Refinement with Fusion Data (PCKRF), a new pose refinement pipeline for 6D pose estimation.
no code implementations • 20 Jul 2022 • Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Ran Yi, Shouhong Ding, Lizhuang Ma
Existing DG-based FAS approaches always capture the domain-invariant features for generalizing on the various unseen domains.
no code implementations • 20 Jul 2022 • Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Ran Yi, Kekai Sheng, Shouhong Ding, Lizhuang Ma
Most existing UDA FAS methods typically fit the trained models to the target domain via aligning the distribution of semantic high-level features.
no code implementations • 13 Apr 2022 • Zipeng Ye, Zhiyao Sun, Yu-Hui Wen, Yanan sun, Tian Lv, Ran Yi, Yong-Jin Liu
In this paper, we propose a method to generate talking-face videos with continuously controllable expressions in real-time.
1 code implementation • CVPR 2022 • Junshu Tang, Zhijun Gong, Ran Yi, Yuan Xie, Lizhuang Ma
An asymmetric keypoint locator, including an unsupervised multi-scale keypoint detector and a complete keypoint generator, is proposed for localizing aligned keypoints from complete and partial point clouds.
no code implementations • 16 Mar 2022 • Yue Wang, Ran Yi, Luying Li, Ying Tai, Chengjie Wang, Lizhuang Ma
We propose a new encoder which embeds real faces into Z+ space and proposes a dual-path training strategy to better cope with the adapted decoder and eliminate the artifacts.
1 code implementation • 8 Feb 2022 • Ran Yi, Yong-Jin Liu, Yu-Kun Lai, Paul L. Rosin
In this paper, we propose a novel method to automatically transform face photos to portrait drawings using unpaired training data with two new features; i. e., our method can (1) learn to generate high quality portrait drawings in multiple styles using a single network and (2) generate portrait drawings in a "new style" unseen in the training data.
no code implementations • 16 Jan 2022 • Zipeng Ye, Mengfei Xia, Ran Yi, Juyong Zhang, Yu-Kun Lai, Xuwei Huang, Guoxin Zhang, Yong-Jin Liu
In this paper, we present a dynamic convolution kernel (DCK) strategy for convolutional neural networks.
no code implementations • 13 Jan 2022 • Yifeng Chen, Wenqing Chu, Fangfang Wang, Ying Tai, Ran Yi, Zhenye Gan, Liang Yao, Chengjie Wang, Xi Li
Recently, there is growing attention on one-stage panoptic segmentation methods which aim to segment instances and stuff jointly within a fully convolutional pipeline efficiently.
1 code implementation • CVPR 2022 • Shaohua Guo, Liang Liu, Zhenye Gan, Yabiao Wang, Wuhao Zhang, Chengjie Wang, Guannan Jiang, Wei zhang, Ran Yi, Lizhuang Ma, Ke Xu
The huge burden of computation and memory are two obstacles in ultra-high resolution image segmentation.
no code implementations • 28 Dec 2021 • Qiqi Gu, Shen Chen, Taiping Yao, Yang Chen, Shouhong Ding, Ran Yi
The progressive enhancement process facilitates the learning of discriminative features with fine-grained face forgery clues.
1 code implementation • 11 Oct 2021 • Qianyu Zhou, Chuyun Zhuang, Ran Yi, Xuequan Lu, Lizhuang Ma
In this paper, we propose a novel and fully end-to-end trainable approach, called regional contrastive consistency regularization (RCCR) for domain adaptive semantic segmentation.
no code implementations • 1 Mar 2021 • Ran Yi, Yang Zhou, Xin Wang, Zhiyuan Liu, Xiaotian Li, Bin Ran
This paper presents an infrastructure assisted constrained connected automated vehicles (CAVs) trajectory optimization method on curved roads.
no code implementations • 1 Sep 2020 • Paul L. Rosin, Yu-Kun Lai, David Mould, Ran Yi, Itamar Berger, Lars Doyle, Seungyong Lee, Chuan Li, Yong-Jin Liu, Amir Semmo, Ariel Shamir, Minjung Son, Holger Winnemoller
Despite the recent upsurge of activity in image-based non-photorealistic rendering (NPR), and in particular portrait image stylisation, due to the advent of neural style transfer, the state of performance evaluation in this field is limited, especially compared to the norms in the computer vision and machine learning communities.
1 code implementation • CVPR 2020 • Ran Yi, Yong-Jin Liu, Yu-Kun Lai, Paul L. Rosin
We observe that due to the significant imbalance of information richness between photos and drawings, existing unpaired transfer methods such as CycleGAN tends to embed invisible reconstruction information indiscriminately in the whole drawings, leading to important facial features partially missing in drawings.
1 code implementation • 15 Mar 2020 • Zipeng Ye, Mengfei Xia, Yanan sun, Ran Yi, MinJing Yu, Juyong Zhang, Yu-Kun Lai, Yong-Jin Liu
The most challenging issue for our system is that the source domain of face photos (characterized by normal 2D faces) is significantly different from the target domain of 3D caricatures (characterized by 3D exaggerated face shapes and textures).
1 code implementation • 24 Feb 2020 • Ran Yi, Zipeng Ye, Juyong Zhang, Hujun Bao, Yong-Jin Liu
In this paper, we address this problem by proposing a deep neural network model that takes an audio signal A of a source person and a very short video V of a target person as input, and outputs a synthesized high-quality talking face video with personalized head pose (making use of the visual information in V), expression and lip synchronization (by considering both A and V).
no code implementations • 17 Nov 2019 • Yiheng Han, Wang Zhao, Jia Pan, Zipeng Ye, Ran Yi, Yong-Jin Liu
Motion planning for robots of high degrees-of-freedom (DOFs) is an important problem in robotics with sampling-based methods in configuration space C as one popular solution.
6 code implementations • CVPR 2019 • Ran Yi, Yong-Jin Liu, Yu-Kun Lai, Paul L. Rosin
Moreover, artists tend to use different strategies to draw different facial features and the lines drawn are only loosely related to obvious image features.
no code implementations • CVPR 2018 • Ran Yi, Yong-Jin Liu, Yu-Kun Lai
We propose an efficient Lloyd-like method with a splitting-merging scheme to compute a uniform tessellation on M, which induces the CSS in X. Theoretically our method has a good competitive ratio O(1).