Search Results for author: Yichao Yan

Found 50 papers, 15 papers with code

Disentangled Clothed Avatar Generation with Layered Representation

no code implementations8 Jan 2025 Weitian Zhang, Sijing Wu, Manwen Liao, Yichao Yan

In this paper, we propose LayerAvatar, the first feed-forward diffusion-based method for generating component-disentangled clothed avatars.

Multimodal Latent Diffusion Model for Complex Sewing Pattern Generation

no code implementations19 Dec 2024 Shengqi Liu, Yuhao Cheng, Zhuo Chen, Xingyu Ren, Wenhan Zhu, Lincheng Li, Mengxiao Bi, Xiaokang Yang, Yichao Yan

To learn the sewing pattern distribution in the latent space, we design a two-step training strategy to inject the multi-modal conditions, \ie, body shapes, text prompts, and garment sketches, into a diffusion model, ensuring the generated garments are body-suited and detail-controlled.

MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations

1 code implementation17 Oct 2024 Liang Xu, Shaoyang Hua, Zili Lin, Yifan Liu, Feipeng Ma, Yichao Yan, Xin Jin, Xiaokang Yang, Wenjun Zeng

The ultimate goal of LMM is to serve as a foundation model for versatile motion-related tasks, e. g., human motion generation, with interpretability and generalizability.

Caption Generation Motion Generation

MMHead: Towards Fine-grained Multi-modal 3D Facial Animation

no code implementations10 Oct 2024 Sijing Wu, Yunhao Li, Yichao Yan, Huiyu Duan, Ziwei Liu, Guangtao Zhai

To fill this gap, we first construct a large-scale multi-modal 3D facial animation dataset, MMHead, which consists of 49 hours of 3D facial motion sequences, speech audios, and rich hierarchical text annotations.

Motion Generation text annotation +1

PostEdit: Posterior Sampling for Efficient Zero-Shot Image Editing

1 code implementation7 Oct 2024 Feng Tian, Yixuan Li, Yichao Yan, Shanyan Guan, Yanhao Ge, Xiaokang Yang

Conversely, inversion-free methods lack theoretical support for background similarity, as they circumvent the issue of maintaining initial features to achieve efficiency.

Revealing Directions for Text-guided 3D Face Editing

no code implementations7 Oct 2024 Zhuo Chen, Yichao Yan, Sehngqi Liu, Yuhao Cheng, Weiming Zhao, Lincheng Li, Mengxiao Bi, Xiaokang Yang

Experiments demonstrate the effectiveness and generalization of our Face Clan for various pre-trained GANs.

Attribute Denoising

AniSDF: Fused-Granularity Neural Surfaces with Anisotropic Encoding for High-Fidelity 3D Reconstruction

no code implementations2 Oct 2024 Jingnan Gao, Zhuo Chen, Yichao Yan, Xiaokang Yang

Extensive experiments demonstrate that our method boosts the quality of SDF-based methods by a great scale in both geometry reconstruction and novel-view synthesis.

3D Reconstruction Novel View Synthesis

HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objects

no code implementations17 Jul 2024 Xintao Lv, Liang Xu, Yichao Yan, Xin Jin, Congsheng Xu, Shuwen Wu, Yifan Liu, Lincheng Li, Mengxiao Bi, Wenjun Zeng, Xiaokang Yang

Thus, we propose HIMO, a large-scale MoCap dataset of full-body human interacting with multiple objects, containing 3. 3K 4D HOI sequences and 4. 08M 3D HOI frames.

Benchmarking Human-Object Interaction Detection +1

Multi-times Monte Carlo Rendering for Inter-reflection Reconstruction

no code implementations8 Jul 2024 Tengjie Zhu, Zhuo Chen, Jingnan Gao, Yichao Yan, Xiaokang Yang

Inverse rendering methods have achieved remarkable performance in reconstructing high-fidelity 3D objects with disentangled geometries, materials, and environmental light.

Disentanglement Inverse Rendering +1

Topo4D: Topology-Preserving Gaussian Splatting for High-Fidelity 4D Head Capture

no code implementations1 Jun 2024 Xuanchen Li, Yuhao Cheng, Xingyu Ren, Haozhe Jia, Di Xu, Wenhan Zhu, Yichao Yan

To simplify this process, we propose Topo4D, a novel framework for automatic geometry and texture generation, which optimizes densely aligned 4D heads and 8K texture maps directly from calibrated multi-view time-series images.

8k Face Reconstruction +2

$E^{3}$Gen: Efficient, Expressive and Editable Avatars Generation

no code implementations29 May 2024 Weitian Zhang, Yichao Yan, Yunhui Liu, Xingdong Sheng, Xiaokang Yang

Extensive experiments demonstrate that our method achieves superior performance in avatar generation and enables expressive full-body pose control and editing.

IPAD: Industrial Process Anomaly Detection Dataset

no code implementations23 Apr 2024 Jinfan Liu, Yichao Yan, Junjie Li, Weiming Zhao, Pengzhi Chu, Xingdong Sheng, Yunhui Liu, Xiaokang Yang

Video anomaly detection (VAD) is a challenging task aiming to recognize anomalies in video frames, and existing large-scale VAD researches primarily focus on road traffic and human activity scenes.

Anomaly Detection Video Anomaly Detection +1

Infusion: Preventing Customized Text-to-Image Diffusion from Overfitting

no code implementations22 Apr 2024 Weili Zeng, Yichao Yan, Qi Zhu, Zhuo Chen, Pengzhi Chu, Weiming Zhao, Xiaokang Yang

Text-to-image (T2I) customization aims to create images that embody specific visual concepts delineated in textual descriptions.

Rethinking Clothes Changing Person ReID: Conflicts, Synthesis, and Optimization

no code implementations19 Apr 2024 Junjie Li, Guanshuo Wang, Fufu Yu, Yichao Yan, Qiong Jia, Shouhong Ding, Xingdong Sheng, Yunhui Liu, Xiaokang Yang

However, such improvement sacrifices the performance under the standard protocol, caused by the inner conflict between standard and CC.

Clothes Changing Person Re-Identification

3D-Aware Face Editing via Warping-Guided Latent Direction Learning

no code implementations CVPR 2024 Yuhao Cheng, Zhuo Chen, Xingyu Ren, Wenhan Zhu, Zhengqin Xu, Di Xu, Changpeng Yang, Yichao Yan

To address the problem of distortion caused by tri-plane warping we train a warp-aware encoder to project the warped face onto a standardized latent space.

Attribute Facial Editing

Inter-X: Towards Versatile Human-Human Interaction Analysis

1 code implementation CVPR 2024 Liang Xu, Xintao Lv, Yichao Yan, Xin Jin, Shuwen Wu, Congsheng Xu, Yifan Liu, Yizhou Zhou, Fengyun Rao, Xingdong Sheng, Yunhui Liu, Wenjun Zeng, Xiaokang Yang

We also equip Inter-X with versatile annotations of more than 34K fine-grained human part-level textual descriptions, semantic interaction categories, interaction order, and the relationship and personality of the subjects.

Motion Synthesis

SingingHead: A Large-scale 4D Dataset for Singing Head Animation

no code implementations7 Dec 2023 Sijing Wu, Yunhao Li, Weitian Zhang, Jun Jia, Yucheng Zhu, Yichao Yan, Guangtao Zhai, Xiaokang Yang

Singing, as a common facial movement second only to talking, can be regarded as a universal language across ethnicities and cultures, plays an important role in emotional communication, art, and entertainment.

Portrait Animation Rhythm

EvaSurf: Efficient View-Aware Implicit Textured Surface Reconstruction

no code implementations16 Nov 2023 Jingnan Gao, Zhuo Chen, Yichao Yan, Bowen Pan, Zhe Wang, Jiangjing Lyu, Xiaokang Yang

In our method, we first employ an efficient surface-based model with a multi-view supervision module to ensure accurate mesh reconstruction.

3D Reconstruction NeRF +1

Generalizable Person Search on Open-world User-Generated Video Content

no code implementations16 Oct 2023 Junjie Li, Guanshuo Wang, Yichao Yan, Fufu Yu, Qiong Jia, Jie Qin, Shouhong Ding, Xiaokang Yang

Person search is a challenging task that involves detecting and retrieving individuals from a large set of un-cropped scene images.

Domain Generalization Person Search

Directional Texture Editing for 3D Models

no code implementations26 Sep 2023 Shengqi Liu, Zhuo Chen, Jingnan Gao, Yichao Yan, Wenhan Zhu, Jiangjing Lyu, Xiaokang Yang

However, the inherent complexity of 3D models and the ambiguous text description lead to the challenge in this task.

3D Object Editing

HyperStyle3D: Text-Guided 3D Portrait Stylization via Hypernetworks

no code implementations19 Apr 2023 Zhuo Chen, Xudong Xu, Yichao Yan, Ye Pan, Wenhan Zhu, Wayne Wu, Bo Dai, Xiaokang Yang

While the use of 3D-aware GANs bypasses the requirement of 3D data, we further alleviate the necessity of style images with the CLIP model being the stylization guidance.

Attribute

GANHead: Towards Generative Animatable Neural Head Avatars

no code implementations CVPR 2023 Sijing Wu, Yichao Yan, Yunhao Li, Yuhao Cheng, Wenhan Zhu, Ke Gao, Xiaobo Li, Guangtao Zhai

To bring digital avatars into people's lives, it is highly demanded to efficiently generate complete, realistic, and animatable head avatars.

Head3D: Complete 3D Head Generation via Tri-plane Feature Distillation

no code implementations28 Mar 2023 Yuhao Cheng, Yichao Yan, Wenhan Zhu, Ye Pan, Bowen Pan, Xiaokang Yang

Head generation with diverse identities is an important task in computer vision and computer graphics, widely used in multimedia applications.

3D-Aware Face Swapping

no code implementations CVPR 2023 Yixuan Li, Chao Ma, Yichao Yan, Wenhan Zhu, Xiaokang Yang

To achieve this, we take advantage of the strong geometry and texture prior of 3D human faces, where the 2D faces are projected into the latent space of a 3D generative model.

Attribute Face Swapping

Domain Adaptive Person Search

2 code implementations25 Jul 2022 Junjie Li, Yichao Yan, Guanshuo Wang, Fufu Yu, Qiong Jia, Shouhong Ding

In this paper, we take a further step and present Domain Adaptive Person Search (DAPS), which aims to generalize the model from a labeled source domain to the unlabeled target domain.

Pedestrian Detection Person Re-Identification +1

ActFormer: A GAN-based Transformer towards General Action-Conditioned 3D Human Motion Generation

no code implementations ICCV 2023 Liang Xu, Ziyang Song, Dongliang Wang, Jing Su, Zhicheng Fang, Chenjing Ding, Weihao Gan, Yichao Yan, Xin Jin, Xiaokang Yang, Wenjun Zeng, Wei Wu

We present a GAN-based Transformer for general action-conditioned 3D human motion generation, including not only single-person actions but also multi-person interactive actions.

Motion Generation

DialogueNeRF: Towards Realistic Avatar Face-to-Face Conversation Video Generation

no code implementations15 Mar 2022 Yichao Yan, Zanwei Zhou, Zi Wang, Jingnan Gao, Xiaokang Yang

In this paper, we propose a novel unified framework based on neural radiance field (NeRF) to address this task.

NeRF Talking Head Generation +1

A Coding Framework and Benchmark towards Low-Bitrate Video Understanding

1 code implementation6 Feb 2022 Yuan Tian, Guo Lu, Yichao Yan, Guangtao Zhai, Li Chen, Zhiyong Gao

The framework is optimized by ensuring that a transportation-efficient semantic representation of the video is preserved w. r. t.

Video Compression Video Understanding

DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering

no code implementations3 Jan 2022 Shunyu Yao, RuiZhe Zhong, Yichao Yan, Guangtao Zhai, Xiaokang Yang

Specifically, neural radiance field takes lip movements features and personalized attributes as two disentangled conditions, where lip movements are directly predicted from the audio inputs to achieve lip-synchronized generation.

NeRF Neural Rendering +1

MovieNet-PS: A Large-Scale Person Search Dataset in the Wild

1 code implementation5 Dec 2021 Jie Qin, Peng Zheng, Yichao Yan, Rong Quan, Xiaogang Cheng, Bingbing Ni

Person search aims to jointly localize and identify a query person from natural, uncropped images, which has been actively studied over the past few years.

Person Search

TAL: Two-stream Adaptive Learning for Generalizable Person Re-identification

no code implementations29 Nov 2021 Yichao Yan, Junjie Li, Shengcai Liao, Jie Qin, Bingbing Ni, Xiaokang Yang

In the meantime, we design an adaptive BN layer in the domain-invariant stream, to approximate the statistics of various unseen domains.

Domain Generalization Generalizable Person Re-identification +1

Efficient Person Search: An Anchor-Free Approach

4 code implementations1 Sep 2021 Yichao Yan, Jinpeng Li, Jie Qin, Shengcai Liao, Xiaokang Yang

Third, by investigating the advantages of both anchor-based and anchor-free models, we further augment AlignPS with an ROI-Align head, which significantly improves the robustness of re-id features while still keeping our model highly efficient.

Person Search

EAN: Event Adaptive Network for Enhanced Action Recognition

1 code implementation22 Jul 2021 Yuan Tian, Yichao Yan, Guangtao Zhai, Guodong Guo, Zhiyong Gao

In this paper, we propose a unified action recognition framework to investigate the dynamic nature of video content by introducing the following designs.

Action Recognition

Learning Multi-Granular Hypergraphs for Video-Based Person Re-Identification

1 code implementation CVPR 2020 Yichao Yan, Jie Qin1, Jiaxin Chen, Li Liu, Fan Zhu, Ying Tai, Ling Shao

In each hypergraph, different temporal granularities are captured by hyperedges that connect a set of graph nodes (i. e., part-based features) across different temporal ranges.

Video-Based Person Re-Identification

Learning Multi-Attention Context Graph for Group-Based Re-Identification

1 code implementation29 Apr 2021 Yichao Yan, Jie Qin, Bingbing Ni, Jiaxin Chen, Li Liu, Fan Zhu, Wei-Shi Zheng, Xiaokang Yang, Ling Shao

Extensive experiments on the novel dataset as well as three existing datasets clearly demonstrate the effectiveness of the proposed framework for both group-based re-id tasks.

Person Re-Identification

Anchor-Free Person Search

1 code implementation CVPR 2021 Yichao Yan, Jinpeng Li, Jie Qin, Song Bai, Shengcai Liao, Li Liu, Fan Zhu, Ling Shao

Person search aims to simultaneously localize and identify a query person from realistic, uncropped images, which can be regarded as the unified task of pedestrian detection and person re-identification (re-id).

Pedestrian Detection Person Re-Identification +1

Pose Transferrable Person Re-Identification

no code implementations CVPR 2018 Jinxian Liu, Bingbing Ni, Yichao Yan, Peng Zhou, Shuo Cheng, Jianguo Hu

On the other hand, in addition to the conventional discriminator of GAN (i. e., to distinguish between REAL/FAKE samples), we propose a novel guider sub-network which encourages the generated sample (i. e., with novel pose) towards better satisfying the ReID loss (i. e., cross-entropy ReID loss, triplet ReID loss).

Person Re-Identification Triplet

Image Matching via Loopy RNN

no code implementations10 Jun 2017 Donghao Luo, Bingbing Ni, Yichao Yan, Xiaokang Yang

Towards this end, we propose a novel loopy recurrent neural network (Loopy RNN), which is capable of aggregating relationship information of two input images in a progressive/iterative manner and outputting the consolidated matching score in the final iteration.

Depth Structure Preserving Scene Image Generation

no code implementations1 Jun 2017 Wendong Zhang, Bingbing Ni, Yichao Yan, Jingwei Xu, Xiaokang Yang

Key to automatically generate natural scene images is to properly arrange among various spatial elements, especially in the depth direction.

Image Generation Scene Generation

Predicting Human Interaction via Relative Attention Model

no code implementations26 May 2017 Yichao Yan, Bingbing Ni, Xiaokang Yang

Predicting human interaction is challenging as the on-going activity has to be inferred based on a partially observed video.

model

Person Re-Identification via Recurrent Feature Aggregation

1 code implementation23 Jan 2017 Yichao Yan, Bingbing Ni, Zhichao Song, Chao Ma, Yan Yan, Xiaokang Yang

We address the person re-identification problem by effectively exploiting a globally discriminative feature representation from a sequence of tracked human regions/patches.

Patch Matching Person Re-Identification

Cannot find the paper you are looking for? You can Submit a new open access paper.