no code implementations • 17 Feb 2025 • Jingnan Gao, Weizhe Liu, Weixuan Sun, Senbo Wang, Xibin Song, Taizhang Shang, Shenzhou Chen, Hongdong Li, Xiaokang Yang, Yichao Yan, Pan Ji
In this paper, we introduce MARS, a novel approach for 3D shape detailization.
no code implementations • 8 Jan 2025 • Weitian Zhang, Sijing Wu, Manwen Liao, Yichao Yan
In this paper, we propose LayerAvatar, the first feed-forward diffusion-based method for generating component-disentangled clothed avatars.
no code implementations • 19 Dec 2024 • Shengqi Liu, Yuhao Cheng, Zhuo Chen, Xingyu Ren, Wenhan Zhu, Lincheng Li, Mengxiao Bi, Xiaokang Yang, Yichao Yan
To learn the sewing pattern distribution in the latent space, we design a two-step training strategy to inject the multi-modal conditions, \ie, body shapes, text prompts, and garment sketches, into a diffusion model, ensuring the generated garments are body-suited and detail-controlled.
1 code implementation • 17 Oct 2024 • Liang Xu, Shaoyang Hua, Zili Lin, Yifan Liu, Feipeng Ma, Yichao Yan, Xin Jin, Xiaokang Yang, Wenjun Zeng
The ultimate goal of LMM is to serve as a foundation model for versatile motion-related tasks, e. g., human motion generation, with interpretability and generalizability.
no code implementations • 10 Oct 2024 • Sijing Wu, Yunhao Li, Yichao Yan, Huiyu Duan, Ziwei Liu, Guangtao Zhai
To fill this gap, we first construct a large-scale multi-modal 3D facial animation dataset, MMHead, which consists of 49 hours of 3D facial motion sequences, speech audios, and rich hierarchical text annotations.
1 code implementation • 7 Oct 2024 • Feng Tian, Yixuan Li, Yichao Yan, Shanyan Guan, Yanhao Ge, Xiaokang Yang
Conversely, inversion-free methods lack theoretical support for background similarity, as they circumvent the issue of maintaining initial features to achieve efficiency.
no code implementations • 7 Oct 2024 • Zhuo Chen, Yichao Yan, Sehngqi Liu, Yuhao Cheng, Weiming Zhao, Lincheng Li, Mengxiao Bi, Xiaokang Yang
Experiments demonstrate the effectiveness and generalization of our Face Clan for various pre-trained GANs.
no code implementations • 2 Oct 2024 • Jingnan Gao, Zhuo Chen, Yichao Yan, Xiaokang Yang
Extensive experiments demonstrate that our method boosts the quality of SDF-based methods by a great scale in both geometry reconstruction and novel-view synthesis.
no code implementations • 17 Jul 2024 • Xintao Lv, Liang Xu, Yichao Yan, Xin Jin, Congsheng Xu, Shuwen Wu, Yifan Liu, Lincheng Li, Mengxiao Bi, Wenjun Zeng, Xiaokang Yang
Thus, we propose HIMO, a large-scale MoCap dataset of full-body human interacting with multiple objects, containing 3. 3K 4D HOI sequences and 4. 08M 3D HOI frames.
no code implementations • 8 Jul 2024 • Tengjie Zhu, Zhuo Chen, Jingnan Gao, Yichao Yan, Xiaokang Yang
Inverse rendering methods have achieved remarkable performance in reconstructing high-fidelity 3D objects with disentangled geometries, materials, and environmental light.
no code implementations • 1 Jun 2024 • Xuanchen Li, Yuhao Cheng, Xingyu Ren, Haozhe Jia, Di Xu, Wenhan Zhu, Yichao Yan
To simplify this process, we propose Topo4D, a novel framework for automatic geometry and texture generation, which optimizes densely aligned 4D heads and 8K texture maps directly from calibrated multi-view time-series images.
no code implementations • 29 May 2024 • Weitian Zhang, Yichao Yan, Yunhui Liu, Xingdong Sheng, Xiaokang Yang
Extensive experiments demonstrate that our method achieves superior performance in avatar generation and enables expressive full-body pose control and editing.
no code implementations • 23 Apr 2024 • Jinfan Liu, Yichao Yan, Junjie Li, Weiming Zhao, Pengzhi Chu, Xingdong Sheng, Yunhui Liu, Xiaokang Yang
Video anomaly detection (VAD) is a challenging task aiming to recognize anomalies in video frames, and existing large-scale VAD researches primarily focus on road traffic and human activity scenes.
no code implementations • 22 Apr 2024 • Weili Zeng, Yichao Yan, Qi Zhu, Zhuo Chen, Pengzhi Chu, Weiming Zhao, Xiaokang Yang
Text-to-image (T2I) customization aims to create images that embody specific visual concepts delineated in textual descriptions.
no code implementations • 19 Apr 2024 • Junjie Li, Guanshuo Wang, Fufu Yu, Yichao Yan, Qiong Jia, Shouhong Ding, Xingdong Sheng, Yunhui Liu, Xiaokang Yang
However, such improvement sacrifices the performance under the standard protocol, caused by the inner conflict between standard and CC.
no code implementations • CVPR 2024 • Xingyu Ren, Jiankang Deng, Yuhao Cheng, Jia Guo, Chao Ma, Yichao Yan, Wenhan Zhu, Xiaokang Yang
We first learn a high-quality prior for facial reflectance.
1 code implementation • CVPR 2024 • Liang Xu, Yizhou Zhou, Yichao Yan, Xin Jin, Wenhan Zhu, Fengyun Rao, Xiaokang Yang, Wenjun Zeng
Humans constantly interact with their surrounding environments.
1 code implementation • 11 Mar 2024 • Weixia Zhang, Chengguang Zhu, Jingnan Gao, Yichao Yan, Guangtao Zhai, Xiaokang Yang
However, performance evaluation research lags behind the development of talking head generation techniques.
no code implementations • CVPR 2024 • Yuhao Cheng, Zhuo Chen, Xingyu Ren, Wenhan Zhu, Zhengqin Xu, Di Xu, Changpeng Yang, Yichao Yan
To address the problem of distortion caused by tri-plane warping we train a warp-aware encoder to project the warped face onto a standardized latent space.
1 code implementation • CVPR 2024 • Liang Xu, Xintao Lv, Yichao Yan, Xin Jin, Shuwen Wu, Congsheng Xu, Yifan Liu, Yizhou Zhou, Fengyun Rao, Xingdong Sheng, Yunhui Liu, Wenjun Zeng, Xiaokang Yang
We also equip Inter-X with versatile annotations of more than 34K fine-grained human part-level textual descriptions, semantic interaction categories, interaction order, and the relationship and personality of the subjects.
no code implementations • 7 Dec 2023 • Sijing Wu, Yunhao Li, Weitian Zhang, Jun Jia, Yucheng Zhu, Yichao Yan, Guangtao Zhai, Xiaokang Yang
Singing, as a common facial movement second only to talking, can be regarded as a universal language across ethnicities and cultures, plays an important role in emotional communication, art, and entertainment.
no code implementations • 16 Nov 2023 • Jingnan Gao, Zhuo Chen, Yichao Yan, Bowen Pan, Zhe Wang, Jiangjing Lyu, Xiaokang Yang
In our method, we first employ an efficient surface-based model with a multi-view supervision module to ensure accurate mesh reconstruction.
no code implementations • 16 Oct 2023 • Junjie Li, Guanshuo Wang, Yichao Yan, Fufu Yu, Qiong Jia, Jie Qin, Shouhong Ding, Xiaokang Yang
Person search is a challenging task that involves detecting and retrieving individuals from a large set of un-cropped scene images.
no code implementations • 26 Sep 2023 • Shengqi Liu, Zhuo Chen, Jingnan Gao, Yichao Yan, Wenhan Zhu, Jiangjing Lyu, Xiaokang Yang
However, the inherent complexity of 3D models and the ambiguous text description lead to the challenge in this task.
no code implementations • 19 Apr 2023 • Zhuo Chen, Xudong Xu, Yichao Yan, Ye Pan, Wenhan Zhu, Wayne Wu, Bo Dai, Xiaokang Yang
While the use of 3D-aware GANs bypasses the requirement of 3D data, we further alleviate the necessity of style images with the CLIP model being the stylization guidance.
no code implementations • CVPR 2023 • Sijing Wu, Yichao Yan, Yunhao Li, Yuhao Cheng, Wenhan Zhu, Ke Gao, Xiaobo Li, Guangtao Zhai
To bring digital avatars into people's lives, it is highly demanded to efficiently generate complete, realistic, and animatable head avatars.
no code implementations • 28 Mar 2023 • Yuhao Cheng, Yichao Yan, Wenhan Zhu, Ye Pan, Bowen Pan, Xiaokang Yang
Head generation with diverse identities is an important task in computer vision and computer graphics, widely used in multimedia applications.
no code implementations • CVPR 2023 • Yixuan Li, Chao Ma, Yichao Yan, Wenhan Zhu, Xiaokang Yang
To achieve this, we take advantage of the strong geometry and texture prior of 3D human faces, where the 2D faces are projected into the latent space of a 3D generative model.
no code implementations • CVPR 2023 • Xingyu Ren, Jiankang Deng, Chao Ma, Yichao Yan, Xiaokang Yang
Our key insight is that intrinsic semantic attributes such as race, skin color, and age can constrain the albedo map.
2 code implementations • 25 Jul 2022 • Junjie Li, Yichao Yan, Guanshuo Wang, Fufu Yu, Qiong Jia, Shouhong Ding
In this paper, we take a further step and present Domain Adaptive Person Search (DAPS), which aims to generalize the model from a labeled source domain to the unlabeled target domain.
no code implementations • ICCV 2023 • Liang Xu, Ziyang Song, Dongliang Wang, Jing Su, Zhicheng Fang, Chenjing Ding, Weihao Gan, Yichao Yan, Xin Jin, Xiaokang Yang, Wenjun Zeng, Wei Wu
We present a GAN-based Transformer for general action-conditioned 3D human motion generation, including not only single-person actions but also multi-person interactive actions.
no code implementations • 15 Mar 2022 • Yichao Yan, Zanwei Zhou, Zi Wang, Jingnan Gao, Xiaokang Yang
In this paper, we propose a novel unified framework based on neural radiance field (NeRF) to address this task.
1 code implementation • 6 Feb 2022 • Yuan Tian, Guo Lu, Yichao Yan, Guangtao Zhai, Li Chen, Zhiyong Gao
The framework is optimized by ensuring that a transportation-efficient semantic representation of the video is preserved w. r. t.
no code implementations • 3 Jan 2022 • Shunyu Yao, RuiZhe Zhong, Yichao Yan, Guangtao Zhai, Xiaokang Yang
Specifically, neural radiance field takes lip movements features and personalized attributes as two disentangled conditions, where lip movements are directly predicted from the audio inputs to achieve lip-synchronized generation.
1 code implementation • 5 Dec 2021 • Jie Qin, Peng Zheng, Yichao Yan, Rong Quan, Xiaogang Cheng, Bingbing Ni
Person search aims to jointly localize and identify a query person from natural, uncropped images, which has been actively studied over the past few years.
Ranked #3 on
Person Search
on CUHK-SYSU
no code implementations • 29 Nov 2021 • Yichao Yan, Junjie Li, Shengcai Liao, Jie Qin, Bingbing Ni, Xiaokang Yang
In the meantime, we design an adaptive BN layer in the domain-invariant stream, to approximate the statistics of various unseen domains.
Domain Generalization
Generalizable Person Re-identification
+1
4 code implementations • 1 Sep 2021 • Yichao Yan, Jinpeng Li, Jie Qin, Shengcai Liao, Xiaokang Yang
Third, by investigating the advantages of both anchor-based and anchor-free models, we further augment AlignPS with an ROI-Align head, which significantly improves the robustness of re-id features while still keeping our model highly efficient.
Ranked #4 on
Person Search
on PRW
1 code implementation • 22 Jul 2021 • Yuan Tian, Yichao Yan, Guangtao Zhai, Guodong Guo, Zhiyong Gao
In this paper, we propose a unified action recognition framework to investigate the dynamic nature of video content by introducing the following designs.
Ranked #15 on
Action Recognition
on Something-Something V1
no code implementations • 10 Jul 2021 • Jinpeng Li, Yichao Yan, Shengcai Liao, Xiaokang Yang, Ling Shao
Transformers have demonstrated great potential in computer vision tasks.
3 code implementations • 19 Jun 2021 • Yichao Yan, Jinpeng Li, Shengcai Liao, Jie Qin, Bingbing Ni, Xiaokang Yang, Ling Shao
This paper inventively considers weakly supervised person search with only bounding box annotations.
1 code implementation • CVPR 2020 • Yichao Yan, Jie Qin1, Jiaxin Chen, Li Liu, Fan Zhu, Ying Tai, Ling Shao
In each hypergraph, different temporal granularities are captured by hyperedges that connect a set of graph nodes (i. e., part-based features) across different temporal ranges.
Ranked #6 on
Person Re-Identification
on iLIDS-VID
1 code implementation • 29 Apr 2021 • Yichao Yan, Jie Qin, Bingbing Ni, Jiaxin Chen, Li Liu, Fan Zhu, Wei-Shi Zheng, Xiaokang Yang, Ling Shao
Extensive experiments on the novel dataset as well as three existing datasets clearly demonstrate the effectiveness of the proposed framework for both group-based re-id tasks.
1 code implementation • CVPR 2021 • Yichao Yan, Jinpeng Li, Jie Qin, Song Bai, Shengcai Liao, Li Liu, Fan Zhu, Ling Shao
Person search aims to simultaneously localize and identify a query person from realistic, uncropped images, which can be regarded as the unified task of pedestrian detection and person re-identification (re-id).
Ranked #10 on
Person Search
on CUHK-SYSU
no code implementations • CVPR 2019 • Yichao Yan, Qiang Zhang, Bingbing Ni, Wendong Zhang, Minghao Xu, Xiaokang Yang
Person re-identification has achieved great progress with deep convolutional neural networks.
no code implementations • CVPR 2018 • Jinxian Liu, Bingbing Ni, Yichao Yan, Peng Zhou, Shuo Cheng, Jianguo Hu
On the other hand, in addition to the conventional discriminator of GAN (i. e., to distinguish between REAL/FAKE samples), we propose a novel guider sub-network which encourages the generated sample (i. e., with novel pose) towards better satisfying the ReID loss (i. e., cross-entropy ReID loss, triplet ReID loss).
no code implementations • 4 Jul 2017 • Yichao Yan, Jingwei Xu, Bingbing Ni, Xiaokang Yang
This work make the first attempt to generate articulated human motion sequence from a single image.
Ranked #2 on
Gesture-to-Gesture Translation
on NTU Hand Digit
no code implementations • 10 Jun 2017 • Donghao Luo, Bingbing Ni, Yichao Yan, Xiaokang Yang
Towards this end, we propose a novel loopy recurrent neural network (Loopy RNN), which is capable of aggregating relationship information of two input images in a progressive/iterative manner and outputting the consolidated matching score in the final iteration.
no code implementations • 1 Jun 2017 • Wendong Zhang, Bingbing Ni, Yichao Yan, Jingwei Xu, Xiaokang Yang
Key to automatically generate natural scene images is to properly arrange among various spatial elements, especially in the depth direction.
no code implementations • 26 May 2017 • Yichao Yan, Bingbing Ni, Xiaokang Yang
Predicting human interaction is challenging as the on-going activity has to be inferred based on a partially observed video.
1 code implementation • 23 Jan 2017 • Yichao Yan, Bingbing Ni, Zhichao Song, Chao Ma, Yan Yan, Xiaokang Yang
We address the person re-identification problem by effectively exploiting a globally discriminative feature representation from a sequence of tracked human regions/patches.