Search Results for author: Xun Cao

Found 39 papers, 15 papers with code

MMVP: A Multimodal MoCap Dataset with Vision and Pressure Sensors

no code implementations26 Mar 2024 He Zhang, Shenghao Ren, Haolei Yuan, Jianhui Zhao, Fan Li, Shuangpeng Sun, Zhenghao Liang, Tao Yu, Qiu Shen, Xun Cao

To validate the dataset, we propose an RGBD-P SMPL fitting method and also a monocular-video-based baseline framework, VP-MoCap, for human motion capture.

Translation

STAG4D: Spatial-Temporal Anchored Generative 4D Gaussians

no code implementations22 Mar 2024 Yifei Zeng, Yanqin Jiang, Siyu Zhu, Yuanxun Lu, Youtian Lin, Hao Zhu, Weiming Hu, Xun Cao, Yao Yao

Recent progress in pre-trained diffusion models and 3D generation have spurred interest in 4D content creation.

3D Generation

Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance

1 code implementation21 Mar 2024 Shenhao Zhu, Junming Leo Chen, Zuozhuo Dai, Yinghui Xu, Xun Cao, Yao Yao, Hao Zhu, Siyu Zhu

In this study, we introduce a methodology for human image animation by leveraging a 3D human parametric model within a latent diffusion framework to enhance shape alignment and motion guidance in curernt human generative techniques.

Animated GIF Generation Image Animation +1

FINER: Flexible spectral-bias tuning in Implicit NEural Representation by Variable-periodic Activation Functions

no code implementations5 Dec 2023 Zhen Liu, Hao Zhu, Qi Zhang, Jingde Fu, Weibing Deng, Zhan Ma, Yanwen Guo, Xun Cao

Implicit Neural Representation (INR), which utilizes a neural network to map coordinate inputs to corresponding attributes, is causing a revolution in the field of signal processing.

VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior

no code implementations4 Dec 2023 Xusen Sun, Longhao Zhang, Hao Zhu, Peng Zhang, Bang Zhang, Xinya Ji, Kangneng Zhou, Daiheng Gao, Liefeng Bo, Xun Cao

Audio-driven talking head generation has drawn much attention in recent years, and many efforts have been made in lip-sync, expressive facial expressions, natural head pose generation, and high video quality.

Talking Head Generation

Direct2.5: Diverse Text-to-3D Generation via Multi-view 2.5D Diffusion

no code implementations27 Nov 2023 Yuanxun Lu, Jingyang Zhang, Shiwei Li, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan, Xun Cao, Yao Yao

The multi-view 2. 5D diffusion directly models the structural distribution of 3D data, while still maintaining the strong generalization ability of the original 2D diffusion model, filling the gap between 2D diffusion-based and direct 3D diffusion-based methods for 3D content generation.

3D Generation Text to 3D

Relightable 3D Gaussian: Real-time Point Cloud Relighting with BRDF Decomposition and Ray Tracing

no code implementations27 Nov 2023 Jian Gao, Chun Gu, Youtian Lin, Hao Zhu, Xun Cao, Li Zhang, Yao Yao

We present a novel differentiable point-based rendering framework for material and lighting decomposition from multi-view images, enabling editing, ray-tracing, and real-time relighting of the 3D point cloud.

BRDF estimation Lighting Estimation

RefConv: Re-parameterized Refocusing Convolution for Powerful ConvNets

1 code implementation16 Oct 2023 Zhicheng Cai, Xiaohan Ding, Qiu Shen, Xun Cao

We propose Re-parameterized Refocusing Convolution (RefConv) as a replacement for regular convolutional layers, which is a plug-and-play module to improve the performance without any inference costs.

Image Classification object-detection +2

Tracking Anything in Heart All at Once

no code implementations4 Oct 2023 Chengkang Shen, Hao Zhu, You Zhou, Yu Liu, Si Yi, Lili Dong, Weipeng Zhao, David J. Brady, Xun Cao, Zhan Ma, Yi Lin

Myocardial motion tracking stands as an essential clinical tool in the prevention and detection of Cardiovascular Diseases (CVDs), the foremost cause of death globally.

Motion Estimation

Aperture Diffraction for Compact Snapshot Spectral Imaging

1 code implementation ICCV 2023 Tao Lv, Hao Ye, Quan Yuan, Zhan Shi, Yibo Wang, Shuming Wang, Xun Cao

We demonstrate a compact, cost-effective snapshot spectral imaging system named Aperture Diffraction Imaging Spectrometer (ADIS), which consists only of an imaging lens with an ultra-thin orthogonal aperture mask and a mosaic filter sensor, requiring no additional physical footprint compared to common RGB cameras.

RHINO: Regularizing the Hash-based Implicit Neural Representation

no code implementations22 Sep 2023 Hao Zhu, Fengyi Liu, Qi Zhang, Xun Cao, Zhan Ma

This connection ensures a seamless backpropagation of gradients from the network's output back to the input coordinates, thereby enhancing regularization.

Anti-Aliased Neural Implicit Surfaces with Encoding Level of Detail

no code implementations19 Sep 2023 Yiyu Zhuang, Qi Zhang, Ying Feng, Hao Zhu, Yao Yao, Xiaoyu Li, Yan-Pei Cao, Ying Shan, Xun Cao

Drawing inspiration from voxel-based representations with the level of detail (LoD), we introduce a multi-scale tri-plane-based scene representation that is capable of capturing the LoD of the signed distance function (SDF) and the space radiance.

Surface Reconstruction

AvatarBooth: High-Quality and Customizable 3D Human Avatar Generation

no code implementations16 Jun 2023 Yifei Zeng, Yuanxun Lu, Xinya Ji, Yao Yao, Hao Zhu, Xun Cao

Unlike previous approaches that can only synthesize avatars based on simple text descriptions, our method enables the creation of personalized avatars from casually captured face or body images, while still supporting text-based model generation and editing.

Text to 3D

High-Fidelity 3D Face Generation from Natural Language Descriptions

1 code implementation CVPR 2023 Menghua Wu, Hao Zhu, Linjia Huang, Yiyu Zhuang, Yuanxun Lu, Xun Cao

Synthesizing high-quality 3D face models from natural language descriptions is very valuable for many applications, including avatar creation, virtual reality, and telepresence.

Descriptive Face Generation +4

NeAI: A Pre-convoluted Representation for Plug-and-Play Neural Ambient Illumination

no code implementations18 Apr 2023 Yiyu Zhuang, Qi Zhang, Xuan Wang, Hao Zhu, Ying Feng, Xiaoyu Li, Ying Shan, Xun Cao

Recent advances in implicit neural representation have demonstrated the ability to recover detailed geometry and material from multi-view images.

Disorder-invariant Implicit Neural Representation

no code implementations3 Apr 2023 Hao Zhu, Shaowen Xie, Zhen Liu, Fengyi Liu, Qi Zhang, You Zhou, Yi Lin, Zhan Ma, Xun Cao

However, the expressive power of INR is limited by the spectral bias in the network training.

Attribute Retrieval

EEG Opto-processor: epileptic seizure detection using diffractive photonic computing units

no code implementations9 Dec 2022 Tao Yan, Maoqi Zhang, Sen Wan, Kaifeng Shang, Haiou Zhang, Xun Cao, Xing Lin, Qionghai Dai

Here, we propose the EEG opto-processor based on diffractive photonic computing units (DPUs) to effectively process the extracranial and intracranial EEG signals and perform epileptic seizure detection.

Brain Computer Interface Edge-computing +2

DINER: Disorder-Invariant Implicit Neural Representation

no code implementations CVPR 2023 Shaowen Xie, Hao Zhu, Zhen Liu, Qi Zhang, You Zhou, Xun Cao, Zhan Ma

Implicit neural representation (INR) characterizes the attributes of a signal as a function of corresponding coordinates which emerges as a sharp weapon for solving inverse problems.

Retrieval

Explore Spatio-temporal Aggregation for Insubstantial Object Detection: Benchmark Dataset and Baseline

1 code implementation CVPR 2022 Kailai Zhou, Yibo Wang, Tao Lv, Yunqian Li, Linsen Chen, Qiu Shen, Xun Cao

We endeavor on a rarely explored task named Insubstantial Object Detection (IOD), which aims to localize the object with following characteristics: (1) amorphous shape with indistinct boundary; (2) similarity to surroundings; (3) absence in color.

object-detection Object Detection

EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model

no code implementations30 May 2022 Xinya Ji, Hang Zhou, Kaisiyuan Wang, Qianyi Wu, Wayne Wu, Feng Xu, Xun Cao

Although significant progress has been made to audio-driven talking face generation, existing methods either neglect facial emotion or cannot be applied to arbitrary subjects.

Talking Face Generation

Detailed Facial Geometry Recovery from Multi-View Images by Learning an Implicit Function

1 code implementation4 Jan 2022 Yunze Xiao, Hao Zhu, Haotian Yang, Zhengyu Diao, Xiangju Lu, Xun Cao

By fitting a 3D morphable model from multi-view images, the features of multiple images are extracted and aggregated in the mesh-attached UV space, which makes the implicit function more effective in recovering detailed facial shape.

MoFaNeRF: Morphable Facial Neural Radiance Field

1 code implementation4 Dec 2021 Yiyu Zhuang, Hao Zhu, Xusen Sun, Xun Cao

To the best of our knowledge, our work is the first facial parametric model built upon a neural radiance field that can be used in fitting, generation and manipulation.

Image Generation Novel View Synthesis

FaceScape: 3D Facial Dataset and Benchmark for Single-View 3D Face Reconstruction

1 code implementation1 Nov 2021 Hao Zhu, Haotian Yang, Longwei Guo, Yidi Zhang, Yanru Wang, Mingkai Huang, Menghua Wu, Qiu Shen, Ruigang Yang, Xun Cao

By training on FaceScape data, a novel algorithm is proposed to predict elaborate riggable 3D face models from a single image input.

3D Face Reconstruction 3D Reconstruction

Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation

1 code implementation22 Sep 2021 Yuanxun Lu, Jinxiang Chai, Xun Cao

The first stage is a deep neural network that extracts deep audio features along with a manifold projection to project the features to the target person's speech space.

Image-to-Image Translation Talking Face Generation +2

Detailed Avatar Recovery from Single Image

no code implementations6 Aug 2021 Hao Zhu, Xinxin Zuo, Haotian Yang, Sen Wang, Xun Cao, Ruigang Yang

In this paper, we propose a novel learning-based framework that combines the robustness of the parametric model with the flexibility of free-form 3D deformation.

End-to-end Neural Video Coding Using a Compound Spatiotemporal Representation

no code implementations5 Aug 2021 Haojie Liu, Ming Lu, Zhiqi Chen, Xun Cao, Zhan Ma, Yao Wang

We further design a one-to-many decoder pipeline to generate multiple predictions from the CSTR, including vector-based resampling, adaptive kernel-based resampling, compensation mode selection maps and texture enhancements, and combines them adaptively to achieve more accurate inter prediction.

Motion Compensation MS-SSIM +3

Audio-Driven Emotional Video Portraits

1 code implementation CVPR 2021 Xinya Ji, Hang Zhou, Kaisiyuan Wang, Wayne Wu, Chen Change Loy, Xun Cao, Feng Xu

In this work, we present Emotional Video Portraits (EVP), a system for synthesizing high-quality video portraits with vivid emotional dynamics driven by audios.

Disentanglement Face Generation

Improving Multispectral Pedestrian Detection by Addressing Modality Imbalance Problems

1 code implementation ECCV 2020 Kailai Zhou, Linsen Chen, Xun Cao

Compared with traditional pedestrian detection, we find multispectral pedestrian detection suffers from modality imbalance problems which will hinder the optimization process of dual-modality network and depress the performance of detector.

Computational Efficiency Pedestrian Detection

Neural Video Coding using Multiscale Motion Compensation and Spatiotemporal Context Model

no code implementations9 Jul 2020 Haojie Liu, Ming Lu, Zhan Ma, Fan Wang, Zhihuang Xie, Xun Cao, Yao Wang

Over the past two decades, traditional block-based video coding has made remarkable progress and spawned a series of well-known standards such as MPEG-4, H. 264/AVC and H. 265/HEVC.

Motion Compensation MS-SSIM +2

FaceScape: a Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction

1 code implementation CVPR 2020 Haotian Yang, Hao Zhu, Yanru Wang, Mingkai Huang, Qiu Shen, Ruigang Yang, Xun Cao

In this paper, we present a large-scale detailed 3D face dataset, FaceScape, and propose a novel algorithm that is able to predict elaborate riggable 3D face models from a single image input.

Neural Image Compression via Non-Local Attention Optimization and Improved Context Modeling

1 code implementation11 Oct 2019 Tong Chen, Haojie Liu, Zhan Ma, Qiu Shen, Xun Cao, Yao Wang

This paper proposes a novel Non-Local Attention optmization and Improved Context modeling-based image compression (NLAIC) algorithm, which is built on top of the deep nerual network (DNN)-based variational auto-encoder (VAE) structure.

Image Compression MS-SSIM +1

Hyperspectral City V1.0 Dataset and Benchmark

no code implementations24 Jul 2019 Shaodi You, Erqi Huang, Shuaizhe Liang, Yongrong Zheng, Yunxiang Li, Fan Wang, Sen Lin, Qiu Shen, Xun Cao, Diming Zhang, Yuanjiang Li, Yu Li, Ying Fu, Boxin Shi, Feng Lu, Yinqiang Zheng, Robby T. Tan

This document introduces the background and the usage of the Hyperspectral City Dataset and the benchmark.

Detailed Human Shape Estimation from a Single Image by Hierarchical Mesh Deformation

1 code implementation CVPR 2019 Hao Zhu, Xinxin Zuo, Sen Wang, Xun Cao, Ruigang Yang

This paper presents a novel framework to recover detailed human body shapes from a single image.

Non-local Attention Optimized Deep Image Compression

no code implementations22 Apr 2019 Haojie Liu, Tong Chen, Peiyao Guo, Qiu Shen, Xun Cao, Yao Wang, Zhan Ma

This paper proposes a novel Non-Local Attention Optimized Deep Image Compression (NLAIC) framework, which is built on top of the popular variational auto-encoder (VAE) structure.

Image Compression MS-SSIM +1

Multispectral Image Intrinsic Decomposition via Subspace Constraint

no code implementations CVPR 2018 Qian Huang, Weixin Zhu, Yang Zhao, Linsen Chen, Yao Wang, Tao Yue, Xun Cao

In this paper, a new Multispectral Image Intrinsic Decomposition model (MIID) is presented to decompose the shading and reflectance from a single multispectral image.

Multispectral Image Intrinsic Decomposition via Low Rank Constraint

no code implementations24 Feb 2018 Qian Huang, Weixin Zhu, Yang Zhao, Linsen Chen, Yao Wang, Tao Yue, Xun Cao

In this paper, a Low Rank Multispectral Image Intrinsic Decomposition model (LRIID) is presented to decompose the shading and reflectance from a single multispectral image.

Blind Optical Aberration Correction by Exploring Geometric and Visual Priors

no code implementations CVPR 2015 Tao Yue, Jinli Suo, Jue Wang, Xun Cao, Qionghai Dai

Furthermore, by investigating the visual artifacts of aberration degenerated images captured by consumer-level cameras, the non-uniform distribution of sharpness across color channels and the image lattice is exploited as visual priors, resulting in a novel strategy to utilize the guidance from the sharpest channel and local image regions to improve the overall performance and robustness.

Cannot find the paper you are looking for? You can Submit a new open access paper.