Search Results for author: Chongyang Ma

Found 44 papers, 21 papers with code

Text-Driven Diverse Facial Texture Generation via Progressive Latent-Space Refinement

no code implementations • 15 Apr 2024 • Chi Wang, Junming Huang, Rong Zhang, Qi Wang, Haotian Yang, Haibin Huang, Chongyang Ma, Weiwei Xu

SDS boosts GANs with more generative modes, while GANs promote more efficient optimization of SDS.

Texture Synthesis

Paper
Add Code

InterFusion: Text-Driven Generation of 3D Human-Object Interaction

no code implementations • 22 Mar 2024 • Sisi Dai, Wenhao Li, Haowen Sun, Haibin Huang, Chongyang Ma, Hui Huang, Kai Xu, Ruizhen Hu

In this study, we tackle the complex task of generating 3D human-object interactions (HOI) from textual descriptions in a zero-shot text-to-3D manner.

3D Generation Human-Object Interaction Detection +2

Paper
Add Code

VRMM: A Volumetric Relightable Morphable Head Model

no code implementations • 6 Feb 2024 • Haotian Yang, Mingwu Zheng, Chongyang Ma, Yu-Kun Lai, Pengfei Wan, Haibin Huang

In this paper, we introduce the Volumetric Relightable Morphable Model (VRMM), a novel volumetric and parametric facial prior for 3D face modeling.

3D Face Reconstruction Self-Supervised Learning

Paper
Add Code

Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion

no code implementations • 5 Feb 2024 • Shiyuan Yang, Liang Hou, Haibin Huang, Chongyang Ma, Pengfei Wan, Di Zhang, Xiaodong Chen, Jing Liao

In practice, users often desire the ability to control object motion and camera movement independently for customized video creation.

Object Video Generation

Paper
Add Code

CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion

1 code implementation • 25 Jan 2024 • Nisha Huang, WeiMing Dong, Yuxin Zhang, Fan Tang, Ronghui Li, Chongyang Ma, Xiu Li, Changsheng Xu

Large-scale text-to-image generative models have made impressive strides, showcasing their ability to synthesize a vast array of high-quality images.

Image Generation Style Transfer

Paper
Code

I2V-Adapter: A General Image-to-Video Adapter for Diffusion Models

no code implementations • 27 Dec 2023 • Xun Guo, Mingwu Zheng, Liang Hou, Yuan Gao, Yufan Deng, Pengfei Wan, Di Zhang, Yufan Liu, Weiming Hu, ZhengJun Zha, Haibin Huang, Chongyang Ma

I2V-Adapter adeptly propagates the unnoised input image to subsequent noised frames through a cross-frame attention mechanism, maintaining the identity of the input image without any changes to the pretrained T2V model.

Video Generation

Paper
Add Code

MotionCrafter: One-Shot Motion Customization of Diffusion Models

1 code implementation • 8 Dec 2023 • Yuxin Zhang, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, WeiMing Dong, Changsheng Xu

The essence of a video lies in its dynamic motions, including character actions, object movements, and camera movements.

Disentanglement Motion Disentanglement +3

Paper
Code

Agents meet OKR: An Object and Key Results Driven Agent System with Hierarchical Self-Collaboration and Self-Evaluation

no code implementations • 28 Nov 2023 • Yi Zheng, Chongyang Ma, Kanle Shi, Haibin Huang

In this study, we introduce the concept of OKR-Agent designed to enhance the capabilities of Large Language Models (LLMs) in task-solving.

Paper
Add Code

Towards Practical Capture of High-Fidelity Relightable Avatars

no code implementations • 8 Sep 2023 • Haotian Yang, Mingwu Zheng, Wanquan Feng, Haibin Huang, Yu-Kun Lai, Pengfei Wan, Zhongyuan Wang, Chongyang Ma

Specifically, TRAvatar is trained with dynamic image sequences captured in a Light Stage under varying lighting conditions, enabling realistic relighting and real-time animation for avatars in diverse scenes.

Paper
Add Code

3D Keypoint Estimation Using Implicit Representation Learning

no code implementations • 20 Jun 2023 • Xiangyu Zhu, Dong Du, Haibin Huang, Chongyang Ma, Xiaoguang Han

Inspired by the recent success of advanced implicit representation in reconstruction tasks, we explore the idea of using an implicit field to represent keypoints.

Keypoint Estimation Representation Learning

Paper
Add Code

Multi-Modal Face Stylization with a Generative Prior

no code implementations • 29 May 2023 • Mengtian Li, Yi Dong, Minxuan Lin, Haibin Huang, Pengfei Wan, Chongyang Ma

We also introduce a two-stage training strategy, where we train the encoder in the first stage to align the feature maps with StyleGAN and enable a faithful reconstruction of input faces.

Face Generation

Paper
Add Code

ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models

3 code implementations • 25 May 2023 • Yuxin Zhang, WeiMing Dong, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Oliver Deussen, Changsheng Xu

We apply ProSpect in various personalized attribute-aware image generation applications, such as image-guided or text-driven manipulations of materials, style, and layout, achieving previously unattainable results from a single image input without fine-tuning the diffusion models.

Attribute Disentanglement +1

461

Paper
Code

Semi-Weakly Supervised Object Kinematic Motion Prediction

no code implementations • CVPR 2023 • Gengxin Liu, Qian Sun, Haibin Huang, Chongyang Ma, Yulan Guo, Li Yi, Hui Huang, Ruizhen Hu

First, although 3D dataset with fully annotated motion labels is limited, there are existing datasets and methods for object part semantic segmentation at large scale.

motion prediction Object +3

Paper
Add Code

A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive Learning

1 code implementation • 9 Mar 2023 • Yuxin Zhang, Fan Tang, WeiMing Dong, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Changsheng Xu

Our framework consists of three key components, i. e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer.

Contrastive Learning Representation Learning +1

157

Paper
Code

HairStep: Transfer Synthetic to Real Using Strand and Depth Maps for Single-View 3D Hair Modeling

1 code implementation • CVPR 2023 • Yujian Zheng, Zirong Jin, Moran Li, Haibin Huang, Chongyang Ma, Shuguang Cui, Xiaoguang Han

We firmly think an intermediate representation is essential, but we argue that orientation map using the dominant filtering-based methods is sensitive to uncertain noise and far from a competent representation.

154

Paper
Code

LitAR: Visually Coherent Lighting for Mobile Augmented Reality

1 code implementation • 15 Jan 2023 • Yiqin Zhao, Chongyang Ma, Haibin Huang, Tian Guo

In this work, we present the design and implementation of a lighting reconstruction framework called LitAR that enables realistic and visually-coherent rendering.

Paper
Code

Inversion-Based Style Transfer with Diffusion Models

1 code implementation • CVPR 2023 • Yuxin Zhang, Nisha Huang, Fan Tang, Haibin Huang, Chongyang Ma, WeiMing Dong, Changsheng Xu

Our key idea is to learn artistic style directly from a single painting and then guide the synthesis without providing complex textual descriptions.

Denoising Style Transfer +1

461

Paper
Code

DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization

1 code implementation • 19 Nov 2022 • Nisha Huang, Yuxin Zhang, Fan Tang, Chongyang Ma, Haibin Huang, Yong Zhang, WeiMing Dong, Changsheng Xu

Despite the impressive results of arbitrary image-guided style transfer methods, text-driven image stylization has recently been proposed for transferring a natural image into a stylized one according to textual descriptions of the target style provided by the user.

Denoising Image Stylization

113

Paper
Code

Augmentation-Aware Self-Supervision for Data-Efficient GAN Training

1 code implementation • NeurIPS 2023 • Liang Hou, Qi Cao, Yige Yuan, Songtao Zhao, Chongyang Ma, Siyuan Pan, Pengfei Wan, Zhongyuan Wang, HuaWei Shen, Xueqi Cheng

Training generative adversarial networks (GANs) with limited data is challenging because the discriminator is prone to overfitting.

Data Augmentation Representation Learning

Paper
Code

Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning

1 code implementation • 19 May 2022 • Yuxin Zhang, Fan Tang, WeiMing Dong, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Changsheng Xu

Our framework consists of three key components, i. e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.

Contrastive Learning Image Stylization +1

157

Paper
Code

Point cloud completion via structured feature maps using a feedback network

no code implementations • 17 Feb 2022 • Zejia Su, Haibin Huang, Chongyang Ma, Hui Huang, Ruizhen Hu

To efficiently exploit local structures and enhance point distribution uniformity, we propose IFNet, a point upsampling module with a self-correction mechanism that can progressively refine details of the generated dense point cloud.

Point Cloud Completion point cloud upsampling

Paper
Add Code

StyTr2: Image Style Transfer With Transformers

3 code implementations • CVPR 2022 • Yingying Deng, Fan Tang, WeiMing Dong, Chongyang Ma, Xingjia Pan, Lei Wang, Changsheng Xu

The goal of image style transfer is to render an image with artistic features guided by a style reference while maintaining the original content.

Style Transfer

320

Paper
Code

MobRecon: Mobile-Friendly Hand Mesh Reconstruction from Monocular Image

1 code implementation • CVPR 2022 • Xingyu Chen, Yufeng Liu, Yajiao Dong, Xiong Zhang, Chongyang Ma, Yanmin Xiong, Yuan Zhang, Xiaoyan Guo

In this work, we propose a framework for single-view hand mesh reconstruction, which can simultaneously achieve high reconstruction accuracy, fast inference speed, and temporal coherence.

Ranked #7 on 3D Hand Pose Estimation on DexYCB

3D Hand Pose Estimation Position +2

324

Paper
Code

Implicit Neural Deformation for Sparse-View Face Reconstruction

no code implementations • 5 Dec 2021 • Moran Li, Haibin Huang, Yi Zheng, Mengtian Li, Nong Sang, Chongyang Ma

In this work, we present a new method for 3D face reconstruction from sparse-view RGB images.

3D Face Reconstruction

Paper
Add Code

Scene Synthesis via Uncertainty-Driven Attribute Synchronization

1 code implementation • ICCV 2021 • Haitao Yang, Zaiwei Zhang, Siming Yan, Haibin Huang, Chongyang Ma, Yi Zheng, Chandrajit Bajaj, QiXing Huang

This task is challenging because 3D scenes exhibit diverse patterns, ranging from continuous ones, such as object sizes and the relative poses between pairs of shapes, to discrete patterns, such as occurrence and co-occurrence of objects with symmetrical relationships.

Attribute

Paper
Code

Task-Aware Sampling Layer for Point-Wise Analysis

no code implementations • 9 Jul 2021 • Yiqun Lin, Lichang Chen, Haibin Huang, Chongyang Ma, Xiaoguang Han, Shuguang Cui

Sampling, grouping, and aggregation are three important components in the multi-scale analysis of point clouds.

Keypoint Detection Point Cloud Completion +1

Paper
Add Code

StyTr$^2$: Image Style Transfer with Transformers

4 code implementations • 30 May 2021 • Yingying Deng, Fan Tang, WeiMing Dong, Chongyang Ma, Xingjia Pan, Lei Wang, Changsheng Xu

The goal of image style transfer is to render an image with artistic features guided by a style reference while maintaining the original content.

Style Transfer

320

Paper
Code

HPNet: Deep Primitive Segmentation Using Hybrid Representations

1 code implementation • ICCV 2021 • Siming Yan, Zhenpei Yang, Chongyang Ma, Haibin Huang, Etienne Vouga, QiXing Huang

This paper introduces HPNet, a novel deep-learning approach for segmenting a 3D shape represented as a point cloud into primitive patches.

Clustering Segmentation

Paper
Code

Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration

1 code implementation • CVPR 2021 • Xingyu Chen, Yufeng Liu, Chongyang Ma, Jianlong Chang, Huayan Wang, Tian Chen, Xiaoyan Guo, Pengfei Wan, Wen Zheng

In the root-relative mesh recovery task, we exploit semantic relations among joints to generate a 3D mesh from the extracted 2D cues.

Position

324

Paper
Code

Effective Label Propagation for Discriminative Semi-Supervised Domain Adaptation

no code implementations • 4 Dec 2020 • Zhiyong Huang, Kekai Sheng, WeiMing Dong, Xing Mei, Chongyang Ma, Feiyue Huang, Dengwen Zhou, Changsheng Xu

For intra-domain propagation, we propose an effective self-training strategy to mitigate the noises in pseudo-labeled target domain data and improve the feature discriminability in the target domain.

Domain Adaptation Image Classification +1

Paper
Add Code

Arbitrary Video Style Transfer via Multi-Channel Correlation

no code implementations • 17 Sep 2020 • Yingying Deng, Fan Tang, Wei-Ming Dong, Haibin Huang, Chongyang Ma, Changsheng Xu

Towards this end, we propose Multi-Channel Correction network (MCCNet), which can be trained to fuse the exemplar style features and input content features for efficient style transfer while naturally maintaining the coherence of input videos.

Style Transfer Video Style Transfer

Paper
Add Code

Improving Monocular Depth Estimation by Leveraging Structural Awareness and Complementary Datasets

no code implementations • ECCV 2020 • Tian Chen, Shijie An, Yuan Zhang, Chongyang Ma, Huayan Wang, Xiaoyan Guo, Wen Zheng

Monocular depth estimation plays a crucial role in 3D recognition and understanding.

Monocular Depth Estimation

Paper
Add Code

Distribution Aligned Multimodal and Multi-Domain Image Stylization

no code implementations • 2 Jun 2020 • Minxuan Lin, Fan Tang, Wei-Ming Dong, Xiao Li, Chongyang Ma, Changsheng Xu

Currently, there are few methods that can perform both multimodal and multi-domain stylization simultaneously.

Image Stylization

Paper
Add Code

Dynamic Refinement Network for Oriented and Densely Packed Object Detection

1 code implementation • CVPR 2020 • Xingjia Pan, Yuqiang Ren, Kekai Sheng, Wei-Ming Dong, Haolei Yuan, Xiaowei Guo, Chongyang Ma, Changsheng Xu

However, the detection of oriented and densely packed objects remains challenging because of following inherent reasons: (1) receptive fields of neurons are all axis-aligned and of the same shape, whereas objects are usually of diverse shapes and align along various directions; (2) detection models are typically trained with generic knowledge and may not generalize well to handle specific objects at test time; (3) the limited dataset hinders the development on this task.

Ranked #3 on One-stage Anchor-free Oriented Object Detection on SKU110K-R

feature selection object-detection +2

325

Paper
Code

Revisiting Image Aesthetic Assessment via Self-Supervised Feature Learning

no code implementations • 26 Nov 2019 • Kekai Sheng, Wei-Ming Dong, Menglei Chai, Guohui Wang, Peng Zhou, Feiyue Huang, Bao-Gang Hu, Rongrong Ji, Chongyang Ma

In this paper, we revisit the problem of image aesthetic assessment from the self-supervised feature learning perspective.

Paper
Add Code

LGM-Net: Learning to Generate Matching Networks for Few-Shot Learning

1 code implementation • 15 May 2019 • Huaiyu Li, Wei-Ming Dong, Xing Mei, Chongyang Ma, Feiyue Huang, Bao-Gang Hu

The TargetNet module is a neural network for solving a specific task and the MetaNet module aims at learning to generate functional weights for TargetNet by observing training samples.

Few-Shot Learning

Paper
Code

End-to-End Time-Lapse Video Synthesis from a Single Outdoor Image

no code implementations • CVPR 2019 • Seonghyeon Nam, Chongyang Ma, Menglei Chai, William Brendel, Ning Xu, Seon Joo Kim

Time-lapse videos usually contain visually appealing content but are often difficult and costly to create.

Generative Adversarial Network

Paper
Add Code

SiCloPe: Silhouette-Based Clothed People

1 code implementation • CVPR 2019 • Ryota Natsume, Shunsuke Saito, Zeng Huang, Weikai Chen, Chongyang Ma, Hao Li, Shigeo Morishima

The synthesized silhouettes which are the most consistent with the input segmentation are fed into a deep visual hull algorithm for robust 3D shape prediction.

Generative Adversarial Network Image-to-Image Translation

Paper
Code

Gourmet Photography Dataset for Aesthetic Assessment of Food Images

1 code implementation • SIGGRAPH Asia 2018 2018 • Kekai Sheng, Wei-Ming Dong, Haibin Huang, Chongyang Ma, Bao-Gang Hu

In this study, we present the Gourmet Photography Dataset (GPD), which is the first large-scale dataset for aesthetic assessment of food photographs.

Paper
Code

Attention-based Multi-Patch Aggregation for Image Aesthetic Assessment

1 code implementation • ACM Multimedia Conference 2018 • Kekai Sheng, Wei-Ming Dong, Chongyang Ma, Xing Mei, Feiyue Huang, Bao-Gang Hu

Aggregation structures with explicit information, such as image attributes and scene semantics, are effective and popular for intelligent systems for assessing aesthetics of visual data.

Ranked #1 on Aesthetics Quality Assessment on AVA

Aesthetics Quality Assessment

Paper
Code

Deep Volumetric Video From Very Sparse Multi-View Performance Capture

no code implementations • ECCV 2018 • Zeng Huang, Tianye Li, Weikai Chen, Yajie Zhao, Jun Xing, Chloe LeGendre, Linjie Luo, Chongyang Ma, Hao Li

We present a deep learning-based volumetric capture approach for performance capture using a passive and highly sparse multi-view capture system.

Surface Reconstruction

Paper
Add Code

Deep Generative Modeling for Scene Synthesis via Hybrid Representations

no code implementations • 6 Aug 2018 • Zaiwei Zhang, Zhenpei Yang, Chongyang Ma, Linjie Luo, Alexander Huth, Etienne Vouga, Qi-Xing Huang

We show a principled way to train this model by combining discriminator losses for both a 3D object arrangement representation and a 2D image-based representation.

Paper
Add Code

Image Retargetability

no code implementations • 12 Feb 2018 • Fan Tang, Wei-Ming Dong, Yiping Meng, Chongyang Ma, Fuzhang Wu, Xinrui Li, Tong-Yee Lee

In this work, we introduce the notion of image retargetability to describe how well a particular image can be handled by content-aware image retargeting.

Image Retargeting

Paper
Add Code

Unconstrained Realtime Facial Performance Capture

no code implementations • CVPR 2015 • Pei-Lun Hsieh, Chongyang Ma, Jihun Yu, Hao Li

We introduce a realtime facial tracking system specifically designed for performance capture in unconstrained settings using a consumer-level RGB-D sensor.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.