Search Results for author: Chongyang Ma

Found 44 papers, 21 papers with code

InterFusion: Text-Driven Generation of 3D Human-Object Interaction

no code implementations22 Mar 2024 Sisi Dai, Wenhao Li, Haowen Sun, Haibin Huang, Chongyang Ma, Hui Huang, Kai Xu, Ruizhen Hu

In this study, we tackle the complex task of generating 3D human-object interactions (HOI) from textual descriptions in a zero-shot text-to-3D manner.

3D Generation Human-Object Interaction Detection +2

VRMM: A Volumetric Relightable Morphable Head Model

no code implementations6 Feb 2024 Haotian Yang, Mingwu Zheng, Chongyang Ma, Yu-Kun Lai, Pengfei Wan, Haibin Huang

In this paper, we introduce the Volumetric Relightable Morphable Model (VRMM), a novel volumetric and parametric facial prior for 3D face modeling.

3D Face Reconstruction Self-Supervised Learning

Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion

no code implementations5 Feb 2024 Shiyuan Yang, Liang Hou, Haibin Huang, Chongyang Ma, Pengfei Wan, Di Zhang, Xiaodong Chen, Jing Liao

In practice, users often desire the ability to control object motion and camera movement independently for customized video creation.

Object Video Generation

CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion

1 code implementation25 Jan 2024 Nisha Huang, WeiMing Dong, Yuxin Zhang, Fan Tang, Ronghui Li, Chongyang Ma, Xiu Li, Changsheng Xu

Large-scale text-to-image generative models have made impressive strides, showcasing their ability to synthesize a vast array of high-quality images.

Image Generation Style Transfer

I2V-Adapter: A General Image-to-Video Adapter for Diffusion Models

no code implementations27 Dec 2023 Xun Guo, Mingwu Zheng, Liang Hou, Yuan Gao, Yufan Deng, Pengfei Wan, Di Zhang, Yufan Liu, Weiming Hu, ZhengJun Zha, Haibin Huang, Chongyang Ma

I2V-Adapter adeptly propagates the unnoised input image to subsequent noised frames through a cross-frame attention mechanism, maintaining the identity of the input image without any changes to the pretrained T2V model.

Video Generation

MotionCrafter: One-Shot Motion Customization of Diffusion Models

1 code implementation8 Dec 2023 Yuxin Zhang, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, WeiMing Dong, Changsheng Xu

The essence of a video lies in its dynamic motions, including character actions, object movements, and camera movements.

Disentanglement Motion Disentanglement +3

Agents meet OKR: An Object and Key Results Driven Agent System with Hierarchical Self-Collaboration and Self-Evaluation

no code implementations28 Nov 2023 Yi Zheng, Chongyang Ma, Kanle Shi, Haibin Huang

In this study, we introduce the concept of OKR-Agent designed to enhance the capabilities of Large Language Models (LLMs) in task-solving.

Towards Practical Capture of High-Fidelity Relightable Avatars

no code implementations8 Sep 2023 Haotian Yang, Mingwu Zheng, Wanquan Feng, Haibin Huang, Yu-Kun Lai, Pengfei Wan, Zhongyuan Wang, Chongyang Ma

Specifically, TRAvatar is trained with dynamic image sequences captured in a Light Stage under varying lighting conditions, enabling realistic relighting and real-time animation for avatars in diverse scenes.

3D Keypoint Estimation Using Implicit Representation Learning

no code implementations20 Jun 2023 Xiangyu Zhu, Dong Du, Haibin Huang, Chongyang Ma, Xiaoguang Han

Inspired by the recent success of advanced implicit representation in reconstruction tasks, we explore the idea of using an implicit field to represent keypoints.

Keypoint Estimation Representation Learning

Multi-Modal Face Stylization with a Generative Prior

no code implementations29 May 2023 Mengtian Li, Yi Dong, Minxuan Lin, Haibin Huang, Pengfei Wan, Chongyang Ma

We also introduce a two-stage training strategy, where we train the encoder in the first stage to align the feature maps with StyleGAN and enable a faithful reconstruction of input faces.

Face Generation

ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models

3 code implementations25 May 2023 Yuxin Zhang, WeiMing Dong, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Oliver Deussen, Changsheng Xu

We apply ProSpect in various personalized attribute-aware image generation applications, such as image-guided or text-driven manipulations of materials, style, and layout, achieving previously unattainable results from a single image input without fine-tuning the diffusion models.

Attribute Disentanglement +1

Semi-Weakly Supervised Object Kinematic Motion Prediction

no code implementations CVPR 2023 Gengxin Liu, Qian Sun, Haibin Huang, Chongyang Ma, Yulan Guo, Li Yi, Hui Huang, Ruizhen Hu

First, although 3D dataset with fully annotated motion labels is limited, there are existing datasets and methods for object part semantic segmentation at large scale.

motion prediction Object +3

A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive Learning

1 code implementation9 Mar 2023 Yuxin Zhang, Fan Tang, WeiMing Dong, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Changsheng Xu

Our framework consists of three key components, i. e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer.

Contrastive Learning Representation Learning +1

HairStep: Transfer Synthetic to Real Using Strand and Depth Maps for Single-View 3D Hair Modeling

1 code implementation CVPR 2023 Yujian Zheng, Zirong Jin, Moran Li, Haibin Huang, Chongyang Ma, Shuguang Cui, Xiaoguang Han

We firmly think an intermediate representation is essential, but we argue that orientation map using the dominant filtering-based methods is sensitive to uncertain noise and far from a competent representation.

LitAR: Visually Coherent Lighting for Mobile Augmented Reality

1 code implementation15 Jan 2023 Yiqin Zhao, Chongyang Ma, Haibin Huang, Tian Guo

In this work, we present the design and implementation of a lighting reconstruction framework called LitAR that enables realistic and visually-coherent rendering.

Inversion-Based Style Transfer with Diffusion Models

1 code implementation CVPR 2023 Yuxin Zhang, Nisha Huang, Fan Tang, Haibin Huang, Chongyang Ma, WeiMing Dong, Changsheng Xu

Our key idea is to learn artistic style directly from a single painting and then guide the synthesis without providing complex textual descriptions.

Denoising Style Transfer +1

DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization

1 code implementation19 Nov 2022 Nisha Huang, Yuxin Zhang, Fan Tang, Chongyang Ma, Haibin Huang, Yong Zhang, WeiMing Dong, Changsheng Xu

Despite the impressive results of arbitrary image-guided style transfer methods, text-driven image stylization has recently been proposed for transferring a natural image into a stylized one according to textual descriptions of the target style provided by the user.

Denoising Image Stylization

Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning

1 code implementation19 May 2022 Yuxin Zhang, Fan Tang, WeiMing Dong, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Changsheng Xu

Our framework consists of three key components, i. e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.

Contrastive Learning Image Stylization +1

Point cloud completion via structured feature maps using a feedback network

no code implementations17 Feb 2022 Zejia Su, Haibin Huang, Chongyang Ma, Hui Huang, Ruizhen Hu

To efficiently exploit local structures and enhance point distribution uniformity, we propose IFNet, a point upsampling module with a self-correction mechanism that can progressively refine details of the generated dense point cloud.

Point Cloud Completion point cloud upsampling

StyTr2: Image Style Transfer With Transformers

3 code implementations CVPR 2022 Yingying Deng, Fan Tang, WeiMing Dong, Chongyang Ma, Xingjia Pan, Lei Wang, Changsheng Xu

The goal of image style transfer is to render an image with artistic features guided by a style reference while maintaining the original content.

Style Transfer

MobRecon: Mobile-Friendly Hand Mesh Reconstruction from Monocular Image

1 code implementation CVPR 2022 Xingyu Chen, Yufeng Liu, Yajiao Dong, Xiong Zhang, Chongyang Ma, Yanmin Xiong, Yuan Zhang, Xiaoyan Guo

In this work, we propose a framework for single-view hand mesh reconstruction, which can simultaneously achieve high reconstruction accuracy, fast inference speed, and temporal coherence.

3D Hand Pose Estimation Position +2

Implicit Neural Deformation for Sparse-View Face Reconstruction

no code implementations5 Dec 2021 Moran Li, Haibin Huang, Yi Zheng, Mengtian Li, Nong Sang, Chongyang Ma

In this work, we present a new method for 3D face reconstruction from sparse-view RGB images.

3D Face Reconstruction

Scene Synthesis via Uncertainty-Driven Attribute Synchronization

1 code implementation ICCV 2021 Haitao Yang, Zaiwei Zhang, Siming Yan, Haibin Huang, Chongyang Ma, Yi Zheng, Chandrajit Bajaj, QiXing Huang

This task is challenging because 3D scenes exhibit diverse patterns, ranging from continuous ones, such as object sizes and the relative poses between pairs of shapes, to discrete patterns, such as occurrence and co-occurrence of objects with symmetrical relationships.

Attribute

Task-Aware Sampling Layer for Point-Wise Analysis

no code implementations9 Jul 2021 Yiqun Lin, Lichang Chen, Haibin Huang, Chongyang Ma, Xiaoguang Han, Shuguang Cui

Sampling, grouping, and aggregation are three important components in the multi-scale analysis of point clouds.

Keypoint Detection Point Cloud Completion +1

StyTr$^2$: Image Style Transfer with Transformers

4 code implementations30 May 2021 Yingying Deng, Fan Tang, WeiMing Dong, Chongyang Ma, Xingjia Pan, Lei Wang, Changsheng Xu

The goal of image style transfer is to render an image with artistic features guided by a style reference while maintaining the original content.

Style Transfer

HPNet: Deep Primitive Segmentation Using Hybrid Representations

1 code implementation ICCV 2021 Siming Yan, Zhenpei Yang, Chongyang Ma, Haibin Huang, Etienne Vouga, QiXing Huang

This paper introduces HPNet, a novel deep-learning approach for segmenting a 3D shape represented as a point cloud into primitive patches.

Clustering Segmentation

Effective Label Propagation for Discriminative Semi-Supervised Domain Adaptation

no code implementations4 Dec 2020 Zhiyong Huang, Kekai Sheng, WeiMing Dong, Xing Mei, Chongyang Ma, Feiyue Huang, Dengwen Zhou, Changsheng Xu

For intra-domain propagation, we propose an effective self-training strategy to mitigate the noises in pseudo-labeled target domain data and improve the feature discriminability in the target domain.

Domain Adaptation Image Classification +1

Arbitrary Video Style Transfer via Multi-Channel Correlation

no code implementations17 Sep 2020 Yingying Deng, Fan Tang, Wei-Ming Dong, Haibin Huang, Chongyang Ma, Changsheng Xu

Towards this end, we propose Multi-Channel Correction network (MCCNet), which can be trained to fuse the exemplar style features and input content features for efficient style transfer while naturally maintaining the coherence of input videos.

Style Transfer Video Style Transfer

Distribution Aligned Multimodal and Multi-Domain Image Stylization

no code implementations2 Jun 2020 Minxuan Lin, Fan Tang, Wei-Ming Dong, Xiao Li, Chongyang Ma, Changsheng Xu

Currently, there are few methods that can perform both multimodal and multi-domain stylization simultaneously.

Image Stylization

Dynamic Refinement Network for Oriented and Densely Packed Object Detection

1 code implementation CVPR 2020 Xingjia Pan, Yuqiang Ren, Kekai Sheng, Wei-Ming Dong, Haolei Yuan, Xiaowei Guo, Chongyang Ma, Changsheng Xu

However, the detection of oriented and densely packed objects remains challenging because of following inherent reasons: (1) receptive fields of neurons are all axis-aligned and of the same shape, whereas objects are usually of diverse shapes and align along various directions; (2) detection models are typically trained with generic knowledge and may not generalize well to handle specific objects at test time; (3) the limited dataset hinders the development on this task.

feature selection object-detection +2

Revisiting Image Aesthetic Assessment via Self-Supervised Feature Learning

no code implementations26 Nov 2019 Kekai Sheng, Wei-Ming Dong, Menglei Chai, Guohui Wang, Peng Zhou, Feiyue Huang, Bao-Gang Hu, Rongrong Ji, Chongyang Ma

In this paper, we revisit the problem of image aesthetic assessment from the self-supervised feature learning perspective.

LGM-Net: Learning to Generate Matching Networks for Few-Shot Learning

1 code implementation15 May 2019 Huaiyu Li, Wei-Ming Dong, Xing Mei, Chongyang Ma, Feiyue Huang, Bao-Gang Hu

The TargetNet module is a neural network for solving a specific task and the MetaNet module aims at learning to generate functional weights for TargetNet by observing training samples.

Few-Shot Learning

SiCloPe: Silhouette-Based Clothed People

1 code implementation CVPR 2019 Ryota Natsume, Shunsuke Saito, Zeng Huang, Weikai Chen, Chongyang Ma, Hao Li, Shigeo Morishima

The synthesized silhouettes which are the most consistent with the input segmentation are fed into a deep visual hull algorithm for robust 3D shape prediction.

Generative Adversarial Network Image-to-Image Translation

Gourmet Photography Dataset for Aesthetic Assessment of Food Images

1 code implementation SIGGRAPH Asia 2018 2018 Kekai Sheng, Wei-Ming Dong, Haibin Huang, Chongyang Ma, Bao-Gang Hu

In this study, we present the Gourmet Photography Dataset (GPD), which is the first large-scale dataset for aesthetic assessment of food photographs.

Attention-based Multi-Patch Aggregation for Image Aesthetic Assessment

1 code implementation ACM Multimedia Conference 2018 Kekai Sheng, Wei-Ming Dong, Chongyang Ma, Xing Mei, Feiyue Huang, Bao-Gang Hu

Aggregation structures with explicit information, such as image attributes and scene semantics, are effective and popular for intelligent systems for assessing aesthetics of visual data.

Aesthetics Quality Assessment

Deep Volumetric Video From Very Sparse Multi-View Performance Capture

no code implementations ECCV 2018 Zeng Huang, Tianye Li, Weikai Chen, Yajie Zhao, Jun Xing, Chloe LeGendre, Linjie Luo, Chongyang Ma, Hao Li

We present a deep learning-based volumetric capture approach for performance capture using a passive and highly sparse multi-view capture system.

Surface Reconstruction

Deep Generative Modeling for Scene Synthesis via Hybrid Representations

no code implementations6 Aug 2018 Zaiwei Zhang, Zhenpei Yang, Chongyang Ma, Linjie Luo, Alexander Huth, Etienne Vouga, Qi-Xing Huang

We show a principled way to train this model by combining discriminator losses for both a 3D object arrangement representation and a 2D image-based representation.

Image Retargetability

no code implementations12 Feb 2018 Fan Tang, Wei-Ming Dong, Yiping Meng, Chongyang Ma, Fuzhang Wu, Xinrui Li, Tong-Yee Lee

In this work, we introduce the notion of image retargetability to describe how well a particular image can be handled by content-aware image retargeting.

Image Retargeting

Unconstrained Realtime Facial Performance Capture

no code implementations CVPR 2015 Pei-Lun Hsieh, Chongyang Ma, Jihun Yu, Hao Li

We introduce a realtime facial tracking system specifically designed for performance capture in unconstrained settings using a consumer-level RGB-D sensor.

Cannot find the paper you are looking for? You can Submit a new open access paper.