no code implementations • 15 Apr 2024 • Chi Wang, Junming Huang, Rong Zhang, Qi Wang, Haotian Yang, Haibin Huang, Chongyang Ma, Weiwei Xu
SDS boosts GANs with more generative modes, while GANs promote more efficient optimization of SDS.
no code implementations • 22 Mar 2024 • Sisi Dai, Wenhao Li, Haowen Sun, Haibin Huang, Chongyang Ma, Hui Huang, Kai Xu, Ruizhen Hu
In this study, we tackle the complex task of generating 3D human-object interactions (HOI) from textual descriptions in a zero-shot text-to-3D manner.
no code implementations • 6 Feb 2024 • Haotian Yang, Mingwu Zheng, Chongyang Ma, Yu-Kun Lai, Pengfei Wan, Haibin Huang
In this paper, we introduce the Volumetric Relightable Morphable Model (VRMM), a novel volumetric and parametric facial prior for 3D face modeling.
no code implementations • 5 Feb 2024 • Shiyuan Yang, Liang Hou, Haibin Huang, Chongyang Ma, Pengfei Wan, Di Zhang, Xiaodong Chen, Jing Liao
In practice, users often desire the ability to control object motion and camera movement independently for customized video creation.
1 code implementation • 25 Jan 2024 • Nisha Huang, WeiMing Dong, Yuxin Zhang, Fan Tang, Ronghui Li, Chongyang Ma, Xiu Li, Changsheng Xu
Large-scale text-to-image generative models have made impressive strides, showcasing their ability to synthesize a vast array of high-quality images.
no code implementations • 27 Dec 2023 • Xun Guo, Mingwu Zheng, Liang Hou, Yuan Gao, Yufan Deng, Pengfei Wan, Di Zhang, Yufan Liu, Weiming Hu, ZhengJun Zha, Haibin Huang, Chongyang Ma
I2V-Adapter adeptly propagates the unnoised input image to subsequent noised frames through a cross-frame attention mechanism, maintaining the identity of the input image without any changes to the pretrained T2V model.
1 code implementation • 8 Dec 2023 • Yuxin Zhang, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, WeiMing Dong, Changsheng Xu
The essence of a video lies in its dynamic motions, including character actions, object movements, and camera movements.
no code implementations • 28 Nov 2023 • Yi Zheng, Chongyang Ma, Kanle Shi, Haibin Huang
In this study, we introduce the concept of OKR-Agent designed to enhance the capabilities of Large Language Models (LLMs) in task-solving.
no code implementations • 8 Sep 2023 • Haotian Yang, Mingwu Zheng, Wanquan Feng, Haibin Huang, Yu-Kun Lai, Pengfei Wan, Zhongyuan Wang, Chongyang Ma
Specifically, TRAvatar is trained with dynamic image sequences captured in a Light Stage under varying lighting conditions, enabling realistic relighting and real-time animation for avatars in diverse scenes.
no code implementations • 20 Jun 2023 • Xiangyu Zhu, Dong Du, Haibin Huang, Chongyang Ma, Xiaoguang Han
Inspired by the recent success of advanced implicit representation in reconstruction tasks, we explore the idea of using an implicit field to represent keypoints.
no code implementations • 29 May 2023 • Mengtian Li, Yi Dong, Minxuan Lin, Haibin Huang, Pengfei Wan, Chongyang Ma
We also introduce a two-stage training strategy, where we train the encoder in the first stage to align the feature maps with StyleGAN and enable a faithful reconstruction of input faces.
3 code implementations • 25 May 2023 • Yuxin Zhang, WeiMing Dong, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Oliver Deussen, Changsheng Xu
We apply ProSpect in various personalized attribute-aware image generation applications, such as image-guided or text-driven manipulations of materials, style, and layout, achieving previously unattainable results from a single image input without fine-tuning the diffusion models.
no code implementations • CVPR 2023 • Gengxin Liu, Qian Sun, Haibin Huang, Chongyang Ma, Yulan Guo, Li Yi, Hui Huang, Ruizhen Hu
First, although 3D dataset with fully annotated motion labels is limited, there are existing datasets and methods for object part semantic segmentation at large scale.
1 code implementation • 9 Mar 2023 • Yuxin Zhang, Fan Tang, WeiMing Dong, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Changsheng Xu
Our framework consists of three key components, i. e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer.
1 code implementation • CVPR 2023 • Yujian Zheng, Zirong Jin, Moran Li, Haibin Huang, Chongyang Ma, Shuguang Cui, Xiaoguang Han
We firmly think an intermediate representation is essential, but we argue that orientation map using the dominant filtering-based methods is sensitive to uncertain noise and far from a competent representation.
1 code implementation • 15 Jan 2023 • Yiqin Zhao, Chongyang Ma, Haibin Huang, Tian Guo
In this work, we present the design and implementation of a lighting reconstruction framework called LitAR that enables realistic and visually-coherent rendering.
1 code implementation • CVPR 2023 • Yuxin Zhang, Nisha Huang, Fan Tang, Haibin Huang, Chongyang Ma, WeiMing Dong, Changsheng Xu
Our key idea is to learn artistic style directly from a single painting and then guide the synthesis without providing complex textual descriptions.
1 code implementation • 19 Nov 2022 • Nisha Huang, Yuxin Zhang, Fan Tang, Chongyang Ma, Haibin Huang, Yong Zhang, WeiMing Dong, Changsheng Xu
Despite the impressive results of arbitrary image-guided style transfer methods, text-driven image stylization has recently been proposed for transferring a natural image into a stylized one according to textual descriptions of the target style provided by the user.
1 code implementation • NeurIPS 2023 • Liang Hou, Qi Cao, Yige Yuan, Songtao Zhao, Chongyang Ma, Siyuan Pan, Pengfei Wan, Zhongyuan Wang, HuaWei Shen, Xueqi Cheng
Training generative adversarial networks (GANs) with limited data is challenging because the discriminator is prone to overfitting.
1 code implementation • 19 May 2022 • Yuxin Zhang, Fan Tang, WeiMing Dong, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Changsheng Xu
Our framework consists of three key components, i. e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.
no code implementations • 17 Feb 2022 • Zejia Su, Haibin Huang, Chongyang Ma, Hui Huang, Ruizhen Hu
To efficiently exploit local structures and enhance point distribution uniformity, we propose IFNet, a point upsampling module with a self-correction mechanism that can progressively refine details of the generated dense point cloud.
3 code implementations • CVPR 2022 • Yingying Deng, Fan Tang, WeiMing Dong, Chongyang Ma, Xingjia Pan, Lei Wang, Changsheng Xu
The goal of image style transfer is to render an image with artistic features guided by a style reference while maintaining the original content.
1 code implementation • CVPR 2022 • Xingyu Chen, Yufeng Liu, Yajiao Dong, Xiong Zhang, Chongyang Ma, Yanmin Xiong, Yuan Zhang, Xiaoyan Guo
In this work, we propose a framework for single-view hand mesh reconstruction, which can simultaneously achieve high reconstruction accuracy, fast inference speed, and temporal coherence.
Ranked #7 on 3D Hand Pose Estimation on DexYCB
no code implementations • 5 Dec 2021 • Moran Li, Haibin Huang, Yi Zheng, Mengtian Li, Nong Sang, Chongyang Ma
In this work, we present a new method for 3D face reconstruction from sparse-view RGB images.
1 code implementation • ICCV 2021 • Haitao Yang, Zaiwei Zhang, Siming Yan, Haibin Huang, Chongyang Ma, Yi Zheng, Chandrajit Bajaj, QiXing Huang
This task is challenging because 3D scenes exhibit diverse patterns, ranging from continuous ones, such as object sizes and the relative poses between pairs of shapes, to discrete patterns, such as occurrence and co-occurrence of objects with symmetrical relationships.
no code implementations • 9 Jul 2021 • Yiqun Lin, Lichang Chen, Haibin Huang, Chongyang Ma, Xiaoguang Han, Shuguang Cui
Sampling, grouping, and aggregation are three important components in the multi-scale analysis of point clouds.
4 code implementations • 30 May 2021 • Yingying Deng, Fan Tang, WeiMing Dong, Chongyang Ma, Xingjia Pan, Lei Wang, Changsheng Xu
The goal of image style transfer is to render an image with artistic features guided by a style reference while maintaining the original content.
1 code implementation • ICCV 2021 • Siming Yan, Zhenpei Yang, Chongyang Ma, Haibin Huang, Etienne Vouga, QiXing Huang
This paper introduces HPNet, a novel deep-learning approach for segmenting a 3D shape represented as a point cloud into primitive patches.
1 code implementation • CVPR 2021 • Xingyu Chen, Yufeng Liu, Chongyang Ma, Jianlong Chang, Huayan Wang, Tian Chen, Xiaoyan Guo, Pengfei Wan, Wen Zheng
In the root-relative mesh recovery task, we exploit semantic relations among joints to generate a 3D mesh from the extracted 2D cues.
no code implementations • 4 Dec 2020 • Zhiyong Huang, Kekai Sheng, WeiMing Dong, Xing Mei, Chongyang Ma, Feiyue Huang, Dengwen Zhou, Changsheng Xu
For intra-domain propagation, we propose an effective self-training strategy to mitigate the noises in pseudo-labeled target domain data and improve the feature discriminability in the target domain.
no code implementations • 17 Sep 2020 • Yingying Deng, Fan Tang, Wei-Ming Dong, Haibin Huang, Chongyang Ma, Changsheng Xu
Towards this end, we propose Multi-Channel Correction network (MCCNet), which can be trained to fuse the exemplar style features and input content features for efficient style transfer while naturally maintaining the coherence of input videos.
no code implementations • ECCV 2020 • Tian Chen, Shijie An, Yuan Zhang, Chongyang Ma, Huayan Wang, Xiaoyan Guo, Wen Zheng
Monocular depth estimation plays a crucial role in 3D recognition and understanding.
no code implementations • 2 Jun 2020 • Minxuan Lin, Fan Tang, Wei-Ming Dong, Xiao Li, Chongyang Ma, Changsheng Xu
Currently, there are few methods that can perform both multimodal and multi-domain stylization simultaneously.
1 code implementation • CVPR 2020 • Xingjia Pan, Yuqiang Ren, Kekai Sheng, Wei-Ming Dong, Haolei Yuan, Xiaowei Guo, Chongyang Ma, Changsheng Xu
However, the detection of oriented and densely packed objects remains challenging because of following inherent reasons: (1) receptive fields of neurons are all axis-aligned and of the same shape, whereas objects are usually of diverse shapes and align along various directions; (2) detection models are typically trained with generic knowledge and may not generalize well to handle specific objects at test time; (3) the limited dataset hinders the development on this task.
no code implementations • 26 Nov 2019 • Kekai Sheng, Wei-Ming Dong, Menglei Chai, Guohui Wang, Peng Zhou, Feiyue Huang, Bao-Gang Hu, Rongrong Ji, Chongyang Ma
In this paper, we revisit the problem of image aesthetic assessment from the self-supervised feature learning perspective.
1 code implementation • 15 May 2019 • Huaiyu Li, Wei-Ming Dong, Xing Mei, Chongyang Ma, Feiyue Huang, Bao-Gang Hu
The TargetNet module is a neural network for solving a specific task and the MetaNet module aims at learning to generate functional weights for TargetNet by observing training samples.
no code implementations • CVPR 2019 • Seonghyeon Nam, Chongyang Ma, Menglei Chai, William Brendel, Ning Xu, Seon Joo Kim
Time-lapse videos usually contain visually appealing content but are often difficult and costly to create.
1 code implementation • CVPR 2019 • Ryota Natsume, Shunsuke Saito, Zeng Huang, Weikai Chen, Chongyang Ma, Hao Li, Shigeo Morishima
The synthesized silhouettes which are the most consistent with the input segmentation are fed into a deep visual hull algorithm for robust 3D shape prediction.
1 code implementation • SIGGRAPH Asia 2018 2018 • Kekai Sheng, Wei-Ming Dong, Haibin Huang, Chongyang Ma, Bao-Gang Hu
In this study, we present the Gourmet Photography Dataset (GPD), which is the first large-scale dataset for aesthetic assessment of food photographs.
1 code implementation • ACM Multimedia Conference 2018 • Kekai Sheng, Wei-Ming Dong, Chongyang Ma, Xing Mei, Feiyue Huang, Bao-Gang Hu
Aggregation structures with explicit information, such as image attributes and scene semantics, are effective and popular for intelligent systems for assessing aesthetics of visual data.
Ranked #1 on Aesthetics Quality Assessment on AVA
no code implementations • ECCV 2018 • Zeng Huang, Tianye Li, Weikai Chen, Yajie Zhao, Jun Xing, Chloe LeGendre, Linjie Luo, Chongyang Ma, Hao Li
We present a deep learning-based volumetric capture approach for performance capture using a passive and highly sparse multi-view capture system.
no code implementations • 6 Aug 2018 • Zaiwei Zhang, Zhenpei Yang, Chongyang Ma, Linjie Luo, Alexander Huth, Etienne Vouga, Qi-Xing Huang
We show a principled way to train this model by combining discriminator losses for both a 3D object arrangement representation and a 2D image-based representation.
no code implementations • 12 Feb 2018 • Fan Tang, Wei-Ming Dong, Yiping Meng, Chongyang Ma, Fuzhang Wu, Xinrui Li, Tong-Yee Lee
In this work, we introduce the notion of image retargetability to describe how well a particular image can be handled by content-aware image retargeting.
no code implementations • CVPR 2015 • Pei-Lun Hsieh, Chongyang Ma, Jihun Yu, Hao Li
We introduce a realtime facial tracking system specifically designed for performance capture in unconstrained settings using a consumer-level RGB-D sensor.