no code implementations • 9 Dec 2024 • Yuzhu Ji, Chuanxia Zheng, Tat-Jen Cham
To address the limitations and by both considering the importance of appearance and geometry for motion transfer, in this work, we proposed a unified framework that combines multi-scale feature warping and neural texture mapping to recover better 2D appearance and 2. 5D geometry, partly by exploiting the information from DensePose, yet adapting to its inherent limited accuracy.
1 code implementation • 7 Nov 2024 • Yuedong Chen, Chuanxia Zheng, Haofei Xu, Bohan Zhuang, Andrea Vedaldi, Tat-Jen Cham, Jianfei Cai
To evaluate MVSplat360's performance, we introduce a new benchmark using the challenging DL3DV-10K dataset, where MVSplat360 achieves superior visual quality compared to state-of-the-art methods on wide-sweeping or even 360{\deg} NVS tasks.
no code implementations • 25 Aug 2024 • Brandon Smart, Chuanxia Zheng, Iro Laina, Victor Adrian Prisacariu
In this paper, we introduce Splatt3R, a pose-free, feed-forward method for in-the-wild 3D reconstruction and novel view synthesis from stereo pairs.
no code implementations • 8 Aug 2024 • Ruining Li, Chuanxia Zheng, Christian Rupprecht, Andrea Vedaldi
We present Puppet-Master, an interactive video generative model that can serve as a motion prior for part-level dynamics.
1 code implementation • 6 Jun 2024 • Stanislaw Szymanowicz, Eldar Insafutdinov, Chuanxia Zheng, Dylan Campbell, João F. Henriques, Christian Rupprecht, Andrea Vedaldi
In this paper, we propose Flash3D, a method for scene reconstruction and novel view synthesis from a single image which is both very generalisable and efficient.
no code implementations • 22 Mar 2024 • Ruining Li, Chuanxia Zheng, Christian Rupprecht, Andrea Vedaldi
We introduce DragAPart, a method that, given an image and a set of drags as input, generates a new image of the same object that responds to the action of the drags.
no code implementations • 21 Mar 2024 • Tianhao Wu, Chuanxia Zheng, Tat-Jen Cham, Qianyi Wu
3D decomposition/segmentation still remains a challenge as large-scale 3D annotated data is not readily available.
1 code implementation • 21 Mar 2024 • Yuedong Chen, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-Jen Cham, Jianfei Cai
We introduce MVSplat, an efficient model that, given sparse multi-view images as input, predicts clean feed-forward 3D Gaussians.
Ranked #1 on
Generalizable Novel View Synthesis
on ACID
1 code implementation • CVPR 2024 • Guanqi Zhan, Chuanxia Zheng, Weidi Xie, Andrew Zisserman
In contrast, we use 3D data to establish an automatic pipeline to determine authentic ground truth amodal masks for partially occluded objects in real images.
1 code implementation • CVPR 2024 • Chuanxia Zheng, Andrea Vedaldi
Similar to Zero-1-to-3, we start from a pre-trained 2D image generator for generalization, and fine-tune it for NVS.
no code implementations • CVPR 2024 • Minghui Hu, Jianbin Zheng, Chuanxia Zheng, Chaoyue Wang, DaCheng Tao, Tat-Jen Cham
By integrating a compact network and incorporating an additional simple yet effective step during inference, OMS elevates image fidelity and harmonizes the dichotomy between training and inference, while preserving original model parameters.
1 code implementation • 10 Oct 2023 • Guanqi Zhan, Chuanxia Zheng, Weidi Xie, Andrew Zisserman
To this end, we make the following contributions: (i) We introduce a general and lightweight protocol to evaluate whether features of an off-the-shelf large vision model encode a number of physical 'properties' of the 3D scene, by training discriminative classifiers on the features for these properties.
1 code implementation • ICCV 2023 • Chuanxia Zheng, Andrea Vedaldi
Vector Quantisation (VQ) is experiencing a comeback in machine learning, where it is increasingly used in representation learning.
no code implementations • 6 Jul 2023 • Tianhao Wu, Chuanxia Zheng, Tat-Jen Cham
Generating complete 360-degree panoramas from narrow field of view images is ongoing research as omnidirectional RGB data is not readily available.
no code implementations • 1 Jun 2023 • Minghui Hu, Jianbin Zheng, Daqing Liu, Chuanxia Zheng, Chaoyue Wang, DaCheng Tao, Tat-Jen Cham
In this work, we propose Cocktail, a pipeline to mix various modalities into one embedding, amalgamated with a generalized ControlNet (gControlNet), a controllable normalisation (ControlNorm), and a spatial guidance sampling method, to actualize multi-modal and spatially-refined control for text-conditional diffusion models.
1 code implementation • 24 Apr 2023 • Yuedong Chen, Haofei Xu, Qianyi Wu, Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
The key to our approach lies in the explicitly modeled correspondence matching information, so as to provide the geometry prior to the prediction of NeRF color and density for volume rendering.
no code implementations • 12 Feb 2023 • Tung-Long Vuong, Trung Le, He Zhao, Chuanxia Zheng, Mehrtash Harandi, Jianfei Cai, Dinh Phung
Learning deep discrete latent presentations offers a promise of better symbolic and summarized abstractions that are more useful to subsequent downstream tasks.
1 code implementation • 27 Nov 2022 • Minghui Hu, Chuanxia Zheng, Heliang Zheng, Tat-Jen Cham, Chaoyue Wang, Zuopeng Yang, DaCheng Tao, Ponnuthurai N. Suganthan
The recently developed discrete diffusion models perform extraordinarily well in the text-to-image task, showing significant promise for handling the multi-modality signals.
2 code implementations • 19 Sep 2022 • Chuanxia Zheng, Long Tung Vuong, Jianfei Cai, Dinh Phung
Although two-stage Vector Quantized (VQ) generative models allow for synthesizing high-fidelity and high-resolution images, their quantization operator encodes similar patches within an image into the same index, resulting in a repeated artifact for similar adjacent regions using existing decoder architectures.
Ranked #4 on
Image Reconstruction
on ImageNet
1 code implementation • 20 Jul 2022 • Qianyi Wu, Xian Liu, Yuedong Chen, Kejie Li, Chuanxia Zheng, Jianfei Cai, Jianmin Zheng
This paper proposes a novel framework, ObjectSDF, to build an object-compositional neural implicit representation with high fidelity in 3D reconstruction and object representation.
no code implementations • 5 Apr 2022 • Chuanxia Zheng, Guoxian Song, Tat-Jen Cham, Jianfei Cai, Dinh Phung, Linjie Luo
In this work, we present a novel framework for pluralistic image completion that can achieve both high quality and diversity at much faster inference speed.
1 code implementation • 21 Mar 2022 • Yuedong Chen, Qianyi Wu, Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
In light of recent advances in NeRF-based 3D-aware generative models, we introduce a new task, Semantic-to-NeRF translation, that aims to reconstruct a 3D scene modelled by NeRF, conditioned on one single-view semantic mask as input.
Ranked #1 on
3D-Aware Image Synthesis
on CelebAMask-HQ
no code implementations • 23 Feb 2022 • Chuanxia Zheng
The goal of this thesis is to present my research contributions towards solving various visual synthesis and generation tasks, comprising image translation, image completion, and completed scene decomposition.
1 code implementation • ACM Transactions on Graphics 2021 • Guoxian Song, Linjie Luo, Jing Liu, Wan-Chun Ma, Chun-Pong Lai, Chuanxia Zheng, Tat-Jen Cham
While substantial progress has been made in automated stylization, generating high quality stylistic portraits is still a challenge, and even the recent popular Toonify suffers from several artifacts when used on real input images.
1 code implementation • 12 Apr 2021 • Chuanxia Zheng, Duy-Son Dao, Guoxian Song, Tat-Jen Cham, Jianfei Cai
In this work, we propose a higher-level scene understanding system to tackle both visible and invisible parts of objects and backgrounds in a given scene.
1 code implementation • CVPR 2022 • Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai, Dinh Phung
Bridging global context interactions correctly is important for high-fidelity image completion with large masks.
Ranked #3 on
Image Inpainting
on FFHQ 512 x 512
2 code implementations • CVPR 2021 • Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
We propose a novel spatially-correlative loss that is simple, efficient and yet effective for preserving scene structure consistency while supporting large appearance changes during unpaired image-to-image (I2I) translation.
no code implementations • ICCV 2021 • Yujun Cai, Yiwei Wang, Yiheng Zhu, Tat-Jen Cham, Jianfei Cai, Junsong Yuan, Jun Liu, Chuanxia Zheng, Sijie Yan, Henghui Ding, Xiaohui Shen, Ding Liu, Nadia Magnenat Thalmann
Notably, by considering this problem as a conditional generation process, we estimate a parametric distribution of the missing regions based on the input conditions, from which to sample and synthesize the full motion series.
1 code implementation • CVPR 2019 • Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
In this paper, we present an approach for \textbf{pluralistic image completion} -- the task of generating multiple and diverse plausible solutions for image completion.
1 code implementation • ECCV 2018 • Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
Current methods for single-image depth estimation use training datasets with real image-depth pairs or stereo pairs, which are not easy to acquire.