no code implementations • ECCV 2020 • Yujun Cai, Lin Huang, Yiwei Wang, Tat-Jen Cham, Jianfei Cai, Junsong Yuan, Jun Liu, Xu Yang, Yiheng Zhu, Xiaohui Shen, Ding Liu, Jing Liu, Nadia Magnenat Thalmann
Last, in order to incorporate a general motion space for high-quality prediction, we build a memory-based dictionary, which aims to preserve the global motion patterns in training data to guide the predictions.
1 code implementation • 7 Aug 2024 • Zhe Jun Tang, Tat-Jen Cham
Instead of optimising a single outgoing radiance parameter, 3iGS enhances 3DGS view-dependent effects by expressing the outgoing radiance as a function of a local illumination field and Bidirectional Reflectance Distribution Function (BRDF) features.
no code implementations • 21 Mar 2024 • Tianhao Wu, Chuanxia Zheng, Tat-Jen Cham, Qianyi Wu
3D decomposition/segmentation still remains a challenge as large-scale 3D annotated data is not readily available.
1 code implementation • 21 Mar 2024 • Yuedong Chen, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-Jen Cham, Jianfei Cai
We introduce MVSplat, an efficient model that, given sparse multi-view images as input, predicts clean feed-forward 3D Gaussians.
Ranked #1 on Generalizable Novel View Synthesis on ACID
1 code implementation • 29 Feb 2024 • Jianbin Zheng, Minghui Hu, Zhongyi Fan, Chaoyue Wang, Changxing Ding, DaCheng Tao, Tat-Jen Cham
Consequently, we introduce Trajectory Consistency Distillation (TCD), which encompasses trajectory consistency function and strategic stochastic sampling.
no code implementations • CVPR 2024 • Minghui Hu, Jianbin Zheng, Chuanxia Zheng, Chaoyue Wang, DaCheng Tao, Tat-Jen Cham
By integrating a compact network and incorporating an additional simple yet effective step during inference, OMS elevates image fidelity and harmonizes the dichotomy between training and inference, while preserving original model parameters.
no code implementations • 6 Jul 2023 • Tianhao Wu, Chuanxia Zheng, Tat-Jen Cham
Generating complete 360-degree panoramas from narrow field of view images is ongoing research as omnidirectional RGB data is not readily available.
no code implementations • 1 Jun 2023 • Minghui Hu, Jianbin Zheng, Daqing Liu, Chuanxia Zheng, Chaoyue Wang, DaCheng Tao, Tat-Jen Cham
In this work, we propose Cocktail, a pipeline to mix various modalities into one embedding, amalgamated with a generalized ControlNet (gControlNet), a controllable normalisation (ControlNorm), and a spatial guidance sampling method, to actualize multi-modal and spatially-refined control for text-conditional diffusion models.
1 code implementation • 24 Apr 2023 • Yuedong Chen, Haofei Xu, Qianyi Wu, Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
The key to our approach lies in the explicitly modeled correspondence matching information, so as to provide the geometry prior to the prediction of NeRF color and density for volume rendering.
1 code implementation • CVPR 2023 • Zhe Jun Tang, Tat-Jen Cham, Haiyu Zhao
Our method, which we call ABLE-NeRF, significantly reduces `blurry' glossy surfaces in rendering and produces realistic translucent surfaces which lack in prior art.
1 code implementation • 27 Nov 2022 • Minghui Hu, Chuanxia Zheng, Heliang Zheng, Tat-Jen Cham, Chaoyue Wang, Zuopeng Yang, DaCheng Tao, Ponnuthurai N. Suganthan
The recently developed discrete diffusion models perform extraordinarily well in the text-to-image task, showing significant promise for handling the multi-modality signals.
no code implementations • 13 Jul 2022 • Bo Hu, Tat-Jen Cham
Our key idea is to model the spatio-temporal relations among participants in a manner that is robust to error accumulation during frame-wise inference and prediction.
no code implementations • 5 Apr 2022 • Chuanxia Zheng, Guoxian Song, Tat-Jen Cham, Jianfei Cai, Dinh Phung, Linjie Luo
In this work, we present a novel framework for pluralistic image completion that can achieve both high quality and diversity at much faster inference speed.
1 code implementation • 21 Mar 2022 • Yuedong Chen, Qianyi Wu, Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
In light of recent advances in NeRF-based 3D-aware generative models, we introduce a new task, Semantic-to-NeRF translation, that aims to reconstruct a 3D scene modelled by NeRF, conditioned on one single-view semantic mask as input.
Ranked #1 on 3D-Aware Image Synthesis on CelebAMask-HQ
no code implementations • CVPR 2022 • Minghui Hu, Yujie Wang, Tat-Jen Cham, Jianfei Yang, P. N. Suganthan
We show that with the help of a content-rich discrete visual codebook from VQ-VAE, the discrete diffusion model can also generate high fidelity images with global context, which compensates for the deficiency of the classical autoregressive model along pixel space.
1 code implementation • 26 Jul 2021 • Yuedong Chen, Xu Yang, Tat-Jen Cham, Jianfei Cai
In this work, we scrutinize this problem from the perspective of causal inference, where such dataset characteristic is termed as a confounder which misleads the system to learn the spurious correlation.
1 code implementation • ACM Transactions on Graphics 2021 • Guoxian Song, Linjie Luo, Jing Liu, Wan-Chun Ma, Chun-Pong Lai, Chuanxia Zheng, Tat-Jen Cham
While substantial progress has been made in automated stylization, generating high quality stylistic portraits is still a challenge, and even the recent popular Toonify suffers from several artifacts when used on real input images.
1 code implementation • 12 Apr 2021 • Chuanxia Zheng, Duy-Son Dao, Guoxian Song, Tat-Jen Cham, Jianfei Cai
In this work, we propose a higher-level scene understanding system to tackle both visible and invisible parts of objects and backgrounds in a given scene.
1 code implementation • CVPR 2022 • Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai, Dinh Phung
Bridging global context interactions correctly is important for high-fidelity image completion with large masks.
Ranked #2 on Image Inpainting on FFHQ 512 x 512
2 code implementations • CVPR 2021 • Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
We propose a novel spatially-correlative loss that is simple, efficient and yet effective for preserving scene structure consistency while supporting large appearance changes during unpaired image-to-image (I2I) translation.
no code implementations • ICCV 2021 • Yujun Cai, Yiwei Wang, Yiheng Zhu, Tat-Jen Cham, Jianfei Cai, Junsong Yuan, Jun Liu, Chuanxia Zheng, Sijie Yan, Henghui Ding, Xiaohui Shen, Ding Liu, Nadia Magnenat Thalmann
Notably, by considering this problem as a conditional generation process, we estimate a parametric distribution of the missing regions based on the input conditions, from which to sample and synthesize the full motion series.
no code implementations • 6 Mar 2020 • Yuedong Chen, Guoxian Song, Zhiwen Shao, Jianfei Cai, Tat-Jen Cham, Jianming Zheng
Automatic facial action unit (AU) recognition has attracted great attention but still remains a challenging task, as subtle changes of local facial muscles are difficult to thoroughly capture.
no code implementations • 27 Nov 2019 • Guoxian Song, Jianmin Zheng, Jianfei Cai, Tat-Jen Cham
While the problem of estimating shapes and diffuse reflectances of human faces from images has been extensively studied, there is relatively less work done on recovering the specular albedo.
no code implementations • 6 May 2019 • Lu Sheng, Jianfei Cai, Tat-Jen Cham, Vladimir Pavlovic, King Ngi Ngan
In this paper, we propose a generative framework that unifies depth-based 3D facial pose tracking and face model adaptation on-the-fly, in the unconstrained scenarios with heavy occlusions and arbitrary facial expression variations.
1 code implementation • 25 Mar 2019 • Zhiwen Shao, Jianfei Cai, Tat-Jen Cham, Xuequan Lu, Lizhuang Ma
Due to the combination of source AU-related information and target AU-free information, the latent feature domain with transferred source label can be learned by maximizing the target-domain AU detection performance.
1 code implementation • CVPR 2019 • Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
In this paper, we present an approach for \textbf{pluralistic image completion} -- the task of generating multiple and diverse plausible solutions for image completion.
no code implementations • 1 Mar 2019 • Bo Hu, Jianfei Cai, Tat-Jen Cham, Junsong Yuan
Previous spatial-temporal action localization methods commonly follow the pipeline of object detection to estimate bounding boxes and labels of actions.
no code implementations • 21 Jan 2019 • Guoxian Song, Jianfei Cai, Tat-Jen Cham, Jianmin Zheng, Juyong Zhang, Henry Fuchs
Teleconference or telepresence based on virtual reality (VR) headmount display (HMD) device is a very interesting and promising application since HMD can provide immersive feelings for users.
1 code implementation • ECCV 2018 • Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
Current methods for single-image depth estimation use training datasets with real image-depth pairs or stereo pairs, which are not easy to acquire.
Ranked #3 on Depth Estimation on eBDtheque
no code implementations • 21 Feb 2018 • Zhilei Liu, Guoxian Song, Jianfei Cai, Tat-Jen Cham, Juyong Zhang
Employing deep learning-based approaches for fine-grained facial expression analysis, such as those involving the estimation of Action Unit (AU) intensities, is difficult due to the lack of a large-scale dataset of real faces with sufficiently diverse AU labels for training.
no code implementations • CVPR 2017 • Lu Sheng, Jianfei Cai, Tat-Jen Cham, Vladimir Pavlovic, King Ngi Ngan
We consider the problem of depth-based robust 3D facial pose tracking under unconstrained scenarios with heavy occlusions and arbitrary facial expression variations.
no code implementations • CVPR 2016 • Anran Wang, Jianfei Cai, Jiwen Lu, Tat-Jen Cham
While convolutional neural networks (CNN) have been excellent for object recognition, the greater spatial variability in scene images typically meant that the standard full-image CNN features are suboptimal for scene classification.
no code implementations • ICCV 2015 • Anran Wang, Jianfei Cai, Jiwen Lu, Tat-Jen Cham
We first construct deep CNN layers for color and depth separately, and then connect them with our carefully designed multi-modal layers, which fuse color and depth information by enforcing a common part to be shared by features of different modalities.
no code implementations • 10 Jul 2015 • Hai X. Pham, Chongyu Chen, Luc N. Dao, Vladimir Pavlovic, Jianfei Cai, Tat-Jen Cham
We introduce a novel robust hybrid 3D face tracking framework from RGBD video streams, which is capable of tracking head pose and facial actions without pre-calibration or intervention from a user.
no code implementations • CVPR 2014 • Di Xu, Qi Duan, Jianming Zheng, Juyong Zhang, Jianfei Cai, Tat-Jen Cham
As a result, our approach is robust, stable and is able to efficiently recover high quality of surface details even starting with a coarse MVS.