no code implementations • ECCV 2020 • Lvmin Zhang, Chengze Li, Yi Ji, Chunping Liu, Tien-Tsin Wong
We show that partially ""erasing"" the appearance preservation facilitate adequate image smoothing.
1 code implementation • 3 Sep 2024 • Wangbo Yu, Jinbo Xing, Li Yuan, WenBo Hu, Xiaoyu Li, Zhipeng Huang, Xiangjun Gao, Tien-Tsin Wong, Ying Shan, Yonghong Tian
Our method takes advantage of the powerful generation capabilities of video diffusion model and the coarse 3D clues offered by point-based representation to generate high-quality video frames with precise camera pose control.
no code implementations • 28 May 2024 • Jinbo Xing, Hanyuan Liu, Menghan Xia, Yong Zhang, Xintao Wang, Ying Shan, Tien-Tsin Wong
We introduce ToonCrafter, a novel approach that transcends traditional correspondence-based cartoon video interpolation, paving the way for generative interpolation.
no code implementations • 21 May 2024 • Jianan Li, Tao Huang, Qingxu Zhu, Tien-Tsin Wong
To attain plausible and realistic interaction motions, our method explicitly introduces physical constraints.
no code implementations • 13 Mar 2024 • Jian Lin, Xueting Liu, Chengze Li, Minshan Xie, Tien-Tsin Wong
Unfortunately, there is no existing method that tailors for automatic manga screening, probably due to the difficulty of generating high-quality shaded high-frequency screentones.
no code implementations • 24 Nov 2023 • Minshan Xie, Hanyuan Liu, Chengze Li, Tien-Tsin Wong
However, they struggle to generate videos with both highly detailed appearance and temporal consistency.
no code implementations • 21 Nov 2023 • Yuxin Liu, Minshan Xie, Hanyuan Liu, Tien-Tsin Wong
In this paper, we propose a synchronized multi-view diffusion approach that allows the diffusion processes from different views to reach a consensus of the generated content early in the process, and hence ensures the texture consistency.
1 code implementation • 18 Oct 2023 • Jinbo Xing, Menghan Xia, Yong Zhang, Haoxin Chen, Wangbo Yu, Hanyuan Liu, Xintao Wang, Tien-Tsin Wong, Ying Shan
Animating a still image offers an engaging visual experience.
1 code implementation • 7 Sep 2023 • Jiatai Lin, Guoqiang Han, Xuemiao Xu, Changhong Liang, Tien-Tsin Wong, C. L. Philip Chen, Zaiyi Liu, Chu Han
Class activation mapping~(CAM), a visualization technique for interpreting deep learning models, is now commonly used for weakly supervised semantic segmentation~(WSSS) and object localization~(WSOL).
Object Localization Weakly supervised Semantic Segmentation +1
no code implementations • 14 Jun 2023 • Cheuk-Kit Lau, Menghan Xia, Tien-Tsin Wong
Furthermore, to tackle the conflicts between the blue-noise quality and restoration accuracy in our novel base method, we proposed a predictor-embedded approach to offload predictable information from the network, which in our case is the luminance information resembling from the halftone pattern.
no code implementations • 7 Jun 2023 • Minshan Xie, Chengze Li, Tien-Tsin Wong
To overcome these limitations, we propose a novel interpretable representation of screentones that disentangles their intensity and type features, enabling better recognition and synthesis of screentones.
no code implementations • 2 Jun 2023 • Hanyuan Liu, Minshan Xie, Jinbo Xing, Chengze Li, Tien-Tsin Wong
In this paper, we present ColorDiffuser, an adaptation of a pre-trained text-to-image latent diffusion model for video colorization.
no code implementations • 1 Jun 2023 • Jinbo Xing, Menghan Xia, Yuxin Liu, Yuechen Zhang, Yong Zhang, Yingqing He, Hanyuan Liu, Haoxin Chen, Xiaodong Cun, Xintao Wang, Ying Shan, Tien-Tsin Wong
Our method, dubbed Make-Your-Video, involves joint-conditional video generation using a Latent Diffusion Model that is pre-trained for still image synthesis and then promoted for video generation with the introduction of temporal modules.
1 code implementation • 21 Apr 2023 • Hanyuan Liu, Jinbo Xing, Minshan Xie, Chengze Li, Tien-Tsin Wong
Our key idea is to exploit the color prior knowledge in the pre-trained T2I diffusion model for realistic and diverse colorization.
1 code implementation • CVPR 2023 • Jinbo Xing, Menghan Xia, Yuechen Zhang, Xiaodong Cun, Jue Wang, Tien-Tsin Wong
In this paper, we propose to cast speech-driven facial animation as a code query task in a finite proxy space of the learned codebook, which effectively promotes the vividness of the generated motions by reducing the cross-modal mapping uncertainty.
Ranked #4 on 3D Face Animation on BEAT2
1 code implementation • SIGGRAPH 2022 • Menghan Xia, WenBo Hu, Tien-Tsin Wong, Jue Wang
Our key insight is that several carefully located anchors could approximately represent the color distribution of an image, and conditioned on the anchor colors, we can predict the image color in a deterministic manner by utilizing internal correlation.
1 code implementation • 18 Mar 2022 • Luyang Luo, Dunyuan Xu, Hao Chen, Tien-Tsin Wong, Pheng-Ann Heng
Deep learning models were frequently reported to learn from shortcuts like dataset biases.
no code implementations • 7 Mar 2022 • Minshan Xie, Menghan Xia, Xueting Liu, Tien-Tsin Wong
Fortunately, the rescaled manga shares the same region-wise screentone correspondences with the original manga, which enables us to simplify the screentone synthesis problem as an anchor-based proposals selection and rearrangement problem.
1 code implementation • 28 Feb 2022 • Ruihui Li, Xianzhi Li, Tien-Tsin Wong, Chi-Wing Fu
To achieve a learnable self-embedding scheme, we design a novel framework with two jointly-trained networks: one to encode the input point set into its self-embedded sparse point set and the other to leverage the embedded information for inverting the original point set back.
no code implementations • 29 Jan 2022 • Jinbo Xing, WenBo Hu, Tien-Tsin Wong
In this paper, we propose a scale-Arbitrary Invertible image Downscaling Network (AIDN), to natively downscale HR images with arbitrary scale factors.
no code implementations • CVPR 2022 • Hanyuan Liu, Chengze Li, Xueting Liu, Tien-Tsin Wong
While humans can intuitively recognize dashed curves from disjoint curve segments based on the law of continuity in Gestalt psychology, it is extremely difficult for computers to model the Gestalt law of continuity and recognize the dashed curves since high-level semantic understanding is needed for this task.
no code implementations • 9 Oct 2021 • Zhuming Zhang, Menghan Xia, Xueting Liu, Chengze Li, Tien-Tsin Wong
In this paper, we propose an invertible tone mapping method that converts the multi-exposure HDR to a true LDR (8-bit per color channel) and reserves the capability to accurately restore the original HDR from this {\em invertible LDR}.
1 code implementation • 16 Jul 2021 • WenBo Hu, Changgong Zhang, Fangneng Zhan, Lei Zhang, Tien-Tsin Wong
Based on this representation, we further propose a spatial-temporal conditional directed graph convolution to leverage varying non-local dependence for different poses by conditioning the graph topology on input poses.
Ranked #17 on 3D Human Pose Estimation on Human3.6M
no code implementations • CVPR 2021 • Lvmin Zhang, Chengze Li, Edgar Simo-Serra, Yi Ji, Tien-Tsin Wong, Chunping Liu
We present a deep learning framework for user-guided line art flat filling that can compute the "influence areas" of the user color scribbles, i. e., the areas where the user scribbles should propagate and influence.
1 code implementation • CVPR 2021 • Minshan Xie, Menghan Xia, Tien-Tsin Wong
First, we predict the target resolution from the degraded manga via the Scale Estimation Network (SE-Net) with spatial voting scheme.
1 code implementation • CVPR 2021 • WenBo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong
Via the \emph{BPM}, complementary 2D and 3D information can interact with each other in multiple architectural levels, such that advantages in these two visual domains can be combined for better scene recognition.
Ranked #19 on Semantic Segmentation on ScanNet
no code implementations • 21 Mar 2021 • Menghan Xia, Jose Echevarria, Minshan Xie, Tien-Tsin Wong
Light fields are 4D scene representation typically structured as arrays of views, or several directional samples per pixel in a single view.
1 code implementation • ICCV 2021 • Menghan Xia, WenBo Hu, Xueting Liu, Tien-Tsin Wong
Existing halftoning algorithms usually drop colors and fine details when dithering color images with binary dot patterns, which makes it extremely difficult to recover the original information.
no code implementations • 9 Dec 2020 • Menghan Xia, Yi Wang, Chu Han, Tien-Tsin Wong
Noise Incentive Block (NIB), which serves as a generic plug-in for any CNN generation model.
1 code implementation • 3 Sep 2020 • Wenbo Hu, Menghan Xia, Chi-Wing Fu, Tien-Tsin Wong
This paper presents the idea ofmono-nizingbinocular videos and a frame-work to effectively realize it.
Image and Video Processing Graphics
no code implementations • 17 Sep 2018 • Zhuming Zhang, Xinghong Hu, Xueting Liu, Tien-Tsin Wong
However, the existing research lacks the binocular perception study and is unable to generate the optimal binocular pair that presents the most visual content.
2 code implementations • 1 Aug 2017 • Haichao Zhu, Xueting Liu, Xiangyu Mao, Tien-Tsin Wong
Interlacing is a widely used technique, for television broadcast and video recording, to double the perceived frame rate without increasing the bandwidth.
Ranked #11 on Video Deinterlacing on MSU Deinterlacer Benchmark
3 code implementations • 27 Oct 2015 • Guofeng Zhang, Hao-Min Liu, Zilong Dong, Jiaya Jia, Tien-Tsin Wong, Hujun Bao
Our framework consists of steps of solving the feature `dropout' problem when indistinctive structures, noise or large image distortion exists, and of rapidly recognizing and joining common features located in different subsequences.