Search Results for author: Kwan-Yee K. Wong

Found 40 papers, 22 papers with code

DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via Diffusion Models

1 code implementation3 Apr 2023 Yukang Cao, Yan-Pei Cao, Kai Han, Ying Shan, Kwan-Yee K. Wong

We present DreamAvatar, a text-and-shape guided framework for generating high-quality 3D human avatars with controllable poses.

Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models

1 code implementation NeurIPS 2023 Shihao Zhao, Dongdong Chen, Yen-Chun Chen, Jianmin Bao, Shaozhe Hao, Lu Yuan, Kwan-Yee K. Wong

Text-to-Image diffusion models have made tremendous progress over the past two years, enabling the generation of highly realistic images based on open-domain text descriptions.

Progressive Semantic-Aware Style Transformation for Blind Face Restoration

1 code implementation CVPR 2021 Chaofeng Chen, Xiaoming Li, Lingbo Yang, Xianhui Lin, Lei Zhang, Kwan-Yee K. Wong

Compared with previous networks, the proposed PSFR-GAN makes full use of the semantic (parsing maps) and pixel (LQ images) space information from different scales of input pairs.

Blind Face Restoration Face Parsing +2

ViCo: Plug-and-play Visual Condition for Personalized Text-to-image Generation

1 code implementation1 Jun 2023 Shaozhe Hao, Kai Han, Shihao Zhao, Kwan-Yee K. Wong

Personalized text-to-image generation using diffusion models has recently emerged and garnered significant interest.

Text-to-Image Generation

Learning Spatial Attention for Face Super-Resolution

1 code implementation2 Dec 2020 Chaofeng Chen, Dihong Gong, Hao Wang, Zhifeng Li, Kwan-Yee K. Wong

Visualization of the attention maps shows that our spatial attention network can capture the key face structures well even for very low resolution faces (e. g., $16\times16$).

Face Parsing Image Super-Resolution +2

Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation

1 code implementation12 Mar 2024 Shihao Zhao, Shaozhe Hao, Bojia Zi, Huaizhe xu, Kwan-Yee K. Wong

In this paper, we explore this objective and propose LaVi-Bridge, a pipeline that enables the integration of diverse pre-trained language models and generative vision models for text-to-image generation.

Language Modelling Text-to-Image Generation

Self-calibrating Deep Photometric Stereo Networks

1 code implementation CVPR 2019 Guan-Ying Chen, Kai Han, Boxin Shi, Yasuyuki Matsushita, Kwan-Yee K. Wong

This paper proposes an uncalibrated photometric stereo method for non-Lambertian scenes based on deep learning.

PS-FCN: A Flexible Learning Framework for Photometric Stereo

1 code implementation ECCV 2018 Guan-Ying Chen, Kai Han, Kwan-Yee K. Wong

This paper addresses the problem of photometric stereo for non-Lambertian surfaces.

Deep Photometric Stereo for Non-Lambertian Surfaces

1 code implementation26 Jul 2020 Guan-Ying Chen, Kai Han, Boxin Shi, Yasuyuki Matsushita, Kwan-Yee K. Wong

To deal with the uncalibrated scenario where light directions are unknown, we introduce a new convolutional network, named LCNet, to estimate light directions from input images.

Semi-Supervised Learning for Face Sketch Synthesis in the Wild

1 code implementation12 Dec 2018 Chaofeng Chen, Wei Liu, Xiao Tan, Kwan-Yee K. Wong

Instead of supervising the network with ground truth sketches, we first perform patch matching in feature space between the input photo and photos in a small reference set of photo-sketch pairs.

Face Sketch Synthesis Patch Matching

Blind Image Super-resolution with Elaborate Degradation Modeling on Noise and Kernel

1 code implementation CVPR 2022 Zongsheng Yue, Qian Zhao, Jianwen Xie, Lei Zhang, Deyu Meng, Kwan-Yee K. Wong

To address the above issues, this paper proposes a model-based blind SISR method under the probabilistic framework, which elaborately models image degradation from the perspectives of noise and blur kernel.

Image Super-Resolution

Weakly-Supervised Spatio-Temporally Grounding Natural Sentence in Video

1 code implementation ACL 2019 Zhenfang Chen, Lin Ma, Wenhan Luo, Kwan-Yee K. Wong

In this paper, we address a novel task, namely weakly-supervised spatio-temporally grounding natural sentence in video.

object-detection Sentence +1

Face Sketch Synthesis with Style Transfer using Pyramid Column Feature

1 code implementation18 Sep 2020 Chaofeng Chen, Xiao Tan, Kwan-Yee K. Wong

We utilize a fully convolutional neural network (FCNN) to create the content image, and propose a style transfer approach to introduce textures and shadings based on a newly proposed pyramid column feature.

Face Sketch Synthesis Style Transfer

SCNet: Learning Semantic Correspondence

1 code implementation ICCV 2017 Kai Han, Rafael S. Rezende, Bumsub Ham, Kwan-Yee K. Wong, Minsu Cho, Cordelia Schmid, Jean Ponce

This paper addresses the problem of establishing semantic correspondences between images depicting different instances of the same object or scene category.

Semantic correspondence

A Unified Framework for Masked and Mask-Free Face Recognition via Feature Rectification

1 code implementation15 Feb 2022 Shaozhe Hao, Chaofeng Chen, Zhenfang Chen, Kwan-Yee K. Wong

We introduce rectification blocks to rectify features extracted by a state-of-the-art recognition model, in both spatial and channel dimensions, to minimize the distance between a masked face and its mask-free counterpart in the rectified feature space.

Face Recognition

Learning Transparent Object Matting

1 code implementation25 Jul 2019 Guan-Ying Chen, Kai Han, Kwan-Yee K. Wong

In this paper, we formulate transparent object matting as a refractive flow estimation problem, and propose a deep learning framework, called TOM-Net, for learning the refractive flow.

Image Matting Object +1

PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis

1 code implementation4 Mar 2024 Zhengyao Lv, Yuxiang Wei, WangMeng Zuo, Kwan-Yee K. Wong

Extensive experiments demonstrate that our approach performs favorably in terms of visual quality, semantic consistency, and layout alignment.

Image Generation

Fixed Viewpoint Mirror Surface Reconstruction under an Uncalibrated Camera

1 code implementation23 Jan 2021 Kai Han, Miaomiao Liu, Dirk Schnieders, Kwan-Yee K. Wong

This paper addresses the problem of mirror surface reconstruction, and proposes a solution based on observing the reflections of a moving reference plane on the mirror surface.

Surface Reconstruction

CiPR: An Efficient Framework with Cross-instance Positive Relations for Generalized Category Discovery

1 code implementation14 Apr 2023 Shaozhe Hao, Kai Han, Kwan-Yee K. Wong

GCD considers the open-world problem of automatically clustering a partially labelled dataset, in which the unlabelled data may contain instances from both novel categories and labelled classes.

Clustering Contrastive Learning +1

SAFE: Scale Aware Feature Encoder for Scene Text Recognition

no code implementations17 Jan 2019 Wei Liu, Chaofeng Chen, Kwan-Yee K. Wong

We propose a novel scale aware feature encoder (SAFE) that is designed specifically for encoding characters with different scales.

Scene Text Recognition

A Fixed Viewpoint Approach for Dense Reconstruction of Transparent Objects

no code implementations CVPR 2015 Kai Han, Kwan-Yee K. Wong, Miaomiao Liu

In this paper, we develop a fixed viewpoint approach for dense surface reconstruction of transparent objects based on refraction of light.

Object Surface Reconstruction +1

Mirror Surface Reconstruction Under an Uncalibrated Camera

no code implementations CVPR 2016 Kai Han, Kwan-Yee K. Wong, Dirk Schnieders, Miaomiao Liu

Unlike previous approaches which require tedious work to calibrate the camera, our method can recover both the camera intrinsics and extrinsics together with the mirror surface from reflections of the reference plane under at least three unknown distinct poses.

Surface Reconstruction

Look Closer to Ground Better: Weakly-Supervised Temporal Grounding of Sentence in Video

no code implementations25 Jan 2020 Zhenfang Chen, Lin Ma, Wenhan Luo, Peng Tang, Kwan-Yee K. Wong

In this paper, we study the problem of weakly-supervised temporal grounding of sentence in video.

Sentence

Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition

no code implementations AAAI 2018 Wei Liu, Chaofeng Chen, Kwan-Yee K. Wong

Unlike previous work which employed a global spatial transformer network to rectify the entire distorted text image, we take an approach of detecting and rectifying individual characters.

Scene Text Recognition

What is Learned in Deep Uncalibrated Photometric Stereo?

no code implementations ECCV 2020 Guan-Ying Chen, Michael Waechter, Boxin Shi, Kwan-Yee K. Wong, Yasuyuki Matsushita

Based on this insight, we propose a guided calibration network, named GCNet, that explicitly leverages object shape and shading information for improved lighting estimation.

Lighting Estimation Surface Normal Estimation

Dense Reconstruction of Transparent Objects by Altering Incident Light Paths Through Refraction

no code implementations20 May 2021 Kai Han, Kwan-Yee K. Wong, Miaomiao Liu

We present a simple setup that allows us to alter the incident light paths before light rays enter the object by immersing the object partially in a liquid, and develop a method for recovering the object surface through reconstructing and triangulating such incident light paths.

Object Surface Reconstruction +1

JIFF: Jointly-aligned Implicit Face Function for High Quality Single View Clothed Human Reconstruction

no code implementations CVPR 2022 Yukang Cao, GuanYing Chen, Kai Han, Wenqi Yang, Kwan-Yee K. Wong

In this paper, we focus on improving the quality of face in the reconstruction and propose a novel Jointly-aligned Implicit Face Function (JIFF) that combines the merits of the implicit function based approach and model based approach.

3D Human Reconstruction Face Model +1

PS-NeRF: Neural Inverse Rendering for Multi-view Photometric Stereo

no code implementations23 Jul 2022 Wenqi Yang, GuanYing Chen, Chaofeng Chen, Zhenfang Chen, Kwan-Yee K. Wong

It then jointly optimizes the surface normals, spatially-varying BRDFs, and lights based on a shadow-aware differentiable rendering layer.

Inverse Rendering Neural Rendering

S$^3$-NeRF: Neural Reflectance Field from Shading and Shadow under a Single Viewpoint

no code implementations17 Oct 2022 Wenqi Yang, GuanYing Chen, Chaofeng Chen, Zhenfang Chen, Kwan-Yee K. Wong

Different from existing single-view methods which can only recover a 2. 5D scene representation (i. e., a normal / depth map for the visible surface), our method learns a neural reflectance field to represent the 3D geometry and BRDFs of a scene.

Novel View Synthesis

SeSDF: Self-evolved Signed Distance Field for Implicit 3D Clothed Human Reconstruction

no code implementations CVPR 2023 Yukang Cao, Kai Han, Kwan-Yee K. Wong

We propose a flexible framework which, by leveraging the parametric SMPL-X model, can take an arbitrary number of input images to reconstruct a clothed human model under an uncalibrated setting.

Semi-supervised Cycle-GAN for face photo-sketch translation in the wild

no code implementations18 Jul 2023 Chaofeng Chen, Wei Liu, Xiao Tan, Kwan-Yee K. Wong

Experiments show that SCG achieves competitive performance on public benchmarks and superior results on photos in the wild.

Translation

RIGID: Recurrent GAN Inversion and Editing of Real Face Videos

no code implementations ICCV 2023 Yangyang Xu, Shengfeng He, Kwan-Yee K. Wong, Ping Luo

In this paper, we propose a unified recurrent framework, named \textbf{R}ecurrent v\textbf{I}deo \textbf{G}AN \textbf{I}nversion and e\textbf{D}iting (RIGID), to explicitly and simultaneously enforce temporally coherent GAN inversion and facial editing of real videos.

Attribute Facial Editing +1

Guide3D: Create 3D Avatars from Text and Image Guidance

no code implementations18 Aug 2023 Yukang Cao, Yan-Pei Cao, Kai Han, Ying Shan, Kwan-Yee K. Wong

To this end, we introduce Guide3D, a zero-shot text-and-image-guided generative model for 3D avatar generation based on diffusion models.

Text to 3D Text-to-Image Generation

DiffusionMat: Alpha Matting as Sequential Refinement Learning

no code implementations22 Nov 2023 Yangyang Xu, Shengfeng He, Wenqi Shao, Kwan-Yee K. Wong, Yu Qiao, Ping Luo

In this paper, we introduce DiffusionMat, a novel image matting framework that employs a diffusion model for the transition from coarse to refined alpha mattes.

Denoising Image Matting

MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation

no code implementations14 Jan 2024 Jiaqi Chen, Bingqian Lin, ran Xu, Zhenhua Chai, Xiaodan Liang, Kwan-Yee K. Wong

Embodied agents equipped with GPT as their brain have exhibited extraordinary decision-making and generalization abilities across various tasks.

Decision Making Vision and Language Navigation

Cannot find the paper you are looking for? You can Submit a new open access paper.