Search Results for author: Xin Yu

Found 58 papers, 21 papers with code

Pseudo-Label Guided Multi-Contrast Generalization for Non-Contrast Organ-Aware Segmentation

no code implementations12 May 2022 Ho Hin Lee, Yucheng Tang, Riqiang Gao, Qi Yang, Xin Yu, Shunxing Bao, James G. Terry, J. Jeffrey Carr, Yuankai Huo, Bennett A. Landman

In this paper, we propose a novel unsupervised approach that leverages pairwise contrast-enhanced CT (CECT) context to compute non-contrast segmentation without ground-truth label.

Video Demoireing with Relation-Based Temporal Consistency

no code implementations6 Apr 2022 Peng Dai, Xin Yu, Lan Ma, Baoheng Zhang, Jia Li, Wenbo Li, Jiajun Shen, Xiaojuan Qi

Moire patterns, appearing as color distortions, severely degrade image and video qualities when filming a screen with digital cameras.


Accurate 3-DoF Camera Geo-Localization via Ground-to-Satellite Image Matching

no code implementations26 Mar 2022 Yujiao Shi, Xin Yu, Liu Liu, Dylan Campbell, Piotr Koniusz, Hongdong Li

We address the problem of ground-to-satellite image geo-localization, that is, estimating the camera latitude, longitude and orientation (azimuth angle) by matching a query image captured at the ground level against a large-scale database with geotagged satellite images.

Image Retrieval

The Combinatorial Brain Surgeon: Pruning Weights That Cancel One Another in Neural Networks

no code implementations9 Mar 2022 Xin Yu, Thiago Serra, Srikumar Ramalingam, Shandian Zhe

We propose a tractable heuristic for solving the combinatorial extension of OBS, in which we select weights for simultaneous removal, as well as a systematic update of the remaining weights.

GaitStrip: Gait Recognition via Effective Strip-based Feature Representations and Multi-Level Framework

no code implementations8 Mar 2022 Ming Wang, Beibei Lin, Xianda Guo, Lincheng Li, Zheng Zhu, Jiande Sun, Shunli Zhang, Xin Yu

ECM consists of the Spatial-Temporal feature extractor (ST), the Frame-Level feature extractor (FL) and SPB, and has two obvious advantages: First, each branch focuses on a specific representation, which can be used to improve the robustness of the network.

Frame Gait Recognition

Gait Recognition with Mask-based Regularization

no code implementations8 Mar 2022 Chuanfu Shen, Beibei Lin, Shunli Zhang, George Q. Huang, Shiqi Yu, Xin Yu

Also, we design an Inception-like ReverseMask Block, which has three branches composed of a global branch, a feature dropping branch, and a feature scaling branch.

Gait Recognition

Characterizing Renal Structures with 3D Block Aggregate Transformers

no code implementations4 Mar 2022 Xin Yu, Yucheng Tang, Yinchi Zhou, Riqiang Gao, Qi Yang, Ho Hin Lee, Thomas Li, Shunxing Bao, Yuankai Huo, Zhoubing Xu, Thomas A. Lasko, Richard G. Abramson, Bennett A. Landman

Efficiently quantifying renal structures can provide distinct spatial context and facilitate biomarker discovery for kidney morphology.

Learning Implicit Body Representations from Double Diffusion Based Neural Radiance Fields

no code implementations23 Dec 2021 Guangming Yao, Hongzhi Wu, Yi Yuan, Lincheng Li, Kun Zhou, Xin Yu

In this paper, we present a novel double diffusion based neural radiance field, dubbed DD-NeRF, to reconstruct human body geometry and render the human body appearance in novel views from a sparse set of images.

Novel View Synthesis

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning

no code implementations6 Dec 2021 Suzhen Wang, Lincheng Li, Yu Ding, Xin Yu

Hence, we propose a novel one-shot talking face generation framework by exploring consistent correlations between audio and visual motions from a specific speaker and then transferring audio-driven motion fields to a reference image.

Talking Face Generation

Joint 3D Human Shape Recovery and Pose Estimation from a Single Image with Bilayer Graph

1 code implementation16 Oct 2021 Xin Yu, Jeroen van Baar, Siheng Chen

We use a coarse graph, derived from a dense graph, to estimate the human's 3D pose, and the dense graph to estimate the 3D shape.

Pose Estimation

RGB-D Saliency Detection via Cascaded Mutual Information Minimization

1 code implementation ICCV 2021 Jing Zhang, Deng-Ping Fan, Yuchao Dai, Xin Yu, Yiran Zhong, Nick Barnes, Ling Shao

In this paper, we introduce a novel multi-stage cascaded learning framework via mutual information minimization to "explicitly" model the multi-modal information between RGB image and depth data.

Saliency Detection

FDA: Feature Decomposition and Aggregation for Robust Airway Segmentation

no code implementations7 Sep 2021 Minghui Zhang, Xin Yu, Hanxiao Zhang, Hao Zheng, Weihao Yu, Hong Pan, Xiangran Cai, Yun Gu

Compared to other state-of-the-art transfer learning methods, our method accurately segmented more bronchi in the noisy CT scans.

Transfer Learning

PR-RRN: Pairwise-Regularized Residual-Recursive Networks for Non-rigid Structure-from-Motion

no code implementations ICCV 2021 Haitian Zeng, Yuchao Dai, Xin Yu, Xiaohan Wang, Yi Yang

As NRSfM is a highly under-constrained problem, we propose two new pairwise regularization to further regularize the reconstruction.

Pro-UIGAN: Progressive Face Hallucination from Occluded Thumbnails

no code implementations2 Aug 2021 Yang Zhang, Xin Yu, Xiaobo Lu, Ping Liu

Specifically, we design a novel cross-modal transformer module for facial priors estimation, in which an input face and its landmark features are formulated as queries and keys, respectively.

Face Alignment Face Hallucination +2

Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion

no code implementations20 Jul 2021 Suzhen Wang, Lincheng Li, Yu Ding, Changjie Fan, Xin Yu

As this keypoint based representation models the motions of facial regions, head, and backgrounds integrally, our method can better constrain the spatial and temporal consistency of the generated videos.

Image Generation Talking Head Generation

Removing Raindrops and Rain Streaks in One Go

1 code implementation CVPR 2021 Ruijie Quan, Xin Yu, Yuanzhi Liang, Yi Yang

First, we propose a complementary cascaded network architecture, namely CCN, to remove rain streaks and raindrops in a unified framework.

Neural Architecture Search Rain Removal

VidFace: A Full-Transformer Solver for Video FaceHallucination with Unaligned Tiny Snapshots

1 code implementation31 May 2021 Yuan Gan, Yawei Luo, Xin Yu, Bang Zhang, Yi Yang

In this paper, we investigate the task of hallucinating an authentic high-resolution (HR) human face from multiple low-resolution (LR) video snapshots.

Face Hallucination

VTNet: Visual Transformer Network for Object Goal Navigation

no code implementations ICLR 2021 Heming Du, Xin Yu, Liang Zheng

In this paper, we introduce a Visual Transformer Network (VTNet) for learning informative visual representation in navigation.

DSC-PoseNet: Learning 6DoF Object Pose Estimation via Dual-scale Consistency

no code implementations CVPR 2021 Zongxin Yang, Xin Yu, Yi Yang

In the first step, the framework learns to segment objects from real and synthetic data in a weakly-supervised fashion, and the segmentation masks will act as a prior for pose estimation.

Pose Estimation

Self-Supervised Visibility Learning for Novel View Synthesis

1 code implementation CVPR 2021 Yujiao Shi, Hongdong Li, Xin Yu

We then warp and aggregate source view pixels to synthesize a novel view based on the estimated source-view visibility and target-view depth.

Novel View Synthesis

Super-Resolving Cross-Domain Face Miniatures by Peeking at One-Shot Exemplar

no code implementations ICCV 2021 Peike Li, Xin Yu, Yi Yang

By iteratively updating the latent representations and our decoder, our DAP-FSR will be adapted to the target domain, thus achieving authentic and high-quality upsampled HR faces.


Geometry-Guided Street-View Panorama Synthesis from Satellite Imagery

1 code implementation2 Mar 2021 Yujiao Shi, Dylan Campbell, Xin Yu, Hongdong Li

Specifically, we observe that when a 3D point in the real world is visible in both views, there is a deterministic mapping between the projected points in the two-view images given the height information of this 3D point.

Image Generation

Scaling Up Exact Neural Network Compression by ReLU Stability

1 code implementation NeurIPS 2021 Thiago Serra, Xin Yu, Abhinav Kumar, Srikumar Ramalingam

We can compress a rectifier network while exactly preserving its underlying functionality with respect to a given input domain if some of its neurons are stable.

Neural Network Compression

Iterative Optimisation with an Innovation CNN for Pose Refinement

no code implementations22 Jan 2021 Gerard Kennedy, Zheyu Zhuang, Xin Yu, Robert Mahony

Object pose estimation from a single RGB image is a challenging problem due to variable lighting conditions and viewpoint changes.

Pose Estimation

PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences

no code implementations ICLR 2021 Hehe Fan, Xin Yu, Yuhang Ding, Yi Yang, Mohan Kankanhalli

Then, a spatial convolution is employed to capture the local structure of points in the 3D space, and a temporal convolution is used to model the dynamics of the spatial regions along the time dimension.

3D Action Recognition Semantic Segmentation

RFNet: Region-Aware Fusion Network for Incomplete Multi-Modal Brain Tumor Segmentation

no code implementations ICCV 2021 Yuhang Ding, Xin Yu, Yi Yang

In this work, we propose a Region-aware Fusion Network (RFNet) that is able to exploit different combinations of multi-modal data adaptively and effectively for tumor segmentation.

Brain Tumor Segmentation Tumor Segmentation

Uncertainty-Aware Deep Calibrated Salient Object Detection

no code implementations10 Dec 2020 Jing Zhang, Yuchao Dai, Xin Yu, Mehrtash Harandi, Nick Barnes, Richard Hartley

Existing deep neural network based salient object detection (SOD) methods mainly focus on pursuing high network accuracy.

Object Detection Salient Object Detection

Gait Recognition via Effective Global-Local Feature Representation and Local Temporal Aggregation

no code implementations ICCV 2021 Beibei Lin, Shunli Zhang, Xin Yu

Towards this goal, we take advantage of both global visual information and local region details and develop a Global and Local Feature Extractor (GLFE).

Frame Gait Recognition

Mapping of Sparse 3D Data using Alternating Projection

no code implementations4 Oct 2020 Siddhant Ranade, Xin Yu, Shantnu Kakkar, Pedro Miraldo, Srikumar Ramalingam

We propose a novel technique to register sparse 3D scans in the absence of texture.

Learning Object Relation Graph and Tentative Policy for Visual Navigation

1 code implementation ECCV 2020 Heming Du, Xin Yu, Liang Zheng

Aiming to improve these two components, this paper proposes three complementary techniques, object relation graph (ORG), trial-driven imitation learning (IL), and a memory-augmented tentative policy network (TPN).

Imitation Learning Representation Learning +1

Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching

1 code implementation CVPR 2020 Yujiao Shi, Xin Yu, Dylan Campbell, Hongdong Li

Cross-view geo-localization is the problem of estimating the position and orientation (latitude, longitude and azimuth angle) of a camera at ground level given a large-scale database of geo-tagged aerial (e. g., satellite) images.

Transferring Cross-domain Knowledge for Video Sign Language Recognition

no code implementations CVPR 2020 Dongxu Li, Xin Yu, Chenchen Xu, Lars Petersson, Hongdong Li

To this end, we extract news signs using a base WSLR model, and then design a classifier jointly trained on news and isolated signs to coarsely align these two domain features.

Sign Language Recognition

Copy and Paste GAN: Face Hallucination from Shaded Thumbnails

no code implementations CVPR 2020 Yang Zhang, Ivor Tsang, Yawei Luo, Changhui Hu, Xiaobo Lu, Xin Yu

This paper proposes a Copy and Paste Generative Adversarial Network (CPGAN) to recover authentic high-resolution (HR) face images while compensating for low and non-uniform illumination.

Face Hallucination

6DoF Object Pose Estimation via Differentiable Proxy Voting Loss

no code implementations10 Feb 2020 Xin Yu, Zheyu Zhuang, Piotr Koniusz, Hongdong Li

In this paper, we aim to reduce such errors by incorporating the distances between pixels and keypoints into our objective.

Pose Estimation

Face Hallucination with Finishing Touches

no code implementations9 Feb 2020 Yang Zhang, Ivor W. Tsang, Jun Li, Ping Liu, Xiaobo Lu, Xin Yu

The coarse-level FHnet generates a frontal coarse HR face and then the fine-level FHnet makes use of the facial component appearance prior, i. e., fine-grained facial components, to attain a frontal HR face image with authentic details.

Face Hallucination Face Recognition

Spatial-Aware Feature Aggregation for Image based Cross-View Geo-Localization

1 code implementation NeurIPS 2019 Yujiao Shi, Liu Liu, Xin Yu, Hongdong Li

The first step is to apply a regular polar transform to warp an aerial image such that its domain is closer to that of a ground-view panorama.

Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison

2 code implementations24 Oct 2019 Dongxu Li, Cristian Rodriguez Opazo, Xin Yu, Hongdong Li

Based on this new large-scale dataset, we are able to experiment with several deep learning methods for word-level sign recognition and evaluate their performances in large scale scenarios.

Action Classification Hand Gesture Recognition +2

Optimal Feature Transport for Cross-View Image Geo-Localization

1 code implementation11 Jul 2019 Yujiao Shi, Xin Yu, Liu Liu, Tong Zhang, Hongdong Li

This paper proposes a novel Cross-View Feature Transport (CVFT) technique to explicitly establish cross-view domain transfer that facilitates feature alignment between ground and aerial images.

Image-Based Localization Metric Learning

Can generalised relative pose estimation solve sparse 3D registration?

no code implementations13 Jun 2019 Siddhant Ranade, Xin Yu, Shantnu Kakkar, Pedro Miraldo, Srikumar Ramalingam

In contrast to correspondence based methods, we take a different viewpoint and formulate the sparse 3D registration problem based on the constraints from the intersection of line segments from adjacent scans.

Pose Estimation

SOSNet: Second Order Similarity Regularization for Local Descriptor Learning

1 code implementation CVPR 2019 Yurun Tian, Xin Yu, Bin Fan, Fuchao Wu, Huub Heijnen, Vassileios Balntas

Despite the fact that Second Order Similarity (SOS) has been used with significant success in tasks such as graph matching and clustering, it has not been exploited for learning local descriptors.

Graph Matching

Identity-preserving Face Recovery from Stylized Portraits

no code implementations7 Apr 2019 Fatemeh Shiri, Xin Yu, Fatih Porikli, Richard Hartley, Piotr Koniusz

We develop an Identity-preserving Face Recovery from Portraits (IFRP) method that utilizes a Style Removal network (SRN) and a Discriminative Network (DN).

Recovering Faces from Portraits with Auxiliary Facial Attributes

no code implementations7 Apr 2019 Fatemeh Shiri, Xin Yu, Fatih Porikli, Richard Hartley, Piotr Koniusz

%Our method can recover high-quality photorealistic faces from unaligned portraits while preserving the identity of the face images as well as it can reconstruct a photorealistic face image with a desired set of attributes.

High Frame Rate Video Reconstruction based on an Event Camera

1 code implementation12 Mar 2019 Liyuan Pan, Richard Hartley, Cedric Scheerlinck, Miaomiao Liu, Xin Yu, Yuchao Dai

Based on the abundant event data alongside a low frame rate, easily blurred images, we propose a simple yet effective approach to reconstruct high-quality and high frame rate sharp videos.

Frame Video Generation +1

Bringing a Blurry Frame Alive at High Frame-Rate with an Event Camera

1 code implementation CVPR 2019 Liyuan Pan, Cedric Scheerlinck, Xin Yu, Richard Hartley, Miaomiao Liu, Yuchao Dai

In this paper, we propose a simple and effective approach, the \textbf{Event-based Double Integral (EDI)} model, to reconstruct a high frame-rate, sharp video from a single blurry frame and its event data.

Frame Video Generation

Face Super-resolution Guided by Facial Component Heatmaps

no code implementations ECCV 2018 Xin Yu, Basura Fernando, Bernard Ghanem, Fatih Porikli, Richard Hartley

State-of-the-art face super-resolution methods use deep convolutional neural networks to learn a mapping between low-resolution (LR) facial patterns and their corresponding high-resolution (HR) counterparts by exploring local information.

Face Hallucination Super-Resolution

VLASE: Vehicle Localization by Aggregating Semantic Edges

1 code implementation6 Jul 2018 Xin Yu, Sagar Chaturvedi, Chen Feng, Yuichi Taguchi, Teng-Yok Lee, Clinton Fernandes, Srikumar Ramalingam

In this paper, we propose VLASE, a framework to use semantic edge features from images to achieve on-road localization.

Image Retrieval

Super-Resolving Very Low-Resolution Face Images With Supplementary Attributes

no code implementations CVPR 2018 Xin Yu, Basura Fernando, Richard Hartley, Fatih Porikli

An LR input contains low-frequency facial components of its HR version while its residual face image defined as the difference between the HR ground-truth and interpolated LR images contains the missing high-frequency facial details.

Face Hallucination Super-Resolution

Learning Strict Identity Mappings in Deep Residual Networks

1 code implementation CVPR 2018 Xin Yu, Zhiding Yu, Srikumar Ramalingam

A family of super deep networks, referred to as residual networks or ResNet, achieved record-beating performance in various visual tasks such as image recognition, object detection, and semantic segmentation.

Object Detection Semantic Segmentation

Face Destylization

no code implementations5 Feb 2018 Fatemeh Shiri, Xin Yu, Fatih Porikli, Piotr Koniusz

To enforce the destylized faces to be similar to authentic face images, we employ a discriminative network, which consists of convolutional and fully connected layers.

Style Transfer

Identity-preserving Face Recovery from Portraits

no code implementations8 Jan 2018 Fatemeh Shiri, Xin Yu, Fatih Porikli, Richard Hartley, Piotr Koniusz

In this paper, we present a new Identity-preserving Face Recovery from Portraits (IFRP) to recover latent photorealistic faces from unaligned stylized portraits.

Cannot find the paper you are looking for? You can Submit a new open access paper.