Search Results for author: Xin Yu

Found 113 papers, 54 papers with code

Learning Efficient Unsupervised Satellite Image-based Building Damage Detection

1 code implementation4 Dec 2023 Yiyun Zhang, Zijian Wang, Yadan Luo, Xin Yu, Zi Huang

Existing Building Damage Detection (BDD) methods always require labour-intensive pixel-level annotations of buildings and their conditions, hence largely limiting their applications.

Text-Guided 3D Face Synthesis -- From Generation to Editing

no code implementations1 Dec 2023 Yunjie Wu, Yapeng Meng, Zhipeng Hu, Lincheng Li, Haoqian Wu, Kun Zhou, Weiwei Xu, Xin Yu

In the editing stage, we first employ a pre-trained diffusion model to update facial geometry or texture based on the texts.

Face Generation Texture Synthesis

Functional Bayesian Tucker Decomposition for Continuous-indexed Tensor Data

no code implementations8 Nov 2023 Shikai Fang, Xin Yu, Zheng Wang, Shibo Li, Mike Kirby, Shandian Zhe

To generalize Tucker decomposition to such scenarios, we propose Functional Bayesian Tucker Decomposition (FunBaT).

Gaussian Processes

Text-to-3D with Classifier Score Distillation

no code implementations30 Oct 2023 Xin Yu, Yuan-Chen Guo, Yangguang Li, Ding Liang, Song-Hai Zhang, Xiaojuan Qi

In this paper, we re-evaluate the role of classifier-free guidance in score distillation and discover a surprising finding: the guidance alone is enough for effective text-to-3D generation tasks.

Text to 3D Texture Synthesis

Towards Open World Active Learning for 3D Object Detection

1 code implementation16 Oct 2023 Zhuoxiao Chen, Yadan Luo, Zixin Wang, Zijian Wang, Xin Yu, Zi Huang

To seek effective solutions, we investigate a more practical yet challenging research task: Open World Active Learning for 3D Object Detection (OWAL-3D), aiming at selecting a small number of 3D boxes to annotate while maximizing detection performance on both known and unknown classes.

3D Object Detection Active Learning +2

CBARF: Cascaded Bundle-Adjusting Neural Radiance Fields from Imperfect Camera Poses

no code implementations15 Oct 2023 Hongyu Fu, Xin Yu, Lincheng Li, Li Zhang

Existing volumetric neural rendering techniques, such as Neural Radiance Fields (NeRF), face limitations in synthesizing high-quality novel views when the camera poses of input images are imperfect.

3D Reconstruction Neural Rendering +1

Divide and Ensemble: Progressively Learning for the Unknown

no code implementations9 Oct 2023 Hu Zhang, Xin Shen, Heming Du, Huiqiang Chen, Chen Liu, Hongwei Sheng, Qingzheng Xu, MD Wahiduzzaman Khan, Qingtao Yu, Tianqing Zhu, Scott Chapman, Zi Huang, Xin Yu

In the wheat nutrient deficiencies classification challenge, we present the DividE and EnseMble (DEEM) method for progressive test data predictions.


DeformUX-Net: Exploring a 3D Foundation Backbone for Medical Image Segmentation with Depthwise Deformable Convolution

1 code implementation30 Sep 2023 Ho Hin Lee, Quan Liu, Qi Yang, Xin Yu, Shunxing Bao, Yuankai Huo, Bennett A. Landman

We hypothesize that deformable convolution can be an exploratory alternative to combine all advantages from the previous operators, providing long-range dependency, adaptive spatial aggregation and computational efficiency as a foundation backbone.

Image Segmentation Medical Image Segmentation +1

Multi-Resolution Active Learning of Fourier Neural Operators

no code implementations29 Sep 2023 Shibo Li, Xin Yu, Wei Xing, Mike Kirby, Akil Narayan, Shandian Zhe

Fourier Neural Operator (FNO) is a popular operator learning framework, which not only achieves the state-of-the-art performance in many tasks, but also is highly efficient in training and prediction.

Active Learning LEMMA +2

Deep conditional generative models for longitudinal single-slice abdominal computed tomography harmonization

1 code implementation17 Sep 2023 Xin Yu, Qi Yang, Yucheng Tang, Riqiang Gao, Shunxing Bao, Leon Y. Cai, Ho Hin Lee, Yuankai Huo, Ann Zenobia Moore, Luigi Ferrucci, Bennett A. Landman

We further evaluate our method's capability to harmonize longitudinal positional variation on 1033 subjects from the Baltimore Longitudinal Study of Aging (BLSA) dataset, which contains longitudinal single abdominal slices, and confirmed that our method can harmonize the slice positional variance in terms of visceral fat area.

Computed Tomography (CT)

Enhancing Hierarchical Transformers for Whole Brain Segmentation with Intracranial Measurements Integration

1 code implementation8 Sep 2023 Xin Yu, Yucheng Tang, Qi Yang, Ho Hin Lee, Shunxing Bao, Yuankai Huo, Bennett A. Landman

Subsequently, the model is finetuned with 45 T1w 3D volumes from Open Access Series Imaging Studies (OASIS) where both 133 whole brain classes and TICV/PFV labels are available.

Brain Segmentation Segmentation

When 3D Bounding-Box Meets SAM: Point Cloud Instance Segmentation with Weak-and-Noisy Supervision

no code implementations2 Sep 2023 Qingtao Yu, Heming Du, Chen Liu, Xin Yu

CIP-WPIS leverages pretrained knowledge embedded in the 2D foundation model SAM and 3D geometric prior to achieve accurate point-wise instance labels from the bounding box annotations.

Instance Segmentation Semantic Segmentation

EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior

1 code implementation25 Aug 2023 Minda Zhao, Chaoyi Zhao, Xinyue Liang, Lincheng Li, Zeng Zhao, Zhipeng Hu, Changjie Fan, Xin Yu

Specifically, we introduce a novel 2D diffusion model that generates an image consisting of four orthogonal-view sub-images for the given text prompt.

Text to 3D

BAVS: Bootstrapping Audio-Visual Segmentation by Integrating Foundation Knowledge

no code implementations20 Aug 2023 Chen Liu, Peike Li, Hu Zhang, Lincheng Li, Zi Huang, Dadong Wang, Xin Yu

In a nutshell, our BAVS is designed to eliminate the interference of background noise or off-screen sounds in segmentation by establishing the audio-visual correspondences in an explicit manner.

Audio Classification Segmentation

Audio-Visual Segmentation by Exploring Cross-Modal Mutual Semantics

no code implementations31 Jul 2023 Chen Liu, Peike Li, Xingqun Qi, Hu Zhang, Lincheng Li, Dadong Wang, Xin Yu

However, we observed that prior arts are prone to segment a certain salient object in a video regardless of the audio information.

Segmentation Semantic Segmentation

Boosting Model Inversion Attacks with Adversarial Examples

no code implementations24 Jun 2023 Shuai Zhou, Tianqing Zhu, Dayong Ye, Xin Yu, Wanlei Zhou

Hence, in this paper, we propose a new training paradigm for a learning-based model inversion attack that can achieve higher attack accuracy in a black-box setting.

Multi-Contrast Computed Tomography Atlas of Healthy Pancreas

no code implementations2 Jun 2023 Yinchi Zhou, Ho Hin Lee, Yucheng Tang, Xin Yu, Qi Yang, Shunxing Bao, Jeffrey M. Spraggins, Yuankai Huo, Bennett A. Landman

Briefly, DEEDs affine and non-rigid registration are performed to transfer patient abdominal volumes to a fixed high-resolution atlas template.

Anatomy Computed Tomography (CT)

EmotionGesture: Audio-Driven Diverse Emotional Co-Speech 3D Gesture Generation

no code implementations30 May 2023 Xingqun Qi, Chen Liu, Lincheng Li, Jie Hou, Haoran Xin, Xin Yu

In this work, we propose EmotionGesture, a novel framework for synthesizing vivid and diverse emotional co-speech 3D gestures from audio.

Gesture Generation

TalkCLIP: Talking Head Generation with Text-Guided Expressive Speaking Styles

no code implementations1 Apr 2023 Yifeng Ma, Suzhen Wang, Yu Ding, Bowen Ma, Tangjie Lv, Changjie Fan, Zhipeng Hu, Zhidong Deng, Xin Yu

In this work, we propose an expression-controllable one-shot talking head method, dubbed TalkCLIP, where the expression in a speech is specified by the natural language.

2D Semantic Segmentation task 3 (25 classes) Talking Head Generation

Low-frequency Image Deep Steganography: Manipulate the Frequency Distribution to Hide Secrets with Tenacious Robustness

no code implementations23 Mar 2023 Huajie Chen, Tianqing Zhu, Yuan Zhao, Bo Liu, Xin Yu, Wanlei Zhou

By avoiding high-frequency artifacts and manipulating the frequency distribution of the embedded feature map, LIDS achieves improved robustness against attacks that distort the high-frequency components of container images.

Retrieval Specificity

Diverse 3D Hand Gesture Prediction from Body Dynamics by Bilateral Hand Disentanglement

1 code implementation CVPR 2023 Xingqun Qi, Chen Liu, Muyi Sun, Lincheng Li, Changjie Fan, Xin Yu

Considering the asymmetric gestures and motions of two hands, we introduce a Spatial-Residual Memory (SRM) module to model spatial interaction between the body and each hand by residual learning.


Attacking Cooperative Multi-Agent Reinforcement Learning by Adversarial Minority Influence

1 code implementation7 Feb 2023 Simin Li, Jun Guo, Jingqiao Xiu, Pu Feng, Xin Yu, Aishan Liu, Wenjun Wu, Xianglong Liu

To achieve maximum deviation in victim policies under complex agent-wise interactions, our unilateral attack aims to characterize and maximize the impact of the adversary on the victims.

Continuous Control reinforcement-learning +4

Exploring Active 3D Object Detection from a Generalization Perspective

1 code implementation23 Jan 2023 Yadan Luo, Zhuoxiao Chen, Zijian Wang, Xin Yu, Zi Huang, Mahsa Baktashmotlagh

To alleviate the high annotation cost in LiDAR-based 3D object detection, active learning is a promising solution that learns to select only a small portion of unlabeled data to annotate, without compromising model performance.

3D Object Detection Active Learning +2

Getting Away with More Network Pruning: From Sparsity to Geometry and Linear Regions

no code implementations19 Jan 2023 Junyang Cai, Khai-Nguyen Nguyen, Nishant Shrestha, Aidan Good, Ruisen Tu, Xin Yu, Shandian Zhe, Thiago Serra

One surprising trait of neural networks is the extent to which their connections can be pruned with little to no effect on accuracy.

Network Pruning

StyleTalk: One-shot Talking Head Generation with Controllable Speaking Styles

1 code implementation3 Jan 2023 Yifeng Ma, Suzhen Wang, Zhipeng Hu, Changjie Fan, Tangjie Lv, Yu Ding, Zhidong Deng, Xin Yu

In a nutshell, we aim to attain a speaking style from an arbitrary reference speaking video and then drive the one-shot portrait to speak with the reference speaking style and another piece of audio.

Talking Face Generation Talking Head Generation

Object-Goal Visual Navigation via Effective Exploration of Relations Among Historical Navigation States

no code implementations CVPR 2023 Heming Du, Lincheng Li, Zi Huang, Xin Yu

In HiNL, we propose a History-aware State Estimation (HaSE) module to alleviate the impacts of dominant historical states on the current state estimation.

valid Visual Navigation

Image Inpainting via Iteratively Decoupled Probabilistic Modeling

2 code implementations6 Dec 2022 Wenbo Li, Xin Yu, Kun Zhou, Yibing Song, Zhe Lin, Jiaya Jia

To achieve high-quality results with low computational cost, we present a novel pixel spread model (PSM) that iteratively employs decoupled probabilistic modeling, combining the optimization efficiency of GANs with the prediction tractability of probabilistic models.

Denoising Image Inpainting

FlowFace: Semantic Flow-guided Shape-aware Face Swapping

no code implementations6 Dec 2022 Hao Zeng, Wei zhang, Changjie Fan, Tangjie Lv, Suzhen Wang, Zhimeng Zhang, Bowen Ma, Lincheng Li, Yu Ding, Xin Yu

Unlike most previous methods that focus on transferring the source inner facial features but neglect facial contours, our FlowFace can transfer both of them to a target face, thus leading to more realistic face swapping.

Face Swapping

Single Slice Thigh CT Muscle Group Segmentation with Domain Adaptation and Self-Training

1 code implementation30 Nov 2022 Qi Yang, Xin Yu, Ho Hin Lee, Leon Y. Cai, Kaiwen Xu, Shunxing Bao, Yuankai Huo, Ann Zenobia Moore, Sokratis Makrogiannis, Luigi Ferrucci, Bennett A. Landman

The proposed pipeline is effective and robust in extracting muscle groups on 2D single slice CT thigh images. The container is available for public use at https://github. com/MASILab/DA_CT_muscle_seg

Anatomy Computed Tomography (CT) +1

Uncertainty-aware Gait Recognition via Learning from Dirichlet Distribution-based Evidence

no code implementations15 Nov 2022 Beibei Lin, Chen Liu, Ming Wang, Lincheng Li, Shunli Zhang, Robby T. Tan, Xin Yu

Existing gait recognition frameworks retrieve an identity in the gallery based on the distance between a probe sample and the identities in the gallery.

Gait Recognition Retrieval

Facial Action Units Detection Aided by Global-Local Expression Embedding

no code implementations25 Oct 2022 Zhipeng Hu, Wei zhang, Lincheng Li, Yu Ding, Wei Chen, Zhigang Deng, Xin Yu

We find that AUs and facial expressions are highly associated, and existing facial expression datasets often contain a large number of identities.

3D Face Reconstruction

Batch Multi-Fidelity Active Learning with Budget Constraints

no code implementations23 Oct 2022 Shibo Li, Jeff M. Phillips, Xin Yu, Robert M. Kirby, Shandian Zhe

However, this method only queries at one pair of fidelity and input at a time, and hence has a risk to bring in strongly correlated examples to reduce the learning efficiency.

Active Learning

Adaptive Contrastive Learning with Dynamic Correlation for Multi-Phase Organ Segmentation

1 code implementation16 Oct 2022 Ho Hin Lee, Yucheng Tang, Han Liu, Yubo Fan, Leon Y. Cai, Qi Yang, Xin Yu, Shunxing Bao, Yuankai Huo, Bennett A. Landman

We evaluate our proposed approach on multi-organ segmentation with both non-contrast CT (NCCT) datasets and the MICCAI 2015 BTCV Challenge contrast-enhance CT (CECT) datasets.

Computed Tomography (CT) Contrastive Learning +1

Is synthetic data from generative models ready for image recognition?

1 code implementation14 Oct 2022 Ruifei He, Shuyang Sun, Xin Yu, Chuhui Xue, Wenqing Zhang, Philip Torr, Song Bai, Xiaojuan Qi

Recent text-to-image generation models have shown promising results in generating high-fidelity photo-realistic images.

Transfer Learning

Deep Idempotent Network for Efficient Single Image Blind Deblurring

no code implementations13 Oct 2022 Yuxin Mao, Zhexiong Wan, Yuchao Dai, Xin Yu

Single image blind deblurring is highly ill-posed as neither the latent sharp image nor the blur kernel is known.

Single-Image Blind Deblurring

Meta Knowledge Condensation for Federated Learning

1 code implementation29 Sep 2022 Ping Liu, Xin Yu, Joey Tianyi Zhou

In this work, we first introduce a meta knowledge representation method that extracts meta knowledge from distributed clients.

Federated Learning

Reducing Positional Variance in Cross-sectional Abdominal CT Slices with Deep Conditional Generative Models

1 code implementation28 Sep 2022 Xin Yu, Qi Yang, Yucheng Tang, Riqiang Gao, Shunxing Bao, LeonY. Cai, Ho Hin Lee, Yuankai Huo, Ann Zenobia Moore, Luigi Ferrucci, Bennett A. Landman

External experiments on 20 subjects from the Baltimore Longitudinal Study of Aging (BLSA) dataset that contains longitudinal single abdominal slices validate that our method can harmonize the slice positional variance in terms of muscle and visceral fat area.

Computed Tomography (CT)

UNesT: Local Spatial Representation Learning with Hierarchical Transformer for Efficient Medical Segmentation

1 code implementation28 Sep 2022 Xin Yu, Qi Yang, Yinchi Zhou, Leon Y. Cai, Riqiang Gao, Ho Hin Lee, Thomas Li, Shunxing Bao, Zhoubing Xu, Thomas A. Lasko, Richard G. Abramson, Zizhao Zhang, Yuankai Huo, Bennett A. Landman, Yucheng Tang

Transformer-based models, capable of learning better global dependencies, have recently demonstrated exceptional representation learning capabilities in computer vision and medical image analysis.

Brain Segmentation Image Segmentation +3

Longitudinal Variability Analysis on Low-dose Abdominal CT with Deep Learning-based Segmentation

no code implementations28 Sep 2022 Xin Yu, Yucheng Tang, Qi Yang, Ho Hin Lee, Riqiang Gao, Shunxing Bao, Ann Zenobia Moore, Luigi Ferrucci, Bennett A. Landman

Metabolic health is increasingly implicated as a risk factor across conditions from cardiology to neurology, and efficiency assessment of body composition is critical to quantitatively characterizing these relationships.

Computed Tomography (CT) Segmentation

CVLNet: Cross-View Semantic Correspondence Learning for Video-based Camera Localization

no code implementations7 Aug 2022 Yujiao Shi, Xin Yu, Shan Wang, Hongdong Li

The critical challenge of this task is to learn a powerful global feature descriptor for the sequential ground-view images while considering its domain alignment with reference satellite images.

Camera Localization Image-Based Localization +1

Instance As Identity: A Generic Online Paradigm for Video Instance Segmentation

1 code implementation5 Aug 2022 Feng Zhu, Zongxin Yang, Xin Yu, Yi Yang, Yunchao Wei

In this work, we propose a new online VIS paradigm named Instance As Identity (IAI), which models temporal information for both detection and tracking in an efficient way.

Instance Segmentation Semantic Segmentation +1

GaitGL: Learning Discriminative Global-Local Feature Representations for Gait Recognition

2 code implementations2 Aug 2022 Beibei Lin, Shunli Zhang, Ming Wang, Lincheng Li, Xin Yu

GFR extractor aims to extract contextual information, e. g., the relationship among various body parts, and the mask-based LFR extractor is presented to exploit the detailed posture changes of local regions.

Gait Recognition

Towards Efficient and Scale-Robust Ultra-High-Definition Image Demoireing

1 code implementation20 Jul 2022 Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Jiajun Shen, Jia Li, Xiaojuan Qi

With the rapid development of mobile devices, modern widely-used mobile phones typically allow users to capture 4K resolution (i. e., ultra-high-definition) images.

Image Enhancement Image Restoration +1

MHR-Net: Multiple-Hypothesis Reconstruction of Non-Rigid Shapes from 2D Views

1 code implementation19 Jul 2022 Haitian Zeng, Xin Yu, Jiaxu Miao, Yi Yang

We propose MHR-Net, a novel method for recovering Non-Rigid Shapes from Motion (NRSfM).

"Understanding Robustness Lottery": A Geometric Visual Comparative Analysis of Neural Network Pruning Approaches

no code implementations16 Jun 2022 Zhimin Li, Shusen Liu, Xin Yu, Kailkhura Bhavya, Jie Cao, Diffenderfer James Daniel, Peer-Timo Bremer, Valerio Pascucci

We decomposed and evaluated a set of critical geometric concepts from the common adopted classification loss, and used them to design a visualization system to compare and highlight the impact of pruning on model performance and feature representation.

Network Pruning

Recall Distortion in Neural Network Pruning and the Undecayed Pruning Algorithm

no code implementations7 Jun 2022 Aidan Good, Jiaqi Lin, Hannah Sieg, Mikey Ferguson, Xin Yu, Shandian Zhe, Jerzy Wieczorek, Thiago Serra

In this work, we study such relative distortions in recall by hypothesizing an intensification effect that is inherent to the model.

Network Pruning

PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences

1 code implementation ICLR 2021 Hehe Fan, Xin Yu, Yuhang Ding, Yi Yang, Mohan Kankanhalli

Then, a spatial convolution is employed to capture the local structure of points in the 3D space, and a temporal convolution is used to model the dynamics of the spatial regions along the time dimension.

3D Action Recognition Semantic Segmentation

Pseudo-Label Guided Multi-Contrast Generalization for Non-Contrast Organ-Aware Segmentation

no code implementations12 May 2022 Ho Hin Lee, Yucheng Tang, Riqiang Gao, Qi Yang, Xin Yu, Shunxing Bao, James G. Terry, J. Jeffrey Carr, Yuankai Huo, Bennett A. Landman

In this paper, we propose a novel unsupervised approach that leverages pairwise contrast-enhanced CT (CECT) context to compute non-contrast segmentation without ground-truth label.

Organ Segmentation Pseudo Label +1

Video Demoireing with Relation-Based Temporal Consistency

1 code implementation CVPR 2022 Peng Dai, Xin Yu, Lan Ma, Baoheng Zhang, Jia Li, Wenbo Li, Jiajun Shen, Xiaojuan Qi

Moire patterns, appearing as color distortions, severely degrade image and video qualities when filming a screen with digital cameras.

Accurate 3-DoF Camera Geo-Localization via Ground-to-Satellite Image Matching

1 code implementation26 Mar 2022 Yujiao Shi, Xin Yu, Liu Liu, Dylan Campbell, Piotr Koniusz, Hongdong Li

We address the problem of ground-to-satellite image geo-localization, that is, estimating the camera latitude, longitude and orientation (azimuth angle) by matching a query image captured at the ground level against a large-scale database with geotagged satellite images.

Image Retrieval Retrieval

The Combinatorial Brain Surgeon: Pruning Weights That Cancel One Another in Neural Networks

1 code implementation9 Mar 2022 Xin Yu, Thiago Serra, Srikumar Ramalingam, Shandian Zhe

We propose a tractable heuristic for solving the combinatorial extension of OBS, in which we select weights for simultaneous removal, as well as a systematic update of the remaining weights.

Gait Recognition with Mask-based Regularization

no code implementations8 Mar 2022 Chuanfu Shen, Beibei Lin, Shunli Zhang, George Q. Huang, Shiqi Yu, Xin Yu

Also, we design an Inception-like ReverseMask Block, which has three branches composed of a global branch, a feature dropping branch, and a feature scaling branch.

Multiview Gait Recognition

GaitStrip: Gait Recognition via Effective Strip-based Feature Representations and Multi-Level Framework

1 code implementation8 Mar 2022 Ming Wang, Beibei Lin, Xianda Guo, Lincheng Li, Zheng Zhu, Jiande Sun, Shunli Zhang, Xin Yu

ECM consists of the Spatial-Temporal feature extractor (ST), the Frame-Level feature extractor (FL) and SPB, and has two obvious advantages: First, each branch focuses on a specific representation, which can be used to improve the robustness of the network.

Gait Recognition

Characterizing Renal Structures with 3D Block Aggregate Transformers

no code implementations4 Mar 2022 Xin Yu, Yucheng Tang, Yinchi Zhou, Riqiang Gao, Qi Yang, Ho Hin Lee, Thomas Li, Shunxing Bao, Yuankai Huo, Zhoubing Xu, Thomas A. Lasko, Richard G. Abramson, Bennett A. Landman

Efficiently quantifying renal structures can provide distinct spatial context and facilitate biomarker discovery for kidney morphology.

Learning Implicit Body Representations from Double Diffusion Based Neural Radiance Fields

no code implementations23 Dec 2021 Guangming Yao, Hongzhi Wu, Yi Yuan, Lincheng Li, Kun Zhou, Xin Yu

In this paper, we present a novel double diffusion based neural radiance field, dubbed DD-NeRF, to reconstruct human body geometry and render the human body appearance in novel views from a sparse set of images.

Novel View Synthesis

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning

no code implementations6 Dec 2021 Suzhen Wang, Lincheng Li, Yu Ding, Xin Yu

Hence, we propose a novel one-shot talking face generation framework by exploring consistent correlations between audio and visual motions from a specific speaker and then transferring audio-driven motion fields to a reference image.

Talking Face Generation

Joint 3D Human Shape Recovery and Pose Estimation from a Single Image with Bilayer Graph

1 code implementation16 Oct 2021 Xin Yu, Jeroen van Baar, Siheng Chen

We use a coarse graph, derived from a dense graph, to estimate the human's 3D pose, and the dense graph to estimate the 3D shape.

3D Human Pose Estimation

RGB-D Saliency Detection via Cascaded Mutual Information Minimization

1 code implementation ICCV 2021 Jing Zhang, Deng-Ping Fan, Yuchao Dai, Xin Yu, Yiran Zhong, Nick Barnes, Ling Shao

In this paper, we introduce a novel multi-stage cascaded learning framework via mutual information minimization to "explicitly" model the multi-modal information between RGB image and depth data.

Saliency Detection Thermal Image Segmentation

FDA: Feature Decomposition and Aggregation for Robust Airway Segmentation

no code implementations7 Sep 2021 Minghui Zhang, Xin Yu, Hanxiao Zhang, Hao Zheng, Weihao Yu, Hong Pan, Xiangran Cai, Yun Gu

Compared to other state-of-the-art transfer learning methods, our method accurately segmented more bronchi in the noisy CT scans.

Transfer Learning

PR-RRN: Pairwise-Regularized Residual-Recursive Networks for Non-rigid Structure-from-Motion

no code implementations ICCV 2021 Haitian Zeng, Yuchao Dai, Xin Yu, Xiaohan Wang, Yi Yang

As NRSfM is a highly under-constrained problem, we propose two new pairwise regularization to further regularize the reconstruction.

Pro-UIGAN: Progressive Face Hallucination from Occluded Thumbnails

no code implementations2 Aug 2021 Yang Zhang, Xin Yu, Xiaobo Lu, Ping Liu

Specifically, we design a novel cross-modal transformer module for facial priors estimation, in which an input face and its landmark features are formulated as queries and keys, respectively.

Face Alignment Face Hallucination +2

Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion

1 code implementation20 Jul 2021 Suzhen Wang, Lincheng Li, Yu Ding, Changjie Fan, Xin Yu

As this keypoint based representation models the motions of facial regions, head, and backgrounds integrally, our method can better constrain the spatial and temporal consistency of the generated videos.

Image Generation Talking Head Generation

Removing Raindrops and Rain Streaks in One Go

1 code implementation CVPR 2021 Ruijie Quan, Xin Yu, Yuanzhi Liang, Yi Yang

First, we propose a complementary cascaded network architecture, namely CCN, to remove rain streaks and raindrops in a unified framework.

Neural Architecture Search Rain Removal

VidFace: A Full-Transformer Solver for Video FaceHallucination with Unaligned Tiny Snapshots

1 code implementation31 May 2021 Yuan Gan, Yawei Luo, Xin Yu, Bang Zhang, Yi Yang

In this paper, we investigate the task of hallucinating an authentic high-resolution (HR) human face from multiple low-resolution (LR) video snapshots.

Face Hallucination

VTNet: Visual Transformer Network for Object Goal Navigation

no code implementations ICLR 2021 Heming Du, Xin Yu, Liang Zheng

In this paper, we introduce a Visual Transformer Network (VTNet) for learning informative visual representation in navigation.

DSC-PoseNet: Learning 6DoF Object Pose Estimation via Dual-scale Consistency

no code implementations CVPR 2021 Zongxin Yang, Xin Yu, Yi Yang

In the first step, the framework learns to segment objects from real and synthetic data in a weakly-supervised fashion, and the segmentation masks will act as a prior for pose estimation.

Pose Estimation

Self-Supervised Visibility Learning for Novel View Synthesis

1 code implementation CVPR 2021 Yujiao Shi, Hongdong Li, Xin Yu

We then warp and aggregate source view pixels to synthesize a novel view based on the estimated source-view visibility and target-view depth.

Novel View Synthesis

Super-Resolving Cross-Domain Face Miniatures by Peeking at One-Shot Exemplar

no code implementations ICCV 2021 Peike Li, Xin Yu, Yi Yang

By iteratively updating the latent representations and our decoder, our DAP-FSR will be adapted to the target domain, thus achieving authentic and high-quality upsampled HR faces.


Geometry-Guided Street-View Panorama Synthesis from Satellite Imagery

1 code implementation2 Mar 2021 Yujiao Shi, Dylan Campbell, Xin Yu, Hongdong Li

Specifically, we observe that when a 3D point in the real world is visible in both views, there is a deterministic mapping between the projected points in the two-view images given the height information of this 3D point.

Image Generation

Scaling Up Exact Neural Network Compression by ReLU Stability

1 code implementation NeurIPS 2021 Thiago Serra, Xin Yu, Abhinav Kumar, Srikumar Ramalingam

We can compress a rectifier network while exactly preserving its underlying functionality with respect to a given input domain if some of its neurons are stable.

Neural Network Compression

Iterative Optimisation with an Innovation CNN for Pose Refinement

no code implementations22 Jan 2021 Gerard Kennedy, Zheyu Zhuang, Xin Yu, Robert Mahony

Object pose estimation from a single RGB image is a challenging problem due to variable lighting conditions and viewpoint changes.

Pose Estimation

RFNet: Region-Aware Fusion Network for Incomplete Multi-Modal Brain Tumor Segmentation

1 code implementation ICCV 2021 Yuhang Ding, Xin Yu, Yi Yang

In this work, we propose a Region-aware Fusion Network (RFNet) that is able to exploit different combinations of multi-modal data adaptively and effectively for tumor segmentation.

Brain Tumor Segmentation Segmentation +1

Uncertainty-Aware Deep Calibrated Salient Object Detection

no code implementations10 Dec 2020 Jing Zhang, Yuchao Dai, Xin Yu, Mehrtash Harandi, Nick Barnes, Richard Hartley

Existing deep neural network based salient object detection (SOD) methods mainly focus on pursuing high network accuracy.

object-detection Object Detection +1

Gait Recognition via Effective Global-Local Feature Representation and Local Temporal Aggregation

no code implementations ICCV 2021 Beibei Lin, Shunli Zhang, Xin Yu

Towards this goal, we take advantage of both global visual information and local region details and develop a Global and Local Feature Extractor (GLFE).

Gait Recognition

Mapping of Sparse 3D Data using Alternating Projection

no code implementations4 Oct 2020 Siddhant Ranade, Xin Yu, Shantnu Kakkar, Pedro Miraldo, Srikumar Ramalingam

We propose a novel technique to register sparse 3D scans in the absence of texture.

Learning Object Relation Graph and Tentative Policy for Visual Navigation

1 code implementation ECCV 2020 Heming Du, Xin Yu, Liang Zheng

Aiming to improve these two components, this paper proposes three complementary techniques, object relation graph (ORG), trial-driven imitation learning (IL), and a memory-augmented tentative policy network (TPN).

Imitation Learning Representation Learning +1

Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching

1 code implementation CVPR 2020 Yujiao Shi, Xin Yu, Dylan Campbell, Hongdong Li

Cross-view geo-localization is the problem of estimating the position and orientation (latitude, longitude and azimuth angle) of a camera at ground level given a large-scale database of geo-tagged aerial (e. g., satellite) images.

Transferring Cross-domain Knowledge for Video Sign Language Recognition

no code implementations CVPR 2020 Dongxu Li, Xin Yu, Chenchen Xu, Lars Petersson, Hongdong Li

To this end, we extract news signs using a base WSLR model, and then design a classifier jointly trained on news and isolated signs to coarsely align these two domain features.

Sign Language Recognition

Copy and Paste GAN: Face Hallucination from Shaded Thumbnails

no code implementations CVPR 2020 Yang Zhang, Ivor Tsang, Yawei Luo, Changhui Hu, Xiaobo Lu, Xin Yu

This paper proposes a Copy and Paste Generative Adversarial Network (CPGAN) to recover authentic high-resolution (HR) face images while compensating for low and non-uniform illumination.

Face Hallucination

6DoF Object Pose Estimation via Differentiable Proxy Voting Loss

no code implementations10 Feb 2020 Xin Yu, Zheyu Zhuang, Piotr Koniusz, Hongdong Li

In this paper, we aim to reduce such errors by incorporating the distances between pixels and keypoints into our objective.

Pose Estimation

Face Hallucination with Finishing Touches

no code implementations9 Feb 2020 Yang Zhang, Ivor W. Tsang, Jun Li, Ping Liu, Xiaobo Lu, Xin Yu

The coarse-level FHnet generates a frontal coarse HR face and then the fine-level FHnet makes use of the facial component appearance prior, i. e., fine-grained facial components, to attain a frontal HR face image with authentic details.

Face Hallucination Face Recognition

Spatial-Aware Feature Aggregation for Image based Cross-View Geo-Localization

1 code implementation NeurIPS 2019 Yujiao Shi, Liu Liu, Xin Yu, Hongdong Li

The first step is to apply a regular polar transform to warp an aerial image such that its domain is closer to that of a ground-view panorama.

Image-Based Localization

Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison

2 code implementations24 Oct 2019 Dongxu Li, Cristian Rodriguez Opazo, Xin Yu, Hongdong Li

Based on this new large-scale dataset, we are able to experiment with several deep learning methods for word-level sign recognition and evaluate their performances in large scale scenarios.

Action Classification Benchmarking +3

Optimal Feature Transport for Cross-View Image Geo-Localization

1 code implementation11 Jul 2019 Yujiao Shi, Xin Yu, Liu Liu, Tong Zhang, Hongdong Li

This paper proposes a novel Cross-View Feature Transport (CVFT) technique to explicitly establish cross-view domain transfer that facilitates feature alignment between ground and aerial images.

Image-Based Localization Metric Learning

Can generalised relative pose estimation solve sparse 3D registration?

no code implementations13 Jun 2019 Siddhant Ranade, Xin Yu, Shantnu Kakkar, Pedro Miraldo, Srikumar Ramalingam

In contrast to correspondence based methods, we take a different viewpoint and formulate the sparse 3D registration problem based on the constraints from the intersection of line segments from adjacent scans.

Pose Estimation

SOSNet: Second Order Similarity Regularization for Local Descriptor Learning

2 code implementations CVPR 2019 Yurun Tian, Xin Yu, Bin Fan, Fuchao Wu, Huub Heijnen, Vassileios Balntas

Despite the fact that Second Order Similarity (SOS) has been used with significant success in tasks such as graph matching and clustering, it has not been exploited for learning local descriptors.

Clustering Graph Matching +1

Identity-preserving Face Recovery from Stylized Portraits

no code implementations7 Apr 2019 Fatemeh Shiri, Xin Yu, Fatih Porikli, Richard Hartley, Piotr Koniusz

We develop an Identity-preserving Face Recovery from Portraits (IFRP) method that utilizes a Style Removal network (SRN) and a Discriminative Network (DN).

Recovering Faces from Portraits with Auxiliary Facial Attributes

no code implementations7 Apr 2019 Fatemeh Shiri, Xin Yu, Fatih Porikli, Richard Hartley, Piotr Koniusz

%Our method can recover high-quality photorealistic faces from unaligned portraits while preserving the identity of the face images as well as it can reconstruct a photorealistic face image with a desired set of attributes.

High Frame Rate Video Reconstruction based on an Event Camera

1 code implementation12 Mar 2019 Liyuan Pan, Richard Hartley, Cedric Scheerlinck, Miaomiao Liu, Xin Yu, Yuchao Dai

Based on the abundant event data alongside a low frame rate, easily blurred images, we propose a simple yet effective approach to reconstruct high-quality and high frame rate sharp videos.

Video Generation Video Reconstruction +1

Bringing a Blurry Frame Alive at High Frame-Rate with an Event Camera

1 code implementation CVPR 2019 Liyuan Pan, Cedric Scheerlinck, Xin Yu, Richard Hartley, Miaomiao Liu, Yuchao Dai

In this paper, we propose a simple and effective approach, the \textbf{Event-based Double Integral (EDI)} model, to reconstruct a high frame-rate, sharp video from a single blurry frame and its event data.

Video Generation

Face Super-resolution Guided by Facial Component Heatmaps

no code implementations ECCV 2018 Xin Yu, Basura Fernando, Bernard Ghanem, Fatih Porikli, Richard Hartley

State-of-the-art face super-resolution methods use deep convolutional neural networks to learn a mapping between low-resolution (LR) facial patterns and their corresponding high-resolution (HR) counterparts by exploring local information.

Face Hallucination Super-Resolution

VLASE: Vehicle Localization by Aggregating Semantic Edges

1 code implementation6 Jul 2018 Xin Yu, Sagar Chaturvedi, Chen Feng, Yuichi Taguchi, Teng-Yok Lee, Clinton Fernandes, Srikumar Ramalingam

In this paper, we propose VLASE, a framework to use semantic edge features from images to achieve on-road localization.

Image Retrieval Retrieval

Super-Resolving Very Low-Resolution Face Images With Supplementary Attributes

no code implementations CVPR 2018 Xin Yu, Basura Fernando, Richard Hartley, Fatih Porikli

An LR input contains low-frequency facial components of its HR version while its residual face image defined as the difference between the HR ground-truth and interpolated LR images contains the missing high-frequency facial details.

Face Hallucination Super-Resolution

Learning Strict Identity Mappings in Deep Residual Networks

1 code implementation CVPR 2018 Xin Yu, Zhiding Yu, Srikumar Ramalingam

A family of super deep networks, referred to as residual networks or ResNet, achieved record-beating performance in various visual tasks such as image recognition, object detection, and semantic segmentation.

object-detection Object Detection +1

Face Destylization

no code implementations5 Feb 2018 Fatemeh Shiri, Xin Yu, Fatih Porikli, Piotr Koniusz

To enforce the destylized faces to be similar to authentic face images, we employ a discriminative network, which consists of convolutional and fully connected layers.

Style Transfer

Identity-preserving Face Recovery from Portraits

no code implementations8 Jan 2018 Fatemeh Shiri, Xin Yu, Fatih Porikli, Richard Hartley, Piotr Koniusz

In this paper, we present a new Identity-preserving Face Recovery from Portraits (IFRP) to recover latent photorealistic faces from unaligned stylized portraits.

Cannot find the paper you are looking for? You can Submit a new open access paper.