1 code implementation • 4 Dec 2023 • Yiyun Zhang, Zijian Wang, Yadan Luo, Xin Yu, Zi Huang
Existing Building Damage Detection (BDD) methods always require labour-intensive pixel-level annotations of buildings and their conditions, hence largely limiting their applications.
no code implementations • 1 Dec 2023 • Yunjie Wu, Yapeng Meng, Zhipeng Hu, Lincheng Li, Haoqian Wu, Kun Zhou, Weiwei Xu, Xin Yu
In the editing stage, we first employ a pre-trained diffusion model to update facial geometry or texture based on the texts.
no code implementations • 8 Nov 2023 • Shikai Fang, Xin Yu, Zheng Wang, Shibo Li, Mike Kirby, Shandian Zhe
To generalize Tucker decomposition to such scenarios, we propose Functional Bayesian Tucker Decomposition (FunBaT).
no code implementations • 30 Oct 2023 • Xin Yu, Yuan-Chen Guo, Yangguang Li, Ding Liang, Song-Hai Zhang, Xiaojuan Qi
In this paper, we re-evaluate the role of classifier-free guidance in score distillation and discover a surprising finding: the guidance alone is enough for effective text-to-3D generation tasks.
1 code implementation • 16 Oct 2023 • Zhuoxiao Chen, Yadan Luo, Zixin Wang, Zijian Wang, Xin Yu, Zi Huang
To seek effective solutions, we investigate a more practical yet challenging research task: Open World Active Learning for 3D Object Detection (OWAL-3D), aiming at selecting a small number of 3D boxes to annotate while maximizing detection performance on both known and unknown classes.
no code implementations • 15 Oct 2023 • Hongyu Fu, Xin Yu, Lincheng Li, Li Zhang
Existing volumetric neural rendering techniques, such as Neural Radiance Fields (NeRF), face limitations in synthesizing high-quality novel views when the camera poses of input images are imperfect.
no code implementations • 9 Oct 2023 • Hu Zhang, Xin Shen, Heming Du, Huiqiang Chen, Chen Liu, Hongwei Sheng, Qingzheng Xu, MD Wahiduzzaman Khan, Qingtao Yu, Tianqing Zhu, Scott Chapman, Zi Huang, Xin Yu
In the wheat nutrient deficiencies classification challenge, we present the DividE and EnseMble (DEEM) method for progressive test data predictions.
1 code implementation • 30 Sep 2023 • Ho Hin Lee, Quan Liu, Qi Yang, Xin Yu, Shunxing Bao, Yuankai Huo, Bennett A. Landman
We hypothesize that deformable convolution can be an exploratory alternative to combine all advantages from the previous operators, providing long-range dependency, adaptive spatial aggregation and computational efficiency as a foundation backbone.
no code implementations • 29 Sep 2023 • Shibo Li, Xin Yu, Wei Xing, Mike Kirby, Akil Narayan, Shandian Zhe
Fourier Neural Operator (FNO) is a popular operator learning framework, which not only achieves the state-of-the-art performance in many tasks, but also is highly efficient in training and prediction.
1 code implementation • 19 Sep 2023 • Aiyuan Yang, Bin Xiao, Bingning Wang, Borong Zhang, Ce Bian, Chao Yin, Chenxu Lv, Da Pan, Dian Wang, Dong Yan, Fan Yang, Fei Deng, Feng Wang, Feng Liu, Guangwei Ai, Guosheng Dong, Haizhou Zhao, Hang Xu, Haoze Sun, Hongda Zhang, Hui Liu, Jiaming Ji, Jian Xie, Juntao Dai, Kun Fang, Lei Su, Liang Song, Lifeng Liu, Liyun Ru, Luyao Ma, Mang Wang, Mickel Liu, MingAn Lin, Nuolan Nie, Peidong Guo, Ruiyang Sun, Tao Zhang, Tianpeng Li, Tianyu Li, Wei Cheng, WeiPeng Chen, Xiangrong Zeng, Xiaochuan Wang, Xiaoxi Chen, Xin Men, Xin Yu, Xuehai Pan, Yanjun Shen, Yiding Wang, Yiyu Li, Youxin Jiang, Yuchen Gao, Yupeng Zhang, Zenan Zhou, Zhiying Wu
Large language models (LLMs) have demonstrated remarkable performance on a variety of natural language tasks based on just a few examples of natural language instructions, reducing the need for extensive feature engineering.
1 code implementation • 17 Sep 2023 • Xin Yu, Qi Yang, Yucheng Tang, Riqiang Gao, Shunxing Bao, Leon Y. Cai, Ho Hin Lee, Yuankai Huo, Ann Zenobia Moore, Luigi Ferrucci, Bennett A. Landman
We further evaluate our method's capability to harmonize longitudinal positional variation on 1033 subjects from the Baltimore Longitudinal Study of Aging (BLSA) dataset, which contains longitudinal single abdominal slices, and confirmed that our method can harmonize the slice positional variance in terms of visceral fat area.
1 code implementation • 8 Sep 2023 • Xin Yu, Yucheng Tang, Qi Yang, Ho Hin Lee, Shunxing Bao, Yuankai Huo, Bennett A. Landman
Subsequently, the model is finetuned with 45 T1w 3D volumes from Open Access Series Imaging Studies (OASIS) where both 133 whole brain classes and TICV/PFV labels are available.
no code implementations • 2 Sep 2023 • Qingtao Yu, Heming Du, Chen Liu, Xin Yu
CIP-WPIS leverages pretrained knowledge embedded in the 2D foundation model SAM and 3D geometric prior to achieve accurate point-wise instance labels from the bounding box annotations.
1 code implementation • 25 Aug 2023 • Minda Zhao, Chaoyi Zhao, Xinyue Liang, Lincheng Li, Zeng Zhao, Zhipeng Hu, Changjie Fan, Xin Yu
Specifically, we introduce a novel 2D diffusion model that generates an image consisting of four orthogonal-view sub-images for the given text prompt.
no code implementations • ICCV 2023 • Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Zhengzhe Liu, Xiaojuan Qi
In this work, we focus on synthesizing high-quality textures on 3D meshes.
no code implementations • 20 Aug 2023 • Chen Liu, Peike Li, Hu Zhang, Lincheng Li, Zi Huang, Dadong Wang, Xin Yu
In a nutshell, our BAVS is designed to eliminate the interference of background noise or off-screen sounds in segmentation by establishing the audio-visual correspondences in an explicit manner.
no code implementations • 31 Jul 2023 • Chen Liu, Peike Li, Xingqun Qi, Hu Zhang, Lincheng Li, Dadong Wang, Xin Yu
However, we observed that prior arts are prone to segment a certain salient object in a video regardless of the audio information.
no code implementations • 30 Jul 2023 • Xin Yu, Rongye Shi, Pu Feng, Yongkai Tian, Jie Luo, Wenjun Wu
In addition, the proposed framework is model-agnostic and can be applied to most of the current MARL algorithms.
no code implementations • 24 Jun 2023 • Shuai Zhou, Tianqing Zhu, Dayong Ye, Xin Yu, Wanlei Zhou
Hence, in this paper, we propose a new training paradigm for a learning-based model inversion attack that can achieve higher attack accuracy in a black-box setting.
no code implementations • 2 Jun 2023 • Yinchi Zhou, Ho Hin Lee, Yucheng Tang, Xin Yu, Qi Yang, Shunxing Bao, Jeffrey M. Spraggins, Yuankai Huo, Bennett A. Landman
Briefly, DEEDs affine and non-rigid registration are performed to transfer patient abdominal volumes to a fixed high-resolution atlas template.
no code implementations • 30 May 2023 • Xingqun Qi, Chen Liu, Lincheng Li, Jie Hou, Haoran Xin, Xin Yu
In this work, we propose EmotionGesture, a novel framework for synthesizing vivid and diverse emotional co-speech 3D gestures from audio.
no code implementations • 10 May 2023 • Hongwei Sheng, Xin Yu, Feiyu Wang, MD Wahiduzzaman Khan, Hexuan Weng, Sahar Shariflou, S. Mojtaba Golzan
Both of the evaluations support its effectiveness in facilitating the observation of SVPs.
1 code implementation • CVPR 2023 • Peng Dai, yinda zhang, Xin Yu, Xiaoyang Lyu, Xiaojuan Qi
Rendering novel view images is highly desirable for many applications.
no code implementations • 1 Apr 2023 • Yifeng Ma, Suzhen Wang, Yu Ding, Bowen Ma, Tangjie Lv, Changjie Fan, Zhipeng Hu, Zhidong Deng, Xin Yu
In this work, we propose an expression-controllable one-shot talking head method, dubbed TalkCLIP, where the expression in a speech is specified by the natural language.
2D Semantic Segmentation task 3 (25 classes)
Talking Head Generation
1 code implementation • CVPR 2023 • Haoqian Wu, Zhipeng Hu, Lincheng Li, Yongqiang Zhang, Changjie Fan, Xin Yu
Inverse rendering methods aim to estimate geometry, materials and illumination from multi-view RGB images.
no code implementations • ICCV 2023 • Ming Wang, Xianda Guo, Beibei Lin, Tian Yang, Zheng Zhu, Lincheng Li, Shunli Zhang, Xin Yu
This is the first framework on gait recognition that is designed to focus on the extraction of dynamic features.
no code implementations • 23 Mar 2023 • Huajie Chen, Tianqing Zhu, Yuan Zhao, Bo Liu, Xin Yu, Wanlei Zhou
By avoiding high-frequency artifacts and manipulating the frequency distribution of the embedded feature map, LIDS achieves improved robustness against attacks that distort the high-frequency components of container images.
2 code implementations • 10 Mar 2023 • Ho Hin Lee, Quan Liu, Shunxing Bao, Qi Yang, Xin Yu, Leon Y. Cai, Thomas Li, Yuankai Huo, Xenofon Koutsoukos, Bennett A. Landman
We hypothesize that convolution with LK sizes is limited to maintain an optimal convergence for locality learning.
1 code implementation • CVPR 2023 • Xingqun Qi, Chen Liu, Muyi Sun, Lincheng Li, Changjie Fan, Xin Yu
Considering the asymmetric gestures and motions of two hands, we introduce a Spatial-Residual Memory (SRM) module to model spatial interaction between the body and each hand by residual learning.
no code implementations • 22 Feb 2023 • Shannan Guan, Xin Yu, Wei Huang, Gengfa Fang, Haiyan Lu
Our DMMG consists of a viewpoint variation min-max game and an edge perturbation min-max game.
1 code implementation • 7 Feb 2023 • Simin Li, Jun Guo, Jingqiao Xiu, Pu Feng, Xin Yu, Aishan Liu, Wenjun Wu, Xianglong Liu
To achieve maximum deviation in victim policies under complex agent-wise interactions, our unilateral attack aims to characterize and maximize the impact of the adversary on the victims.
1 code implementation • 23 Jan 2023 • Yadan Luo, Zhuoxiao Chen, Zijian Wang, Xin Yu, Zi Huang, Mahsa Baktashmotlagh
To alleviate the high annotation cost in LiDAR-based 3D object detection, active learning is a promising solution that learns to select only a small portion of unlabeled data to annotate, without compromising model performance.
no code implementations • 19 Jan 2023 • Junyang Cai, Khai-Nguyen Nguyen, Nishant Shrestha, Aidan Good, Ruisen Tu, Xin Yu, Shandian Zhe, Thiago Serra
One surprising trait of neural networks is the extent to which their connections can be pruned with little to no effect on accuracy.
1 code implementation • 3 Jan 2023 • Yifeng Ma, Suzhen Wang, Zhipeng Hu, Changjie Fan, Tangjie Lv, Yu Ding, Zhidong Deng, Xin Yu
In a nutshell, we aim to attain a speaking style from an arbitrary reference speaking video and then drive the one-shot portrait to speak with the reference speaking style and another piece of audio.
no code implementations • CVPR 2023 • Heming Du, Lincheng Li, Zi Huang, Xin Yu
In HiNL, we propose a History-aware State Estimation (HaSE) module to alleviate the impacts of dominant historical states on the current state estimation.
2 code implementations • 6 Dec 2022 • Wenbo Li, Xin Yu, Kun Zhou, Yibing Song, Zhe Lin, Jiaya Jia
To achieve high-quality results with low computational cost, we present a novel pixel spread model (PSM) that iteratively employs decoupled probabilistic modeling, combining the optimization efficiency of GANs with the prediction tractability of probabilistic models.
no code implementations • 6 Dec 2022 • Hao Zeng, Wei zhang, Changjie Fan, Tangjie Lv, Suzhen Wang, Zhimeng Zhang, Bowen Ma, Lincheng Li, Yu Ding, Xin Yu
Unlike most previous methods that focus on transferring the source inner facial features but neglect facial contours, our FlowFace can transfer both of them to a target face, thus leading to more realistic face swapping.
1 code implementation • 30 Nov 2022 • Qi Yang, Xin Yu, Ho Hin Lee, Leon Y. Cai, Kaiwen Xu, Shunxing Bao, Yuankai Huo, Ann Zenobia Moore, Sokratis Makrogiannis, Luigi Ferrucci, Bennett A. Landman
The proposed pipeline is effective and robust in extracting muscle groups on 2D single slice CT thigh images. The container is available for public use at https://github. com/MASILab/DA_CT_muscle_seg
no code implementations • 15 Nov 2022 • Beibei Lin, Chen Liu, Ming Wang, Lincheng Li, Shunli Zhang, Robby T. Tan, Xin Yu
Existing gait recognition frameworks retrieve an identity in the gallery based on the distance between a probe sample and the identities in the gallery.
no code implementations • 25 Oct 2022 • Zhipeng Hu, Wei zhang, Lincheng Li, Yu Ding, Wei Chen, Zhigang Deng, Xin Yu
We find that AUs and facial expressions are highly associated, and existing facial expression datasets often contain a large number of identities.
no code implementations • 23 Oct 2022 • Shibo Li, Jeff M. Phillips, Xin Yu, Robert M. Kirby, Shandian Zhe
However, this method only queries at one pair of fidelity and input at a time, and hence has a risk to bring in strongly correlated examples to reduce the learning efficiency.
1 code implementation • 16 Oct 2022 • Ho Hin Lee, Yucheng Tang, Han Liu, Yubo Fan, Leon Y. Cai, Qi Yang, Xin Yu, Shunxing Bao, Yuankai Huo, Bennett A. Landman
We evaluate our proposed approach on multi-organ segmentation with both non-contrast CT (NCCT) datasets and the MICCAI 2015 BTCV Challenge contrast-enhance CT (CECT) datasets.
1 code implementation • 14 Oct 2022 • Ruifei He, Shuyang Sun, Xin Yu, Chuhui Xue, Wenqing Zhang, Philip Torr, Song Bai, Xiaojuan Qi
Recent text-to-image generation models have shown promising results in generating high-fidelity photo-realistic images.
no code implementations • 13 Oct 2022 • Yuxin Mao, Zhexiong Wan, Yuchao Dai, Xin Yu
Single image blind deblurring is highly ill-posed as neither the latent sharp image nor the blur kernel is known.
1 code implementation • 29 Sep 2022 • Ping Liu, Xin Yu, Joey Tianyi Zhou
In this work, we first introduce a meta knowledge representation method that extracts meta knowledge from distributed clients.
1 code implementation • 28 Sep 2022 • Xin Yu, Qi Yang, Yucheng Tang, Riqiang Gao, Shunxing Bao, LeonY. Cai, Ho Hin Lee, Yuankai Huo, Ann Zenobia Moore, Luigi Ferrucci, Bennett A. Landman
External experiments on 20 subjects from the Baltimore Longitudinal Study of Aging (BLSA) dataset that contains longitudinal single abdominal slices validate that our method can harmonize the slice positional variance in terms of muscle and visceral fat area.
1 code implementation • 28 Sep 2022 • Xin Yu, Qi Yang, Yinchi Zhou, Leon Y. Cai, Riqiang Gao, Ho Hin Lee, Thomas Li, Shunxing Bao, Zhoubing Xu, Thomas A. Lasko, Richard G. Abramson, Zizhao Zhang, Yuankai Huo, Bennett A. Landman, Yucheng Tang
Transformer-based models, capable of learning better global dependencies, have recently demonstrated exceptional representation learning capabilities in computer vision and medical image analysis.
no code implementations • 28 Sep 2022 • Xin Yu, Yucheng Tang, Qi Yang, Ho Hin Lee, Riqiang Gao, Shunxing Bao, Ann Zenobia Moore, Luigi Ferrucci, Bennett A. Landman
Metabolic health is increasingly implicated as a risk factor across conditions from cardiology to neurology, and efficiency assessment of body composition is critical to quantitatively characterizing these relationships.
no code implementations • 7 Aug 2022 • Yujiao Shi, Xin Yu, Shan Wang, Hongdong Li
The critical challenge of this task is to learn a powerful global feature descriptor for the sequential ground-view images while considering its domain alignment with reference satellite images.
1 code implementation • 5 Aug 2022 • Feng Zhu, Zongxin Yang, Xin Yu, Yi Yang, Yunchao Wei
In this work, we propose a new online VIS paradigm named Instance As Identity (IAI), which models temporal information for both detection and tracking in an efficient way.
2 code implementations • 2 Aug 2022 • Beibei Lin, Shunli Zhang, Ming Wang, Lincheng Li, Xin Yu
GFR extractor aims to extract contextual information, e. g., the relationship among various body parts, and the mask-based LFR extractor is presented to exploit the detailed posture changes of local regions.
1 code implementation • 20 Jul 2022 • Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Jiajun Shen, Jia Li, Xiaojuan Qi
With the rapid development of mobile devices, modern widely-used mobile phones typically allow users to capture 4K resolution (i. e., ultra-high-definition) images.
Ranked #1 on
Image Restoration
on UHDM
1 code implementation • 19 Jul 2022 • Haitian Zeng, Xin Yu, Jiaxu Miao, Yi Yang
We propose MHR-Net, a novel method for recovering Non-Rigid Shapes from Motion (NRSfM).
no code implementations • 16 Jun 2022 • Zhimin Li, Shusen Liu, Xin Yu, Kailkhura Bhavya, Jie Cao, Diffenderfer James Daniel, Peer-Timo Bremer, Valerio Pascucci
We decomposed and evaluated a set of critical geometric concepts from the common adopted classification loss, and used them to design a visualization system to compare and highlight the impact of pruning on model performance and feature representation.
no code implementations • 7 Jun 2022 • Aidan Good, Jiaqi Lin, Hannah Sieg, Mikey Ferguson, Xin Yu, Shandian Zhe, Jerzy Wieczorek, Thiago Serra
In this work, we study such relative distortions in recall by hypothesizing an intensification effect that is inherent to the model.
1 code implementation • ICLR 2021 • Hehe Fan, Xin Yu, Yuhang Ding, Yi Yang, Mohan Kankanhalli
Then, a spatial convolution is employed to capture the local structure of points in the 3D space, and a temporal convolution is used to model the dynamics of the spatial regions along the time dimension.
no code implementations • 12 May 2022 • Ho Hin Lee, Yucheng Tang, Riqiang Gao, Qi Yang, Xin Yu, Shunxing Bao, James G. Terry, J. Jeffrey Carr, Yuankai Huo, Bennett A. Landman
In this paper, we propose a novel unsupervised approach that leverages pairwise contrast-enhanced CT (CECT) context to compute non-contrast segmentation without ground-truth label.
1 code implementation • CVPR 2022 • Peng Dai, Xin Yu, Lan Ma, Baoheng Zhang, Jia Li, Wenbo Li, Jiajun Shen, Xiaojuan Qi
Moire patterns, appearing as color distortions, severely degrade image and video qualities when filming a screen with digital cameras.
1 code implementation • 26 Mar 2022 • Yujiao Shi, Xin Yu, Liu Liu, Dylan Campbell, Piotr Koniusz, Hongdong Li
We address the problem of ground-to-satellite image geo-localization, that is, estimating the camera latitude, longitude and orientation (azimuth angle) by matching a query image captured at the ground level against a large-scale database with geotagged satellite images.
1 code implementation • 9 Mar 2022 • Xin Yu, Thiago Serra, Srikumar Ramalingam, Shandian Zhe
We propose a tractable heuristic for solving the combinatorial extension of OBS, in which we select weights for simultaneous removal, as well as a systematic update of the remaining weights.
no code implementations • 8 Mar 2022 • Chuanfu Shen, Beibei Lin, Shunli Zhang, George Q. Huang, Shiqi Yu, Xin Yu
Also, we design an Inception-like ReverseMask Block, which has three branches composed of a global branch, a feature dropping branch, and a feature scaling branch.
Ranked #2 on
Gait Recognition
on OUMVLP
1 code implementation • 8 Mar 2022 • Ming Wang, Beibei Lin, Xianda Guo, Lincheng Li, Zheng Zhu, Jiande Sun, Shunli Zhang, Xin Yu
ECM consists of the Spatial-Temporal feature extractor (ST), the Frame-Level feature extractor (FL) and SPB, and has two obvious advantages: First, each branch focuses on a specific representation, which can be used to improve the robustness of the network.
no code implementations • 4 Mar 2022 • Xin Yu, Yucheng Tang, Yinchi Zhou, Riqiang Gao, Qi Yang, Ho Hin Lee, Thomas Li, Shunxing Bao, Yuankai Huo, Zhoubing Xu, Thomas A. Lasko, Richard G. Abramson, Bennett A. Landman
Efficiently quantifying renal structures can provide distinct spatial context and facilitate biomarker discovery for kidney morphology.
no code implementations • 23 Dec 2021 • Guangming Yao, Hongzhi Wu, Yi Yuan, Lincheng Li, Kun Zhou, Xin Yu
In this paper, we present a novel double diffusion based neural radiance field, dubbed DD-NeRF, to reconstruct human body geometry and render the human body appearance in novel views from a sparse set of images.
no code implementations • 6 Dec 2021 • Suzhen Wang, Lincheng Li, Yu Ding, Xin Yu
Hence, we propose a novel one-shot talking face generation framework by exploring consistent correlations between audio and visual motions from a specific speaker and then transferring audio-driven motion fields to a reference image.
1 code implementation • 16 Oct 2021 • Xin Yu, Jeroen van Baar, Siheng Chen
We use a coarse graph, derived from a dense graph, to estimate the human's 3D pose, and the dense graph to estimate the 3D shape.
Ranked #247 on
3D Human Pose Estimation
on Human3.6M
1 code implementation • ICCV 2021 • Jing Zhang, Deng-Ping Fan, Yuchao Dai, Xin Yu, Yiran Zhong, Nick Barnes, Ling Shao
In this paper, we introduce a novel multi-stage cascaded learning framework via mutual information minimization to "explicitly" model the multi-modal information between RGB image and depth data.
no code implementations • 7 Sep 2021 • Minghui Zhang, Xin Yu, Hanxiao Zhang, Hao Zheng, Weihao Yu, Hong Pan, Xiangran Cai, Yun Gu
Compared to other state-of-the-art transfer learning methods, our method accurately segmented more bronchi in the noisy CT scans.
no code implementations • ICCV 2021 • Haitian Zeng, Yuchao Dai, Xin Yu, Xiaohan Wang, Yi Yang
As NRSfM is a highly under-constrained problem, we propose two new pairwise regularization to further regularize the reconstruction.
no code implementations • 2 Aug 2021 • Yang Zhang, Xin Yu, Xiaobo Lu, Ping Liu
Specifically, we design a novel cross-modal transformer module for facial priors estimation, in which an input face and its landmark features are formulated as queries and keys, respectively.
1 code implementation • 20 Jul 2021 • Suzhen Wang, Lincheng Li, Yu Ding, Changjie Fan, Xin Yu
As this keypoint based representation models the motions of facial regions, head, and backgrounds integrally, our method can better constrain the spatial and temporal consistency of the generated videos.
1 code implementation • CVPR 2021 • Ruijie Quan, Xin Yu, Yuanzhi Liang, Yi Yang
First, we propose a complementary cascaded network architecture, namely CCN, to remove rain streaks and raindrops in a unified framework.
no code implementations • 3 Jun 2021 • Ho Hin Lee, Yucheng Tang, Qi Yang, Xin Yu, Shunxing Bao, Leon Y. Cai, Lucas W. Remedios, Bennett A. Landman, Yuankai Huo
Medical image segmentation, or computing voxelwise semantic masks, is a fundamental yet challenging task to compute a voxel-level semantic mask.
1 code implementation • 31 May 2021 • Yuan Gan, Yawei Luo, Xin Yu, Bang Zhang, Yi Yang
In this paper, we investigate the task of hallucinating an authentic high-resolution (HR) human face from multiple low-resolution (LR) video snapshots.
no code implementations • ICLR 2021 • Heming Du, Xin Yu, Liang Zheng
In this paper, we introduce a Visual Transformer Network (VTNet) for learning informative visual representation in navigation.
1 code implementation • 16 Apr 2021 • Lincheng Li, Suzhen Wang, Zhimeng Zhang, Yu Ding, Yixing Zheng, Xin Yu, Changjie Fan
To be specific, our framework consists of a speaker-independent stage and a speaker-specific stage.
no code implementations • CVPR 2021 • Zongxin Yang, Xin Yu, Yi Yang
In the first step, the framework learns to segment objects from real and synthetic data in a weakly-supervised fashion, and the segmentation masks will act as a prior for pose estimation.
1 code implementation • CVPR 2021 • Yujiao Shi, Hongdong Li, Xin Yu
We then warp and aggregate source view pixels to synthesize a novel view based on the estimated source-view visibility and target-view depth.
no code implementations • ICCV 2021 • Peike Li, Xin Yu, Yi Yang
By iteratively updating the latent representations and our decoder, our DAP-FSR will be adapted to the target domain, thus achieving authentic and high-quality upsampled HR faces.
no code implementations • CVPR 2021 • Dongxu Li, Chenchen Xu, Kaihao Zhang, Xin Yu, Yiran Zhong, Wenqi Ren, Hanna Suominen, Hongdong Li
Video deblurring models exploit consecutive frames to remove blurs from camera shakes and object motions.
1 code implementation • 2 Mar 2021 • Yujiao Shi, Dylan Campbell, Xin Yu, Hongdong Li
Specifically, we observe that when a 3D point in the real world is visible in both views, there is a deterministic mapping between the projected points in the two-view images given the height information of this 3D point.
1 code implementation • NeurIPS 2021 • Thiago Serra, Xin Yu, Abhinav Kumar, Srikumar Ramalingam
We can compress a rectifier network while exactly preserving its underlying functionality with respect to a given input domain if some of its neurons are stable.
1 code implementation • 3 Feb 2021 • Yuhang Ding, Xin Yu, Yi Yang
Thus, it is more desirable to employ only a few labeled data in pursuing high segmentation performance.
no code implementations • 22 Jan 2021 • Gerard Kennedy, Zheyu Zhuang, Xin Yu, Robert Mahony
Object pose estimation from a single RGB image is a challenging problem due to variable lighting conditions and viewpoint changes.
1 code implementation • ICCV 2021 • Yuhang Ding, Xin Yu, Yi Yang
In this work, we propose a Region-aware Fusion Network (RFNet) that is able to exploit different combinations of multi-modal data adaptively and effectively for tumor segmentation.
Ranked #65 on
Semantic Segmentation
on NYU Depth v2
no code implementations • 10 Dec 2020 • Jing Zhang, Yuchao Dai, Xin Yu, Mehrtash Harandi, Nick Barnes, Richard Hartley
Existing deep neural network based salient object detection (SOD) methods mainly focus on pursuing high network accuracy.
no code implementations • ICCV 2021 • Beibei Lin, Shunli Zhang, Xin Yu
Towards this goal, we take advantage of both global visual information and local region details and develop a Global and Local Feature Extractor (GLFE).
2 code implementations • NeurIPS 2020 • Dongxu Li, Chenchen Xu, Xin Yu, Kaihao Zhang, Ben Swift, Hanna Suominen, Hongdong Li
Sign language translation (SLT) aims to interpret sign video sequences into text-based natural language sentences.
no code implementations • 4 Oct 2020 • Siddhant Ranade, Xin Yu, Shantnu Kakkar, Pedro Miraldo, Srikumar Ramalingam
We propose a novel technique to register sparse 3D scans in the absence of texture.
1 code implementation • ECCV 2020 • Heming Du, Xin Yu, Liang Zheng
Aiming to improve these two components, this paper proposes three complementary techniques, object relation graph (ORG), trial-driven imitation learning (IL), and a memory-augmented tentative policy network (TPN).
1 code implementation • 1 Jul 2020 • Yizhak Ben-Shabat, Xin Yu, Fatemeh Sadat Saleh, Dylan Campbell, Cristian Rodriguez-Opazo, Hongdong Li, Stephen Gould
The availability of a large labeled dataset is a key requirement for applying deep learning methods to solve various computer vision tasks.
1 code implementation • CVPR 2020 • Yujiao Shi, Xin Yu, Dylan Campbell, Hongdong Li
Cross-view geo-localization is the problem of estimating the position and orientation (latitude, longitude and azimuth angle) of a camera at ground level given a large-scale database of geo-tagged aerial (e. g., satellite) images.
1 code implementation • CVPR 2020 • Jing Zhang, Xin Yu, Aixuan Li, Peipei Song, Bowen Liu, Yuchao Dai
In this paper, we propose a weakly-supervised salient object detection model to learn saliency from such annotations.
no code implementations • CVPR 2020 • Dongxu Li, Xin Yu, Chenchen Xu, Lars Petersson, Hongdong Li
To this end, we extract news signs using a base WSLR model, and then design a classifier jointly trained on news and isolated signs to coarsely align these two domain features.
no code implementations • CVPR 2020 • Yang Zhang, Ivor Tsang, Yawei Luo, Changhui Hu, Xiaobo Lu, Xin Yu
This paper proposes a Copy and Paste Generative Adversarial Network (CPGAN) to recover authentic high-resolution (HR) face images while compensating for low and non-uniform illumination.
no code implementations • 10 Feb 2020 • Xin Yu, Zheyu Zhuang, Piotr Koniusz, Hongdong Li
In this paper, we aim to reduce such errors by incorporating the distances between pixels and keypoints into our objective.
no code implementations • 9 Feb 2020 • Yang Zhang, Ivor W. Tsang, Jun Li, Ping Liu, Xiaobo Lu, Xin Yu
The coarse-level FHnet generates a frontal coarse HR face and then the fine-level FHnet makes use of the facial component appearance prior, i. e., fine-grained facial components, to attain a frontal HR face image with authentic details.
1 code implementation • NeurIPS 2019 • Yujiao Shi, Liu Liu, Xin Yu, Hongdong Li
The first step is to apply a regular polar transform to warp an aerial image such that its domain is closer to that of a ground-view panorama.
Ranked #4 on
Image-Based Localization
on VIGOR Cross Area
2 code implementations • 24 Oct 2019 • Dongxu Li, Cristian Rodriguez Opazo, Xin Yu, Hongdong Li
Based on this new large-scale dataset, we are able to experiment with several deep learning methods for word-level sign recognition and evaluate their performances in large scale scenarios.
Ranked #3 on
Sign Language Recognition
on WLASL100
1 code implementation • 11 Jul 2019 • Yujiao Shi, Xin Yu, Liu Liu, Tong Zhang, Hongdong Li
This paper proposes a novel Cross-View Feature Transport (CVFT) technique to explicitly establish cross-view domain transfer that facilitates feature alignment between ground and aerial images.
no code implementations • 13 Jun 2019 • Siddhant Ranade, Xin Yu, Shantnu Kakkar, Pedro Miraldo, Srikumar Ramalingam
In contrast to correspondence based methods, we take a different viewpoint and formulate the sparse 3D registration problem based on the constraints from the intersection of line segments from adjacent scans.
2 code implementations • CVPR 2019 • Yurun Tian, Xin Yu, Bin Fan, Fuchao Wu, Huub Heijnen, Vassileios Balntas
Despite the fact that Second Order Similarity (SOS) has been used with significant success in tasks such as graph matching and clustering, it has not been exploited for learning local descriptors.
no code implementations • 7 Apr 2019 • Fatemeh Shiri, Xin Yu, Fatih Porikli, Richard Hartley, Piotr Koniusz
We develop an Identity-preserving Face Recovery from Portraits (IFRP) method that utilizes a Style Removal network (SRN) and a Discriminative Network (DN).
no code implementations • 7 Apr 2019 • Fatemeh Shiri, Xin Yu, Fatih Porikli, Richard Hartley, Piotr Koniusz
%Our method can recover high-quality photorealistic faces from unaligned portraits while preserving the identity of the face images as well as it can reconstruct a photorealistic face image with a desired set of attributes.
1 code implementation • 12 Mar 2019 • Liyuan Pan, Richard Hartley, Cedric Scheerlinck, Miaomiao Liu, Xin Yu, Yuchao Dai
Based on the abundant event data alongside a low frame rate, easily blurred images, we propose a simple yet effective approach to reconstruct high-quality and high frame rate sharp videos.
1 code implementation • CVPR 2019 • Liyuan Pan, Cedric Scheerlinck, Xin Yu, Richard Hartley, Miaomiao Liu, Yuchao Dai
In this paper, we propose a simple and effective approach, the \textbf{Event-based Double Integral (EDI)} model, to reconstruct a high frame-rate, sharp video from a single blurry frame and its event data.
no code implementations • ECCV 2018 • Xin Yu, Basura Fernando, Bernard Ghanem, Fatih Porikli, Richard Hartley
State-of-the-art face super-resolution methods use deep convolutional neural networks to learn a mapping between low-resolution (LR) facial patterns and their corresponding high-resolution (HR) counterparts by exploring local information.
1 code implementation • 6 Jul 2018 • Xin Yu, Sagar Chaturvedi, Chen Feng, Yuichi Taguchi, Teng-Yok Lee, Clinton Fernandes, Srikumar Ramalingam
In this paper, we propose VLASE, a framework to use semantic edge features from images to achieve on-road localization.
no code implementations • CVPR 2018 • Xin Yu, Basura Fernando, Richard Hartley, Fatih Porikli
An LR input contains low-frequency facial components of its HR version while its residual face image defined as the difference between the HR ground-truth and interpolated LR images contains the missing high-frequency facial details.
1 code implementation • CVPR 2018 • Xin Yu, Zhiding Yu, Srikumar Ramalingam
A family of super deep networks, referred to as residual networks or ResNet, achieved record-beating performance in various visual tasks such as image recognition, object detection, and semantic segmentation.
no code implementations • 5 Feb 2018 • Fatemeh Shiri, Xin Yu, Fatih Porikli, Piotr Koniusz
To enforce the destylized faces to be similar to authentic face images, we employ a discriminative network, which consists of convolutional and fully connected layers.
no code implementations • 8 Jan 2018 • Fatemeh Shiri, Xin Yu, Fatih Porikli, Richard Hartley, Piotr Koniusz
In this paper, we present a new Identity-preserving Face Recovery from Portraits (IFRP) to recover latent photorealistic faces from unaligned stylized portraits.
no code implementations • CVPR 2017 • Xin Yu, Fatih Porikli
Then we use a transformative encoder network to project the intermediate HR faces to aligned and noise-free LR faces.
Ranked #7 on
Image Super-Resolution
on VggFace2 - 8x upscaling