Since those regularization strategies are mostly associated with classifier outputs, we propose a MUlti-Classifier (MUC) incremental learning paradigm that integrates an ensemble of auxiliary classifiers to estimate more effective regularization constraints.
The frequency-guided upsampling module reconstructs details from multiple frequency-specific components with rich details.
The breakthrough of contrastive learning (CL) has fueled the recent success of self-supervised learning (SSL) in high-level vision tasks on RGB images.
no code implementations • 25 May 2021 • Xi Jia, Alexander Thorley, Wei Chen, Huaqi Qiu, Linlin Shen, Iain B Styles, Hyung Jin Chang, Ales Leonardis, Antonio de Marvao, Declan P. O'Regan, Daniel Rueckert, Jinming Duan
We then propose two neural layers (i. e. warping layer and intensity consistency layer) to model the analytical solution and a residual U-Net to formulate the denoising problem (i. e. generalized denoising layer).
no code implementations • 7 May 2021 • Jinjin Gu, Haoming Cai, Chao Dong, Jimmy S. Ren, Yu Qiao, Shuhang Gu, Radu Timofte, Manri Cheon, SungJun Yoon, Byungyeon Kang, Junwoo Lee, Qing Zhang, Haiyang Guo, Yi Bin, Yuqing Hou, Hengliang Luo, Jingyu Guo, ZiRui Wang, Hai Wang, Wenming Yang, Qingyan Bai, Shuwei Shi, Weihao Xia, Mingdeng Cao, Jiahao Wang, Yifan Chen, Yujiu Yang, Yang Li, Tao Zhang, Longtao Feng, Yiting Liao, Junlin Li, William Thong, Jose Costa Pereira, Ales Leonardis, Steven McDonagh, Kele Xu, Lehan Yang, Hengxing Cai, Pengfei Sun, Seyed Mehdi Ayyoubzadeh, Ali Royat, Sid Ahmed Fezza, Dounia Hammou, Wassim Hamidouche, Sewoong Ahn, Gwangjin Yoon, Koki Tsubota, Hiroaki Akutsu, Kiyoharu Aizawa
This paper reports on the NTIRE 2021 challenge on perceptual image quality assessment (IQA), held in conjunction with the New Trends in Image Restoration and Enhancement workshop (NTIRE) workshop at CVPR 2021.
We study the problem of labelling effort for semantic segmentation of large-scale 3D point clouds.
In this paper, we focus on category-level 6D pose and size estimation from monocular RGB-D image.
Recurrent models are becoming a popular choice for video enhancement tasks such as video denoising.
Object detection has witnessed significant progress by relying on large, manually annotated datasets.
When smartphone cameras are used to take photos of digital screens, usually moire patterns result, severely degrading photo quality.
no code implementations • 6 May 2020 • Shanxin Yuan, Radu Timofte, Ales Leonardis, Gregory Slabaugh, Xiaotong Luo, Jiangtao Zhang, Yanyun Qu, Ming Hong, Yuan Xie, Cuihua Li, Dejia Xu, Yihao Chu, Qingyan Sun, Shuai Liu, Ziyao Zong, Nan Nan, Chenghua Li, Sangmin Kim, Hyungjoon Nam, Jisu Kim, Jechang Jeong, Manri Cheon, Sung-Jun Yoon, Byungyeon Kang, Junwoo Lee, Bolun Zheng, Xiaohong Liu, Linhui Dai, Jun Chen, Xi Cheng, Zhen-Yong Fu, Jian Yang, Chul Lee, An Gia Vien, Hyunkook Park, Sabari Nathan, M. Parisa Beham, S Mohamed Mansoor Roomi, Florian Lemarchand, Maxime Pelcat, Erwan Nogues, Densen Puthussery, Hrishikesh P. S, Jiji C. V, Ashish Sinha, Xuan Zhao
Track 1 targeted the single image demoireing problem, which seeks to remove moire patterns from a single image.
Image demoireing is a multi-faceted image restoration task involving both texture and color restoration.
Ranked #1 on Image Enhancement on TIP 2018
This framework flexibly disentangles user-adaptation into model personalization on the server and local data regularization on the user device, with desirable properties regarding scalability and privacy constraints.
Third, via the predicted segmentation and translation, we transfer the fine object point cloud into a local canonical coordinate, in which we train a rotation localization network to estimate initial object rotation.
Firstly, we select a set of candidate scene illuminants in a data-driven fashion and apply them to a target image to generate of set of corrected images.
no code implementations • 8 Nov 2019 • Shanxin Yuan, Radu Timofte, Gregory Slabaugh, Ales Leonardis, Bolun Zheng, Xin Ye, Xiang Tian, Yaowu Chen, Xi Cheng, Zhen-Yong Fu, Jian Yang, Ming Hong, Wenying Lin, Wenjin Yang, Yanyun Qu, Hong-Kyu Shin, Joon-Yeon Kim, Sung-Jea Ko, Hang Dong, Yu Guo, Jie Wang, Xuan Ding, Zongyan Han, Sourya Dipta Das, Kuldeep Purohit, Praveen Kandula, Maitreya Suin, A. N. Rajagopalan
A new dataset, called LCDMoire was created for this challenge, and consists of 10, 200 synthetically generated image pairs (moire and clean ground truth).
In addition to describing the dataset and its creation, this paper also reviews the challenge tracks, competition, and results, the latter summarizing the current state-of-the-art on this dataset.
Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase.
We first show that applying physics supervision to an existing scene understanding model increases performance, produces more stable predictions, and allows training to an equivalent performance level with fewer annotated training examples.
In this work, we propose a new approach that affords fast adaptation to previously unseen cameras, and robustness to changes in capture device by leveraging annotated samples across different cameras and datasets.
no code implementations • 9 Oct 2018 • Tomas Hodan, Rigas Kouskouridas, Tae-Kyun Kim, Federico Tombari, Kostas Bekris, Bertram Drost, Thibault Groueix, Krzysztof Walas, Vincent Lepetit, Ales Leonardis, Carsten Steger, Frank Michel, Caner Sahin, Carsten Rother, Jiri Matas
The workshop featured four invited talks, oral and poster presentations of accepted workshop papers, and an introduction of the BOP benchmark for 6D object pose estimation.
With this, we show that HRA database poses a challenge at a higher level for the well studied representation learning methods, and provide a benchmark in the task of human rights violations recognition in visual context.
We question the dominant role of real-world training images in the field of material classification by investigating whether synthesized data can generalise more effectively than real-world data.
Selecting the most suitable local invariant feature detector for a particular application has rendered the task of evaluating feature detectors a critical issue in vision research.
We conduct a rigorous evaluation on a common ground by combining this dataset with different state-of-the-art deep convolutional architectures in order to achieve recognition of human rights violations.
Determining the material category of a surface from an image is a demanding task in perception that is drawing increasing attention.
Since each branch in NetT is trained by the videos of a specific category or groups of similar categories, NetT encodes category-based features for tracking.
Since local feature detection has been one of the most active research areas in computer vision during the last decade, a large number of detectors have been proposed.
The efficiency and the good accuracy in determining the optimal feature detector for any operating condition, make the proposed tool suitable to be utilized in real visual applications.
We represent the topological relationship between shape components using graphs, which are aggregated to construct a hierarchical graph structure for the shape vocabulary.
This paper presents a method for single target tracking of arbitrary objects in challenging video sequences.
This paper addresses the problem of single-target tracker performance evaluation.
We propose a joint object pose estimation and categorization approach which extracts information about object poses and categories from the object parts and compositions constructed at different layers of a hierarchical object representation algorithm, namely Learned Hierarchy of Parts (LHOP).
A graph theoretic approach is proposed for object shape representation in a hierarchical compositional architecture called Compositional Hierarchy of Parts (CHOP).
At the top-level of the vocabulary, the compositions are sufficiently large and complex to represent the whole shapes of the objects.
We explore and compare their computational behavior (space and time) and detection performance as a function of the number of learned classes on several recognition data sets.