In digital images, the performance of optical aberration is a multivariate degradation, where the spectral of the scene, the lens imperfections, and the field of view together contribute to the results.
Temporal action segmentation is crucial for understanding long-form videos.
Ranked #2 on Action Segmentation on Breakfast
To bridge this gap, we propose a new task called Multi-view Amodal Instance Segmentation (MAIS) and introduce the MUVA dataset, the first MUlti-View AIS dataset that takes the shopping scenario as instantiation.
Most existing deblurring methods focus on removing global blur caused by camera shake, while they cannot well handle local blur caused by object movements.
Medical image segmentation has been widely recognized as a pivot procedure for clinical diagnosis, analysis, and treatment planning.
However, there are several major concerns when directly applying the Transformer to the action segmentation task, such as the lack of inductive biases with small training sets, the deficit in processing long input sequence, and the limitation of the decoder architecture to utilize temporal relations among multiple action segments to refine the initial predictions.
Ranked #2 on Action Segmentation on Assembly101
To address the problem, we propose a new non end-to-end training strategy and explore different designs of multi-stage architecture for surgical phase recognition task.
In this paper, a unified multi-path framework for automatic surgical skill assessment is proposed, which takes care of multiple composing aspects of surgical skills, including surgical tool usage, intraoperative event pattern, and other skill proxies.
We focus on automatically assessing the quality of in-the-wild videos, which is a challenging problem due to the absence of reference videos, the complexity of distortions, and the diversity of video contents.
Ranked #1 on Video Quality Assessment on MSU NR VQA Database
Computer vision technology is widely used in biological and medical data analysis and understanding.
Then an objective and automated framework based on neural network is proposed to predict surgical skills through the proxy of COF.
In the experiments on the binary instrument segmentation task of the 2017 MICCAI EndoVis Robotic Instrument Segmentation Challenge dataset, the proposed method achieves 0. 71 IoU and 0. 81 Dice score without using a single manual annotation, which is promising to show the potential of unsupervised learning for surgical tool segmentation.
Experiments on two relevant datasets (KonIQ-10k and CLIVE) show that, compared to MAE or MSE loss, the new loss enables the IQA model to converge about 10 times faster and the final model achieves better performance.
Ranked #2 on Image Quality Assessment on MSU NR VQA Database
no code implementations • 23 Mar 2020 • Tobias Ross, Annika Reinke, Peter M. Full, Martin Wagner, Hannes Kenngott, Martin Apitz, Hellena Hempe, Diana Mindroc Filimon, Patrick Scholz, Thuy Nuong Tran, Pierangela Bruno, Pablo Arbeláez, Gui-Bin Bian, Sebastian Bodenstedt, Jon Lindström Bolmgren, Laura Bravo-Sánchez, Hua-Bin Chen, Cristina González, Dong Guo, Pål Halvorsen, Pheng-Ann Heng, Enes Hosgor, Zeng-Guang Hou, Fabian Isensee, Debesh Jha, Tingting Jiang, Yueming Jin, Kadir Kirtac, Sabrina Kletz, Stefan Leger, Zhixuan Li, Klaus H. Maier-Hein, Zhen-Liang Ni, Michael A. Riegler, Klaus Schoeffmann, Ruohua Shi, Stefanie Speidel, Michael Stenzel, Isabell Twick, Gutai Wang, Jiacheng Wang, Liansheng Wang, Lu Wang, Yu-Jie Zhang, Yan-Jie Zhou, Lei Zhu, Manuel Wiesenfarth, Annette Kopp-Schneider, Beat P. Müller-Stich, Lena Maier-Hein
The validation of the competing methods for the three tasks (binary segmentation, multi-instance detection and multi-instance segmentation) was performed in three different stages with an increasing domain gap between the training and the test data.
It is widely known that well-designed perturbations can cause state-of-the-art machine learning classifiers to mis-label an image, with sufficiently small perturbations that are imperceptible to the human eyes.
We propose an objective no-reference video quality assessment method by integrating both effects into a deep neural network.
Ranked #6 on Video Quality Assessment on MSU NR VQA Database
So we propose a new no-reference method of tone-mapped image quality assessment based on multi-scale and multi-layer features that are extracted from a pre-trained deep convolutional neural network model.
To guarantee a satisfying Quality of Experience (QoE) for consumers, it is required to measure image quality efficiently and reliably.
The proposed method, SFA, is compared with nine representative blur-specific NR-IQA methods, two general-purpose NR-IQA methods, and two extra full-reference IQA methods on Gaussian blur images (with and without Gaussian noise/JPEG compression) and realistic blur images from multiple databases, including LIVE, TID2008, TID2013, MLIVE1, MLIVE2, BID, and CLIVE.
The property of edge-free guarantees that the generated adversarial images can still preserve visual quality, even when perturbations are of large magnitudes.
Recognition of surgical gesture is crucial for surgical skill assessment and efficient surgery training.
Ranked #3 on Action Segmentation on JIGSAWS
We introduce DeepSurv, a Cox proportional hazards deep neural network and state-of-the-art survival method for modeling interactions between a patient's covariates and treatment effectiveness in order to provide personalized treatment recommendations.
In unsupervised ensemble learning, one obtains predictions from multiple sources or classifiers, yet without knowing the reliability and expertise of each source, and with no labeled data to assess it.
As the image enhancement algorithms developed in recent years, how to compare the performances of different image enhancement algorithms becomes a novel task.