Our method is composed of two networks: a localizer that yields segmentation mask, followed by a classifier.
Transductive inference is widely used in few-shot learning, as it leverages the statistics of the unlabeled query set of a few-shot task, typically yielding substantially better performances than its inductive counterpart.
An interesting and practical paradigm is online test-time adaptation, according to which training data is inaccessible, no labelled data from the test distribution is available, and adaptation can only happen at test time and on a handful of samples.
The CNN is exploited to collect both positive and negative evidence at the pixel level to train the decoder.
Following our observations, we propose a simple and flexible generalization based on inequality constraints, which imposes a controllable margin on logit distances.
In this work, we propose a dual-branch architecture, where the upper branch (teacher) receives strong annotations, while the bottom one (student) is driven by limited supervision and guided by the upper branch.
Interpolation is required to restore full size CAMs, yet it does not consider the statistical properties of objects, such as color and texture, leading to activations with inconsistent boundaries, and inaccurate localizations.
Our method yields comparable results to several state of the art adaptation techniques, despite having access to much less information, as the source images are entirely absent in our adaptation phase.
We motivate our transductive loss by deriving a formal relation between the classification accuracy and mutual-information maximization.
Surprisingly, we found that even standard clustering procedures (e. g., K-means), which correspond to particular, non-regularized cases of our general model, already achieve competitive performances in comparison to the state-of-the-art in few-shot learning.
Adversarial robustness has become a topic of growing interest in machine learning since it was observed that neural networks tend to be brittle.
We also found that shape descriptors can be a valid way to encode anatomical priors about the task, enabling to leverage expert knowledge without additional annotations.
In the abundant segmentation literature, there is no clear consensus as to which of these losses is a better choice, with varying performances for each across different benchmarks and applications.
In conjunction with a standard cross-entropy over the labeled pixels, our novel formulation integrates two important terms: (i) a Shannon entropy loss defined over the less-supervised images, which encourages confident student predictions at the bottom branch; and (ii) a Kullback-Leibler (KL) divergence, which transfers the knowledge from the predictions generated by the strongly supervised branch to the less-supervised branch, and guides the entropy (student-confidence) term to avoid trivial solutions.
We show that the way inference is performed in few-shot segmentation tasks has a substantial effect on performances -- an aspect often overlooked in the literature in favor of the meta-learning paradigm.
1 code implementation • 9 Dec 2020 • Shanshan Wang, Cheng Li, Rongpin Wang, Zaiyi Liu, Meiyun Wang, Hongna Tan, Yaping Wu, Xinfeng Liu, Hui Sun, Rui Yang, Xin Liu, Jie Chen, Huihui Zhou, Ismail Ben Ayed, Hairong Zheng
Automatic medical image segmentation plays a critical role in scientific research and medical care.
Our attack enjoys the generality of penalty methods and the computational efficiency of distance-customized algorithms, and can be readily used for a wide set of distances.
We propose novel regularization terms, which enable the model to seek both non-discriminative and discriminative regions, while discouraging unbalanced segmentations.
CNN visualization and interpretation methods, like class-activation maps (CAMs), are typically used to highlight the image regions linked to class predictions.
Assessing the degree of disease severity in biomedical images is a task similar to standard classification but constrained by an underlying structure in the label space.
Our analysis demonstrates that the retinal vessel segmentation problem is far from solved when considering test images that differ substantially from the training data, and that this task represents an ideal scenario for the exploration of domain adaptation techniques.
We introduce Transductive Infomation Maximization (TIM) for few-shot learning.
Our Mutual Attention network relies on the joint spatial attention between image and optical flow features maps to activate a common set of salient features across them.
This compendium gathers all the accepted extended abstracts from the Third International Conference on Medical Imaging with Deep Learning (MIDL 2020), held in Montreal, Canada, 6-9 July 2020.
Our transductive inference does not re-train the base model, and can be viewed as a graph clustering of the query set, subject to supervision constraints from the support set.
Data augmentation is a key practice in machine learning for improving generalization performance.
Our formulation is based on minimizing a label-free entropy loss defined over target-domain data, which we further guide with a domain invariant prior on the segmentation regions.
Particularly, we leverage a classical tightness prior to a deep learning setting via imposing a set of constraints on the network outputs.
Second, we show that, more generally, minimizing the cross-entropy is actually equivalent to maximizing the mutual information, to which we connect several well-known pairwise losses.
Ranked #7 on Metric Learning on In-Shop (using extra training data)
To handle this new learning paradigm, we propose to include surrogate tasks that can leverage very powerful supervisory signals --derived from the data itself-- for semantic feature learning.
Despite the initial belief that Convolutional Neural Networks (CNNs) are driven by shapes to perform visual recognition tasks, recent evidence suggests that texture bias in CNNs provides higher performing models when learning on large labeled training datasets.
Ranked #2 on Few-Shot Semantic Segmentation on Pascal5i
We propose a new constrained-optimization formulation for deep ordinal classification, in which uni-modality of the label distribution is enforced implicitly via a set of inequality constraints over all the pairs of adjacent labels.
Ranked #1 on Historical Color Image Dating on HCI
Large-scale ground truth data sets are of crucial importance for deep learning based segmentation models, but annotating per-pixel masks is prohibitively time consuming.
Data augmentation (DA) is fundamental against overfitting in large convolutional neural networks, especially with a limited training dataset.
Four key challenges are identified for the application of deep WSOL methods in histology -- under/over activation of CAMs, sensitivity to thresholding, and model selection.
An efficient strategy for weakly-supervised segmentation is to impose constraints or regularization priors on target regions.
We demonstrate the existence of universal adversarial perturbations, which can fool a family of audio classification architectures, for both targeted and untargeted attack scenarios.
We propose to adapt segmentation networks with a constrained formulation, which embeds domain-invariant prior knowledge about the segmentation regions.
Pointwise localization allows more precise localization and accurate interpretability, compared to bounding box, in applications where objects are highly unstructured such as in medical domain.
Then, these techniques are analysed according to their pruningcriteria and strategy, and according to different scenarios for exploiting pruningmethods to fine-tuning networks to target domains.
We derive a general tight upper bound based on a concave-convex decomposition of our fairness term, its Lipschitz-gradient property and the Pinsker's inequality.
This study investigates a curriculum-style strategy for semi-supervised CNN segmentation, which devises a regression network to learn image-level information such as the size of a target region.
While sub-optimality is not guaranteed for non-convex problems, this result shows that log-barrier extensions are a principled way to approximate Lagrangian optimization for constrained CNNs via implicit dual variables.
We juxtapose our approach to state-of-the-art segmentation adaptation via adversarial training in the network-output space.
We propose a boundary loss, which takes the form of a distance metric on the space of contours, not regions.
Research on adversarial examples in computer vision tasks has shown that small, often imperceptible changes to an image can induce misclassification, which has security implications for a wide range of image processing systems.
Despite the technological advances in medical imaging, IVD localization and segmentation are still manually performed, which is time-consuming and prone to errors.
Furthermore, we show that the density modes can be obtained as byproducts of the assignment variables via simple maximum-value operations whose additional computational cost is linear in the number of data points.
First, instead of combining the available image modalities at the input, each of them is processed in a different path to better exploit their unique information.
Typically, they use multinomial logistic regression posteriors and parameter regularization, as is very common in supervised learning.
Ranked #2 on Image Clustering on YouTube Faces DB
Precise segmentation of bladder walls and tumor regions is an essential step towards non-invasive identification of tumor stage and grade, which is critical for treatment decision and prognosis of patients with bladder cancer (BC).
To the best of our knowledge, the method of [Pathak et al., 2015] is the only prior work that addresses deep CNNs with linear constraints in weakly supervised segmentation.
Therefore, the proposed network has total freedom to learn more complex combinations between the modalities, within and in-between all the levels of abstraction, which increases significantly the learning representation.
Ranked #1 on Medical Image Segmentation on iSEG 2017 Challenge
This approach simplifies weakly-supervised training by avoiding extra MRF/CRF inference steps or layers explicitly generating full masks, while improving both the quality and efficiency of training.
The method enforces connectivity priors iteratively by a cutting plane method, and provides feasible solutions with a guarantee on sub-optimality even if we terminate it earlier.
We report evaluations of our method on the public data of the MICCAI iSEG-2017 Challenge on 6-month infant brain MRI segmentation, and show very competitive results among 21 teams, ranking first or second in most metrics.
Ranked #1 on Infant Brain Mri Segmentation on iSEG 2017 Challenge
Neonatal brain segmentation in magnetic resonance (MR) is a challenging problem due to poor image quality and low contrast between white and gray matter regions.
We call it Breiman's bias due to its similarity to the histogram mode isolation previously discovered by Breiman in decision tree learning with Gini impurity.
We propose to constrain segmentation functionals with a dimensionless, unbiased and position-independent shape compactness prior, which we solve efficiently with an alternating direction method of multipliers (ADMM).
These figures translate into a very good agreement with the reference contours and an increase in accuracy compared to other methods.
We formulate an Alternating Direction Method of Mul-tipliers (ADMM) that systematically distributes the computations of any technique for optimizing pairwise functions, including non-submodular potentials.
Our bound formulation for kernel K-means allows to combine general pair-wise feature clustering methods with image grid regularization using graph cuts, similarly to standard color model fitting techniques for segmentation.
We propose a new segmentation model combining common regularization energies, e. g. Markov Random Field (MRF) potentials, and standard pairwise clustering criteria like Normalized Cut (NC), average association (AA), etc.
Many standard optimization methods for segmentation and reconstruction compute ML model estimates for appearance or geometry of segments, e. g. Zhu-Yuille 1996, Torr 1998, Chan-Vese 2001, GrabCut 2004, Delong et al. 2012.