Such an approach enables the learning of more powerful prototype representations since each learned prototype will own a measure of variability, which naturally reduces the sparsity given the spread of the distribution around each prototype, and we also integrate a prototype diversity objective function into the GMM optimisation to reduce redundancy.
Then, cross-modal knowledge distillation is performed between teacher and student modalities for each task to push the model parameters to a point that is beneficial for all tasks.
Second, we introduce a new Partial Label Supervision (PLS) for noisy label learning that accounts for both clean label coverage and uncertainty.
This is achieved from a strategy that relies on auxiliary tasks based on distribution alignment and domain classification, in addition to a residual feature fusion procedure.
In this work, we propose a new strategy to build cost-effective and relatively unbiased audio-visual semantic segmentation benchmarks.
no code implementations • 31 Jan 2023 • Yuanhong Chen, Yuyuan Liu, Chong Wang, Michael Elliott, Chun Fung Kwok, Carlos Pena-Solorzano, Yu Tian, Fengbei Liu, Helen Frazer, Davis J. McCarthy, Gustavo Carneiro
Given the large size of such datasets, researchers usually face a dilemma with the weakly annotated subset: to not use it or to fully annotate it.
Prototypical part network (ProtoPNet) methods have been designed to achieve interpretable classification by associating predictions with a set of training prototypes, which we refer to as trivial prototypes because they are trained to lie far from the classification boundary in the feature space.
Also, the new sample selection is based on multi-view consensus, which uses the label views from training labels and model predictions to divide the training set into clean and noisy for training the multi-class model and to re-label the training samples with multiple top-ranked labels for training the multi-label model.
On the other hand, prototype-based models improve interpretability by associating predictions with training image prototypes, but they are less accurate than global models and their prototypes tend to have poor diversity.
When analysing screening mammograms, radiologists can naturally process information across two ipsilateral views of each breast, namely the cranio-caudal (CC) and mediolateral-oblique (MLO) views.
Multi-modal learning focuses on training models by equally combining multiple input data modalities during the prediction process.
The most successful SSL approaches are based on consistency learning that minimises the distance between model responses obtained from perturbed views of the unlabelled data.
Current polyp detection methods from colonoscopy videos use exclusively normal (i. e., healthy) training images, which i) ignore the importance of temporal information in consecutive video frames, and ii) lack knowledge about the polyps.
Our UAD approach, the memory-augmented multi-level cross-attentional masked autoencoder (MemMC-MAE), is a transformer-based approach, consisting of a novel memory-augmented self-attention operator for the encoder and a new multi-level cross-attention operator for the decoder.
Deep learning methods have shown outstanding classification accuracy in medical imaging problems, which is largely attributed to the availability of large-scale datasets manually annotated with clean labels.
The accurate prediction by this model allows us to use a challenging combination of network, input data and feature perturbations to improve the consistency learning generalisation, where the feature perturbations consist of a new adversarial perturbation.
Effective semi-supervised learning (SSL) in medical image analysis (MIA) must address two challenges: 1) work effectively on both multi-class (e. g., lesion classification) and multi-label (e. g., multiple-disease diagnosis) problems, and 2) handle imbalanced learning (because of the high variance in disease prevalence).
However, previous uncertainty approaches that directly associate high uncertainty to anomaly may sometimes lead to incorrect anomaly predictions, and external reconstruction models tend to be too inefficient for real-time self-driving embedded systems.
Ranked #2 on Anomaly Detection on Lost and Found (using extra training data)
Pre-training UAD methods with self-supervised learning, based on computer vision techniques, can mitigate this challenge, but they are sub-optimal because they do not explore domain knowledge for designing the pretext tasks, and their contrastive learning losses do not try to cluster the normal training images, which may result in a sparse distribution of normal images that is ineffective for anomaly detection.
In this paper, we propose a new training module called Non-Volatile Unbiased Memory (NVUM), which non-volatility stores running average of model logits for a new regularization loss on noisy multi-label problem.
Unsupervised anomaly detection (UAD) learns one-class classifiers exclusively with normal (i. e., healthy) images to detect any abnormal (i. e., unhealthy) samples that do not conform to the expected normal patterns.
Ranked #1 on Anomaly Detection on LAG
To address this issue, we introduce a novel and theoretically sound method, named Robust Temporal Feature Magnitude learning (RTFM), which trains a feature magnitude learning function to effectively recognise the positive instances, substantially improving the robustness of the MIL approach to the negative instances from abnormal videos.
The adversarial interpolation is enforced to consistently learn a smooth Gaussian descriptor, even when the training data is small or contaminated with anomalous samples.
Ranked #2 on Anomaly Detection on MNIST (using extra training data)