no code implementations • 20 Nov 2023 • Eli Verwimp, Rahaf Aljundi, Shai Ben-David, Matthias Bethge, Andrea Cossu, Alexander Gepperth, Tyler L. Hayes, Eyke Hüllermeier, Christopher Kanan, Dhireesha Kudithipudi, Christoph H. Lampert, Martin Mundt, Razvan Pascanu, Adrian Popescu, Andreas S. Tolias, Joost Van de Weijer, Bing Liu, Vincenzo Lomonaco, Tinne Tuytelaars, Gido M. van de Ven
Continual learning is a sub-field of machine learning, which aims to allow machine learning models to continuously learn on new data, by accumulating knowledge without forgetting what was learned in the past.
no code implementations • 14 Sep 2023 • Eugene Vorontsov, Alican Bozkurt, Adam Casson, George Shaikovski, Michal Zelechowski, SiQi Liu, Philippe Mathieu, Alexander van Eck, Donghun Lee, Julian Viret, Eric Robert, Yi Kan Wang, Jeremy D. Kunz, Matthew C. H. Lee, Jan Bernhard, Ran A. Godrich, Gerard Oakley, Ewan Millar, Matthew Hanna, Juan Retamero, William A. Moye, Razik Yousfi, Christopher Kanan, David Klimstra, Brandon Rothrock, Thomas J. Fuchs
However, a major challenge to this objective is that for many specific computational pathology tasks the amount of data is inadequate for development.
We evaluate GRASP and other policies by conducting CL experiments on the large-scale ImageNet-1K and Places-LT image classification datasets.
Addressing this problem would enable learning new data with fewer network updates, resulting in increased computational efficiency.
Compared to REMIND and prior arts, SIESTA is far more computationally efficient, enabling continual learning on ImageNet-1K in under 2 hours on a single GPU; moreover, in the augmentation-free setting it matches the performance of the offline learner, a milestone critical to driving adoption of continual learning in real-world applications.
no code implementations • 8 Dec 2022 • Indranil Sur, Zachary Daniels, Abrar Rahman, Kamil Faber, Gianmarco J. Gallardo, Tyler L. Hayes, Cameron E. Taylor, Mustafa Burak Gurbuz, James Smith, Sahana Joshi, Nathalie Japkowicz, Michael Baron, Zsolt Kira, Christopher Kanan, Roberto Corizzo, Ajay Divakaran, Michael Piacentino, Jesse Hostetler, Aswin Raghavan
In this paper, we introduce the Lifelong Reinforcement Learning Components Framework (L2RLCF), which standardizes L2RL systems and assimilates different continual learning components (each addressing different aspects of the lifelong learning problem) into a unified system.
We achieve more than 95% of the network's performance on CamVid and CityScapes datasets, utilizing only 12. 1% and 15. 1% of the labeled data, respectively.
Previous work has shown that convolutional networks excel at extracting gaze features despite the presence of such artifacts.
We propose a new direction: modifying the network architecture to impose inductive biases that make the network robust to dataset bias.
Ranked #3 on Action Recognition on BAR
Real-time on-device continual learning is needed for new applications such as home robots, user personalization on smartphones, and augmented/virtual reality headsets.
Using this framing, we introduce an active sampling method that asks for examples from the tail of the data distribution and show that it outperforms classical active learning methods on Visual Genome.
GCRN consists of two separate graphs to predict object labels based on the contextual cues in the image: 1) a representation graph to learn object features based on the neighboring objects and 2) a context graph to explicitly capture contextual cues from the neighboring objects.
In this technical report, we present our approaches for the continual object detection track of the SODA10M challenge.
Humans are incredibly good at transferring knowledge from one domain to another, enabling rapid learning of new tasks.
We introduce a new dataset called Biased MNIST that enables assessment of robustness to multiple bias sources.
4 code implementations • 1 Apr 2021 • Vincenzo Lomonaco, Lorenzo Pellegrini, Andrea Cossu, Antonio Carta, Gabriele Graffieti, Tyler L. Hayes, Matthias De Lange, Marc Masana, Jary Pomponi, Gido van de Ven, Martin Mundt, Qi She, Keiland Cooper, Jeremy Forest, Eden Belouadah, Simone Calderara, German I. Parisi, Fabio Cuzzolin, Andreas Tolias, Simone Scardapane, Luca Antiga, Subutai Amhad, Adrian Popescu, Christopher Kanan, Joost Van de Weijer, Tinne Tuytelaars, Davide Bacciu, Davide Maltoni
Learning continually from non-stationary data streams is a long-standing goal and a challenging problem in machine learning.
Replay is the reactivation of one or more neural patterns, which are similar to the activation patterns experienced during past waking experiences.
In continual learning, a system must incrementally learn from a non-stationary data stream without catastrophic forgetting.
Analogical reasoning tests such as Raven's Progressive Matrices (RPMs) are commonly used to measure non-verbal abstract reasoning in humans, and recently offline neural networks for the RPM problem have been proposed.
Artificial intelligence (AI) has been successful at solving numerous problems in machine perception.
Supervised classification methods often assume that evaluation data is drawn from the same distribution as training data and that all classes are present for training.
Humans can incrementally learn to do new visual detection tasks, which is a huge challenge for today's computer vision systems.
In this work, we introduce Stream-51, a new dataset for streaming classification consisting of temporally correlated images from 51 distinct object categories and additional evaluation classes outside of the training distribution to test novelty recognition.
Out-of-distribution (OOD) testing is increasingly popular for evaluating a machine learning system's ability to generalize beyond the biases of a training set.
Traditionally, deep convolutional neural networks consist of a series of convolutional and pooling layers followed by one or more fully connected (FC) layers to perform the final classification.
Existing Visual Question Answering (VQA) methods tend to exploit dataset biases and spurious statistical correlations, instead of producing right answers for the right reasons.
We investigate applying convolutional neural network (CNN) architecture to facilitate aerial hyperspectral scene understanding and present a new hyperspectral dataset-AeroRIT-that is large enough for CNN training.
We found that input perturbation and temperature scaling yield the best performance on large scale datasets regardless of the feature space regularization strategy.
While there is neuroscientific evidence that the brain replays compressed memories, existing methods for convolutional networks replay raw images.
Accurate eye segmentation can improve eye-gaze estimation and support interactive computing based on visual attention; however, existing eye segmentation methods suffer from issues such as person-dependent accuracy, lack of robustness, and an inability to be run in real-time.
Ranked #1 on Semantic Segmentation on OpenEDS
By combining streaming linear discriminant analysis with deep learning, we are able to outperform both incremental batch learning and streaming learning algorithms on both ImageNet ILSVRC-2012 and CORe50, a dataset that involves learning to classify from temporally ordered samples.
Chart question answering (CQA) is a newly proposed visual question answering (VQA) task where an algorithm must answer questions about data visualizations, e. g. bar charts, pie charts, and line graphs.
Continual learning refers to the ability of a biological or artificial system to seamlessly learn from continuous streams of information while preventing catastrophic forgetting, i. e., a condition in which new incoming information strongly interferes with previously learned representations.
Our approach was to collect a novel, naturalistic, and multimodal dataset of eye+head movements when subjects performed everyday tasks while wearing a mobile eye tracker equipped with an inertial measurement unit and a 3D stereo camera.
Language grounded image understanding tasks have often been proposed as a method for evaluating progress in artificial intelligence.
Visual Question Answering (VQA) research is split into two camps: the first focuses on VQA datasets that require natural image understanding and the second focuses on synthetic datasets that test reasoning.
Most counting questions in visual question answering (VQA) datasets are simple and require no more than object detection.
Ranked #2 on Object Counting on TallyQA-Simple
We find that full rehearsal can eliminate catastrophic forgetting in a variety of streaming learning settings, with ExStream performing well using far less memory and computation.
Deep learning continues to push state-of-the-art performance for the semantic segmentation of color (i. e., RGB) imagery; however, the lack of annotated data for many remote sensing sensors (i. e. hyperspectral imagery (HSI)) prevents researchers from taking advantage of this recent success.
These low-shot learning frameworks will reduce the manual image annotation burden and improve semantic segmentation performance for remote sensing imagery.
Humans and animals have the ability to continually acquire, fine-tune, and transfer knowledge and skills throughout their lifespan.
Bar charts are an effective way to convey numeric information, but today's algorithms cannot parse them.
In contrast to the spectra of ground based images, aerial spectral images have low spatial resolution and suffer from higher noise interference.
Arguably, the best method for incremental class learning is iCaRL, but it requires storing training examples for each class, making it challenging to scale.
Analyzing spatio-temporal data like video is a challenging task that requires processing visual and temporal information effectively.
As a result, evaluation scores are inflated and predominantly determined by answering easier questions, making it difficult to compare different methods.
In this paper, we adapt state-of-the-art DCNN frameworks in computer vision for semantic segmentation for MSI imagery.
Unmanned aircraft have decreased the cost required to collect remote sensing imagery, which has enabled researchers to collect high-spatial resolution data from multiple sensor modalities more frequently and easily.
In this paper, we present a novel robotic grasp detection system that predicts the best grasping pose of a parallel-plate robotic gripper for novel objects using the RGB-D image of the scene.
Ranked #4 on Robotic Grasping on Cornell Grasp Dataset
Recently, algorithms for object recognition and related tasks have become sufficiently proficient that new vision tasks can now be pursued.