Search Results for author: Gemma Roig

Found 43 papers, 18 papers with code

Human Gaze Boosts Object-Centered Representation Learning

no code implementations6 Jan 2025 Timothy Schaumlöffel, Arthur Aubret, Gemma Roig, Jochen Triesch

To account for the importance of central vision in humans, we crop the visual area around the gaze location.

Gaze Prediction Object +2

Efficient Unsupervised Shortcut Learning Detection and Mitigation in Transformers

1 code implementation1 Jan 2025 Lukas Kuhn, Sari Sadiya, Jorg Schlotterer, Christin Seifert, Gemma Roig

Shortcut learning, i. e., a model's reliance on undesired features not directly relevant to the task, is a major challenge that severely limits the applications of machine learning algorithms, particularly when deploying them to assist in making sensitive decisions, such as in medical diagnostics.

The Algonauts Project 2025 Challenge: How the Human Brain Makes Sense of Multimodal Movies

no code implementations31 Dec 2024 Alessandro T. Gifford, Domenic Bersch, Marie St-Laurent, Basile Pinsard, Julie Boyle, Lune Bellec, Aude Oliva, Gemma Roig, Radoslaw M. Cichy

To promote further collaboration between biological and artificial intelligence researchers, we introduce the 2025 edition of the Algonauts Project challenge: How the Human Brain Makes Sense of Multimodal Movies (https://algonautsproject. com/).

On Explaining Knowledge Distillation: Measuring and Visualising the Knowledge Transfer Process

no code implementations18 Dec 2024 Gereziher Adhane, Mohammad Mahdi Dehshibi, Dennis Vetter, David Masip, Gemma Roig

We refer to the features learned with the Teacher's guidance as distilled features and the features irrelevant to the task and ignored by the Student as residual features.

Knowledge Distillation Transfer Learning

Limited but consistent gains in adversarial robustness by co-training object recognition models with human EEG

no code implementations5 Sep 2024 Manshan Guo, Bhavin Choksi, Sari Sadiya, Alessandro T. Gifford, Martina G. Vilas, Radoslaw M. Cichy, Gemma Roig

Previous works relied on brain data acquired in rodents or primates using invasive techniques, from specific regions of the brain, under non-natural conditions (anesthetized animals), and with stimulus datasets lacking diversity and naturalness.

Adversarial Robustness EEG +2

Human-inspired Explanations for Vision Transformers and Convolutional Neural Networks

1 code implementation4 Aug 2024 Mahadev Prasad Panda, Matteo Tiezzi, Martina Vilas, Gemma Roig, Bjoern M. Eskofier, Dario Zanca

We introduce Foveation-based Explanations (FovEx), a novel human-inspired visual explainability (XAI) method for Deep Neural Networks.


Classification of freshwater snails of the genus Radomaniola with multimodal triplet networks

no code implementations29 Jul 2024 Dennis Vetter, Muhammad Ahsan, Diana Delicado, Thomas A. Neubauer, Thomas Wilke, Gemma Roig

In this paper, we present our first proposal of a machine learning system for the classification of freshwater snails of the genus Radomaniola.


Position: An Inner Interpretability Framework for AI Inspired by Lessons from Cognitive Neuroscience

no code implementations3 Jun 2024 Martina G. Vilas, Federico Adolfi, David Poeppel, Gemma Roig

Inner Interpretability is a promising emerging field tasked with uncovering the inner mechanisms of AI systems, though how to develop these mechanistic theories is still much debated.


Learning Object Semantic Similarity with Self-Supervision

no code implementations19 Apr 2024 Arthur Aubret, Timothy Schaumlöffel, Gemma Roig, Jochen Triesch

To achieve this, the model exploits two distinct strategies: the visuo-language alignment ensures that different objects of the same category are represented similarly, whereas the temporal alignment leverages that objects from the same context are frequently seen in succession to make their representations more similar.

Object Semantic Similarity +2

Different Algorithms (Might) Uncover Different Patterns: A Brain-Age Prediction Case Study

no code implementations8 Feb 2024 Tobias Ettling, Sari Saba-Sadiya, Gemma Roig

Few of our models achieved state-of-the-art performance on the specific data-set we utilized.


Learning Class and Domain Augmentations for Single-Source Open-Domain Generalization

no code implementations5 Nov 2023 Prathmesh Bele, Valay Bundele, Avigyan Bhattacharya, Ankit Jha, Gemma Roig, Biplab Banerjee

Single-source open-domain generalization (SS-ODG) addresses the challenge of labeled source domains with supervision during training and unlabeled novel target domains during testing.

Domain Generalization

Analyzing Vision Transformers for Image Classification in Class Embedding Space

1 code implementation NeurIPS 2023 Martina G. Vilas, Timothy Schaumlöffel, Gemma Roig

Inspired by previous research in NLP, we demonstrate how the inner representations at any level of the hierarchy can be projected onto the learned class embedding space to uncover how these networks build categorical representations for their predictions.

Image Classification

Net2Brain: A Toolbox to compare artificial vision models with human brain responses

1 code implementation20 Aug 2022 Domenic Bersch, Kshitij Dwivedi, Martina Vilas, Radoslaw M. Cichy, Gemma Roig

We introduce Net2Brain, a graphical and command-line user interface toolbox for comparing the representational spaces of artificial deep neural networks (DNNs) and human brain recordings.

Action Recognition Depth Estimation +2

Using Sentence Embeddings and Semantic Similarity for Seeking Consensus when Assessing Trustworthy AI

1 code implementation9 Aug 2022 Dennis Vetter, Jesmin Jahan Tithi, Magnus Westerlund, Roberto V. Zicari, Gemma Roig

Therefore, a core challenge of the assessment process is to identify when experts from different disciplines talk about the same problem but use different terminologies.

Semantic Similarity Semantic Textual Similarity +2

What do navigation agents learn about their environment?

1 code implementation CVPR 2022 Kshitij Dwivedi, Gemma Roig, Aniruddha Kembhavi, Roozbeh Mottaghi

We use iSEE to probe the dynamic representations produced by these agents for the presence of information about the agent as well as the environment.

Visual Navigation

Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responses

1 code implementation19 Feb 2022 Phoebe Chua, Dimos Makris, Dorien Herremans, Gemma Roig, Kat Agres

In this paper we present MusicVideos (MuVi), a novel dataset for affective multimedia content analysis to study how the auditory and visual modalities contribute to the perceived emotion of media.

Descriptive Feature Importance +2

FRIDA -- Generative Feature Replay for Incremental Domain Adaptation

no code implementations28 Dec 2021 Sayan Rakshit, Anwesh Mohanty, Ruchika Chavhan, Biplab Banerjee, Gemma Roig, Subhasis Chaudhuri

Inspired by the notion of generative feature replay, we propose a novel framework called Feature Replay based Incremental Domain Adaptation (FRIDA) which leverages a new incremental generative adversarial network (GAN) called domain-generic auxiliary classification GAN (DGAC-GAN) for producing domain-specific feature representations seamlessly.

Generative Adversarial Network Unsupervised Domain Adaptation

AttendAffectNet–Emotion Prediction of Movie Viewers Using Multimodal Fusion with Self-Attention

1 code implementation Sensors 2021 Ha Thi Phuong Thao, B T Balamurali, Gemma Roig, Dorien Herremans

The models that use all visual, audio, and text features simultaneously as their inputs performed better than those using features extracted from each modality separately.

Representation Learning

AttendAffectNet: Self-Attention based Networks for Predicting Affective Responses from Movies

1 code implementation21 Oct 2020 Ha Thi Phuong Thao, Balamurali B. T., Dorien Herremans, Gemma Roig

In this work, we propose different variants of the self-attention based network for emotion prediction from movies, which we call AttendAffectNet.


Duality Diagram Similarity: a generic framework for initialization selection in task transfer learning

2 code implementations ECCV 2020 Kshitij Dwivedi, Jiahui Huang, Radoslaw Martin Cichy, Gemma Roig

In this paper, we tackle an open research question in transfer learning, which is selecting a model initialization to achieve high performance on a new task, given several pre-trained models.

Model Selection Semantic Segmentation +1

Using Human Psychophysics to Evaluate Generalization in Scene Text Recognition Models

no code implementations30 Jun 2020 Sahar Siddiqui, Elena Sizikova, Gemma Roig, Najib J. Majaj, Denis G. Pelli

Relative to the attention-based (Attn) model, we discover that the connectionist temporal classification (CTC) model is more robust to noise and occlusion, and better at generalizing to different word lengths.

Scene Text Recognition

LCD: Learned Cross-Domain Descriptors for 2D-3D Matching

1 code implementation21 Nov 2019 Quang-Hieu Pham, Mikaela Angelina Uy, Binh-Son Hua, Duc Thanh Nguyen, Gemma Roig, Sai-Kit Yeung

In this work, we present a novel method to learn a local cross-domain descriptor for 2D image and 3D point cloud matching.

3D Point Cloud Matching Depth Estimation +1

Predictive Coding Networks Meet Action Recognition

no code implementations22 Oct 2019 Xia Huang, Hossein Mousavi, Gemma Roig

In this way, the model only relies on the video frames, and does not need pre-processed optical flows as input.

Action Recognition Optical Flow Estimation

Latent space representation for multi-target speaker detection and identification with a sparse dataset using Triplet neural networks

1 code implementation1 Oct 2019 Kin Wai Cheuk, Balamurali B. T., Gemma Roig, Dorien Herremans

When reducing the training data to only using the train set, our method results in 309 confusions for the Multi-target speaker identification task, which is 46% better than the baseline model.

Speaker Identification Speaker Recognition +1

Multimodal Deep Models for Predicting Affective Responses Evoked by Movies

1 code implementation16 Sep 2019 Ha Thi Phuong Thao, Dorien Herremans, Gemma Roig

Interestingly, we also observe that the optical flow is more informative than the RGB in videos, and overall, models using audio features are more accurate than those based on video features when making the final prediction of evoked emotions.

Optical Flow Estimation

Representation Similarity Analysis for Efficient Task taxonomy & Transfer Learning

2 code implementations CVPR 2019 Kshitij Dwivedi, Gemma Roig

We next evaluate the relationship of RSA with the transfer learning performance on Taskonomy tasks and a new task: Pascal VOC semantic segmentation.

Segmentation Semantic Segmentation +1

Deep Anchored Convolutional Neural Networks

no code implementations22 Apr 2019 Jiahui Huang, Kshitij Dwivedi, Gemma Roig

Convolutional Neural Networks (CNNs) have been proven to be extremely successful at solving computer vision tasks.

Few-Shot Regression via Learned Basis Functions

no code implementations ICLR Workshop LLD 2019 Yi Loo, Swee Kiat Lim, Gemma Roig, Ngai-Man Cheung

We show that our model outperforms the current state of the art meta-learning methods in various regression tasks.

Few-Shot Learning regression

DOPING: Generative Data Augmentation for Unsupervised Anomaly Detection with GAN

no code implementations23 Aug 2018 Swee Kiat Lim, Yi Loo, Ngoc-Trung Tran, Ngai-Man Cheung, Gemma Roig, Yuval Elovici

To the best of our knowledge, our method is the first data augmentation technique focused on improving performance in unsupervised anomaly detection.

Data Augmentation Generative Adversarial Network +1

Do Deep Neural Networks Suffer from Crowding?

2 code implementations NeurIPS 2017 Anna Volokitin, Gemma Roig, Tomaso Poggio

Also, for all tested networks, when trained on targets in isolation, we find that recognition accuracy of the networks decreases the closer the flankers are to the target and the more flankers there are.

Object Recognition

Herding Generalizes Diverse M -Best Solutions

no code implementations14 Nov 2016 Ece Ozkan, Gemma Roig, Orcun Goksel, Xavier Boix

We show that the algorithm to extract diverse M -solutions from a Conditional Random Field (called divMbest [1]) takes exactly the form of a Herding procedure [2], i. e. a deterministic dynamical system that produces a sequence of hypotheses that respect a set of observed moment constraints.

Semantic Segmentation

Foveation-based Mechanisms Alleviate Adversarial Examples

no code implementations19 Nov 2015 Yan Luo, Xavier Boix, Gemma Roig, Tomaso Poggio, Qi Zhao

To see this, first, we report results in ImageNet that lead to a revision of the hypothesis that adversarial perturbations are a consequence of CNNs acting as a linear classifier: CNNs act locally linearly to changes in the image regions with objects recognized by the CNN, and in other regions the CNN may act non-linearly.

Foveation Translation

Self-Adaptable Templates for Feature Coding

no code implementations NeurIPS 2014 Xavier Boix, Gemma Roig, Salomon Diether, Luc V. Gool

Within this processing pipeline, the common trend is to learn the feature coding templates, often referred as codebook entries, filters, or over-complete basis.

Image Classification Object Recognition +1

Comment on "Ensemble Projection for Semi-supervised Image Classification"

no code implementations29 Aug 2014 Xavier Boix, Gemma Roig, Luc van Gool

In a series of papers by Dai and colleagues [1, 2], a feature map (or kernel) was introduced for semi- and unsupervised learning.

Classification General Classification +1

SEEDS: Superpixels Extracted via Energy-Driven Sampling

1 code implementation16 Sep 2013 Michael Van den Bergh, Xavier Boix, Gemma Roig, Luc van Gool

We define a robust and fast to evaluate energy function, based on enforcing color similarity between the bound- aries and the superpixel color histogram.


Random Binary Mappings for Kernel Learning and Efficient SVM

no code implementations19 Jul 2013 Gemma Roig, Xavier Boix, Luc van Gool

SVMs suffer from various drawbacks in terms of selecting the right kernel, which depends on the image descriptors, as well as computational and memory efficiency.

Attribute Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.