Search Results for author: Ioannis Patras

Found 76 papers, 43 papers with code

Temporal Score Analysis for Understanding and Correcting Diffusion Artifacts

no code implementations CVPR 2025 Yu Cao, Zengqun Zhao, Ioannis Patras, Shaogang Gong

Visual artifacts remain a persistent challenge in diffusion models, even with training on massive datasets.

Denoising

AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic Data

1 code implementation CVPR 2025 Zengqun Zhao, Ziquan Liu, Yu Cao, Shaogang Gong, Ioannis Patras

Two key challenges are identified in this fine-tuning paradigm, 1) the low quality of synthetic data, which can still happen even with advanced generative models, and 2) the domain and bias gap between real and synthetic data.

Diversity Fairness +1

P-TAME: Explain Any Image Classifier with Trained Perturbations

no code implementations29 Jan 2025 Mariano V. Ntrougkas, Vasileios Mezaris, Ioannis Patras

The adoption of Deep Neural Networks (DNNs) in critical fields where predictions need to be accompanied by justifications is hindered by their inherent black-box nature.

A Comprehensive Social Bias Audit of Contrastive Vision Language Models

no code implementations22 Jan 2025 Zahraa Al Sahili, Ioannis Patras, Matthew Purver

In the domain of text-to-image generative models, biases inherent in training datasets often propagate into generated content, posing significant ethical challenges, particularly in socially sensitive contexts.

Diversity Fairness +1

VidCtx: Context-aware Video Question Answering with Image Models

1 code implementation23 Dec 2024 Andreas Goulas, Vasileios Mezaris, Ioannis Patras

To address those shortcomings, in this paper, we introduce VidCtx, a novel training-free VideoQA framework which integrates both modalities, i. e. both visual information from input frames and textual descriptions of others frames that give the appropriate context.

Large Language Model Zero-Shot Video Question Answer

Multimodal Outer Arithmetic Block Dual Fusion of Whole Slide Images and Omics Data for Precision Oncology

no code implementations26 Nov 2024 Omnia Alwazzan, Amaya Gallagher-Syed, Thomas O. Millner, Sebastian Brandner, Ioannis Patras, Silvia Marino, Gregory Slabaugh

In this paper, we propose the use of omic embeddings during early and late fusion to capture complementary information from local (patch-level) to global (slide-level) interactions, boosting performance through multimodal integration.

Diagnostic Multiple Instance Learning +2

ReWind: Understanding Long Videos with Instructed Learnable Memory

no code implementations CVPR 2025 Anxhelo Diko, Tinghuai Wang, Wassim Swaileh, Shiyan Sun, Ioannis Patras

We empirically demonstrate ReWind's superior performance in visual question answering (VQA) and temporal grounding tasks, surpassing previous methods on long video benchmarks.

Large Language Model Question Answering +2

Behaviour4All: in-the-wild Facial Behaviour Analysis Toolkit

no code implementations26 Sep 2024 Dimitrios Kollias, Chunchang Shao, Odysseus Kaloidas, Ioannis Patras

In this paper, we introduce Behavior4All, a comprehensive, open-source toolkit for in-the-wild facial behavior analysis, integrating Face Localization, Valence-Arousal Estimation, Basic Expression Recognition and Action Unit Detection, all within a single framework.

Action Unit Detection Arousal Estimation +1

CLIPCleaner: Cleaning Noisy Labels with CLIP

1 code implementation19 Aug 2024 Chen Feng, Georgios Tzimiropoulos, Ioannis Patras

This has the advantage that the sample selection is decoupled from the in-training model and that the sample selection is aware of the semantic and visual similarities between the classes due to the way that CLIP is trained.

Learning with noisy labels

Are CLIP features all you need for Universal Synthetic Image Origin Attribution?

1 code implementation17 Aug 2024 Dario Cioni, Christos Tzelepis, Lorenzo Seidenari, Ioannis Patras

The steady improvement of Diffusion Models for visual synthesis has given rise to many new and interesting use cases of synthetic images but also has raised concerns about their potential abuse, which poses significant societal threats.

All

Get Confused Cautiously: Textual Sequence Memorization Erasure with Selective Entropy Maximization

no code implementations9 Aug 2024 Zhaohan Zhang, Ziquan Liu, Ioannis Patras

To achieve a better trade-off between the effectiveness of TSM erasure and model utility in LLMs, our paper proposes a new framework based on Entropy Maximization with Selective Optimization (EMSO), where the updated weights are chosen with a novel contrastive gradient metric without any participation of additional model or data.

Memorization Text Generation

Multimodal Machine Learning in Mental Health: A Survey of Data, Algorithms, and Challenges

no code implementations23 Jul 2024 Zahraa Al Sahili, Ioannis Patras, Matthew Purver

Multimodal machine learning (MML) is rapidly reshaping the way mental-health disorders are detected, characterized, and longitudinally monitored.

cross-modal alignment Fairness +2

Efficient Unsupervised Visual Representation Learning with Explicit Cluster Balancing

1 code implementation15 Jul 2024 Ioannis Maniadis Metaxas, Georgios Tzimiropoulos, Ioannis Patras

We conduct extensive experiments to evaluate our approach and demonstrate that ExCB: a) achieves state-of-the-art results with significantly reduced resource requirements compared to previous works, b) is fully online, and therefore scalable to large datasets, and c) is stable and effective even with very small batch sizes.

Representation Learning Self-Supervised Learning

FairCoT: Enhancing Fairness in Diffusion Models via Chain of Thought Reasoning of Multimodal Language Models

no code implementations13 Jun 2024 Zahraa Al Sahili, Ioannis Patras, Matthew Purver

In the domain of text-to-image generative models, biases inherent in training datasets often propagate into generated content, posing significant ethical challenges, particularly in socially sensitive contexts.

Attribute Diversity +1

Enhancing Zero-Shot Facial Expression Recognition by LLM Knowledge Transfer

1 code implementation29 May 2024 Zengqun Zhao, Yu Cao, Shaogang Gong, Ioannis Patras

Current facial expression recognition (FER) models are often designed in a supervised learning manner and thus are constrained by the lack of large-scale facial expression images with high-quality annotations.

Facial Expression Recognition (FER) Transfer Learning +1

FashionSD-X: Multimodal Fashion Garment Synthesis using Latent Diffusion

no code implementations26 Apr 2024 Abhishek Kumar Singh, Ioannis Patras

The rapid evolution of the fashion industry increasingly intersects with technological advancements, particularly through the integration of generative AI.

Virtual Try-on

DiffusionAct: Controllable Diffusion Autoencoder for One-shot Face Reenactment

no code implementations25 Mar 2024 Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis Patras, Georgios Tzimiropoulos

To this end, in this paper we present DiffusionAct, a novel method that leverages the photo-realistic image generation of diffusion models to perform neural face reenactment.

Face Reenactment Image Generation

LAFS: Landmark-based Facial Self-supervised Learning for Face Recognition

2 code implementations CVPR 2024 Zhonglin Sun, Chen Feng, Ioannis Patras, Georgios Tzimiropoulos

This enables our method - namely LAndmark-based Facial Self-supervised learning LAFS), to learn key representation that is more critical for face recognition.

Diversity Face Recognition +1

MOAB: Multi-Modal Outer Arithmetic Block For Fusion Of Histopathological Images And Genetic Data For Brain Tumor Grading

1 code implementation11 Mar 2024 Omnia Alwazzan, Abbas Khan, Ioannis Patras, Gregory Slabaugh

We propose a novel Multi-modal Outer Arithmetic Block (MOAB) based on arithmetic operations to combine latent representations of the different modalities for predicting the tumor grade (Grade \rom{2}, \rom{3} and \rom{4}).

Prognosis

FOAA: Flattened Outer Arithmetic Attention For Multimodal Tumor Classification

1 code implementation10 Mar 2024 Omnia Alwazzan, Ioannis Patras, Gregory Slabaugh

Fusion of multimodal healthcare data holds great promise to provide a holistic view of a patient's health, taking advantage of the complementarity of different modalities while leveraging their correlation.

Self-Supervised Facial Representation Learning with Facial Region Awareness

no code implementations CVPR 2024 Zheng Gao, Ioannis Patras

Recent efforts toward this goal are limited to treating each face image as a whole, i. e., learning consistent facial representations at the image-level, which overlooks the consistency of local facial representations (i. e., facial regions like eyes, nose, etc).

Deep Clustering Representation Learning +2

Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization

2 code implementations19 Feb 2024 James Oldfield, Markos Georgopoulos, Grigorios G. Chrysos, Christos Tzelepis, Yannis Panagakis, Mihalis A. Nicolaou, Jiankang Deng, Ioannis Patras

The Mixture of Experts (MoE) paradigm provides a powerful way to decompose dense layers into smaller, modular computations often more amenable to human interpretation, debugging, and editability.

Attribute counterfactual +1

One-shot Neural Face Reenactment via Finding Directions in GAN's Latent Space

no code implementations5 Feb 2024 Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis Patras, Georgios Tzimiropoulos

Moreover, we show that by embedding real images in the GAN latent space, our method can be successfully used for the reenactment of real-world faces.

Disentanglement Face Reenactment

EmoCLIP: A Vision-Language Method for Zero-Shot Video Facial Expression Recognition

1 code implementation25 Oct 2023 Niki Maria Foteinopoulou, Ioannis Patras

To test this, we evaluate using zero-shot classification of the model trained on sample-level descriptions on four popular dynamic FER datasets.

Facial Expression Recognition (FER) Language Modelling +3

Prompting Visual-Language Models for Dynamic Facial Expression Recognition

1 code implementation25 Aug 2023 Zengqun Zhao, Ioannis Patras

For the visual part, based on the CLIP image encoder, a temporal model consisting of several Transformer encoders is introduced for extracting temporal facial expression features, and the final feature embedding is obtained as a learnable "class" token.

Dynamic Facial Expression Recognition Facial Expression Recognition +1

Self-Supervised Representation Learning with Cross-Context Learning between Global and Hypercolumn Features

no code implementations25 Aug 2023 Zheng Gao, Chen Feng, Ioannis Patras

Inspired by cross-modality learning, we extend this existing framework that only learns from global features by encouraging the global features and intermediate layer features to learn from each other.

Contrastive Learning Knowledge Distillation +1

HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and Retarget Faces

1 code implementation ICCV 2023 Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis Patras, Georgios Tzimiropoulos

In this paper, we present our method for neural face reenactment, called HyperReenact, that aims to generate realistic talking head images of a source identity, driven by a target facial pose.

Face Reenactment

Parts of Speech-Grounded Subspaces in Vision-Language Models

2 code implementations23 May 2023 James Oldfield, Christos Tzelepis, Yannis Panagakis, Mihalis A. Nicolaou, Ioannis Patras

Latent image representations arising from vision-language models have proved immensely useful for a variety of downstream tasks.

Image Generation POS +2

DivClust: Controlling Diversity in Deep Clustering

1 code implementation CVPR 2023 Ioannis Maniadis Metaxas, Georgios Tzimiropoulos, Ioannis Patras

Clustering has been a major research topic in the field of machine learning, one to which Deep Learning has recently been applied with significant success.

Clustering Deep Clustering +1

MaskCon: Masked Contrastive Learning for Coarse-Labelled Dataset

1 code implementation CVPR 2023 Chen Feng, Ioannis Patras

More specifically, within the contrastive learning framework, for each sample our method generates soft-labels with the aid of coarse labels against other samples and another augmented view of the sample in question.

Contrastive Learning Learning with coarse labels

Attribute-preserving Face Dataset Anonymization via Latent Code Optimization

1 code implementation CVPR 2023 Simone Barattin, Christos Tzelepis, Ioannis Patras, Nicu Sebe

By optimizing the latent codes directly, we ensure both that the identity is of a desired distance away from the original (with an identity obfuscation loss), whilst preserving the facial attributes (using a novel feature-matching loss in FaRL's deep feature space).

Attribute

Motor Imagery Decoding Using Ensemble Curriculum Learning and Collaborative Training

1 code implementation21 Nov 2022 Georgios Zoumpourlis, Ioannis Patras

The first loss applies curriculum learning, forcing each feature extractor to specialize to a subset of the training subjects and promoting feature diversity.

Anatomy Domain Generalization +3

StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment

1 code implementation27 Sep 2022 Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis Patras, Georgios Tzimiropoulos

In this paper we address the problem of neural face reenactment, where, given a pair of a source and a target facial image, we need to transfer the target's pose (defined as the head pose and its facial expressions) to the source image, by preserving at the same time the source's identity characteristics (e. g., facial shape, hair style, etc), even in the challenging case where the source and the target faces belong to different identities.

Disentanglement Face Reenactment

Capsule Network based Contrastive Learning of Unsupervised Visual Representations

1 code implementation22 Sep 2022 Harsh Panwar, Ioannis Patras

Capsule Networks have shown tremendous advancement in the past decade, outperforming the traditional CNNs in various task due to it's equivariant properties.

Contrastive Learning image-classification +1

Adaptive Soft Contrastive Learning

1 code implementation22 Jul 2022 Chen Feng, Ioannis Patras

Self-supervised learning has recently achieved great success in representation learning without human annotations.

Contrastive Learning Representation Learning +1

Learning from Label Relationships in Human Affect

1 code implementation12 Jul 2022 Niki Maria Foteinopoulou, Ioannis Patras

In the case of affect recognition, we outperform previous vision-based methods in terms of CCC on both the OMG and the AMIGOS datasets.

Continuous Affect Estimation regression +1

Summarizing Videos using Concentrated Attention and Considering the Uniqueness and Diversity of the Video Frames

1 code implementation ACM ICMR 2022 Evlampios Apostolidis, Georgios Balaouras, Vasileios Mezaris, Ioannis Patras

Instead of simply modeling the frames' dependencies based on global attention, our method integrates a concentrated attention mechanism that is able to focus on non-overlapping blocks in the main diagonal of the attention matrix, and to enrich the existing information by extracting and exploiting knowledge about the uniqueness and diversity of the associated frames of the video.

Benchmarking Diversity +1

ContraCLIP: Interpretable GAN generation driven by pairs of contrasting sentences

1 code implementation5 Jun 2022 Christos Tzelepis, James Oldfield, Georgios Tzimiropoulos, Ioannis Patras

This work addresses the problem of discovering non-linear interpretable paths in the latent space of pre-trained GANs in a model-agnostic manner.

Position

PandA: Unsupervised Learning of Parts and Appearances in the Feature Maps of GANs

1 code implementation31 May 2022 James Oldfield, Christos Tzelepis, Yannis Panagakis, Mihalis A. Nicolaou, Ioannis Patras

Recent advances in the understanding of Generative Adversarial Networks (GANs) have led to remarkable progress in visual editing and synthesis tasks, capitalizing on the rich semantics that are embedded in the latent spaces of pre-trained GANs.

Tensor Component Analysis for Interpreting the Latent Space of GANs

no code implementations23 Nov 2021 James Oldfield, Markos Georgopoulos, Yannis Panagakis, Mihalis A. Nicolaou, Ioannis Patras

This paper addresses the problem of finding interpretable directions in the latent space of pre-trained Generative Adversarial Networks (GANs) to facilitate controllable image synthesis.

Image Generation

SSR: An Efficient and Robust Framework for Learning with Unknown Label Noise

1 code implementation22 Nov 2021 Chen Feng, Georgios Tzimiropoulos, Ioannis Patras

Under this setting, unlike previous methods that often introduce multiple assumptions and lead to complex solutions, we propose a simple, efficient and robust framework named Sample Selection and Relabelling(SSR), that with a minimal number of hyperparameters achieves SOTA results in various conditions.

Learning with noisy labels Self-Supervised Learning +1

WarpedGANSpace: Finding non-linear RBF paths in GAN latent space

1 code implementation ICCV 2021 Christos Tzelepis, Georgios Tzimiropoulos, Ioannis Patras

This work addresses the problem of discovering, in an unsupervised manner, interpretable paths in the latent space of pretrained GANs, so as to provide an intuitive and easy way of controlling the underlying generative factors.

DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval

1 code implementation24 Jun 2021 Giorgos Kordopatis-Zilos, Christos Tzelepis, Symeon Papadopoulos, Ioannis Kompatsiaris, Ioannis Patras

In this work, we propose a Knowledge Distillation framework, called Distill-and-Select (DnS), that starting from a well-performing fine-grained Teacher Network learns: a) Student Networks at different retrieval performance and computational efficiency trade-offs and b) a Selector Network that at test time rapidly directs samples to the appropriate student to maintain both high retrieval performance and high computational efficiency.

Computational Efficiency Knowledge Distillation +2

Few-Shot Action Localization without Knowing Boundaries

1 code implementation8 Jun 2021 Ting-Ting Xie, Christos Tzelepis, Fan Fu, Ioannis Patras

Learning to localize actions in long, cluttered, and untrimmed videos is a hard task, that in the literature has typically been addressed assuming the availability of large amounts of annotated training samples for each class -- either in a fully-supervised setting, where action boundaries are known, or in a weakly-supervised setting, where only class labels are known for each video.

Action Localization Few-Shot Learning

Relationship-based Neural Baby Talk

no code implementations8 Mar 2021 Fan Fu, TingTing Xie, Ioannis Patras, Sepehr Jalali

Understanding interactions between objects in an image is an important element for generating captions.

Caption Generation Graph Attention

Uncertainty Propagation in Convolutional Neural Networks: Technical Report

2 code implementations11 Feb 2021 Christos Tzelepis, Ioannis Patras

In this technical report we study the problem of propagation of uncertainty (in terms of variances of given uni-variate normal random variables) through typical building blocks of a Convolutional Neural Network (CNN).

Video Summarization Using Deep Neural Networks: A Survey

no code implementations15 Jan 2021 Evlampios Apostolidis, Eleni Adamantidou, Alexandros I. Metsai, Vasileios Mezaris, Ioannis Patras

Video summarization technologies aim to create a concise and complete synopsis by selecting the most informative parts of the video content.

Deep Learning Survey +1

Temporal Action Localization with Variance-Aware Networks

no code implementations25 Aug 2020 Ting-Ting Xie, Christos Tzelepis, Ioannis Patras

Results in the action localization problem show that the incorporation of second order statistics improves over the baseline network, and that VANp surpasses the accuracy of virtually all other two-stage networks without involving any additional parameters.

regression Temporal Action Localization

Boundary Uncertainty in a Single-Stage Temporal Action Localization Network

no code implementations25 Aug 2020 Ting-Ting Xie, Christos Tzelepis, Ioannis Patras

We use two uncertainty-aware boundary regression losses: first, the Kullback-Leibler divergence between the ground truth location of the boundary and the Gaussian modeling the prediction of the boundary and second, the expectation of the $\ell_1$ loss under the same Gaussian.

Temporal Action Localization

Unsupervised Video Summarization via Attention-Driven Adversarial Learning

1 code implementation MultiMedia Modeling (MMM) 2019 Evlampios Apostolidis, Eleni Adamantidou, Alexandros I. Metsai, Vasileios Mezaris, Ioannis Patras

Experimental evaluation on two datasets (SumMe and TVSum) documents the contribution of the attention auto-encoder to faster and more stable training of the model, resulting in a significant performance improvement with respect to the original model and demonstrating the competitiveness of the proposed SUM-GAN-AAE against the state of the art.

Unsupervised Video Summarization

ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning

1 code implementation ICCV 2019 Giorgos Kordopatis-Zilos, Symeon Papadopoulos, Ioannis Patras, Ioannis Kompatsiaris

Subsequently, the similarity matrix between all video frames is fed to a four-layer CNN, and then summarized using Chamfer Similarity (CS) into a video-to-video similarity score -- this avoids feature aggregation before the similarity calculation between videos and captures the temporal similarity patterns between matching frame sequences.

ISVR Retrieval +3

TARN: Temporal Attentive Relation Network for Few-Shot and Zero-Shot Action Recognition

no code implementations21 Jul 2019 Mina Bishay, Georgios Zoumpourlis, Ioannis Patras

At the heart of our network is a meta-learning approach that learns to compare representations of variable temporal length, that is, either two videos of different length (in the case of few-shot action recognition) or a video and a semantic representation such as word vector (in the case of zero-shot action recognition).

Few-Shot action recognition Few Shot Action Recognition +5

Registration-free Face-SSD: Single shot analysis of smiles, facial attributes, and affect in the wild

no code implementations11 Feb 2019 Youngkyoon Jang, Hatice Gunes, Ioannis Patras

In this paper, we present a novel single shot face-related task analysis method, called Face-SSD, for detecting faces and for performing various face-related (classification/regression) tasks including smile recognition, face attribute prediction and valence-arousal estimation in the wild.

Arousal Estimation Attribute +2

FIVR: Fine-grained Incident Video Retrieval

1 code implementation11 Sep 2018 Giorgos Kordopatis-Zilos, Symeon Papadopoulos, Ioannis Patras, Ioannis Kompatsiaris

To create the dataset, we devise a process for the collection of YouTube videos based on major news events from recent years crawled from Wikipedia and deploy a retrieval pipeline for the automatic selection of query videos based on their estimated suitability as benchmarks.

Benchmarking Retrieval +1

SchiNet: Automatic Estimation of Symptoms of Schizophrenia from Facial Behaviour Analysis

no code implementations7 Aug 2018 Mina Bishay, Petar Palasek, Stefan Priebe, Ioannis Patras

Patients with schizophrenia often display impairments in the expression of emotion and speech and those are observed in their facial behaviour.

Semi-supervised Fisher vector network

no code implementations13 Jan 2018 Petar Palasek, Ioannis Patras

In this work we explore how the architecture proposed in [8], which expresses the processing steps of the classical Fisher vector pipeline approaches, i. e. dimensionality reduction by principal component analysis (PCA) projection, Gaussian mixture model (GMM) and Fisher vector descriptor extraction as network layers, can be modified into a hybrid network that combines the benefits of both unsupervised and supervised training methods, resulting in a model that learns a semi-supervised Fisher vector descriptor of the input data.

Action Recognition Classification +5

Deep Globally Constrained MRFs for Human Pose Estimation

no code implementations ICCV 2017 Ioannis Marras, Petar Palasek, Ioannis Patras

We overcome this by introducing a Markov Random Field (MRF)-based spatial model network between the coarse and the refinement model that introduces geometric constraints on the relative locations of the body joints.

Pose Estimation

Discriminative convolutional Fisher vector network for action recognition

no code implementations19 Jul 2017 Petar Palasek, Ioannis Patras

In this work we propose a novel neural network architecture for the problem of human action recognition in videos.

Action Recognition In Videos Dimensionality Reduction +1

Unsupervised convolutional neural networks for motion estimation

no code implementations22 Jan 2016 Aria Ahmadi, Ioannis Patras

In this paper, we propose a direct method and train a Convolutional Neural Network (CNN) that when, at test time, is given a pair of images as input it produces a dense motion field F at its output layer.

Motion Estimation Optical Flow Estimation

Face Alignment Assisted by Head Pose Estimation

1 code implementation11 Jul 2015 Heng Yang, Wenxuan Mou, Yichi Zhang, Ioannis Patras, Hatice Gunes, Peter Robinson

In this paper we propose a supervised initialization scheme for cascaded face alignment based on explicit head pose estimation.

Face Alignment Head Pose Estimation

Linear Maximum Margin Classifier for Learning from Uncertain Data

1 code implementation15 Apr 2015 Christos Tzelepis, Vasileios Mezaris, Ioannis Patras

In this paper, we propose a maximum margin classifier that deals with uncertainty in data input.

Mirror, mirror on the wall, tell me, is the error small?

no code implementations CVPR 2015 Heng Yang, Ioannis Patras

Our experiments lead to several interesting findings: 1) Surprisingly, most of state of the art methods struggle to preserve the mirror symmetry, despite the fact that they do have very similar overall performance on the original and mirror images; 2) the low mirrorability is not caused by training or testing sample bias - all algorithms are trained on both the original images and their mirrored versions; 3) the mirror error is strongly correlated to the localization/alignment error (with correlation coefficients around 0. 7).

Face Alignment Pose Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.