1 code implementation • 20 Oct 2022 • Alexandros Stergiou, Dima Damen
A key function of auditory cognition is the association of characteristic sounds with their corresponding semantics over time.
Ranked #1 on
Audio Classification
on EPIC-KITCHENS-100
2 code implementations • 28 Apr 2022 • Alexandros Stergiou, Dima Damen
Early action prediction deals with inferring the ongoing action from partially-observed videos, typically at the outset of the video.
Ranked #1 on
Early Action Prediction
on Something-Something V2
1 code implementation • 1 Nov 2021 • Alexandros Stergiou, Ronald Poppe
We evaluate adaUnPool on image and video super-resolution and frame interpolation.
no code implementations • 5 Oct 2021 • Alexandros Stergiou
The hierarchical extraction of features models variations of relatively similar classes the same as very dissimilar classes.
1 code implementation • 29 Jan 2021 • Alexandros Stergiou
Visual interpretability of Convolutional Neural Networks (CNNs) has gained significant popularity because of the great challenges that CNN complexity imposes to understanding their inner workings.
3 code implementations • ICCV 2021 • Alexandros Stergiou, Ronald Poppe, Grigorios Kalliatakis
Convolutional Neural Networks (CNNs) use pooling to decrease the size of activation maps.
1 code implementation • 8 Nov 2020 • Alexandros Stergiou, Ronald Poppe
To address this challenge, we present a novel spatio-temporal convolution block that is capable of extracting spatio-temporal patterns at multiple temporal resolutions.
1 code implementation • 15 Jun 2020 • Alexandros Stergiou, Ronald Poppe
Generalizing over temporal variations is a prerequisite for effective action recognition in videos.
Ranked #2 on
Action Recognition
on HACS
no code implementations • 7 Feb 2020 • Alexandros Stergiou, Ronald Poppe, Remco C. Veltkamp
We show that using Class Regularization blocks in state-of-the-art CNN architectures for action recognition leads to systematic improvement gains of 1. 8%, 1. 2% and 1. 4% on the Kinetics, UCF-101 and HMDB-51 datasets, respectively.
no code implementations • 30 Sep 2019 • Alexandros Stergiou, Ronald Poppe
Motivated by the often distinctive temporal characteristics of actions in either horizontal or vertical direction, we introduce a novel convolution block for CNN architectures with video input.
1 code implementation • 18 Sep 2019 • Alexandros Stergiou, Georgios Kapidis, Grigorios Kalliatakis, Christos Chrysoulas, Ronald Poppe, Remco Veltkamp
We demonstrate the method on six state-of-the-art 3D convolution neural networks (CNNs) on three action recognition (Kinetics-400, UCF-101, and HMDB-51) and two egocentric action recognition datasets (EPIC-Kitchens and EGTEA Gaze+).
1 code implementation • 4 Feb 2019 • Alexandros Stergiou, Georgios Kapidis, Grigorios Kalliatakis, Christos Chrysoulas, Remco Veltkamp, Ronald Poppe
Deep learning approaches have been established as the main methodology for video classification and recognition.
1 code implementation • 31 Jul 2018 • Alexandros Stergiou, Ronald Poppe
The main challenges stem from dealing with the considerable variation in recording setting, the appearance of the people depicted and the coordinated performance of their interaction.