no code implementations • 16 Jun 2022 • Fatemeh Saleh, Fuwen Tan, Adrian Bulat, Georgios Tzimiropoulos, Brais Martinez
Video self-supervised learning (SSL) suffers from added challenges: video datasets are typically not as large as image datasets, compute is an order of magnitude larger, and the amount of spurious patterns the optimizer has to sieve through is multiplied several fold.
1 code implementation • 5 Jun 2022 • Christos Tzelepis, James Oldfield, Georgios Tzimiropoulos, Ioannis Patras
This work addresses the problem of discovering non-linear interpretable paths in the latent space of pre-trained GANs in a model-agnostic manner.
1 code implementation • 31 May 2022 • Dimitrios Mallis, Enrique Sanchez, Matt Bell, Georgios Tzimiropoulos
This paper proposes a novel paradigm for the unsupervised learning of object landmark detectors.
1 code implementation • 13 May 2022 • Jing Yang, Xiatian Zhu, Adrian Bulat, Brais Martinez, Georgios Tzimiropoulos
The key idea is that we leverage the teacher's classifier as a semantic critic for evaluating the representations of both teacher and student and distilling the semantic knowledge with high-order structured information over all feature dimensions.
2 code implementations • 6 May 2022 • Junting Pan, Adrian Bulat, Fuwen Tan, Xiatian Zhu, Lukasz Dudziak, Hongsheng Li, Georgios Tzimiropoulos, Brais Martinez
In this work, pushing further along this under-studied direction we introduce EdgeViTs, a new family of light-weight ViTs that, for the first time, enable attention-based vision models to compete with the best light-weight CNNs in the tradeoff between accuracy and on-device efficiency.
no code implementations • 31 Jan 2022 • Stella Bounareli, Vasileios Argyriou, Georgios Tzimiropoulos
Because GANs are characterized by weak controllability, the core of our approach is a method to discover which directions in latent GAN space are responsible for controlling facial pose and expression variations.
no code implementations • 22 Nov 2021 • Chen Feng, Georgios Tzimiropoulos, Ioannis Patras
Despite the large progress in supervised learning with Neural Networks, there are significant challenges in obtaining high-quality, large-scale and accurately labeled datasets.
Ranked #1 on
Learning with noisy labels
on ANIMAL
no code implementations • 3 Nov 2021 • Adrian Bulat, Enrique Sanchez, Georgios Tzimiropoulos
Deep Learning models based on heatmap regression have revolutionized the task of facial landmark localization with existing models working robustly under large poses, non-uniform illumination and shadows, occlusions and self-occlusions, low resolution and blur.
Ranked #1 on
Face Alignment
on COFW
(using extra training data)
no code implementations • 26 Oct 2021 • Adrian Bulat, Jean Kossaifi, Sourav Bhattacharya, Yannis Panagakis, Timothy Hospedales, Georgios Tzimiropoulos, Nicholas D Lane, Maja Pantic
We propose defensive tensorization, an adversarial defence technique that leverages a latent high-order factorization of the network.
no code implementations • 6 Oct 2021 • Swathikiran Sudhakaran, Adrian Bulat, Juan-Manuel Perez-Rua, Alex Falcon, Sergio Escalera, Oswald Lanz, Brais Martinez, Georgios Tzimiropoulos
This report presents the technical details of our submission to the EPIC-Kitchens-100 Action Recognition Challenge 2021.
1 code implementation • ICCV 2021 • Christos Tzelepis, Georgios Tzimiropoulos, Ioannis Patras
This work addresses the problem of discovering, in an unsupervised manner, interpretable paths in the latent space of pretrained GANs, so as to provide an intuitive and easy way of controlling the underlying generative factors.
1 code implementation • NeurIPS 2021 • Adrian Bulat, Juan-Manuel Perez-Rua, Swathikiran Sudhakaran, Brais Martinez, Georgios Tzimiropoulos
In this work, we propose a Video Transformer model the complexity of which scales linearly with the number of frames in the video sequence and hence induces no overhead compared to an image-based Transformer model.
Ranked #19 on
Action Classification
on Kinetics-600
no code implementations • ICCV 2021 • Adrian Bulat, Georgios Tzimiropoulos
In this work, we propose Bit-Mixer, the very first method to train a meta-quantized network where during test time any layer can change its bid-width without affecting at all the overall network's ability for highly accurate inference.
1 code implementation • 30 Mar 2021 • Adrian Bulat, Shiyang Cheng, Jing Yang, Andrew Garbett, Enrique Sanchez, Georgios Tzimiropoulos
Recent work on Deep Learning in the area of face analysis has focused on supervised learning for specific tasks of interest (e. g. face recognition, facial landmark localization etc.)
Ranked #1 on
Arousal Estimation
on AffectNet
no code implementations • CVPR 2021 • Enrique Sanchez, Mani Kumar Tellamekala, Michel Valstar, Georgios Tzimiropoulos
Temporal context is key to the recognition of expressions of emotion.
no code implementations • 8 Feb 2021 • Adrian Bulat, Enrique Sánchez-Lozano, Georgios Tzimiropoulos
An important component of unsupervised learning by instance-based discrimination is a memory bank for storing a feature representation for each training sample in the dataset.
no code implementations • ICLR 2021 • Jing Yang, Brais Martinez, Adrian Bulat, Georgios Tzimiropoulos
We advocate for a method that optimizes the output feature of the penultimate layer of the student network and hence is directly related to representation learning.
1 code implementation • NeurIPS 2020 • Dimitrios Mallis, Enrique Sanchez, Matthew Bell, Georgios Tzimiropoulos
This paper addresses the problem of unsupervised discovery of object landmarks.
no code implementations • 3 Nov 2020 • Enrique Sanchez, Adrian Bulat, Anestis Zaganidis, Georgios Tzimiropoulos
The second stage uses another dataset of randomly chosen labeled frames to train a regressor on top of our spatio-temporal model for estimating the AU intensity.
1 code implementation • ICLR 2021 • Adrian Bulat, Brais Martinez, Georgios Tzimiropoulos
Network binarization is a promising hardware-aware direction for creating efficient deep models.
no code implementations • 14 Apr 2020 • Ioanna Ntinou, Enrique Sanchez, Adrian Bulat, Michel Valstar, Georgios Tzimiropoulos
Action Units (AUs) are geometrically-based atomic facial muscle movements known to produce appearance changes at specific facial locations.
1 code implementation • ICLR 2020 • Brais Martinez, Jing Yang, Adrian Bulat, Georgios Tzimiropoulos
This paper shows how to train binary networks to within a few percent points ($\sim 3-5 \%$) of the full precision counterpart.
no code implementations • 9 Mar 2020 • Jing Yang, Brais Martinez, Adrian Bulat, Georgios Tzimiropoulos
To this end, we propose a new knowledge distillation method based on transferring feature statistics, specifically the channel-wise mean and variance, from the teacher to the student.
no code implementations • ECCV 2020 • Adrian Bulat, Brais Martinez, Georgios Tzimiropoulos
We show that directly applying NAS to the binary domain provides very poor results.
3 code implementations • 25 Feb 2020 • Adrian Bulat, Jean Kossaifi, Georgios Tzimiropoulos, Maja Pantic
In addition, with a reduction of 3x in model size and complexity, we show no decrease in performance when compared to the original HourGlass network.
Ranked #2 on
Pose Estimation
on MPII Human Pose
(using extra training data)
no code implementations • 14 Nov 2019 • Shiyang Cheng, Pingchuan Ma, Georgios Tzimiropoulos, Stavros Petridis, Adrian Bulat, Jie Shen, Maja Pantic
The proposed model significantly outperforms previous approaches on non-frontal views while retaining the superior performance on frontal and near frontal mouth views.
1 code implementation • NeurIPS 2019 • Enrique Sanchez, Georgios Tzimiropoulos
Contrary to previous works, we do however assume that a landmark detector, which has already learned a structured representation for a given object category in a fully supervised manner, is available.
1 code implementation • 30 Sep 2019 • Adrian Bulat, Georgios Tzimiropoulos
This paper proposes an improved training algorithm for binary neural networks in which both weights and activations are binary numbers.
no code implementations • 25 Sep 2019 • Adrian Bulat, Jean Kossaifi, Sourav Bhattacharya, Yannis Panagakis, Georgios Tzimiropoulos, Nicholas D. Lane, Maja Pantic
As deep neural networks become widely adopted for solving most problems in computer vision and audio-understanding, there are rising concerns about their potential vulnerability.
1 code implementation • CVPR 2020 • Muhammad Haris Khan, John McDonagh, Salman Khan, Muhammad Shahabuddin, Aditya Arora, Fahad Shahbaz Khan, Ling Shao, Georgios Tzimiropoulos
Several studies show that animal needs are often expressed through their faces.
no code implementations • 16 Apr 2019 • Adrian Bulat, Jean Kossaifi, Georgios Tzimiropoulos, Maja Pantic
This paper is on improving the training of binary neural networks in which both activations and weights are binary.
no code implementations • 12 Apr 2019 • Adrian Bulat, Jean Kossaifi, Georgios Tzimiropoulos, Maja Pantic
Adapting the learned classification to new domains is a hard problem due to at least three reasons: (1) the new domains and the tasks might be drastically different; (2) there might be very limited amount of annotated data on the new domain and (3) full training of a new model for each new task is prohibitive in terms of computation and memory, due to the sheer number of parameters of deep CNNs.
1 code implementation • 11 Apr 2019 • Adrian Bulat, Georgios Tzimiropoulos, Jean Kossaifi, Maja Pantic
Big neural networks trained on large datasets have advanced the state-of-the-art for a large variety of challenging problems, improving performance by a large margin.
no code implementations • CVPR 2019 • Jean Kossaifi, Adrian Bulat, Georgios Tzimiropoulos, Maja Pantic
In this paper, we propose to fully parametrize Convolutional Neural Networks (CNNs) with a single high-order, low-rank tensor.
Ranked #32 on
Pose Estimation
on MPII Human Pose
no code implementations • 12 Dec 2018 • Juan Manuel Fernandez Montenegro, Mahdi Maktab Dar Oghaz, Athanasios Gkelias, Georgios Tzimiropoulos, Vasileios Argyriou
The performance evaluation demonstrates an improvement on facial emotion classification (accuracy and F1 score) that indicates the superiority of the proposed methodology.
no code implementations • 6 Dec 2018 • Vassilis C. Nicodemou, Iason Oikonomidis, Georgios Tzimiropoulos, Antonis Argyros
We propose the first approach to the problem of inferring the depth map of a human hand based on a single RGB image.
no code implementations • 3 Nov 2018 • Themos Stafylakis, Muhammad Haris Khan, Georgios Tzimiropoulos
A further analysis on the utility of target word boundaries is provided, as well as on the capacity of the network in modeling the linguistic context of the target word.
no code implementations • 28 Sep 2018 • Stavros Petridis, Themos Stafylakis, Pingchuan Ma, Georgios Tzimiropoulos, Maja Pantic
Therefore, we could use a CTC loss in combination with an attention-based model in order to force monotonic alignments and at the same time get rid of the conditional independence assumption.
Ranked #4 on
Audio-Visual Speech Recognition
on LRS2
Audio-Visual Speech Recognition
Automatic Speech Recognition
+3
no code implementations • 11 Sep 2018 • Aaron S. Jackson, Chris Manafas, Georgios Tzimiropoulos
This paper proposes the use of an end-to-end Convolutional Neural Network for direct reconstruction of the 3D geometry of humans via volumetric regression.
1 code implementation • 14 Aug 2018 • Adrian Bulat, Georgios Tzimiropoulos
To this end, we make the following contributions: (a) we are the first to study the effect of neural network binarization on localization tasks, namely human pose estimation and face alignment.
Ranked #1 on
3D Face Alignment
on AFLW2000-3D
2 code implementations • ECCV 2018 • Adrian Bulat, Jing Yang, Georgios Tzimiropoulos
This paper is on image and face super-resolution.
1 code implementation • ECCV 2018 • Themos Stafylakis, Georgios Tzimiropoulos
Visual keyword spotting (KWS) is the problem of estimating whether a text query occurs in a given recording using only video information.
1 code implementation • 9 May 2018 • Enrique Sanchez-Lozano, Georgios Tzimiropoulos, Michel Valstar
Contrary to previous works that try to learn an unsupervised representation of the Action Unit regions, we propose to directly and jointly estimate all AU intensities through heatmap regression, along with the location in the face where they cause visible changes.
2 code implementations • 18 Feb 2018 • Stavros Petridis, Themos Stafylakis, Pingchuan Ma, Feipeng Cai, Georgios Tzimiropoulos, Maja Pantic
In presence of high levels of noise, the end-to-end audiovisual model significantly outperforms both audio-only models.
Ranked #14 on
Lipreading
on Lip Reading in the Wild
no code implementations • CVPR 2018 • Adrian Bulat, Georgios Tzimiropoulos
This paper addresses 2 challenging tasks: improving the quality of low resolution facial images and accurately locating the facial landmarks on such poor resolution images.
Ranked #4 on
Face Hallucination
on FFHQ 512 x 512 - 16x upscaling
1 code implementation • 30 Oct 2017 • Themos Stafylakis, Georgios Tzimiropoulos
In this paper we present a deep learning architecture for extracting word embeddings for visual speech recognition.
no code implementations • ICCV 2017 • Muhammad Haris Khan, John McDonagh, Georgios Tzimiropoulos
Tracking-by-detection is drift-free but results in low accuracy fittings.
1 code implementation • ICCV 2017 • Aaron S. Jackson, Adrian Bulat, Vasileios Argyriou, Georgios Tzimiropoulos
Our CNN works with just a single 2D facial image, does not require accurate alignment nor establishes dense correspondence between images, works for arbitrary facial poses and expressions, and can be used to reconstruct the whole 3D facial geometry (including the non-visible parts of the face) bypassing the construction (during training) and fitting (during testing) of a 3D Morphable Model.
Ranked #2 on
3D Face Reconstruction
on Florence
9 code implementations • ICCV 2017 • Adrian Bulat, Georgios Tzimiropoulos
To this end, we make the following 5 contributions: (a) we construct, for the first time, a very strong baseline by combining a state-of-the-art architecture for landmark localization with a state-of-the-art residual block, train it on a very large yet synthetically expanded 2D facial landmark dataset and finally evaluate it on all other 2D facial landmark datasets.
Ranked #1 on
Face Alignment
on LS3D-W Balanced
4 code implementations • 12 Mar 2017 • Themos Stafylakis, Georgios Tzimiropoulos
We propose an end-to-end deep learning architecture for word-level visual speech recognition.
Ranked #16 on
Lipreading
on Lip Reading in the Wild
3 code implementations • ICCV 2017 • Adrian Bulat, Georgios Tzimiropoulos
(d) We present results for experiments on the most challenging datasets for human pose estimation and face alignment, reporting in many cases state-of-the-art performance.
Ranked #1 on
Face Alignment
on AFLW-Full
no code implementations • 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2016 • Sergio Escalera, Mercedes Torres Torres, Brais Martínez, Xavier Baró, Hugo Jair Escalante, Isabelle Guyon, Georgios Tzimiropoulos, Ciprian Corneanu, Marc Oliu, Mohammad Ali Bagheri, Michel Valstar
A custom-build application was used to collect and label data about the apparent age of people (as opposed to the real age).
Ranked #2 on
Gender Prediction
on FotW Gender
no code implementations • 7 Dec 2016 • Enrique Sánchez-Lozano, Georgios Tzimiropoulos, Brais Martinez, Fernando de la Torre, Michel Valstar
This paper presents a Functional Regression solution to the least squares problem, which we coin Continuous Regression, resulting in the first real-time incremental face tracker.
no code implementations • 30 Sep 2016 • Aaron Jackson, Michel Valstar, Georgios Tzimiropoulos
This paper proposes a CNN cascade for semantic part segmentation guided by pose-specific information encoded in terms of a set of landmarks (or keypoints).
1 code implementation • 29 Sep 2016 • Adrian Bulat, Georgios Tzimiropoulos
This paper describes our submission to the 1st 3D Face Alignment in the Wild (3DFAW) Challenge.
Ranked #1 on
Face Alignment
on 3DFAW
no code implementations • British Machine Vision Conference 2016 • Adrian Bulat, Georgios Tzimiropoulos
Besides playing the role of a graphical model, CNN regression is a key feature of our system, guiding the network to rely on context for predicting the location of occluded landmarks, typically encountered in very large poses.
Ranked #1 on
Face Alignment
on AFLW-PIFA (21 points)
1 code implementation • 6 Sep 2016 • Adrian Bulat, Georgios Tzimiropoulos
Our main contribution is a CNN cascaded architecture specifically designed for learning part relationships and spatial context, and robustly inferring pose even for the case of severe part occlusions.
Ranked #10 on
Pose Estimation
on Leeds Sports Poses
no code implementations • 3 Aug 2016 • Enrique Sánchez-Lozano, Brais Martinez, Georgios Tzimiropoulos, Michel Valstar
We then derive the incremental learning updates for CCR (iCCR) and show that it is an order of magnitude faster than standard incremental learning for cascaded regression, bringing the time required for the update from seconds down to a fraction of a second, thus enabling real-time tracking.
no code implementations • CVPR 2015 • Georgios Tzimiropoulos
Cascaded regression approaches have been recently shown to achieve state-of-the-art performance for many computer vision tasks.
no code implementations • CVPR 2014 • Georgios Tzimiropoulos, Maja Pantic
To address this limitation, in this paper, we propose to jointly optimize a part-based, trained in-the-wild, flexible appearance model along with a global shape model which results in a joint translational motion model for the model parts via Gauss-Newton (GN) optimization.