no code implementations • 5 Jun 2023 • Thomas Walker, Octave Mariotti, Amir Vaxman, Hakan Bilen
We introduce Explicit Neural Surfaces (ENS), an efficient surface reconstruction method that learns an explicitly defined continuous surface from multiple views.
1 code implementation • CVPR 2023 • Davide Moltisanti, Frank Keller, Hakan Bilen, Laura Sevilla-Lara
The goal of this work is to understand the way actions are performed in videos.
no code implementations • 11 Dec 2022 • Mustafa Taha Koçyiğit, Timothy M. Hospedales, Hakan Bilen
Recently the focus of the computer vision community has shifted from expensive supervised learning towards self-supervised learning of visual representations.
no code implementations • ICCV 2021 • Octave Mariotti, Oisin Mac Aodha, Hakan Bilen
Understanding the 3D world without supervision is currently a major challenge in computer vision as the annotations required to supervise deep networks for tasks in this domain are expensive to obtain on a large scale.
no code implementations • 1 Dec 2022 • Octave Mariotti, Oisin Mac Aodha, Hakan Bilen
We introduce ViewNeRF, a Neural Radiance Field-based viewpoint estimation method that learns to predict category-level viewpoints directly from images during training.
no code implementations • 26 Nov 2022 • Arushi Goel, Basura Fernando, Frank Keller, Hakan Bilen
Coreference resolution aims to identify words and phrases which refer to same entity in a text, a core task in natural language processing.
1 code implementation • CVPR 2023 • Titas Anciukevicius, Zexiang Xu, Matthew Fisher, Paul Henderson, Hakan Bilen, Niloy J. Mitra, Paul Guerrero
In this paper, we present RenderDiffusion, the first diffusion model for 3D generation and inference, trained using only monocular 2D supervision.
1 code implementation • 6 Nov 2022 • Yu Yang, Xiaotian Cheng, Chang Liu, Hakan Bilen, Xiangyang Ji
In recent years, generative adversarial networks (GANs) have been an actively studied topic and shown to successfully produce high-quality realistic images in various domains.
1 code implementation • ICLR 2022 • Yu Yang, Xiaotian Cheng, Hakan Bilen, Xiangyang Ji
The success of state-of-the-art deep neural networks heavily relies on the presence of large-scale labelled datasets, which are extremely expensive and time-consuming to annotate.
1 code implementation • 23 May 2022 • Kai Wang, Bo Zhao, Xiangyu Peng, Zheng Zhu, Jiankang Deng, Xinchao Wang, Hakan Bilen, Yang You
Firstly, randomly masked face images are used to train the reconstruction module in FaceMAE.
3 code implementations • 15 Apr 2022 • Bo Zhao, Hakan Bilen
However, traditional GANs generated images are not as informative as the real training samples when being used to train deep neural networks.
2 code implementations • 6 Apr 2022 • Wei-Hong Li, Xialei Liu, Hakan Bilen
We propose a unified look at jointly learning multiple vision tasks and visual domains through universal representations, a single deep neural network.
no code implementations • 31 Mar 2022 • Yunlu Chen, Basura Fernando, Hakan Bilen, Matthias Nießner, Efstratios Gavves
In this work, we address two key limitations of such representations, in failing to capture local 3D geometric fine details, and to learn from and generalize to shapes with unseen 3D transformations.
2 code implementations • CVPR 2022 • Kai Wang, Bo Zhao, Xiangyu Peng, Zheng Zhu, Shuo Yang, Shuo Wang, Guan Huang, Hakan Bilen, Xinchao Wang, Yang You
Dataset condensation aims at reducing the network training effort through condensing a cumbersome training set into a compact synthetic one.
1 code implementation • CVPR 2022 • Wei-Hong Li, Xialei Liu, Hakan Bilen
Despite the recent advances in multi-task learning of dense prediction problems, most methods rely on expensive labelled datasets.
no code implementations • CVPR 2022 • Arushi Goel, Basura Fernando, Frank Keller, Hakan Bilen
Scene graph generation (SGG) aims to capture a wide variety of interactions between pairs of objects, which is essential for full scene understanding.
2 code implementations • 8 Oct 2021 • Bo Zhao, Hakan Bilen
Computational cost of training state-of-the-art deep models in many learning problems is rapidly increasing due to more sophisticated models and larger datasets.
no code implementations • ICLR 2022 • Lucas Deecke, Timothy Hospedales, Hakan Bilen
A fundamental shortcoming of deep neural networks is their specialization to a single task and domain.
4 code implementations • CVPR 2022 • Wei-Hong Li, Xialei Liu, Hakan Bilen
In this paper, we look at the problem of cross-domain few-shot classification that aims to learn a classifier from previously unseen classes and domains with few labeled samples.
Ranked #2 on
Few-Shot Image Classification
on Meta-Dataset
cross-domain few-shot learning
Few-Shot Image Classification
no code implementations • 2 Apr 2021 • Octave Mariotti, Hakan Bilen
There is a growing interest in developing computer vision methods that can learn from limited supervision.
no code implementations • 1 Apr 2021 • Yu Yang, Hakan Bilen, Qiran Zou, Wing Yin Cheung, Xiangyang Ji
Deep learning approaches heavily rely on high-quality human supervision which is nonetheless expensive, time-consuming, and error-prone, especially for image segmentation task.
3 code implementations • ICCV 2021 • Wei-Hong Li, Xialei Liu, Hakan Bilen
In this paper, we look at the problem of few-shot classification that aims to learn a classifier for previously unseen classes and domains from few labeled samples.
Ranked #4 on
Few-Shot Image Classification
on Meta-Dataset
2 code implementations • 16 Feb 2021 • Bo Zhao, Hakan Bilen
In many machine learning problems, large-scale datasets have become the de-facto standard to train state-of-the-art deep networks at the price of heavy computation load.
no code implementations • 1 Jan 2021 • Lucas Deecke, Timothy Hospedales, Hakan Bilen
A fundamental shortcoming of deep neural networks is their specialization to a single task and domain.
no code implementations • 5 Oct 2020 • Lucas Deecke, Lukas Ruff, Robert A. Vandermeulen, Hakan Bilen
Deep anomaly detection is a difficult task since, in high dimensions, it is hard to completely characterize a notion of "differentness" when given only examples of normality.
3 code implementations • 14 Jul 2020 • Wei-Hong Li, Hakan Bilen
We then learn the multi-task model for minimizing task-specific loss and for producing the same feature with task-specific models.
4 code implementations • ICLR 2021 • Bo Zhao, Konda Reddy Mopuri, Hakan Bilen
As the state-of-the-art machine learning methods in many fields rely on larger datasets, storing datasets and training models on them become significantly more expensive.
1 code implementation • 8 Jun 2020 • Bo Zhao, Shixiang Tang, Dapeng Chen, Hakan Bilen, Rui Zhao
With the explosion of digital data in recent years, continuously learning new tasks from a stream of data without forgetting previously acquired knowledge has become increasingly important.
no code implementations • 1 Jun 2020 • Lucas Deecke, Timothy Hospedales, Hakan Bilen
While recent techniques in domain adaptation and multi-domain learning enable the learning of more domain-agnostic features, their success relies on the presence of domain labels, typically requiring manual annotation and careful curation of datasets.
2 code implementations • 8 Jan 2020 • Bo Zhao, Konda Reddy Mopuri, Hakan Bilen
Particularly, our approach can certainly extract the ground-truth labels as opposed to DLG, hence we name it Improved DLG (iDLG).
2 code implementations • 22 Dec 2019 • Wei-Hong Li, Chuan-Sheng Foo, Hakan Bilen
Recent semi-supervised learning methods have shown to achieve comparable results to their supervised counterparts while using only a small portion of labels in image classification tasks thanks to their regularization strategies.
no code implementations • 22 Nov 2019 • Arushi Goel, Basura Fernando, Thanh-Son Nguyen, Hakan Bilen
Automatically generating natural language descriptions from an image is a challenging problem in artificial intelligence that requires a good understanding of the visual and textual signals and the correlations between them.
no code implementations • 19 Oct 2019 • Sylvestre-Alvise Rebuffi, Ruth Fong, Xu Ji, Hakan Bilen, Andrea Vedaldi
In this paper, we are rather interested by the locations of an image that contribute to the model's training.
no code implementations • 18 Oct 2019 • Zhunxuan Wang, Zipei Wang, Qiqi Li, Hakan Bilen
Image deconvolution is the process of recovering convolutional degraded images, which is always a hard inverse problem because of its mathematically ill-posed property.
no code implementations • 25 Sep 2019 • Basura Fernando, Hakan Bilen
The instance representation is shared by both instance classification and weighting streams.
1 code implementation • ICCV 2019 • James Thewlis, Samuel Albanie, Hakan Bilen, Andrea Vedaldi
Equivariance to random image transformations is an effective method to learn landmarks of object categories, such as the eyes and the nose in faces, without manual supervision.
Ranked #1 on
Unsupervised Facial Landmark Detection
on 300W
1 code implementation • In Iberian Conference on Pattern Recognition and Image Analysis 2019 • Carlos Rodríguez - Pardo, Hakan Bilen
The use of computational methods to evaluate aesthetics in photography has gained interest in recent years due to the popularization of convolutional neural networks and the availability of new annotated datasets.
no code implementations • CVPR 2020 • Tomas Jakab, Ankush Gupta, Hakan Bilen, Andrea Vedaldi
We propose KeypointGAN, a new method for recognizing the pose of objects from a single image that for learning uses only unlabelled videos and a weak empirical prior on the object poses.
no code implementations • 16 Apr 2019 • Basura Fernando, Cheston Tan Yin Chet, Hakan Bilen
Detecting temporal extents of human actions in videos is a challenging computer vision problem that requires detailed manual supervision including frame-level labels.
no code implementations • NeurIPS 2018 • James Thewlis, Hakan Bilen, Andrea Vedaldi
We propose a new approach to model and learn, without manual supervision, the symmetries of natural objects, such as faces or flowers, given only images as input.
2 code implementations • ICLR 2019 • Lucas Deecke, Iain Murray, Hakan Bilen
Normalization methods are a central building block in the deep learning toolbox.
2 code implementations • NeurIPS 2018 • Tomas Jakab, Ankush Gupta, Hakan Bilen, Andrea Vedaldi
We propose a method for learning landmark detectors for visual objects (such as the eyes and the nose in a face) without any manual supervision.
Conditional Image Generation
Unsupervised Facial Landmark Detection
2 code implementations • CVPR 2018 • Sylvestre-Alvise Rebuffi, Hakan Bilen, Andrea Vedaldi
A practical limitation of deep neural networks is their high degree of specialization to a single task and visual domain.
no code implementations • NeurIPS 2017 • James Thewlis, Hakan Bilen, Andrea Vedaldi
One of the key challenges of visual perception is to extract abstract models of 3D objects and object categories from visual measurements, which are affected by complex nuisance factors such as viewpoint, occlusion, motion, and deformations.
Ranked #3 on
Unsupervised Facial Landmark Detection
on AFLW-MTFL
Optical Flow Estimation
Unsupervised Facial Landmark Detection
2 code implementations • NeurIPS 2017 • Sylvestre-Alvise Rebuffi, Hakan Bilen, Andrea Vedaldi
There is a growing interest in learning data representations that work well for many different types of problems and data.
1 code implementation • ICCV 2017 • James Thewlis, Hakan Bilen, Andrea Vedaldi
Learning automatically the structure of object categories remains an important open problem in computer vision.
Ranked #2 on
Unsupervised Facial Landmark Detection
on AFLW-MTFL
Unsupervised Facial Landmark Detection
Unsupervised Human Pose Estimation
+1
no code implementations • 25 Jan 2017 • Hakan Bilen, Andrea Vedaldi
With the advent of large labelled datasets and high-capacity models, the performance of machine vision systems has been improving rapidly.
Ranked #14 on
Continual Learning
on visual domain decathlon (10 tasks)
3 code implementations • 2 Dec 2016 • Hakan Bilen, Basura Fernando, Efstratios Gavves, Andrea Vedaldi
This is a powerful idea because it allows to convert any video to an image so that existing CNN models pre-trained for the analysis of still images can be immediately extended to videos.
no code implementations • CVPR 2017 • Basura Fernando, Hakan Bilen, Efstratios Gavves, Stephen Gould
On action classification, our method obtains 60. 3\% on the UCF101 dataset using only UCF101 data for training which is approximately 10% better than current state-of-the-art self-supervised learning methods.
Ranked #46 on
Self-Supervised Action Recognition
on UCF101
no code implementations • NeurIPS 2016 • Hakan Bilen, Andrea Vedaldi
Modern discriminative predictors have been shown to match natural intelligences in specific perceptual tasks in image classification, object and part detection, boundary extraction, etc.
1 code implementation • CVPR 2016 • Hakan Bilen, Basura Fernando, Efstratios Gavves, Andrea Vedaldi, Stephen Gould
We introduce the concept of dynamic image, a novel compact representation of videos useful for video analysis especially when convolutional neural networks (CNNs) are used.
Ranked #60 on
Action Recognition
on HMDB-51
3 code implementations • CVPR 2016 • Hakan Bilen, Andrea Vedaldi
Weakly supervised learning of object detection is an important problem in image understanding that still does not have a satisfactory solution.
Ranked #3 on
Weakly Supervised Object Detection
on HICO-DET
no code implementations • CVPR 2015 • Hakan Bilen, Marco Pedersoli, Tinne Tuytelaars
However, as learning appearance and localization are two interconnected tasks, the optimization is not convex and the procedure can easily get stuck in a poor local minimum, the algorithm "misses" the object in some images.
no code implementations • CVPR 2014 • Hakan Bilen, Marco Pedersoli, Vinay P. Namboodiri, Tinne Tuytelaars, Luc van Gool
In classification of objects substantial work has gone into improving the low level representation of an image by considering various aspects such as different features, a number of feature pooling and coding techniques and considering different kernels.