Search Results for author: Adrian Hilton

Found 33 papers, 4 papers with code

COSMU: Complete 3D human shape from monocular unconstrained images

no code implementations15 Jul 2024 Marco Pesavento, Marco Volino, Adrian Hilton

The generated 2D normal maps are then processed by a multi-view attention-based neural implicit model that estimates an implicit representation of the 3D shape, ensuring the reproduction of details in both observed and occluded regions.

Benchmarking Monocular 3D Dog Pose Estimation Using In-The-Wild Motion Capture Data

no code implementations20 Jun 2024 Moira Shooter, Charles Malleson, Adrian Hilton

To address this, we created 3DDogs-Wild, a naturalised version of the dataset where the optical markers are in-painted and the subjects are placed in diverse environments, enhancing its utility for training RGB image-based pose detectors.

Animal Pose Estimation Benchmarking +1

NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative

no code implementations10 Jun 2024 Asmar Nadeem, Faegheh Sardari, Robert Dawes, Syed Sameed Husain, Adrian Hilton, Armin Mustafa

Existing video captioning benchmarks and models lack coherent representations of causal-temporal narrative, which is sequences of events linked through cause and effect, unfolding over time and driven by characters or agents.

Language Modelling Large Language Model +1

An Effective-Efficient Approach for Dense Multi-Label Action Detection

no code implementations10 Jun 2024 Faegheh Sardari, Armin Mustafa, Philip J. B. Jackson, Adrian Hilton

In this paper, we address this issue by proposing a novel transformer-based network that (a) employs a non-hierarchical structure when modelling different ranges of temporal dependencies and (b) embeds relative positional encoding in its transformer layers.

Action Detection

Gaussian Splatting with Localized Points Management

no code implementations6 Jun 2024 Haosen Yang, Chenhao Zhang, Wenqing Wang, Marco Volino, Adrian Hilton, Li Zhang, Xiatian Zhu

To address these limitations, we propose a Localized Point Management (LPM) strategy, capable of identifying those error-contributing zones in the highest demand for both point addition and geometry calibration.

Management

CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing

1 code implementation17 May 2024 Faegheh Sardari, Armin Mustafa, Philip J. B. Jackson, Adrian Hilton

In this paper, we propose CoLeaF, a novel learning framework that optimizes the integration of cross-modal context in the embedding space such that the network explicitly learns to combine cross-modal information for audible-visible events while filtering them out for unaligned events.

ANIM: Accurate Neural Implicit Model for Human Reconstruction from a single RGB-D image

no code implementations CVPR 2024 Marco Pesavento, Yuanlu Xu, Nikolaos Sarafianos, Robert Maier, Ziyan Wang, Chun-Han Yao, Marco Volino, Edmond Boyer, Adrian Hilton, Tony Tung

In this paper, we explore the benefits of incorporating depth observations in the reconstruction process by introducing ANIM, a novel method that reconstructs arbitrary 3D human shapes from single-view RGB-D images with an unprecedented level of accuracy.

CAD -- Contextual Multi-modal Alignment for Dynamic AVQA

no code implementations25 Oct 2023 Asmar Nadeem, Adrian Hilton, Robert Dawes, Graham Thomas, Armin Mustafa

In the context of Audio Visual Question Answering (AVQA) tasks, the audio visual modalities could be learnt on three levels: 1) Spatial, 2) Temporal, and 3) Semantic.

Audio-visual Question Answering Audio-Visual Question Answering (AVQA) +2

PAT: Position-Aware Transformer for Dense Multi-Label Action Detection

no code implementations9 Aug 2023 Faegheh Sardari, Armin Mustafa, Philip J. B. Jackson, Adrian Hilton

To address this issue, we (i) embed relative positional encoding in the self-attention mechanism and (ii) exploit multi-scale temporal relationships by designing a novel non hierarchical network, in contrast to the recent transformer-based approaches that use a hierarchical structure.

Action Detection Event Detection +1

Super-resolution 3D Human Shape from a Single Low-Resolution Image

1 code implementation23 Aug 2022 Marco Pesavento, Marco Volino, Adrian Hilton

The approach overcomes limitations of existing approaches that reconstruct 3D human shape from a single image, which require high-resolution images together with auxiliary data such as surface normal or a parametric model to reconstruct high-detail shape.

3D Human Reconstruction 3D Human Shape Estimation +2

Visually Supervised Speaker Detection and Localization via Microphone Array

no code implementations7 Mar 2022 Davide Berghi, Adrian Hilton, Philip J. B. Jackson

We propose to generate weak labels using a pre-trained active speaker detector on pre-extracted face tracks.

Active Speaker Detection

Super-Resolution Appearance Transfer for 4D Human Performances

no code implementations31 Aug 2021 Marco Pesavento, Marco Volino, Adrian Hilton

Typically the requirement to frame cameras to capture the volume of a dynamic performance ($>50m^3$) results in the person occupying only a small proportion $<$ 10% of the field of view.

4D reconstruction 4k +2

Attention-based Multi-Reference Learning for Image Super-Resolution

1 code implementation ICCV 2021 Marco Pesavento, Marco Volino, Adrian Hilton

A novel hierarchical attention-based sampling approach is introduced to learn the similarity between low-resolution image features and multiple reference images based on a perceptual loss.

Image Super-Resolution

SyDog: A Synthetic Dog Dataset for Improved 2D Pose Estimation

no code implementations31 Jul 2021 Moira Shooter, Charles Malleson, Adrian Hilton

Estimating the pose of animals can facilitate the understanding of animal motion which is fundamental in disciplines such as biomechanics, neuroscience, ethology, robotics and the entertainment industry.

2D Pose Estimation Animal Pose Estimation

Multi-person Implicit Reconstruction from a Single Image

no code implementations CVPR 2021 Armin Mustafa, Akin Caliskan, Lourdes Agapito, Adrian Hilton

We present a new end-to-end learning framework to obtain detailed and spatially coherent reconstructions of multiple people from a single image.

3D geometry 3D Human Reconstruction

Multi-View Consistency Loss for Improved Single-Image 3D Reconstruction of Clothed People

no code implementations29 Sep 2020 Akin Caliskan, Armin Mustafa, Evren Imre, Adrian Hilton

This paper introduces two advances to overcome this limitation: firstly a new synthetic dataset of realistic clothed people, 3DVH; and secondly, a novel multiple-view loss function for training of monocular volumetric shape estimation, which is demonstrated to significantly improve generalisation and reconstruction accuracy.

3D Human Shape Estimation 3D Reconstruction

Spectral Analysis Network for Deep Representation Learning and Image Clustering

no code implementations11 Sep 2020 Jinghua Wang, Adrian Hilton, Jianmin Jiang

This paper proposes a new network structure for unsupervised deep representation learning based on spectral analysis, which is a popular technique with solid theory foundations.

Clustering Image Clustering +1

Learning Dense Wide Baseline Stereo Matching for People

no code implementations2 Oct 2019 Akin Caliskan, Armin Mustafa, Evren Imre, Adrian Hilton

We show that it is possible to learn stereo matching from synthetic people dataset and improve performance on real datasets for stereo reconstruction of people from narrow and wide baseline stereo data.

Data Augmentation Stereo Matching

Semantic Estimation of 3D Body Shape and Pose using Minimal Cameras

no code implementations8 Aug 2019 Andrew Gilbert, Matthew Trumble, Adrian Hilton, John Collomosse

We aim to simultaneously estimate the 3D articulated pose and high fidelity volumetric occupancy of human performance, from multiple viewpoint video (MVV) with as few as two views.

3D Human Pose Estimation Decoder

EdgeNet: Semantic Scene Completion from a Single RGB-D Image

1 code implementation8 Aug 2019 Aloisio Dourado, Teofilo Emidio de Campos, Hansung Kim, Adrian Hilton

Semantic scene completion is the task of predicting a complete 3D representation of volumetric occupancy with corresponding semantic labels for a scene from a single point of view.

3D Semantic Scene Completion Edge Detection

U4D: Unsupervised 4D Dynamic Scene Understanding

no code implementations ICCV 2019 Armin Mustafa, Chris Russell, Adrian Hilton

We introduce the first approach to solve the challenging problem of unsupervised 4D visual scene understanding for complex dynamic scenes with multiple interacting people from multi-view video.

3D Pose Estimation Instance Segmentation +3

Temporally Coherent General Dynamic Scene Reconstruction

no code implementations18 Jul 2019 Armin Mustafa, Marco Volino, Hansung Kim, Jean-yves Guillemaut, Adrian Hilton

Existing techniques for dynamic scene reconstruction from multiple wide-baseline cameras primarily focus on reconstruction in controlled environments, with fixed calibrated cameras and strong prior constraints.

Segmentation Semantic Segmentation

Volumetric performance capture from minimal camera viewpoints

no code implementations ECCV 2018 Andrew Gilbert, Marco Volino, John Collomosse, Adrian Hilton

We present a convolutional autoencoder that enables high fidelity volumetric reconstructions of human performance to be captured from multi-view video comprising only a small set of camera views.

4D Temporally Coherent Light-field Video

no code implementations30 Apr 2018 Armin Mustafa, Marco Volino, Jean-yves Guillemaut, Adrian Hilton

Evaluation of the proposed light-field scene flow against existing multi-view dense correspondence approaches demonstrates a significant improvement in accuracy of temporal coherence.

Scene Flow Estimation

Semantic Scene Completion Combining Colour and Depth: preliminary experiments

no code implementations13 Feb 2018 Andre Bernardes Soares Guedes, Teofilo Emidio de Campos, Adrian Hilton

Semantic scene completion is the task of producing a complete 3D voxel representation of volumetric occupancy with semantic labels for a scene from a single-view observation.

3D Semantic Scene Completion

Semantically Coherent Co-Segmentation and Reconstruction of Dynamic Scenes

no code implementations CVPR 2017 Armin Mustafa, Adrian Hilton

Semantic co-segmentation exploits the coherence in semantic class labels both spatially, between views at a single time instant, and temporally, between widely spaced time instants of dynamic objects with similar shape and appearance.

3D Reconstruction Segmentation

Temporally coherent 4D reconstruction of complex dynamic scenes

no code implementations CVPR 2016 Armin Mustafa, Hansung Kim, Jean-yves Guillemaut, Adrian Hilton

Sparse-to-dense temporal correspondence is integrated with joint multi-view segmentation and reconstruction to obtain a complete 4D representation of static and dynamic objects.

4D reconstruction Camera Calibration +2

General Dynamic Scene Reconstruction from Multiple View Video

no code implementations ICCV 2015 Armin Mustafa, Hansung Kim, Jean-yves Guillemaut, Adrian Hilton

The primary contributions of this paper are twofold: an automatic method for initial coarse dynamic scene segmentation and reconstruction without prior knowledge of background appearance or structure; and a general robust approach for joint segmentation refinement and dense reconstruction of dynamic scenes from multiple wide-baseline static or moving cameras.

Scene Segmentation Segmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.