Search Results for author: Armin Mustafa

Found 20 papers, 6 papers with code

S3R-Net: A Single-Stage Approach to Self-Supervised Shadow Removal

1 code implementation • 18 Apr 2024 • Nikolina Kubiak, Armin Mustafa, Graeme Phillipson, Stephen Jolly, Simon Hadfield

In this paper we present S3R-Net, the Self-Supervised Shadow Removal Network.

Paper
Code

ViscoNet: Bridging and Harmonizing Visual and Textual Conditioning for ControlNet

1 code implementation • 5 Dec 2023 • Soon Yau Cheong, Armin Mustafa, Andrew Gilbert

This paper introduces ViscoNet, a novel method that enhances text-to-image human generation models with visual prompting.

Image Generation Visual Prompting

Paper
Code

CAD -- Contextual Multi-modal Alignment for Dynamic AVQA

no code implementations • 25 Oct 2023 • Asmar Nadeem, Adrian Hilton, Robert Dawes, Graham Thomas, Armin Mustafa

In the context of Audio Visual Question Answering (AVQA) tasks, the audio visual modalities could be learnt on three levels: 1) Spatial, 2) Temporal, and 3) Semantic.

Ranked #3 on Audio-visual Question Answering on MUSIC-AVQA

Audio-visual Question Answering Audio-Visual Question Answering (AVQA) +2

Paper
Add Code

PAT: Position-Aware Transformer for Dense Multi-Label Action Detection

no code implementations • 9 Aug 2023 • Faegheh Sardari, Armin Mustafa, Philip J. B. Jackson, Adrian Hilton

To address this issue, we (i) embed relative positional encoding in the self-attention mechanism and (ii) exploit multi-scale temporal relationships by designing a novel non hierarchical network, in contrast to the recent transformer-based approaches that use a hierarchical structure.

Ranked #1 on Action Detection on MultiTHUMOS

Action Detection Event Detection +1

Paper
Add Code

UPGPT: Universal Diffusion Model for Person Image Generation, Editing and Pose Transfer

1 code implementation • 18 Apr 2023 • Soon Yau Cheong, Armin Mustafa, Andrew Gilbert

Text-to-image models (T2I) such as StableDiffusion have been used to generate high quality images of people.

Ranked #1 on Pose Transfer on Deep-Fashion (FID metric)

Disentanglement Pose Transfer +2

Paper
Code

SEM-POS: Grammatically and Semantically Correct Video Captioning

no code implementations • 26 Mar 2023 • Asmar Nadeem, Adrian Hilton, Robert Dawes, Graham Thomas, Armin Mustafa

Generating grammatically and semantically correct captions in video captioning is a challenging task.

POS Video Captioning

Paper
Add Code

KPE: Keypoint Pose Encoding for Transformer-based Image Generation

1 code implementation • 9 Mar 2022 • Soon Yau Cheong, Armin Mustafa, Andrew Gilbert

Therefore we propose a new method; Keypoint Pose Encoding (KPE); KPE is 10 times more memory efficient and over 73% faster at generating high quality images from text input conditioned on the pose.

Image Generation

Paper
Code

SILT: Self-supervised Lighting Transfer Using Implicit Image Decomposition

1 code implementation • 25 Oct 2021 • Nikolina Kubiak, Armin Mustafa, Graeme Phillipson, Stephen Jolly, Simon Hadfield

We then remap this unified input domain using a discriminator that is presented with the generated outputs and the style reference, i. e. images of the desired illumination conditions.

Paper
Code

Temporal Consistency Loss for High Resolution Textured and Clothed 3DHuman Reconstruction from Monocular Video

no code implementations • 19 Apr 2021 • Akin Caliskan, Armin Mustafa, Adrian Hilton

We present a novel method to learn temporally consistent 3D reconstruction of clothed people from a monocular video.

3D Human Reconstruction 3D Human Shape Estimation +2

Paper
Add Code

Multi-person Implicit Reconstruction from a Single Image

no code implementations • CVPR 2021 • Armin Mustafa, Akin Caliskan, Lourdes Agapito, Adrian Hilton

We present a new end-to-end learning framework to obtain detailed and spatially coherent reconstructions of multiple people from a single image.

3D Human Reconstruction

Paper
Add Code

Multi-View Consistency Loss for Improved Single-Image 3D Reconstruction of Clothed People

no code implementations • 29 Sep 2020 • Akin Caliskan, Armin Mustafa, Evren Imre, Adrian Hilton

This paper introduces two advances to overcome this limitation: firstly a new synthetic dataset of realistic clothed people, 3DVH; and secondly, a novel multiple-view loss function for training of monocular volumetric shape estimation, which is demonstrated to significantly improve generalisation and reconstruction accuracy.

3D Human Shape Estimation 3D Reconstruction

Paper
Add Code

RealMonoDepth: Self-Supervised Monocular Depth Estimation for General Scenes

no code implementations • 14 Apr 2020 • Mertalp Ocal, Armin Mustafa

In this paper, we introduce RealMonoDepth a self-supervised monocular depth estimation approach which learns to estimate the real scene depth for a diverse range of indoor and outdoor scenes.

Monocular Depth Estimation Self-Supervised Learning

Paper
Add Code

Learning Dense Wide Baseline Stereo Matching for People

no code implementations • 2 Oct 2019 • Akin Caliskan, Armin Mustafa, Evren Imre, Adrian Hilton

We show that it is possible to learn stereo matching from synthetic people dataset and improve performance on real datasets for stereo reconstruction of people from narrow and wide baseline stereo data.

Data Augmentation Stereo Matching

Paper
Add Code

A*3D Dataset: Towards Autonomous Driving in Challenging Environments

1 code implementation • 17 Sep 2019 • Quang-Hieu Pham, Pierre Sevestre, Ramanpreet Singh Pahwa, Huijing Zhan, Chun Ho Pang, Yuda Chen, Armin Mustafa, Vijay Chandrasekhar, Jie Lin

With the increasing global popularity of self-driving cars, there is an immediate need for challenging real-world datasets for benchmarking and training various computer vision tasks such as 3D object detection.

3D Object Detection Autonomous Driving +4

117

Paper
Code

U4D: Unsupervised 4D Dynamic Scene Understanding

no code implementations • ICCV 2019 • Armin Mustafa, Chris Russell, Adrian Hilton

We introduce the first approach to solve the challenging problem of unsupervised 4D visual scene understanding for complex dynamic scenes with multiple interacting people from multi-view video.

3D Pose Estimation Instance Segmentation +3

Paper
Add Code

Temporally Coherent General Dynamic Scene Reconstruction

no code implementations • 18 Jul 2019 • Armin Mustafa, Marco Volino, Hansung Kim, Jean-yves Guillemaut, Adrian Hilton

Existing techniques for dynamic scene reconstruction from multiple wide-baseline cameras primarily focus on reconstruction in controlled environments, with fixed calibrated cameras and strong prior constraints.

Segmentation Semantic Segmentation

Paper
Add Code

4D Temporally Coherent Light-field Video

no code implementations • 30 Apr 2018 • Armin Mustafa, Marco Volino, Jean-yves Guillemaut, Adrian Hilton

Evaluation of the proposed light-field scene flow against existing multi-view dense correspondence approaches demonstrates a significant improvement in accuracy of temporal coherence.

Scene Flow Estimation

Paper
Add Code

Semantically Coherent Co-Segmentation and Reconstruction of Dynamic Scenes

no code implementations • CVPR 2017 • Armin Mustafa, Adrian Hilton

Semantic co-segmentation exploits the coherence in semantic class labels both spatially, between views at a single time instant, and temporally, between widely spaced time instants of dynamic objects with similar shape and appearance.

3D Reconstruction Segmentation

Paper
Add Code

Temporally coherent 4D reconstruction of complex dynamic scenes

no code implementations • CVPR 2016 • Armin Mustafa, Hansung Kim, Jean-yves Guillemaut, Adrian Hilton

Sparse-to-dense temporal correspondence is integrated with joint multi-view segmentation and reconstruction to obtain a complete 4D representation of static and dynamic objects.

4D reconstruction Camera Calibration +2

Paper
Add Code

General Dynamic Scene Reconstruction from Multiple View Video

no code implementations • ICCV 2015 • Armin Mustafa, Hansung Kim, Jean-yves Guillemaut, Adrian Hilton

The primary contributions of this paper are twofold: an automatic method for initial coarse dynamic scene segmentation and reconstruction without prior knowledge of background appearance or structure; and a general robust approach for joint segmentation refinement and dense reconstruction of dynamic scenes from multiple wide-baseline static or moving cameras.

Scene Segmentation Segmentation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.