Search Results for author: Richard Bowden

Found 78 papers, 24 papers with code

Diversity-Aware Sign Language Production through a Pose Encoding Variational Autoencoder

no code implementations16 May 2024 Mohamed Ilyes Lakhal, Richard Bowden

The generator framework is presented as a UNet architecture to ensure spatial preservation of the input pose, and we include the visual features from the variational inference to maintain control over appearance and style.

Decoder Diversity +3

Sign Stitching: A Novel Approach to Sign Language Production

1 code implementation13 May 2024 Harry Walsh, Ben Saunders, Richard Bowden

Then by applying filtering in the frequency domain and resampling each sign we create cohesive natural sequences, that mimic the prosody found in the original data.

Diversity Sign Language Production

Sign2GPT: Leveraging Large Language Models for Gloss-Free Sign Language Translation

no code implementations7 May 2024 Ryan Wong, Necati Cihan Camgoz, Richard Bowden

Automatic Sign Language Translation requires the integration of both computer vision and natural language processing to effectively bridge the communication gap between sign and spoken languages.

Gloss-free Sign Language Translation Sign Language Translation +1

Select and Reorder: A Novel Approach for Neural Sign Language Production

no code implementations17 Apr 2024 Harry Walsh, Ben Saunders, Richard Bowden

Sign languages, often categorised as low-resource languages, face significant challenges in achieving accurate translation due to the scarcity of parallel annotated datasets.

Disentanglement Sign Language Production +1

A Data-Driven Representation for Sign Language Production

1 code implementation17 Apr 2024 Harry Walsh, Abolfazl Ravanshad, Mariam Rahmani, Richard Bowden

By applying Vector Quantisation (VQ) to sign language data, we first learn a codebook of short motions that can be combined to create a natural sequence of sign.

Sign Language Production Translation

Two Hands Are Better Than One: Resolving Hand to Hand Intersections via Occupancy Networks

no code implementations8 Apr 2024 Maksym Ivashechkin, Oscar Mendez, Richard Bowden

This work addresses the intersection of hands by exploiting an occupancy network that represents the hand's volume as a continuous manifold.

3D Hand Pose Estimation

Using an LLM to Turn Sign Spottings into Spoken Language Sentences

no code implementations15 Mar 2024 Ozge Mercanoglu Sincan, Necati Cihan Camgoz, Richard Bowden

Sign Language Translation (SLT) is a challenging task that aims to generate spoken language sentences from sign language videos.

Language Modelling Large Language Model +2

Giving a Hand to Diffusion Models: a Two-Stage Approach to Improving Conditional Human Image Generation

1 code implementation15 Mar 2024 Anton Pelykh, Ozge Mercanoglu Sincan, Richard Bowden

Our approach not only enhances the quality of the generated hands but also offers improved control over hand pose, advancing the capabilities of pose-conditioned human image generation.

Anatomy Image Generation

Learnt Contrastive Concept Embeddings for Sign Recognition

no code implementations18 Aug 2023 Ryan Wong, Necati Cihan Camgoz, Richard Bowden

In natural language processing (NLP) of spoken languages, word embeddings have been shown to be a useful method to encode the meaning of words.

Word Embeddings

Is context all you need? Scaling Neural Sign Language Translation to Large Domains of Discourse

no code implementations18 Aug 2023 Ozge Mercanoglu Sincan, Necati Cihan Camgoz, Richard Bowden

Sign Language Translation (SLT) is a challenging task that aims to generate spoken language sentences from sign language videos, both of which have different grammar and word/gloss order.

Machine Translation NMT +3

Improving 3D Pose Estimation for Sign Language

no code implementations18 Aug 2023 Maksym Ivashechkin, Oscar Mendez, Richard Bowden

Given a 2D detection of keypoints in the image, we lift the skeleton to 3D using neural networks to predict both the joint rotations and bone lengths.

3D Pose Estimation valid

Gloss Alignment Using Word Embeddings

no code implementations8 Aug 2023 Harry Walsh, Ozge Mercanoglu Sincan, Ben Saunders, Richard Bowden

As a result, research has turned to TV broadcast content as a source of large-scale training data, consisting of both the sign language interpreter and the associated audio subtitle.

Word Alignment Word Embeddings

Kick Back & Relax: Learning to Reconstruct the World by Watching SlowTV

1 code implementation ICCV 2023 Jaime Spencer, Chris Russell, Simon Hadfield, Richard Bowden

Unfortunately, existing approaches limit themselves to the automotive domain, resulting in models incapable of generalizing to complex environments such as natural or indoor settings.

Diversity Monocular Depth Estimation +2

Learning Adaptive Neighborhoods for Graph Neural Networks

no code implementations ICCV 2023 Avishkar Saha, Oscar Mendez, Chris Russell, Richard Bowden

Our module can be readily integrated into existing pipelines involving graph convolution operations, replacing the predetermined or existing adjacency matrix with one that is learned, and optimized, as part of the general objective.

Node Classification Point Cloud Classification +1

Decision Making for Autonomous Driving in Interactive Merge Scenarios via Learning-based Prediction

no code implementations29 Mar 2023 Salar Arbabi, Davide Tavernini, Saber Fallah, Richard Bowden

This paper presents a decision making approach for autonomous driving, focusing on the complex task of merging into moving traffic where uncertainty emanates from the behavior of other drivers and imperfect sensor measurements.

Autonomous Driving Decision Making

Novel View Synthesis of Humans using Differentiable Rendering

1 code implementation28 Mar 2023 Guillaume Rochette, Chris Russell, Richard Bowden

We show how our approach can be used for motion transfer between individuals; novel view synthesis of individuals captured from just a single camera; to synthesize individuals from any virtual viewpoint; and to re-render people in novel poses.

Decoder Image Reconstruction +1

Hierarchical I3D for Sign Spotting

no code implementations3 Oct 2022 Ryan Wong, Necati Cihan Camgöz, Richard Bowden

Most of the vision-based sign language research to date has focused on Isolated Sign Language Recognition (ISLR), where the objective is to predict a single sign class given a short video clip.

Sign Language Recognition

Changing the Representation: Examining Language Representation for Neural Sign Language Production

no code implementations SLTAT (LREC) 2022 Harry Walsh, Ben Saunders, Richard Bowden

We use language models such as BERT and Word2Vec to create better sentence level embeddings, and apply several tokenization techniques, demonstrating how these improve performance on the low resource translation task of Text to Gloss.

Sentence Sign Language Production +2

Deconstructing Self-Supervised Monocular Reconstruction: The Design Decisions that Matter

2 code implementations2 Aug 2022 Jaime Spencer, Chris Russell, Simon Hadfield, Richard Bowden

It is likely that many papers were not only optimized for particular datasets, but also for errors in the data and evaluation criteria.

Monocular Depth Estimation Monocular Reconstruction

AFT-VO: Asynchronous Fusion Transformers for Multi-View Visual Odometry Estimation

no code implementations26 Jun 2022 Nimet Kaygusuz, Oscar Mendez, Richard Bowden

To address this limitation, in this work, we propose AFT-VO, a novel transformer-based sensor fusion architecture to estimate VO from multiple sensors.

Motion Estimation Sensor Fusion +1

Medusa: Universal Feature Learning via Attentional Multitasking

no code implementations12 Apr 2022 Jaime Spencer, Richard Bowden, Simon Hadfield

We argue that MTL is a stepping stone towards universal feature learning (UFL), which is the ability to learn generic features that can be applied to new tasks without retraining.

Decoder Multi-Task Learning

"The Pedestrian next to the Lamppost" Adaptive Object Graphs for Better Instantaneous Mapping

no code implementations CVPR 2022 Avishkar Saha, Oscar Mendez, Chris Russell, Richard Bowden

Estimating a semantically segmented bird's-eye-view (BEV) map from a single image has become a popular technique for autonomous control and navigation.

Graph Neural Network

Signing at Scale: Learning to Co-Articulate Signs for Large-Scale Photo-Realistic Sign Language Production

1 code implementation CVPR 2022 Ben Saunders, Necati Cihan Camgoz, Richard Bowden

To learn sign co-articulation, we propose a novel Frame Selection Network (FS-Net) that improves the temporal alignment of interpolated dictionary signs to continuous signing sequences.

Sign Language Production

A Free Lunch with Influence Functions? Improving Neural Network Estimates with Concepts from Semiparametric Statistics

no code implementations18 Feb 2022 Matthew J. Vowels, Sina Akbari, Necati Cihan Camgoz, Richard Bowden

Unfortunately, they are unlikely to be sufficiently flexible to be able to adequately model real-world phenomena, and may yield biased estimates.

Causal Inference

MDN-VO: Estimating Visual Odometry with Confidence

no code implementations23 Dec 2021 Nimet Kaygusuz, Oscar Mendez, Richard Bowden

Visual Odometry (VO) is used in many applications including robotics and autonomous systems.

Visual Odometry

Multi-Camera Sensor Fusion for Visual Odometry using Deep Uncertainty Estimation

no code implementations23 Dec 2021 Nimet Kaygusuz, Oscar Mendez, Richard Bowden

To address this issue, we propose a deep sensor fusion framework which estimates vehicle motion using both pose and uncertainty estimations from multiple on-board cameras.

Autonomous Driving Sensor Fusion +1

Skeletal Graph Self-Attention: Embedding a Skeleton Inductive Bias into Sign Language Production

no code implementations SLTAT (LREC) 2022 Ben Saunders, Necati Cihan Camgoz, Richard Bowden

Recent approaches to Sign Language Production (SLP) have adopted spoken language Neural Machine Translation (NMT) architectures, applied without sign-specific modifications.

Inductive Bias Machine Translation +3

Human Pose Manipulation and Novel View Synthesis using Differentiable Rendering

1 code implementation24 Nov 2021 Guillaume Rochette, Chris Russell, Richard Bowden

We show how our approach can be used for motion transfer between individuals; novel view synthesis of individuals captured from just a single camera; to synthesize individuals from any virtual viewpoint; and to re-render people in novel poses.

Decoder Image Reconstruction +1

Translating Images into Maps

1 code implementation3 Oct 2021 Avishkar Saha, Oscar Mendez Maldonado, Chris Russell, Richard Bowden

We show how a novel form of transformer network can be used to map from images and video directly to an overhead map or bird's-eye-view (BEV) of the world, in a single end-to-end network.

Translation

Mixed SIGNals: Sign Language Production via a Mixture of Motion Primitives

no code implementations ICCV 2021 Ben Saunders, Necati Cihan Camgoz, Richard Bowden

Using a progressive transformer for the translation sub-task, we propose a novel Mixture of Motion Primitives (MoMP) architecture for sign language animation.

Sign Language Production Translation

AnonySIGN: Novel Human Appearance Synthesis for Sign Language Video Anonymisation

no code implementations22 Jul 2021 Ben Saunders, Necati Cihan Camgoz, Richard Bowden

To tackle SLVA, we propose AnonySign, a novel automatic approach for visual anonymisation of sign language data.

Image-to-Image Translation

Looking for the Signs: Identifying Isolated Sign Instances in Continuous Video Footage

no code implementations21 Jul 2021 Tao Jiang, Necati Cihan Camgoz, Richard Bowden

In this paper, we focus on the task of one-shot sign spotting, i. e. given an example of an isolated sign (query), we want to identify whether/where this sign appears in a continuous, co-articulated sign language video (target).

Adversarial Mixture Density Networks: Learning to Drive Safely from Collision Data

1 code implementation9 Jul 2021 Sampo Kuutti, Saber Fallah, Richard Bowden

By penalising the safe action distribution based on its similarity to the unsafe action distribution when training on the collision dataset, a more robust and safe control policy is obtained.

Autonomous Driving Imitation Learning

ARC: Adversarially Robust Control Policies for Autonomous Vehicles

1 code implementation9 Jul 2021 Sampo Kuutti, Saber Fallah, Richard Bowden

By training the protagonist against an ensemble of adversaries, it learns a significantly more robust control policy, which generalises to a variety of adversarial strategies.

ARC Autonomous Vehicles

Markov Localisation using Heatmap Regression and Deep Convolutional Odometry

no code implementations1 Jun 2021 Oscar Mendez, Simon Hadfield, Richard Bowden

Recent advances in deep learning hardware allow large likelihood volumes to be stored directly on the GPU, along with the hardware necessary to efficiently perform GPU-bound 3D convolutions and this obviates many of the disadvantages of grid based methods.

regression

Skeletor: Skeletal Transformers for Robust Body-Pose Estimation

no code implementations23 Apr 2021 Tao Jiang, Necati Cihan Camgoz, Richard Bowden

Skeletor can achieve this as it implicitly learns the spatio-temporal context of human motion via a transformer based neural network.

3D Human Pose Estimation Sign Language Translation +1

Shadow-Mapping for Unsupervised Neural Causal Discovery

no code implementations16 Apr 2021 Matthew J. Vowels, Necati Cihan Camgoz, Richard Bowden

An important goal across most scientific fields is the discovery of causal structures underling a set of observations.

Causal Discovery

There and Back Again: Self-supervised Multispectral Correspondence Estimation

no code implementations19 Mar 2021 Celyn Walters, Oscar Mendez, Mark Johnson, Richard Bowden

In this work, we aim to address the dense correspondence estimation problem in a way that generalizes to more than one spectrum.

Autonomous Vehicles

Weakly Supervised Reinforcement Learning for Autonomous Highway Driving via Virtual Safety Cages

1 code implementation17 Mar 2021 Sampo Kuutti, Richard Bowden, Saber Fallah

We compare models with and without safety cages, as well as models with optimal and constrained model parameters, and show that the weak supervision consistently improves the safety of exploration, speed of convergence, and model performance.

Autonomous Vehicles reinforcement-learning +2

VDSM: Unsupervised Video Disentanglement with State-Space Modeling and Deep Mixtures of Experts

1 code implementation CVPR 2021 Matthew J. Vowels, Necati Cihan Camgoz, Richard Bowden

Given that supervision is often expensive or infeasible to acquire, we choose to incorporate structural inductive bias and present an unsupervised, deep State-Space-Model for Video Disentanglement (VDSM).

Decoder Disentanglement +1

Continuous 3D Multi-Channel Sign Language Production via Progressive Transformers and Mixture Density Networks

no code implementations11 Mar 2021 Ben Saunders, Necati Cihan Camgoz, Richard Bowden

Sign languages are multi-channel visual languages, where signers use a continuous 3D space to communicate. Sign Language Production (SLP), the automatic translation from spoken to sign languages, must embody both the continuous articulation and full morphology of sign to be truly understandable by the Deaf community.

Data Augmentation Sign Language Production +1

Everybody Sign Now: Translating Spoken Language to Photo Realistic Sign Language Video

no code implementations19 Nov 2020 Ben Saunders, Necati Cihan Camgoz, Richard Bowden

To be truly understandable and accepted by Deaf communities, an automatic Sign Language Production (SLP) system must generate a photo-realistic signer.

Sign Language Production Video Generation

Targeted VAE: Variational and Targeted Learning for Causal Inference

1 code implementation28 Sep 2020 Matthew James Vowels, Necati Cihan Camgoz, Richard Bowden

Undertaking causal inference with observational data is incredibly useful across a wide range of tasks including the development of medical treatments, advertisements and marketing, and policy making.

Causal Inference counterfactual +2

Targeted VAE: Structured Inference and Targeted Learning for Causal Parameter Estimation

no code implementations28 Sep 2020 Matthew James Vowels, Necati Cihan Camgoz, Richard Bowden

Undertaking causal inference with observational data is extremely useful across a wide range of domains including the development of medical treatments, advertisements and marketing, and policy making.

Causal Inference counterfactual +1

Multi-channel Transformers for Multi-articulatory Sign Language Translation

no code implementations1 Sep 2020 Necati Cihan Camgoz, Oscar Koller, Simon Hadfield, Richard Bowden

Sign languages use multiple asynchronous information channels (articulators), not just the hands but also the face and body, which computational approaches often ignore.

Sign Language Translation Translation

Adversarial Training for Multi-Channel Sign Language Production

no code implementations27 Aug 2020 Ben Saunders, Necati Cihan Camgoz, Richard Bowden

Sign Languages are rich multi-channel languages, requiring articulation of both manual (hands) and non-manual (face and body) features in a precise, intricate manner.

Sign Language Production Translation

Progressive Transformers for End-to-End Sign Language Production

1 code implementation ECCV 2020 Ben Saunders, Necati Cihan Camgoz, Richard Bowden

The goal of automatic Sign Language Production (SLP) is to translate spoken language to a continuous stream of sign language video at a level comparable to a human translator.

Sign Language Production Translation

Same Features, Different Day: Weakly Supervised Feature Learning for Seasonal Invariance

1 code implementation CVPR 2020 Jaime Spencer, Richard Bowden, Simon Hadfield

The aim of this paper is to provide a dense feature representation that can be used to perform localization, sparse matching or image retrieval, regardless of the current seasonal or temporal appearance.

Image Retrieval Retrieval

DeFeat-Net: General Monocular Depth via Simultaneous Unsupervised Representation Learning

1 code implementation CVPR 2020 Jaime Spencer, Richard Bowden, Simon Hadfield

In the current monocular depth research, the dominant approach is to employ unsupervised training on large datasets, driven by warped photometric consistency.

Monocular Depth Estimation Representation Learning

Training Adversarial Agents to Exploit Weaknesses in Deep Control Policies

1 code implementation27 Feb 2020 Sampo Kuutti, Saber Fallah, Richard Bowden

As the networks used to obtain state-of-the-art results become increasingly deep and complex, the rules they have learned and how they operate become more challenging to understand.

Autonomous Driving reinforcement-learning +3

NestedVAE: Isolating Common Factors via Weak Supervision

no code implementations CVPR 2020 Matthew J. Vowels, Necati Cihan Camgoz, Richard Bowden

Two outer VAEs with shared weights attempt to reconstruct the input and infer a latent space, whilst a nested VAE attempts to reconstruct the latent representation of one image, from the latent representation of its paired image.

Attribute Change Detection +1

A Survey of Deep Learning Applications to Autonomous Vehicle Control

no code implementations23 Dec 2019 Sampo Kuutti, Richard Bowden, Yaochu Jin, Phil Barber, Saber Fallah

However, deep learning methods have shown great promise in not only providing excellent performance for complex and non-linear control problems, but also in generalising previously learned rules to new scenarios.

Autonomous Vehicles Deep Learning +4

Gated Variational AutoEncoders: Incorporating Weak Supervision to Encourage Disentanglement

no code implementations15 Nov 2019 Matthew J. Vowels, Necati Cihan Camgoz, Richard Bowden

However, there is some debate about how to encourage disentanglement with VAEs and evidence indicates that existing implementations of VAEs do not achieve disentanglement consistently.

Disentanglement Informativeness

Weakly-Supervised 3D Pose Estimation from a Single Image using Multi-View Consistency

no code implementations13 Sep 2019 Guillaume Rochette, Chris Russell, Richard Bowden

We present a novel data-driven regularizer for weakly-supervised learning of 3D human pose estimation that eliminates the drift problem that affects existing approaches.

3D Human Pose Estimation 3D Pose Estimation +2

Scale-Adaptive Neural Dense Features: Learning via Hierarchical Context Aggregation

1 code implementation CVPR 2019 Jaime Spencer, Richard Bowden, Simon Hadfield

In all cases, we show how incorporating SAND features results in better or comparable results to the baseline, whilst requiring little to no additional training.

Disparity Estimation Semantic Segmentation

Localisation via Deep Imagination: learn the features not the map

no code implementations19 Nov 2018 Jaime Spencer, Oscar Mendez, Richard Bowden, Simon Hadfield

In order to build the embedded map, we train a deep Siamese Fully Convolutional U-Net to perform dense feature extraction.

Neural Sign Language Translation

1 code implementation CVPR 2018 Necati Cihan Camgoz, Simon Hadfield, Oscar Koller, Hermann Ney, Richard Bowden

SLR seeks to recognize a sequence of continuous signs but neglects the underlying rich grammatical and linguistic structures of sign language that differ from spoken language.

Gesture Recognition Language Modelling +5

Taking the Scenic Route to 3D: Optimising Reconstruction From Moving Cameras

no code implementations ICCV 2017 Oscar Mendez, Simon Hadfield, Nicolas Pugeault, Richard Bowden

This approach is ill-suited for reconstruction applications, where learning about the environment is more valuable than speed of traversal.

SeDAR - Semantic Detection and Ranging: Humans can localise without LiDAR, can robots?

no code implementations5 Sep 2017 Oscar Mendez, Simon Hadfield, Nicolas Pugeault, Richard Bowden

Similarly, we do not extrude the 2D geometry present in the floorplan into 3D and try to align it to the real-world.

Image and Video Mining through Online Learning

no code implementations9 Sep 2016 Andrew Gilbert, Richard Bowden

On the UCF11 video dataset, the accuracy is 86. 7% despite using only 90 labelled examples from a dataset of over 1200 videos, instead of the standard 1122 training videos.

Action Recognition Active Learning +3

Deep Hand: How to Train a CNN on 1 Million Hand Images When Your Data Is Continuous and Weakly Labelled

no code implementations CVPR 2016 Oscar Koller, Hermann Ney, Richard Bowden

Furthermore, we demonstrate its use in continuous sign language recognition on two publicly available large sign language data sets, where it outperforms the current state-of-the-art by a large margin.

Sign Language Recognition Video Recognition

Exploring Causal Relationships in Visual Object Tracking

no code implementations ICCV 2015 Karel Lebeda, Simon Hadfield, Richard Bowden

We show that the location predictions are robust to camera shake and sud- den motion, which is invaluable for any tracking algorithm and demonstrate this by applying causal prediction to two state-of-the-art trackers.

Object Visual Object Tracking

Hollywood 3D: Recognizing Actions in 3D Natural Scenes

no code implementations CVPR 2013 Simon Hadfield, Richard Bowden

In addition, two state of the art action recognition algorithms are extended to make use of the 3D data, and five new interest point detection strategies are also proposed, that extend to the 3D data.

Action Recognition Benchmarking +2

Cannot find the paper you are looking for? You can Submit a new open access paper.