no code implementations • 19 Aug 2024 • Oliver Cory, Ozge Mercanoglu Sincan, Matthew Vowels, Alessia Battisti, Franz Holzknecht, Katja Tissi, Sandra Sidler-Miserez, Tobias Haug, Sarah Ebling, Richard Bowden
Sign Language Assessment (SLA) tools are useful to aid in language learning and are underdeveloped.
no code implementations • 16 May 2024 • Mohamed Ilyes Lakhal, Richard Bowden
The generator framework is presented as a UNet architecture to ensure spatial preservation of the input pose, and we include the visual features from the variational inference to maintain control over appearance and style.
1 code implementation • 13 May 2024 • Harry Walsh, Ben Saunders, Richard Bowden
Then by applying filtering in the frequency domain and resampling each sign we create cohesive natural sequences, that mimic the prosody found in the original data.
no code implementations • 7 May 2024 • Ryan Wong, Necati Cihan Camgoz, Richard Bowden
Automatic Sign Language Translation requires the integration of both computer vision and natural language processing to effectively bridge the communication gap between sign and spoken languages.
Gloss-free Sign Language Translation Sign Language Translation +1
no code implementations • 25 Apr 2024 • Jaime Spencer, Fabio Tosi, Matteo Poggi, Ripudaman Singh Arora, Chris Russell, Simon Hadfield, Richard Bowden, Guangyuan Zhou, Zhengxin Li, Qiang Rao, Yiping Bao, Xiao Liu, Dohyeong Kim, Jinseong Kim, Myunghyun Kim, Mykola Lavreniuk, Rui Li, Qing Mao, Jiang Wu, Yu Zhu, Jinqiu Sun, Yanning Zhang, Suraj Patni, Aradhye Agarwal, Chetan Arora, Pihai Sun, Kui Jiang, Gang Wu, Jian Liu, Xianming Liu, Junjun Jiang, Xidan Zhang, Jianing Wei, Fangjun Wang, Zhiming Tan, Jiabao Wang, Albert Luginov, Muhammad Shahzad, Seyed Hosseini, Aleksander Trajcevski, James H. Elder
This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC).
no code implementations • 17 Apr 2024 • Harry Walsh, Ben Saunders, Richard Bowden
Sign languages, often categorised as low-resource languages, face significant challenges in achieving accurate translation due to the scarcity of parallel annotated datasets.
1 code implementation • 17 Apr 2024 • Harry Walsh, Abolfazl Ravanshad, Mariam Rahmani, Richard Bowden
By applying Vector Quantisation (VQ) to sign language data, we first learn a codebook of short motions that can be combined to create a natural sequence of sign.
no code implementations • 8 Apr 2024 • Maksym Ivashechkin, Oscar Mendez, Richard Bowden
This work addresses the intersection of hands by exploiting an occupancy network that represents the hand's volume as a continuous manifold.
no code implementations • 15 Mar 2024 • Ozge Mercanoglu Sincan, Necati Cihan Camgoz, Richard Bowden
Sign Language Translation (SLT) is a challenging task that aims to generate spoken language sentences from sign language videos.
1 code implementation • 15 Mar 2024 • Anton Pelykh, Ozge Mercanoglu Sincan, Richard Bowden
Our approach not only enhances the quality of the generated hands but also offers improved control over hand pose, advancing the capabilities of pose-conditioned human image generation.
1 code implementation • 3 Mar 2024 • Jaime Spencer, Chris Russell, Simon Hadfield, Richard Bowden
Self-supervised learning is the key to unlocking generic computer vision systems.
no code implementations • 18 Aug 2023 • Ryan Wong, Necati Cihan Camgoz, Richard Bowden
In natural language processing (NLP) of spoken languages, word embeddings have been shown to be a useful method to encode the meaning of words.
no code implementations • 18 Aug 2023 • Ozge Mercanoglu Sincan, Necati Cihan Camgoz, Richard Bowden
Sign Language Translation (SLT) is a challenging task that aims to generate spoken language sentences from sign language videos, both of which have different grammar and word/gloss order.
no code implementations • 18 Aug 2023 • Maksym Ivashechkin, Oscar Mendez, Richard Bowden
Hand pose estimation from a single image has many applications.
no code implementations • 18 Aug 2023 • Maksym Ivashechkin, Oscar Mendez, Richard Bowden
Given a 2D detection of keypoints in the image, we lift the skeleton to 3D using neural networks to predict both the joint rotations and bone lengths.
no code implementations • 8 Aug 2023 • Harry Walsh, Ozge Mercanoglu Sincan, Ben Saunders, Richard Bowden
As a result, research has turned to TV broadcast content as a source of large-scale training data, consisting of both the sign language interpreter and the associated audio subtitle.
1 code implementation • ICCV 2023 • Jaime Spencer, Chris Russell, Simon Hadfield, Richard Bowden
Unfortunately, existing approaches limit themselves to the automotive domain, resulting in models incapable of generalizing to complex environments such as natural or indoor settings.
no code implementations • ICCV 2023 • Avishkar Saha, Oscar Mendez, Chris Russell, Richard Bowden
Our module can be readily integrated into existing pipelines involving graph convolution operations, replacing the predetermined or existing adjacency matrix with one that is learned, and optimized, as part of the general objective.
no code implementations • 14 Apr 2023 • Jaime Spencer, C. Stella Qian, Michaela Trescakova, Chris Russell, Simon Hadfield, Erich W. Graf, Wendy J. Adams, Andrew J. Schofield, James Elder, Richard Bowden, Ali Anwar, Hao Chen, Xiaozhi Chen, Kai Cheng, Yuchao Dai, Huynh Thai Hoa, Sadat Hossain, Jianmian Huang, Mohan Jing, Bo Li, Chao Li, Baojun Li, Zhiwen Liu, Stefano Mattoccia, Siegfried Mercelis, Myungwoo Nam, Matteo Poggi, Xiaohua Qi, Jiahui Ren, Yang Tang, Fabio Tosi, Linh Trinh, S. M. Nadim Uddin, Khan Muhammad Umair, Kaixuan Wang, YuFei Wang, Yixing Wang, Mochu Xiang, Guangkai Xu, Wei Yin, Jun Yu, Qi Zhang, Chaoqiang Zhao
This paper discusses the results for the second edition of the Monocular Depth Estimation Challenge (MDEC).
no code implementations • 29 Mar 2023 • Salar Arbabi, Davide Tavernini, Saber Fallah, Richard Bowden
This paper presents a decision making approach for autonomous driving, focusing on the complex task of merging into moving traffic where uncertainty emanates from the behavior of other drivers and imperfect sensor measurements.
1 code implementation • 28 Mar 2023 • Guillaume Rochette, Chris Russell, Richard Bowden
We show how our approach can be used for motion transfer between individuals; novel view synthesis of individuals captured from just a single camera; to synthesize individuals from any virtual viewpoint; and to re-render people in novel poses.
1 code implementation • 22 Nov 2022 • Jaime Spencer, C. Stella Qian, Chris Russell, Simon Hadfield, Erich Graf, Wendy Adams, Andrew J. Schofield, James Elder, Richard Bowden, Heng Cong, Stefano Mattoccia, Matteo Poggi, Zeeshan Khan Suri, Yang Tang, Fabio Tosi, Hao Wang, Youmin Zhang, Yusheng Zhang, Chaoqiang Zhao
This challenge evaluated the progress of self-supervised monocular depth estimation on the challenging SYNS-Patches dataset.
no code implementations • 3 Oct 2022 • Ryan Wong, Necati Cihan Camgöz, Richard Bowden
Most of the vision-based sign language research to date has focused on Isolated Sign Language Recognition (ISLR), where the objective is to predict a single sign class given a short video clip.
no code implementations • SLTAT (LREC) 2022 • Harry Walsh, Ben Saunders, Richard Bowden
We use language models such as BERT and Word2Vec to create better sentence level embeddings, and apply several tokenization techniques, demonstrating how these improve performance on the low resource translation task of Text to Gloss.
2 code implementations • 2 Aug 2022 • Jaime Spencer, Chris Russell, Simon Hadfield, Richard Bowden
It is likely that many papers were not only optimized for particular datasets, but also for errors in the data and evaluation criteria.
no code implementations • 26 Jun 2022 • Nimet Kaygusuz, Oscar Mendez, Richard Bowden
To address this limitation, in this work, we propose AFT-VO, a novel transformer-based sensor fusion architecture to estimate VO from multiple sensors.
no code implementations • 12 Apr 2022 • Jaime Spencer, Richard Bowden, Simon Hadfield
We argue that MTL is a stepping stone towards universal feature learning (UFL), which is the ability to learn generic features that can be applied to new tasks without retraining.
no code implementations • CVPR 2022 • Avishkar Saha, Oscar Mendez, Chris Russell, Richard Bowden
Estimating a semantically segmented bird's-eye-view (BEV) map from a single image has become a popular technique for autonomous control and navigation.
1 code implementation • CVPR 2022 • Ben Saunders, Necati Cihan Camgoz, Richard Bowden
To learn sign co-articulation, we propose a novel Frame Selection Network (FS-Net) that improves the temporal alignment of interpolated dictionary signs to continuous signing sequences.
no code implementations • 18 Feb 2022 • Matthew J. Vowels, Sina Akbari, Necati Cihan Camgoz, Richard Bowden
Unfortunately, they are unlikely to be sufficiently flexible to be able to adequately model real-world phenomena, and may yield biased estimates.
no code implementations • 23 Dec 2021 • Nimet Kaygusuz, Oscar Mendez, Richard Bowden
Visual Odometry (VO) is used in many applications including robotics and autonomous systems.
no code implementations • 23 Dec 2021 • Nimet Kaygusuz, Oscar Mendez, Richard Bowden
To address this issue, we propose a deep sensor fusion framework which estimates vehicle motion using both pose and uncertainty estimations from multiple on-board cameras.
no code implementations • SLTAT (LREC) 2022 • Ben Saunders, Necati Cihan Camgoz, Richard Bowden
Recent approaches to Sign Language Production (SLP) have adopted spoken language Neural Machine Translation (NMT) architectures, applied without sign-specific modifications.
1 code implementation • 24 Nov 2021 • Guillaume Rochette, Chris Russell, Richard Bowden
We show how our approach can be used for motion transfer between individuals; novel view synthesis of individuals captured from just a single camera; to synthesize individuals from any virtual viewpoint; and to re-render people in novel poses.
1 code implementation • 3 Oct 2021 • Avishkar Saha, Oscar Mendez Maldonado, Chris Russell, Richard Bowden
We show how a novel form of transformer network can be used to map from images and video directly to an overhead map or bird's-eye-view (BEV) of the world, in a single end-to-end network.
no code implementations • 25 Jul 2021 • Oscar Mendez, Matthew Vowels, Richard Bowden
Attention is an important component of modern deep learning.
no code implementations • ICCV 2021 • Ben Saunders, Necati Cihan Camgoz, Richard Bowden
Using a progressive transformer for the translation sub-task, we propose a novel Mixture of Motion Primitives (MoMP) architecture for sign language animation.
no code implementations • 22 Jul 2021 • Ben Saunders, Necati Cihan Camgoz, Richard Bowden
To tackle SLVA, we propose AnonySign, a novel automatic approach for visual anonymisation of sign language data.
no code implementations • 21 Jul 2021 • Tao Jiang, Necati Cihan Camgoz, Richard Bowden
In this paper, we focus on the task of one-shot sign spotting, i. e. given an example of an isolated sign (query), we want to identify whether/where this sign appears in a continuous, co-articulated sign language video (target).
1 code implementation • 9 Jul 2021 • Sampo Kuutti, Saber Fallah, Richard Bowden
By penalising the safe action distribution based on its similarity to the unsafe action distribution when training on the collision dataset, a more robust and safe control policy is obtained.
1 code implementation • 9 Jul 2021 • Sampo Kuutti, Saber Fallah, Richard Bowden
By training the protagonist against an ensemble of adversaries, it learns a significantly more robust control policy, which generalises to a variety of adversarial strategies.
no code implementations • 1 Jun 2021 • Oscar Mendez, Simon Hadfield, Richard Bowden
Recent advances in deep learning hardware allow large likelihood volumes to be stored directly on the GPU, along with the hardware necessary to efficiently perform GPU-bound 3D convolutions and this obviates many of the disadvantages of grid based methods.
no code implementations • 5 May 2021 • Necati Cihan Camgoz, Ben Saunders, Guillaume Rochette, Marco Giovanelli, Giacomo Inches, Robin Nachtrab-Ribback, Richard Bowden
Computational sign language research lacks the large-scale datasets that enables the creation of useful reallife applications.
no code implementations • 23 Apr 2021 • Tao Jiang, Necati Cihan Camgoz, Richard Bowden
Skeletor can achieve this as it implicitly learns the spatio-temporal context of human motion via a transformer based neural network.
no code implementations • 20 Apr 2021 • Amit Moryossef, Ioannis Tsochantaridis, Joe Dinn, Necati Cihan Camgöz, Richard Bowden, Tao Jiang, Annette Rios, Mathias Müller, Sarah Ebling
Basically, skeletal representations generalize over an individual's appearance and background, allowing us to focus on the recognition of motion.
no code implementations • 16 Apr 2021 • Matthew J. Vowels, Necati Cihan Camgoz, Richard Bowden
An important goal across most scientific fields is the discovery of causal structures underling a set of observations.
no code implementations • 19 Mar 2021 • Celyn Walters, Oscar Mendez, Mark Johnson, Richard Bowden
In this work, we aim to address the dense correspondence estimation problem in a way that generalizes to more than one spectrum.
1 code implementation • 17 Mar 2021 • Sampo Kuutti, Richard Bowden, Saber Fallah
We compare models with and without safety cages, as well as models with optimal and constrained model parameters, and show that the weak supervision consistently improves the safety of exploration, speed of convergence, and model performance.
1 code implementation • CVPR 2021 • Matthew J. Vowels, Necati Cihan Camgoz, Richard Bowden
Given that supervision is often expensive or infeasible to acquire, we choose to incorporate structural inductive bias and present an unsupervised, deep State-Space-Model for Video Disentanglement (VDSM).
no code implementations • 11 Mar 2021 • Ben Saunders, Necati Cihan Camgoz, Richard Bowden
Sign languages are multi-channel visual languages, where signers use a continuous 3D space to communicate. Sign Language Production (SLP), the automatic translation from spoken to sign languages, must embody both the continuous articulation and full morphology of sign to be truly understandable by the Deaf community.
no code implementations • 3 Mar 2021 • Matthew J. Vowels, Necati Cihan Camgoz, Richard Bowden
Causal reasoning is a crucial part of science and human intelligence.
no code implementations • 19 Nov 2020 • Ben Saunders, Necati Cihan Camgoz, Richard Bowden
To be truly understandable and accepted by Deaf communities, an automatic Sign Language Production (SLP) system must generate a photo-realistic signer.
1 code implementation • 28 Sep 2020 • Matthew James Vowels, Necati Cihan Camgoz, Richard Bowden
Undertaking causal inference with observational data is incredibly useful across a wide range of tasks including the development of medical treatments, advertisements and marketing, and policy making.
no code implementations • 28 Sep 2020 • Matthew James Vowels, Necati Cihan Camgoz, Richard Bowden
Undertaking causal inference with observational data is extremely useful across a wide range of domains including the development of medical treatments, advertisements and marketing, and policy making.
no code implementations • 1 Sep 2020 • Necati Cihan Camgoz, Oscar Koller, Simon Hadfield, Richard Bowden
Sign languages use multiple asynchronous information channels (articulators), not just the hands but also the face and body, which computational approaches often ignore.
no code implementations • 27 Aug 2020 • Ben Saunders, Necati Cihan Camgoz, Richard Bowden
Sign Languages are rich multi-channel languages, requiring articulation of both manual (hands) and non-manual (face and body) features in a precise, intricate manner.
1 code implementation • ECCV 2020 • Ben Saunders, Necati Cihan Camgoz, Richard Bowden
The goal of automatic Sign Language Production (SLP) is to translate spoken language to a continuous stream of sign language video at a level comparable to a human translator.
1 code implementation • CVPR 2020 • Jaime Spencer, Richard Bowden, Simon Hadfield
The aim of this paper is to provide a dense feature representation that can be used to perform localization, sparse matching or image retrieval, regardless of the current seasonal or temporal appearance.
1 code implementation • CVPR 2020 • Jaime Spencer, Richard Bowden, Simon Hadfield
In the current monocular depth research, the dominant approach is to employ unsupervised training on large datasets, driven by warped photometric consistency.
2 code implementations • CVPR 2020 • Necati Cihan Camgoz, Oscar Koller, Simon Hadfield, Richard Bowden
We report state-of-the-art sign language recognition and translation results achieved by our Sign Language Transformers.
1 code implementation • 27 Feb 2020 • Sampo Kuutti, Saber Fallah, Richard Bowden
As the networks used to obtain state-of-the-art results become increasingly deep and complex, the rules they have learned and how they operate become more challenging to understand.
no code implementations • CVPR 2020 • Matthew J. Vowels, Necati Cihan Camgoz, Richard Bowden
Two outer VAEs with shared weights attempt to reconstruct the input and infer a latent space, whilst a nested VAE attempts to reconstruct the latent representation of one image, from the latent representation of its paired image.
no code implementations • 23 Dec 2019 • Sampo Kuutti, Richard Bowden, Yaochu Jin, Phil Barber, Saber Fallah
However, deep learning methods have shown great promise in not only providing excellent performance for complex and non-linear control problems, but also in generalising previously learned rules to new scenarios.
no code implementations • 15 Nov 2019 • Matthew J. Vowels, Necati Cihan Camgoz, Richard Bowden
However, there is some debate about how to encourage disentanglement with VAEs and evidence indicates that existing implementations of VAEs do not achieve disentanglement consistently.
no code implementations • 13 Sep 2019 • Guillaume Rochette, Chris Russell, Richard Bowden
We present a novel data-driven regularizer for weakly-supervised learning of 3D human pose estimation that eliminates the drift problem that affects existing approaches.
1 code implementation • CVPR 2019 • Jaime Spencer, Richard Bowden, Simon Hadfield
In all cases, we show how incorporating SAND features results in better or comparable results to the baseline, whilst requiring little to no additional training.
no code implementations • 19 Nov 2018 • Jaime Spencer, Oscar Mendez, Richard Bowden, Simon Hadfield
In order to build the embedded map, we train a deep Siamese Fully Convolutional U-Net to perform dense feature extraction.
1 code implementation • CVPR 2018 • Necati Cihan Camgoz, Simon Hadfield, Oscar Koller, Hermann Ney, Richard Bowden
SLR seeks to recognize a sequence of continuous signs but neglects the underlying rich grammatical and linguistic structures of sign language that differ from spoken language.
Ranked #11 on Sign Language Translation on RWTH-PHOENIX-Weather 2014 T
no code implementations • ICCV 2017 • Oscar Mendez, Simon Hadfield, Nicolas Pugeault, Richard Bowden
This approach is ill-suited for reconstruction applications, where learning about the environment is more valuable than speed of traversal.
2 code implementations • ICCV 2017 • Necati Cihan Camgoz, Simon Hadfield, Oscar Koller, Richard Bowden
We propose a novel deep learning approach to solve simultaneous alignment and recognition problems (referred to as "Sequence-to-sequence" learning).
Ranked #20 on Sign Language Recognition on RWTH-PHOENIX-Weather 2014
no code implementations • 5 Sep 2017 • Oscar Mendez, Simon Hadfield, Nicolas Pugeault, Richard Bowden
Similarly, we do not extrude the 2D geometry present in the floorplan into 3D and try to align it to the real-world.
no code implementations • 9 Sep 2016 • Andrew Gilbert, Richard Bowden
On the UCF11 video dataset, the accuracy is 86. 7% despite using only 90 labelled examples from a dataset of over 1200 videos, instead of the standard 1122 training videos.
no code implementations • CVPR 2016 • Oscar Koller, Hermann Ney, Richard Bowden
Furthermore, we demonstrate its use in continuous sign language recognition on two publicly available large sign language data sets, where it outperforms the current state-of-the-art by a large margin.
no code implementations • ICCV 2015 • Karel Lebeda, Simon Hadfield, Richard Bowden
We show that the location predictions are robust to camera shake and sud- den motion, which is invaluable for any tracking algorithm and demonstrate this by applying causal prediction to two state-of-the-art trackers.
no code implementations • ICCV 2015 • Simon Hadfield, Richard Bowden
We present a novel approach to 3D reconstruction which is inspired by the human visual system.
no code implementations • CVPR 2014 • Eng-Jon Ong, Oscar Koller, Nicolas Pugeault, Richard Bowden
This paper tackles the problem of spotting a set of signs occuring in videos with sequences of signs.
no code implementations • CVPR 2013 • Simon Hadfield, Richard Bowden
In addition, two state of the art action recognition algorithms are extended to make use of the 3D data, and five new interest point detection strategies are also proposed, that extend to the 3D data.