1 code implementation • ICCV 2023 • Liliane Momeni, Mathilde Caron, Arsha Nagrani, Andrew Zisserman, Cordelia Schmid
Understanding verbs is crucial to modelling how people and objects interact with each other and the environment through space and time.
Ranked #11 on Zero-Shot Video Question Answer on NExT-QA
no code implementations • 2 Apr 2023 • Samuel Albanie, Liliane Momeni, João F. Henriques
Driven by recent advances AI, we passengers are entering a golden age of scientific discovery.
1 code implementation • 16 Nov 2022 • K R Prajwal, Hannah Bull, Liliane Momeni, Samuel Albanie, Gül Varol, Andrew Zisserman
Through extensive evaluations, we verify our method for automatic annotation and our model architecture.
no code implementations • 4 Aug 2022 • Liliane Momeni, Hannah Bull, K R Prajwal, Samuel Albanie, Gül Varol, Andrew Zisserman
Recently, sign language researchers have turned to sign language interpreted TV broadcasts, comprising (i) a video of continuous signing and (ii) subtitles corresponding to the audio content, as a readily available and large-scale source of training data.
no code implementations • 9 May 2022 • Gül Varol, Liliane Momeni, Samuel Albanie, Triantafyllos Afouras, Andrew Zisserman
The focus of this work is $\textit{sign spotting}$ - given a video of an isolated sign, our task is to identify $\textit{whether}$ and $\textit{where}$ it has been signed in a continuous, co-articulated sign language video.
no code implementations • 5 Nov 2021 • Samuel Albanie, Gül Varol, Liliane Momeni, Hannah Bull, Triantafyllos Afouras, Himel Chowdhury, Neil Fox, Bencie Woll, Rob Cooper, Andrew McParland, Andrew Zisserman
In this work, we introduce the BBC-Oxford British Sign Language (BOBSL) dataset, a large-scale video collection of British Sign Language (BSL).
1 code implementation • 29 Oct 2021 • K R Prajwal, Liliane Momeni, Triantafyllos Afouras, Andrew Zisserman
In this paper, we consider the task of spotting spoken keywords in silent video sequences -- also known as visual keyword spotting.
Ranked #1 on Visual Keyword Spotting on LRS2
no code implementations • ICCV 2021 • Hannah Bull, Triantafyllos Afouras, Gül Varol, Samuel Albanie, Liliane Momeni, Andrew Zisserman
The goal of this work is to temporally align asynchronous subtitles in sign language videos.
no code implementations • CVPR 2021 • Gül Varol, Liliane Momeni, Samuel Albanie, Triantafyllos Afouras, Andrew Zisserman
Our contributions are as follows: (1) we demonstrate the ability to leverage large quantities of continuous signing videos with weakly-aligned subtitles to localise signs in continuous sign language; (2) we employ the learned attention to automatically generate hundreds of thousands of annotations for a large sign vocabulary; (3) we collect a set of 37K manually verified sign instances across a vocabulary of 950 sign classes to support our study of sign language recognition; (4) by training on the newly annotated data from our method, we outperform the prior state of the art on the BSL-1K sign language recognition benchmark.
1 code implementation • 8 Oct 2020 • Liliane Momeni, Gül Varol, Samuel Albanie, Triantafyllos Afouras, Andrew Zisserman
The focus of this work is sign spotting - given a video of an isolated sign, our task is to identify whether and where it has been signed in a continuous, co-articulated sign language video.
1 code implementation • 2 Sep 2020 • Liliane Momeni, Triantafyllos Afouras, Themos Stafylakis, Samuel Albanie, Andrew Zisserman
The goal of this work is to automatically determine whether and when a word of interest is spoken by a talking face, with or without the audio.
1 code implementation • ECCV 2020 • Samuel Albanie, Gül Varol, Liliane Momeni, Triantafyllos Afouras, Joon Son Chung, Neil Fox, Andrew Zisserman
Recent progress in fine-grained gesture and action classification, and machine translation, point to the possibility of automated sign language recognition becoming a reality.
Ranked #4 on Sign Language Recognition on WLASL-2000