1 code implementation • 26 Aug 2024 • Karsten Roth, Vishaal Udandarao, Sebastian Dziadzio, Ameya Prabhu, Mehdi Cherti, Oriol Vinyals, Olivier Hénaff, Samuel Albanie, Matthias Bethge, Zeynep Akata
In this work, we complement current perspectives on continual pretraining through a research test bed as well as provide comprehensive guidance for effective continual model updates in such scenarios.
no code implementations • 21 Aug 2024 • Jonathan Roberts, Kai Han, Samuel Albanie
As such, there is a pressing need for a new generation of benchmarks challenging enough for the next generation of LMMs.
no code implementations • 5 Jul 2024 • Zachary Kenton, Noah Y. Siegel, János Kramár, Jonah Brown-Cohen, Samuel Albanie, Jannis Bulian, Rishabh Agarwal, David Lindner, Yunhao Tang, Noah D. Goodman, Rohin Shah
We find that debate outperforms consultancy across all tasks when the consultant is randomly assigned to argue for the correct/incorrect answer.
no code implementations • 5 Jun 2024 • Tim Franzmeyer, Aleksandar Shtedritski, Samuel Albanie, Philip Torr, João F. Henriques, Jakob N. Foerster
Verifying whether an X note is helpful or whether a Wikipedia edit should be accepted are hard tasks that require grounding by querying the web.
1 code implementation • 2 Jun 2024 • Arduin Findeis, Timo Kaufmann, Eyke Hüllermeier, Samuel Albanie, Robert Mullins
In constitutional AI, a set of principles (or constitution) is used to provide feedback and fine-tune AI models.
no code implementations • 16 May 2024 • Charles Raude, K R Prajwal, Liliane Momeni, Hannah Bull, Samuel Albanie, Andrew Zisserman, Gül Varol
To this end, we introduce a multi-task Transformer model, CSLR2, that is able to ingest a signing sequence and output in a joint embedding space between signed language and spoken language text.
1 code implementation • 14 May 2024 • Jonathan Roberts, Kai Han, Neil Houlsby, Samuel Albanie
In this work, we present SciFIBench, a scientific figure interpretation benchmark.
1 code implementation • 15 Apr 2024 • Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, Jose Hernandez-Orallo, Lewis Hammond, Eric Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Zhang, Ruiqi Zhong, Seán Ó hÉigeartaigh, Gabriel Recchia, Giulio Corsi, Alan Chan, Markus Anderljung, Lilian Edwards, Aleksandar Petrov, Christian Schroeder de Witt, Sumeet Ramesh Motwan, Yoshua Bengio, Danqi Chen, Philip H. S. Torr, Samuel Albanie, Tegan Maharaj, Jakob Foerster, Florian Tramer, He He, Atoosa Kasirzadeh, Yejin Choi, David Krueger
This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs).
1 code implementation • 4 Apr 2024 • Vishaal Udandarao, Ameya Prabhu, Adhiraj Ghosh, Yash Sharma, Philip H. S. Torr, Adel Bibi, Samuel Albanie, Matthias Bethge
Web-crawled pretraining datasets underlie the impressive "zero-shot" evaluation performance of multimodal models, such as CLIP for classification/retrieval and Stable-Diffusion for image generation.
1 code implementation • 29 Feb 2024 • Ameya Prabhu, Vishaal Udandarao, Philip Torr, Matthias Bethge, Adel Bibi, Samuel Albanie
However, with repeated testing, the risk of overfitting grows as algorithms over-exploit benchmark idiosyncrasies.
no code implementations • 29 Feb 2024 • Andreea-Maria Oncescu, João F. Henriques, Andrew Zisserman, Samuel Albanie, A. Sophia Koepke
Furthermore, we show that using the same prompts, we can successfully employ LLMs to improve the retrieval on EpicSounds, compared to using the original audio class labels of the dataset.
1 code implementation • CVPR 2024 • Hangjie Yuan, Shiwei Zhang, Xiang Wang, Yujie Wei, Tao Feng, Yining Pan, Yingya Zhang, Ziwei Liu, Samuel Albanie, Dong Ni
To tackle this problem, we propose InstructVideo to instruct text-to-video diffusion models with human feedback by reward fine-tuning.
1 code implementation • 24 Nov 2023 • Jonathan Roberts, Timo Lüddecke, Rehan Sheikh, Kai Han, Samuel Albanie
Multimodal large language models (MLLMs) have shown remarkable capabilities across a broad range of tasks but their knowledge and abilities in the geographic and geospatial domains are yet to be explored, despite potential wide-ranging benefits to navigation, environmental research, urban development, and disaster response.
1 code implementation • 12 Oct 2023 • Vishaal Udandarao, Max F. Burg, Samuel Albanie, Matthias Bethge
This finding points to a blind spot in current frontier VLMs: they excel in recognizing semantic content but fail to acquire an understanding of visual data-types through scaling.
1 code implementation • ICCV 2023 • Kaiqu Liang, Samuel Albanie
To date, the majority of video retrieval systems have been optimized for a "single-shot" scenario in which the user submits a query in isolation, ignoring previous interactions with the system.
3 code implementations • ICCV 2023 • Hangjie Yuan, Shiwei Zhang, Xiang Wang, Samuel Albanie, Yining Pan, Tao Feng, Jianwen Jiang, Dong Ni, Yingya Zhang, Deli Zhao
In this paper, we propose RLIPv2, a fast converging model that enables the scaling of relational pre-training to large-scale pseudo-labelled scene graph data.
Ranked #1 on Zero-Shot Human-Object Interaction Detection on HICO-DET (using extra training data)
1 code implementation • 13 Jun 2023 • Gyungin Shin, Weidi Xie, Samuel Albanie
In this paper, we propose to meet this challenge through the novel task of automatic table verification (AutoTV), in which the objective is to verify the accuracy of numerical data in tables by cross-referencing cited sources.
no code implementations • 30 May 2023 • Jonathan Roberts, Timo Lüddecke, Sowmen Das, Kai Han, Samuel Albanie
Large language models (LLMs) have shown remarkable capabilities across a broad range of tasks involving question answering and the generation of coherent text and code.
1 code implementation • 27 Apr 2023 • Gyungin Shin, Samuel Albanie, Weidi Xie
Segmentation is a core computer vision competency, with applications spanning a broad range of scientifically and economically valuable domains.
no code implementations • 23 Apr 2023 • Jonathan Roberts, Kai Han, Samuel Albanie
In this work, we introduce SATellite ImageNet (SATIN), a metadataset curated from 27 existing remotely sensed datasets, and comprehensively evaluate the zero-shot transfer classification capabilities of a broad range of vision-language (VL) models on SATIN.
1 code implementation • 21 Apr 2023 • Mingkai Zheng, Xiu Su, Shan You, Fei Wang, Chen Qian, Chang Xu, Samuel Albanie
We investigate the potential of GPT-4~\cite{gpt4} to perform Neural Architecture Search (NAS) -- the task of designing effective neural architectures.
no code implementations • 2 Apr 2023 • Samuel Albanie, Liliane Momeni, João F. Henriques
Driven by recent advances AI, we passengers are entering a golden age of scientific discovery.
1 code implementation • 15 Mar 2023 • Sucheng Ren, Fangyun Wei, Samuel Albanie, Zheng Zhang, Han Hu
Deep supervision, which involves extra supervisions to the intermediate features of a neural network, was widely used in image classification in the early deep learning era since it significantly reduces the training difficulty and eases the optimization like avoiding gradient vanish over the vanilla training.
1 code implementation • ICCV 2023 • Ioana Croitoru, Simion-Vlad Bogolin, Samuel Albanie, Yang Liu, Zhaowen Wang, Seunghyun Yoon, Franck Dernoncourt, Hailin Jin, Trung Bui
To study this problem, we propose the first dataset of untrimmed, long-form tutorial videos for the task of Moment Detection called the Behance Moment Detection (BMD) dataset.
2 code implementations • ICCV 2023 • Vishaal Udandarao, Ankush Gupta, Samuel Albanie
Contrastive Language-Image Pre-training (CLIP) has emerged as a simple yet effective way to train large-scale vision-language models.
1 code implementation • 16 Nov 2022 • K R Prajwal, Hannah Bull, Liliane Momeni, Samuel Albanie, Gül Varol, Andrew Zisserman
Through extensive evaluations, we verify our method for automatic annotation and our model architecture.
7 code implementations • 9 Nov 2022 • BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major, Iz Beltagy, Huu Nguyen, Lucile Saulnier, Samson Tan, Pedro Ortiz Suarez, Victor Sanh, Hugo Laurençon, Yacine Jernite, Julien Launay, Margaret Mitchell, Colin Raffel, Aaron Gokaslan, Adi Simhi, Aitor Soroa, Alham Fikri Aji, Amit Alfassy, Anna Rogers, Ariel Kreisberg Nitzav, Canwen Xu, Chenghao Mou, Chris Emezue, Christopher Klamm, Colin Leong, Daniel van Strien, David Ifeoluwa Adelani, Dragomir Radev, Eduardo González Ponferrada, Efrat Levkovizh, Ethan Kim, Eyal Bar Natan, Francesco De Toni, Gérard Dupont, Germán Kruszewski, Giada Pistilli, Hady Elsahar, Hamza Benyamina, Hieu Tran, Ian Yu, Idris Abdulmumin, Isaac Johnson, Itziar Gonzalez-Dios, Javier de la Rosa, Jenny Chim, Jesse Dodge, Jian Zhu, Jonathan Chang, Jörg Frohberg, Joseph Tobing, Joydeep Bhattacharjee, Khalid Almubarak, Kimbo Chen, Kyle Lo, Leandro von Werra, Leon Weber, Long Phan, Loubna Ben allal, Ludovic Tanguy, Manan Dey, Manuel Romero Muñoz, Maraim Masoud, María Grandury, Mario Šaško, Max Huang, Maximin Coavoux, Mayank Singh, Mike Tian-Jian Jiang, Minh Chien Vu, Mohammad A. Jauhar, Mustafa Ghaleb, Nishant Subramani, Nora Kassner, Nurulaqilla Khamis, Olivier Nguyen, Omar Espejel, Ona de Gibert, Paulo Villegas, Peter Henderson, Pierre Colombo, Priscilla Amuok, Quentin Lhoest, Rheza Harliman, Rishi Bommasani, Roberto Luis López, Rui Ribeiro, Salomey Osei, Sampo Pyysalo, Sebastian Nagel, Shamik Bose, Shamsuddeen Hassan Muhammad, Shanya Sharma, Shayne Longpre, Somaieh Nikpoor, Stanislav Silberberg, Suhas Pai, Sydney Zink, Tiago Timponi Torrent, Timo Schick, Tristan Thrush, Valentin Danchev, Vassilina Nikoulina, Veronika Laippala, Violette Lepercq, Vrinda Prabhu, Zaid Alyafeai, Zeerak Talat, Arun Raja, Benjamin Heinzerling, Chenglei Si, Davut Emre Taşar, Elizabeth Salesky, Sabrina J. Mielke, Wilson Y. Lee, Abheesht Sharma, Andrea Santilli, Antoine Chaffin, Arnaud Stiegler, Debajyoti Datta, Eliza Szczechla, Gunjan Chhablani, Han Wang, Harshit Pandey, Hendrik Strobelt, Jason Alan Fries, Jos Rozen, Leo Gao, Lintang Sutawika, M Saiful Bari, Maged S. Al-shaibani, Matteo Manica, Nihal Nayak, Ryan Teehan, Samuel Albanie, Sheng Shen, Srulik Ben-David, Stephen H. Bach, Taewoon Kim, Tali Bers, Thibault Fevry, Trishala Neeraj, Urmish Thakker, Vikas Raunak, Xiangru Tang, Zheng-Xin Yong, Zhiqing Sun, Shaked Brody, Yallow Uri, Hadar Tojarieh, Adam Roberts, Hyung Won Chung, Jaesung Tae, Jason Phang, Ofir Press, Conglong Li, Deepak Narayanan, Hatim Bourfoune, Jared Casper, Jeff Rasley, Max Ryabinin, Mayank Mishra, Minjia Zhang, Mohammad Shoeybi, Myriam Peyrounette, Nicolas Patry, Nouamane Tazi, Omar Sanseviero, Patrick von Platen, Pierre Cornette, Pierre François Lavallée, Rémi Lacroix, Samyam Rajbhandari, Sanchit Gandhi, Shaden Smith, Stéphane Requena, Suraj Patil, Tim Dettmers, Ahmed Baruwa, Amanpreet Singh, Anastasia Cheveleva, Anne-Laure Ligozat, Arjun Subramonian, Aurélie Névéol, Charles Lovering, Dan Garrette, Deepak Tunuguntla, Ehud Reiter, Ekaterina Taktasheva, Ekaterina Voloshina, Eli Bogdanov, Genta Indra Winata, Hailey Schoelkopf, Jan-Christoph Kalo, Jekaterina Novikova, Jessica Zosa Forde, Jordan Clive, Jungo Kasai, Ken Kawamura, Liam Hazan, Marine Carpuat, Miruna Clinciu, Najoung Kim, Newton Cheng, Oleg Serikov, Omer Antverg, Oskar van der Wal, Rui Zhang, Ruochen Zhang, Sebastian Gehrmann, Shachar Mirkin, Shani Pais, Tatiana Shavrina, Thomas Scialom, Tian Yun, Tomasz Limisiewicz, Verena Rieser, Vitaly Protasov, Vladislav Mikhailov, Yada Pruksachatkun, Yonatan Belinkov, Zachary Bamberger, Zdeněk Kasner, Alice Rueda, Amanda Pestana, Amir Feizpour, Ammar Khan, Amy Faranak, Ana Santos, Anthony Hevia, Antigona Unldreaj, Arash Aghagol, Arezoo Abdollahi, Aycha Tammour, Azadeh HajiHosseini, Bahareh Behroozi, Benjamin Ajibade, Bharat Saxena, Carlos Muñoz Ferrandis, Daniel McDuff, Danish Contractor, David Lansky, Davis David, Douwe Kiela, Duong A. Nguyen, Edward Tan, Emi Baylor, Ezinwanne Ozoani, Fatima Mirza, Frankline Ononiwu, Habib Rezanejad, Hessie Jones, Indrani Bhattacharya, Irene Solaiman, Irina Sedenko, Isar Nejadgholi, Jesse Passmore, Josh Seltzer, Julio Bonis Sanz, Livia Dutra, Mairon Samagaio, Maraim Elbadri, Margot Mieskes, Marissa Gerchick, Martha Akinlolu, Michael McKenna, Mike Qiu, Muhammed Ghauri, Mykola Burynok, Nafis Abrar, Nazneen Rajani, Nour Elkott, Nour Fahmy, Olanrewaju Samuel, Ran An, Rasmus Kromann, Ryan Hao, Samira Alizadeh, Sarmad Shubber, Silas Wang, Sourav Roy, Sylvain Viguier, Thanh Le, Tobi Oyebade, Trieu Le, Yoyo Yang, Zach Nguyen, Abhinav Ramesh Kashyap, Alfredo Palasciano, Alison Callahan, Anima Shukla, Antonio Miranda-Escalada, Ayush Singh, Benjamin Beilharz, Bo wang, Caio Brito, Chenxi Zhou, Chirag Jain, Chuxin Xu, Clémentine Fourrier, Daniel León Periñán, Daniel Molano, Dian Yu, Enrique Manjavacas, Fabio Barth, Florian Fuhrimann, Gabriel Altay, Giyaseddin Bayrak, Gully Burns, Helena U. Vrabec, Imane Bello, Ishani Dash, Jihyun Kang, John Giorgi, Jonas Golde, Jose David Posada, Karthik Rangasai Sivaraman, Lokesh Bulchandani, Lu Liu, Luisa Shinzato, Madeleine Hahn de Bykhovetz, Maiko Takeuchi, Marc Pàmies, Maria A Castillo, Marianna Nezhurina, Mario Sänger, Matthias Samwald, Michael Cullan, Michael Weinberg, Michiel De Wolf, Mina Mihaljcic, Minna Liu, Moritz Freidank, Myungsun Kang, Natasha Seelam, Nathan Dahlberg, Nicholas Michio Broad, Nikolaus Muellner, Pascale Fung, Patrick Haller, Ramya Chandrasekhar, Renata Eisenberg, Robert Martin, Rodrigo Canalli, Rosaline Su, Ruisi Su, Samuel Cahyawijaya, Samuele Garda, Shlok S Deshmukh, Shubhanshu Mishra, Sid Kiblawi, Simon Ott, Sinee Sang-aroonsiri, Srishti Kumar, Stefan Schweter, Sushil Bharati, Tanmay Laud, Théo Gigant, Tomoya Kainuma, Wojciech Kusa, Yanis Labrak, Yash Shailesh Bajaj, Yash Venkatraman, Yifan Xu, Yingxin Xu, Yu Xu, Zhe Tan, Zhongli Xie, Zifan Ye, Mathilde Bras, Younes Belkada, Thomas Wolf
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions.
1 code implementation • 3 Nov 2022 • Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, Stella Biderman, Teven Le Scao, M Saiful Bari, Sheng Shen, Zheng-Xin Yong, Hailey Schoelkopf, Xiangru Tang, Dragomir Radev, Alham Fikri Aji, Khalid Almubarak, Samuel Albanie, Zaid Alyafeai, Albert Webson, Edward Raff, Colin Raffel
We find finetuning large multilingual language models on English tasks with English prompts allows for task generalization to non-English languages that appear only in the pretraining corpus.
Ranked #1 on Question Answering on StoryCloze
1 code implementation • 22 Sep 2022 • Gyungin Shin, Weidi Xie, Samuel Albanie
Our method, termed NamedMask, begins by using CLIP to construct category-specific archives of images.
3 code implementations • 5 Sep 2022 • Hangjie Yuan, Jianwen Jiang, Samuel Albanie, Tao Feng, Ziyuan Huang, Dong Ni, Mingqian Tang
The task of Human-Object Interaction (HOI) detection targets fine-grained visual parsing of humans interacting with their environment, enabling a broad range of applications.
Ranked #17 on Human-Object Interaction Detection on HICO-DET
no code implementations • 4 Aug 2022 • Liliane Momeni, Hannah Bull, K R Prajwal, Samuel Albanie, Gül Varol, Andrew Zisserman
Recently, sign language researchers have turned to sign language interpreted TV broadcasts, comprising (i) a video of continuous signing and (ii) subtitles corresponding to the audio content, as a readily available and large-scale source of training data.
2 code implementations • 14 Jun 2022 • Gyungin Shin, Weidi Xie, Samuel Albanie
Semantic segmentation has a broad range of applications, but its real-world impact has been significantly limited by the prohibitive annotation costs necessary to enable deployment.
no code implementations • 9 May 2022 • Gül Varol, Liliane Momeni, Samuel Albanie, Triantafyllos Afouras, Andrew Zisserman
The focus of this work is $\textit{sign spotting}$ - given a video of an isolated sign, our task is to identify $\textit{whether}$ and $\textit{where}$ it has been signed in a continuous, co-articulated sign language video.
no code implementations • 31 Mar 2022 • Samuel Albanie, Dylan Campbell, João F. Henriques
The field of machine learning has achieved striking progress in recent years, witnessing breakthrough results on language modelling, protein folding and nitpickingly fine-grained dog breed classification.
1 code implementation • 23 Mar 2022 • Gyungin Shin, Samuel Albanie, Weidi Xie
In this paper, we tackle the challenging task of unsupervised salient object detection (SOD) by leveraging spectral clustering on self-supervised features.
Ranked #1 on Unsupervised Saliency Detection on ECSSD
no code implementations • CVPR 2022 • Amanda Duarte, Samuel Albanie, Xavier Giró-i-Nieto, Gül Varol
Systems that can efficiently search collections of sign language videos have been highlighted as a useful application of sign language technology.
1 code implementation • CVPR 2022 • Simion-Vlad Bogolin, Ioana Croitoru, Hailin Jin, Yang Liu, Samuel Albanie
In this work we first show that, despite their effectiveness, state-of-the-art joint embeddings suffer significantly from the longstanding "hubness problem" in which a small number of gallery embeddings form the nearest neighbours of many queries.
Ranked #5 on Video Retrieval on QuerYD
1 code implementation • 17 Dec 2021 • A. Sophia Koepke, Andreea-Maria Oncescu, João F. Henriques, Zeynep Akata, Samuel Albanie
Additionally, we introduce the SoundDescs benchmark, which consists of paired audio and natural language descriptions for a diverse collection of sounds that are complementary to those found in AudioCaps and Clotho.
Ranked #1 on Audio to Text Retrieval on SoundDescs
no code implementations • 5 Nov 2021 • Samuel Albanie, Gül Varol, Liliane Momeni, Hannah Bull, Triantafyllos Afouras, Himel Chowdhury, Neil Fox, Bencie Woll, Rob Cooper, Andrew McParland, Andrew Zisserman
In this work, we introduce the BBC-Oxford British Sign Language (BOBSL) dataset, a large-scale video collection of British Sign Language (BSL).
no code implementations • CVPR 2021 • Yang Liu, Qingchao Chen, Samuel Albanie
In this paper, we study the task of visual-text retrieval in the highly practical setting in which labelled visual data with paired text descriptions are available in one domain (the "source"), but only unlabelled visual data (without text descriptions) are available in the domain of interest (the "target").
no code implementations • ICCV 2021 • Hannah Bull, Triantafyllos Afouras, Gül Varol, Samuel Albanie, Liliane Momeni, Andrew Zisserman
The goal of this work is to temporally align asynchronous subtitles in sign language videos.
1 code implementation • 5 May 2021 • Andreea-Maria Oncescu, A. Sophia Koepke, João F. Henriques, Zeynep Akata, Samuel Albanie
We consider the task of retrieving audio using free-form natural language queries.
Ranked #1 on Audio/Video to Text Retrieval on AudioCaps
1 code implementation • 28 Apr 2021 • Katrin Renz, Nicolaj C. Stache, Neil Fox, Gül Varol, Samuel Albanie
The objective of this work is to find temporal boundaries between signs in continuous sign language.
1 code implementation • ICCV 2021 • Ioana Croitoru, Simion-Vlad Bogolin, Marius Leordeanu, Hailin Jin, Andrew Zisserman, Samuel Albanie, Yang Liu
In recent years, considerable progress on the task of text-video retrieval has been achieved by leveraging large-scale pretraining on visual and audio datasets to construct powerful video encoders.
2 code implementations • 13 Apr 2021 • Gyungin Shin, Weidi Xie, Samuel Albanie
A central challenge for the task of semantic segmentation is the prohibitive cost of obtaining dense pixel-level annotations to supervise model training.
no code implementations • 31 Mar 2021 • Samuel Albanie, Erika Lu, Joao F. Henriques
In the quiet backwaters of cs. CV, cs. LG and stat. ML, a cornucopia of new learning systems is emerging from a primordial soup of mathematics-learning systems with no need for external supervision.
no code implementations • CVPR 2021 • Gül Varol, Liliane Momeni, Samuel Albanie, Triantafyllos Afouras, Andrew Zisserman
Our contributions are as follows: (1) we demonstrate the ability to leverage large quantities of continuous signing videos with weakly-aligned subtitles to localise signs in continuous sign language; (2) we employ the learned attention to automatically generate hundreds of thousands of annotations for a large sign vocabulary; (3) we collect a set of 37K manually verified sign instances across a vocabulary of 950 sign classes to support our study of sign language recognition; (4) by training on the newly annotated data from our method, we outperform the prior state of the art on the BSL-1K sign language recognition benchmark.
2 code implementations • 26 Mar 2021 • Ben Jaderberg, Lewis W. Anderson, Weidi Xie, Samuel Albanie, Martin Kiffner, Dieter Jaksch
The resurgence of self-supervised learning, whereby a deep learning model generates its own supervisory signal from the data, promises a scalable way to tackle the dramatically increasing size of real-world data sets without human annotation.
1 code implementation • 25 Nov 2020 • Katrin Renz, Nicolaj C. Stache, Samuel Albanie, Gül Varol
The objective of this work is to determine the location of temporal boundaries between signs in continuous sign language videos.
2 code implementations • 22 Nov 2020 • Andreea-Maria Oncescu, João F. Henriques, Yang Liu, Andrew Zisserman, Samuel Albanie
We introduce QuerYD, a new large-scale dataset for retrieval and event localisation in video.
1 code implementation • 8 Oct 2020 • Liliane Momeni, Gül Varol, Samuel Albanie, Triantafyllos Afouras, Andrew Zisserman
The focus of this work is sign spotting - given a video of an isolated sign, our task is to identify whether and where it has been signed in a continuous, co-articulated sign language video.
1 code implementation • 2 Sep 2020 • Liliane Momeni, Triantafyllos Afouras, Themos Stafylakis, Samuel Albanie, Andrew Zisserman
The goal of this work is to automatically determine whether and when a word of interest is spoken by a talking face, with or without the audio.
1 code implementation • 3 Aug 2020 • Samuel Albanie, Yang Liu, Arsha Nagrani, Antoine Miech, Ernesto Coto, Ivan Laptev, Rahul Sukthankar, Bernard Ghanem, Andrew Zisserman, Valentin Gabeur, Chen Sun, Karteek Alahari, Cordelia Schmid, Shi-Zhe Chen, Yida Zhao, Qin Jin, Kaixu Cui, Hui Liu, Chen Wang, Yudong Jiang, Xiaoshuai Hao
This report summarizes the results of the first edition of the challenge together with the findings of the participants.
1 code implementation • ECCV 2020 • Samuel Albanie, Gül Varol, Liliane Momeni, Triantafyllos Afouras, Joon Son Chung, Neil Fox, Andrew Zisserman
Recent progress in fine-grained gesture and action classification, and machine translation, point to the possibility of automated sign language recognition becoming a reality.
Ranked #6 on Sign Language Recognition on WLASL-2000
no code implementations • 31 Mar 2020 • Samuel Albanie, Jaime Thewmore, Robert McCraith, Joao F. Henriques
Peer review forms the backbone of modern scientific manuscript evaluation.
no code implementations • 2 Mar 2020 • Diego Granziol, Xingchen Wan, Samuel Albanie, Stephen Roberts
We analyse and explain the increased generalisation performance of iterate averaging using a Gaussian process perturbation model between the true and batch risk surface on the high dimensional quadratic.
no code implementations • 20 Feb 2020 • Arsha Nagrani, Joon Son Chung, Samuel Albanie, Andrew Zisserman
The objective of this paper is to learn representations of speaker identity without access to manually annotated data.
1 code implementation • ICCV 2019 • James Thewlis, Samuel Albanie, Hakan Bilen, Andrea Vedaldi
Equivariance to random image transformations is an effective method to learn landmarks of object categories, such as the eyes and the nose in faces, without manual supervision.
Ranked #1 on Unsupervised Facial Landmark Detection on 300W
3 code implementations • 31 Jul 2019 • Yang Liu, Samuel Albanie, Arsha Nagrani, Andrew Zisserman
The rapid growth of video on the internet has made searching for video content using natural language queries a significant challenge.
Ranked #24 on Video Retrieval on MSVD
no code implementations • 1 Apr 2019 • Samuel Albanie, James Thewlis, Sebastien Ehrhardt, Joao Henriques
The theory of deep learning is now considered largely solved, and is well understood by researchers and influencers alike.
9 code implementations • NeurIPS 2018 • Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Andrea Vedaldi
We also propose a parametric gather-excite operator pair which yields further performance gains, relate it to the recently-introduced Squeeze-and-Excitation Networks, and analyse the effects of these changes to the CNN feature activation statistics.
no code implementations • 16 Aug 2018 • Samuel Albanie, Arsha Nagrani, Andrea Vedaldi, Andrew Zisserman
We make the following contributions: (i) we develop a strong teacher network for facial emotion recognition that achieves the state of the art on a standard benchmark; (ii) we use the teacher to train a student, tabula rasa, to learn representations (embeddings) for speech emotion recognition without access to labelled audio data; and (iii) we show that the speech emotion embedding can be used for speech emotion recognition on external benchmark datasets.
Ranked #3 on Facial Expression Recognition (FER) on FERPlus
Facial Emotion Recognition Facial Expression Recognition (FER) +1
no code implementations • ECCV 2018 • David Novotny, Samuel Albanie, Diane Larlus, Andrea Vedaldi
Object detection and instance segmentation are dominated by region-based methods such as Mask RCNN.
6 code implementations • ICLR 2019 • João F. Henriques, Sebastien Ehrhardt, Samuel Albanie, Andrea Vedaldi
Instead, we propose to keep a single estimate of the gradient projected by the inverse Hessian matrix, and update it once per iteration.
1 code implementation • 21 May 2018 • João F. Henriques, Sebastien Ehrhardt, Samuel Albanie, Andrea Vedaldi
We propose a fast second-order method that can be used as a drop-in replacementfor current deep learning solvers.
1 code implementation • ECCV 2018 • Arsha Nagrani, Samuel Albanie, Andrew Zisserman
We propose and investigate an identity sensitive joint embedding of face and voice.
no code implementations • CVPR 2018 • David Novotny, Samuel Albanie, Diane Larlus, Andrea Vedaldi
Self-supervision can dramatically cut back the amount of manually-labelled data required to train deep neural networks.
no code implementations • CVPR 2018 • Arsha Nagrani, Samuel Albanie, Andrew Zisserman
We make the following contributions: (i) we introduce CNN architectures for both binary and multi-way cross-modal face and audio matching, (ii) we compare dynamic testing (where video information is available, but the audio is not from the same video) with static testing (where only a single still image is available), and (iii) we use human testing as a baseline to calibrate the difficulty of the task.
1 code implementation • 1 Apr 2018 • Samuel Albanie, James Thewlis, Joao F. Henriques
Learning through experience is time-consuming, inefficient and often bad for your cortisol levels.
82 code implementations • CVPR 2018 • Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Enhua Wu
Squeeze-and-Excitation Networks formed the foundation of our ILSVRC 2017 classification submission which won first place and reduced the top-5 error to 2. 251%, surpassing the winning entry of 2016 by a relative improvement of ~25%.
Ranked #6 on Object Detection on DSEC
1 code implementation • 7 Mar 2017 • Samuel Albanie, Sébastien Ehrhardt, João F. Henriques
While the costs of human violence have attracted a great deal of attention from the research community, the effects of the network-on-network (NoN) violence popularised by Generative Adversarial Networks have yet to be addressed.
no code implementations • 17 Jan 2017 • Samuel Albanie, Hillary Shakespeare, Tom Gunter
For a social networking service to acquire and retain users, it must find ways to keep them engaged.
no code implementations • 7 Oct 2016 • Samuel Albanie, Andrea Vedaldi
As a starting point, we consider the problem of relating facial expressions to objectively measurable events occurring in videos.