Search Results for author: Samuel Albanie

Found 66 papers, 42 papers with code

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

no code implementations • 15 Apr 2024 • Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, Jose Hernandez-Orallo, Lewis Hammond, Eric Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Zhang, Ruiqi Zhong, Seán Ó hÉigeartaigh, Gabriel Recchia, Giulio Corsi, Alan Chan, Markus Anderljung, Lilian Edwards, Yoshua Bengio, Danqi Chen, Samuel Albanie, Tegan Maharaj, Jakob Foerster, Florian Tramer, He He, Atoosa Kasirzadeh, Yejin Choi, David Krueger

This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs).

Paper
Add Code

No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance

1 code implementation • 4 Apr 2024 • Vishaal Udandarao, Ameya Prabhu, Adhiraj Ghosh, Yash Sharma, Philip H. S. Torr, Adel Bibi, Samuel Albanie, Matthias Bethge

Web-crawled pretraining datasets underlie the impressive "zero-shot" evaluation performance of multimodal models, such as CLIP for classification/retrieval and Stable-Diffusion for image generation.

Benchmarking Image Generation +1

Paper
Code

A SOUND APPROACH: Using Large Language Models to generate audio descriptions for egocentric text-audio retrieval

no code implementations • 29 Feb 2024 • Andreea-Maria Oncescu, João F. Henriques, Andrew Zisserman, Samuel Albanie, A. Sophia Koepke

Furthermore, we show that using the same prompts, we can successfully employ LLMs to improve the retrieval on EpicSounds, compared to using the original audio class labels of the dataset.

Retrieval

Paper
Add Code

Lifelong Benchmarks: Efficient Model Evaluation in an Era of Rapid Progress

1 code implementation • 29 Feb 2024 • Ameya Prabhu, Vishaal Udandarao, Philip Torr, Matthias Bethge, Adel Bibi, Samuel Albanie

However, with repeated testing, the risk of overfitting grows as algorithms over-exploit benchmark idiosyncrasies.

Benchmarking

Paper
Code

InstructVideo: Instructing Video Diffusion Models with Human Feedback

1 code implementation • 19 Dec 2023 • Hangjie Yuan, Shiwei Zhang, Xiang Wang, Yujie Wei, Tao Feng, Yining Pan, Yingya Zhang, Ziwei Liu, Samuel Albanie, Dong Ni

To tackle this problem, we propose InstructVideo to instruct text-to-video diffusion models with human feedback by reward fine-tuning.

Video Generation

Paper
Code

Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs

1 code implementation • 24 Nov 2023 • Jonathan Roberts, Timo Lüddecke, Rehan Sheikh, Kai Han, Samuel Albanie

Multimodal large language models (MLLMs) have shown remarkable capabilities across a broad range of tasks but their knowledge and abilities in the geographic and geospatial domains are yet to be explored, despite potential wide-ranging benefits to navigation, environmental research, urban development, and disaster response.

Disaster Response

Paper
Code

Visual Data-Type Understanding does not emerge from Scaling Vision-Language Models

1 code implementation • 12 Oct 2023 • Vishaal Udandarao, Max F. Burg, Samuel Albanie, Matthias Bethge

This finding points to a blind spot in current frontier VLMs: they excel in recognizing semantic content but fail to acquire an understanding of visual data-types through scaling.

Paper
Code

Simple Baselines for Interactive Video Retrieval with Questions and Answers

1 code implementation • ICCV 2023 • Kaiqu Liang, Samuel Albanie

To date, the majority of video retrieval systems have been optimized for a "single-shot" scenario in which the user submits a query in isolation, ignoring previous interactions with the system.

Question Answering Retrieval +1

Paper
Code

RLIPv2: Fast Scaling of Relational Language-Image Pre-training

3 code implementations • ICCV 2023 • Hangjie Yuan, Shiwei Zhang, Xiang Wang, Samuel Albanie, Yining Pan, Tao Feng, Jianwen Jiang, Dong Ni, Yingya Zhang, Deli Zhao

In this paper, we propose RLIPv2, a fast converging model that enables the scaling of relational pre-training to large-scale pseudo-labelled scene graph data.

Ranked #1 on Zero-Shot Human-Object Interaction Detection on HICO-DET (using extra training data)

Graph Generation Human-Object Interaction Detection +6

Paper
Code

arXiVeri: Automatic table verification with GPT

1 code implementation • 13 Jun 2023 • Gyungin Shin, Weidi Xie, Samuel Albanie

In this paper, we propose to meet this challenge through the novel task of automatic table verification (AutoTV), in which the objective is to verify the accuracy of numerical data in tables by cross-referencing cited sources.

Paper
Code

GPT4GEO: How a Language Model Sees the World's Geography

no code implementations • 30 May 2023 • Jonathan Roberts, Timo Lüddecke, Sowmen Das, Kai Han, Samuel Albanie

Large language models (LLMs) have shown remarkable capabilities across a broad range of tasks involving question answering and the generation of coherent text and code.

Disaster Response Language Modelling +2

Paper
Add Code

Zero-shot Unsupervised Transfer Instance Segmentation

1 code implementation • 27 Apr 2023 • Gyungin Shin, Samuel Albanie, Weidi Xie

Segmentation is a core computer vision competency, with applications spanning a broad range of scientifically and economically valuable domains.

Instance Segmentation Segmentation +1

Paper
Code

SATIN: A Multi-Task Metadataset for Classifying Satellite Imagery using Vision-Language Models

no code implementations • 23 Apr 2023 • Jonathan Roberts, Kai Han, Samuel Albanie

In this work, we introduce SATellite ImageNet (SATIN), a metadataset curated from 27 existing remotely sensed datasets, and comprehensively evaluate the zero-shot transfer classification capabilities of a broad range of vision-language (VL) models on SATIN.

Classification Image Classification

Paper
Add Code

Can GPT-4 Perform Neural Architecture Search?

1 code implementation • 21 Apr 2023 • Mingkai Zheng, Xiu Su, Shan You, Fei Wang, Chen Qian, Chang Xu, Samuel Albanie

We investigate the potential of GPT-4~\cite{gpt4} to perform Neural Architecture Search (NAS) -- the task of designing effective neural architectures.

Navigate Neural Architecture Search

Paper
Code

Large Language Models are Few-shot Publication Scoopers

no code implementations • 2 Apr 2023 • Samuel Albanie, Liliane Momeni, João F. Henriques

Driven by recent advances AI, we passengers are entering a golden age of scientific discovery.

Paper
Add Code

DeepMIM: Deep Supervision for Masked Image Modeling

1 code implementation • 15 Mar 2023 • Sucheng Ren, Fangyun Wei, Samuel Albanie, Zheng Zhang, Han Hu

Deep supervision, which involves extra supervisions to the intermediate features of a neural network, was widely used in image classification in the early deep learning era since it significantly reduces the training difficulty and eases the optimization like avoiding gradient vanish over the vanilla training.

Image Classification object-detection +2

Paper
Code

Moment Detection in Long Tutorial Videos

1 code implementation • ICCV 2023 • Ioana Croitoru, Simion-Vlad Bogolin, Samuel Albanie, Yang Liu, Zhaowen Wang, Seunghyun Yoon, Franck Dernoncourt, Hailin Jin, Trung Bui

To study this problem, we propose the first dataset of untrimmed, long-form tutorial videos for the task of Moment Detection called the Behance Moment Detection (BMD) dataset.

Paper
Code

SuS-X: Training-Free Name-Only Transfer of Vision-Language Models

2 code implementations • ICCV 2023 • Vishaal Udandarao, Ankush Gupta, Samuel Albanie

Contrastive Language-Image Pre-training (CLIP) has emerged as a simple yet effective way to train large-scale vision-language models.

Retrieval Zero-Shot Learning

Paper
Code

Weakly-supervised Fingerspelling Recognition in British Sign Language Videos

1 code implementation • 16 Nov 2022 • K R Prajwal, Hannah Bull, Liliane Momeni, Samuel Albanie, Gül Varol, Andrew Zisserman

Through extensive evaluations, we verify our method for automatic annotation and our model architecture.

Paper
Code

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

6 code implementations • 9 Nov 2022 • BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major, Iz Beltagy, Huu Nguyen, Lucile Saulnier, Samson Tan, Pedro Ortiz Suarez, Victor Sanh, Hugo Laurençon, Yacine Jernite, Julien Launay, Margaret Mitchell, Colin Raffel, Aaron Gokaslan, Adi Simhi, Aitor Soroa, Alham Fikri Aji, Amit Alfassy, Anna Rogers, Ariel Kreisberg Nitzav, Canwen Xu, Chenghao Mou, Chris Emezue, Christopher Klamm, Colin Leong, Daniel van Strien, David Ifeoluwa Adelani, Dragomir Radev, Eduardo González Ponferrada, Efrat Levkovizh, Ethan Kim, Eyal Bar Natan, Francesco De Toni, Gérard Dupont, Germán Kruszewski, Giada Pistilli, Hady Elsahar, Hamza Benyamina, Hieu Tran, Ian Yu, Idris Abdulmumin, Isaac Johnson, Itziar Gonzalez-Dios, Javier de la Rosa, Jenny Chim, Jesse Dodge, Jian Zhu, Jonathan Chang, Jörg Frohberg, Joseph Tobing, Joydeep Bhattacharjee, Khalid Almubarak, Kimbo Chen, Kyle Lo, Leandro von Werra, Leon Weber, Long Phan, Loubna Ben allal, Ludovic Tanguy, Manan Dey, Manuel Romero Muñoz, Maraim Masoud, María Grandury, Mario Šaško, Max Huang, Maximin Coavoux, Mayank Singh, Mike Tian-Jian Jiang, Minh Chien Vu, Mohammad A. Jauhar, Mustafa Ghaleb, Nishant Subramani, Nora Kassner, Nurulaqilla Khamis, Olivier Nguyen, Omar Espejel, Ona de Gibert, Paulo Villegas, Peter Henderson, Pierre Colombo, Priscilla Amuok, Quentin Lhoest, Rheza Harliman, Rishi Bommasani, Roberto Luis López, Rui Ribeiro, Salomey Osei, Sampo Pyysalo, Sebastian Nagel, Shamik Bose, Shamsuddeen Hassan Muhammad, Shanya Sharma, Shayne Longpre, Somaieh Nikpoor, Stanislav Silberberg, Suhas Pai, Sydney Zink, Tiago Timponi Torrent, Timo Schick, Tristan Thrush, Valentin Danchev, Vassilina Nikoulina, Veronika Laippala, Violette Lepercq, Vrinda Prabhu, Zaid Alyafeai, Zeerak Talat, Arun Raja, Benjamin Heinzerling, Chenglei Si, Davut Emre Taşar, Elizabeth Salesky, Sabrina J. Mielke, Wilson Y. Lee, Abheesht Sharma, Andrea Santilli, Antoine Chaffin, Arnaud Stiegler, Debajyoti Datta, Eliza Szczechla, Gunjan Chhablani, Han Wang, Harshit Pandey, Hendrik Strobelt, Jason Alan Fries, Jos Rozen, Leo Gao, Lintang Sutawika, M Saiful Bari, Maged S. Al-shaibani, Matteo Manica, Nihal Nayak, Ryan Teehan, Samuel Albanie, Sheng Shen, Srulik Ben-David, Stephen H. Bach, Taewoon Kim, Tali Bers, Thibault Fevry, Trishala Neeraj, Urmish Thakker, Vikas Raunak, Xiangru Tang, Zheng-Xin Yong, Zhiqing Sun, Shaked Brody, Yallow Uri, Hadar Tojarieh, Adam Roberts, Hyung Won Chung, Jaesung Tae, Jason Phang, Ofir Press, Conglong Li, Deepak Narayanan, Hatim Bourfoune, Jared Casper, Jeff Rasley, Max Ryabinin, Mayank Mishra, Minjia Zhang, Mohammad Shoeybi, Myriam Peyrounette, Nicolas Patry, Nouamane Tazi, Omar Sanseviero, Patrick von Platen, Pierre Cornette, Pierre François Lavallée, Rémi Lacroix, Samyam Rajbhandari, Sanchit Gandhi, Shaden Smith, Stéphane Requena, Suraj Patil, Tim Dettmers, Ahmed Baruwa, Amanpreet Singh, Anastasia Cheveleva, Anne-Laure Ligozat, Arjun Subramonian, Aurélie Névéol, Charles Lovering, Dan Garrette, Deepak Tunuguntla, Ehud Reiter, Ekaterina Taktasheva, Ekaterina Voloshina, Eli Bogdanov, Genta Indra Winata, Hailey Schoelkopf, Jan-Christoph Kalo, Jekaterina Novikova, Jessica Zosa Forde, Jordan Clive, Jungo Kasai, Ken Kawamura, Liam Hazan, Marine Carpuat, Miruna Clinciu, Najoung Kim, Newton Cheng, Oleg Serikov, Omer Antverg, Oskar van der Wal, Rui Zhang, Ruochen Zhang, Sebastian Gehrmann, Shachar Mirkin, Shani Pais, Tatiana Shavrina, Thomas Scialom, Tian Yun, Tomasz Limisiewicz, Verena Rieser, Vitaly Protasov, Vladislav Mikhailov, Yada Pruksachatkun, Yonatan Belinkov, Zachary Bamberger, Zdeněk Kasner, Alice Rueda, Amanda Pestana, Amir Feizpour, Ammar Khan, Amy Faranak, Ana Santos, Anthony Hevia, Antigona Unldreaj, Arash Aghagol, Arezoo Abdollahi, Aycha Tammour, Azadeh HajiHosseini, Bahareh Behroozi, Benjamin Ajibade, Bharat Saxena, Carlos Muñoz Ferrandis, Daniel McDuff, Danish Contractor, David Lansky, Davis David, Douwe Kiela, Duong A. Nguyen, Edward Tan, Emi Baylor, Ezinwanne Ozoani, Fatima Mirza, Frankline Ononiwu, Habib Rezanejad, Hessie Jones, Indrani Bhattacharya, Irene Solaiman, Irina Sedenko, Isar Nejadgholi, Jesse Passmore, Josh Seltzer, Julio Bonis Sanz, Livia Dutra, Mairon Samagaio, Maraim Elbadri, Margot Mieskes, Marissa Gerchick, Martha Akinlolu, Michael McKenna, Mike Qiu, Muhammed Ghauri, Mykola Burynok, Nafis Abrar, Nazneen Rajani, Nour Elkott, Nour Fahmy, Olanrewaju Samuel, Ran An, Rasmus Kromann, Ryan Hao, Samira Alizadeh, Sarmad Shubber, Silas Wang, Sourav Roy, Sylvain Viguier, Thanh Le, Tobi Oyebade, Trieu Le, Yoyo Yang, Zach Nguyen, Abhinav Ramesh Kashyap, Alfredo Palasciano, Alison Callahan, Anima Shukla, Antonio Miranda-Escalada, Ayush Singh, Benjamin Beilharz, Bo wang, Caio Brito, Chenxi Zhou, Chirag Jain, Chuxin Xu, Clémentine Fourrier, Daniel León Periñán, Daniel Molano, Dian Yu, Enrique Manjavacas, Fabio Barth, Florian Fuhrimann, Gabriel Altay, Giyaseddin Bayrak, Gully Burns, Helena U. Vrabec, Imane Bello, Ishani Dash, Jihyun Kang, John Giorgi, Jonas Golde, Jose David Posada, Karthik Rangasai Sivaraman, Lokesh Bulchandani, Lu Liu, Luisa Shinzato, Madeleine Hahn de Bykhovetz, Maiko Takeuchi, Marc Pàmies, Maria A Castillo, Marianna Nezhurina, Mario Sänger, Matthias Samwald, Michael Cullan, Michael Weinberg, Michiel De Wolf, Mina Mihaljcic, Minna Liu, Moritz Freidank, Myungsun Kang, Natasha Seelam, Nathan Dahlberg, Nicholas Michio Broad, Nikolaus Muellner, Pascale Fung, Patrick Haller, Ramya Chandrasekhar, Renata Eisenberg, Robert Martin, Rodrigo Canalli, Rosaline Su, Ruisi Su, Samuel Cahyawijaya, Samuele Garda, Shlok S Deshmukh, Shubhanshu Mishra, Sid Kiblawi, Simon Ott, Sinee Sang-aroonsiri, Srishti Kumar, Stefan Schweter, Sushil Bharati, Tanmay Laud, Théo Gigant, Tomoya Kainuma, Wojciech Kusa, Yanis Labrak, Yash Shailesh Bajaj, Yash Venkatraman, Yifan Xu, Yingxin Xu, Yu Xu, Zhe Tan, Zhongli Xie, Zifan Ye, Mathilde Bras, Younes Belkada, Thomas Wolf

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions.

Language Modelling Multilingual NLP

2,181

Paper
Code

Crosslingual Generalization through Multitask Finetuning

1 code implementation • 3 Nov 2022 • Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, Stella Biderman, Teven Le Scao, M Saiful Bari, Sheng Shen, Zheng-Xin Yong, Hailey Schoelkopf, Xiangru Tang, Dragomir Radev, Alham Fikri Aji, Khalid Almubarak, Samuel Albanie, Zaid Alyafeai, Albert Webson, Edward Raff, Colin Raffel

We find finetuning large multilingual language models on English tasks with English prompts allows for task generalization to non-English languages that appear only in the pretraining corpus.

Ranked #1 on Question Answering on StoryCloze

Coreference Resolution Cross-Lingual Transfer +4

493

Paper
Code

NamedMask: Distilling Segmenters from Complementary Foundation Models

1 code implementation • 22 Sep 2022 • Gyungin Shin, Weidi Xie, Samuel Albanie

Our method, termed NamedMask, begins by using CLIP to construct category-specific archives of images.

Data Augmentation Object +1

Paper
Code

RLIP: Relational Language-Image Pre-training for Human-Object Interaction Detection

3 code implementations • 5 Sep 2022 • Hangjie Yuan, Jianwen Jiang, Samuel Albanie, Tao Feng, Ziyuan Huang, Dong Ni, Mingqian Tang

The task of Human-Object Interaction (HOI) detection targets fine-grained visual parsing of humans interacting with their environment, enabling a broad range of applications.

Ranked #16 on Human-Object Interaction Detection on HICO-DET

Human-Object Interaction Detection Relation +1

Paper
Code

Automatic dense annotation of large-vocabulary sign language videos

no code implementations • 4 Aug 2022 • Liliane Momeni, Hannah Bull, K R Prajwal, Samuel Albanie, Gül Varol, Andrew Zisserman

Recently, sign language researchers have turned to sign language interpreted TV broadcasts, comprising (i) a video of continuous signing and (ii) subtitles corresponding to the audio content, as a readily available and large-scale source of training data.

Paper
Add Code

ReCo: Retrieve and Co-segment for Zero-shot Transfer

2 code implementations • 14 Jun 2022 • Gyungin Shin, Weidi Xie, Samuel Albanie

Semantic segmentation has a broad range of applications, but its real-world impact has been significantly limited by the prohibitive annotation costs necessary to enable deployment.

Ranked #1 on Unsupervised Semantic Segmentation with Language-image Pre-training on COCO-Stuff-27

Retrieval Segmentation +1

Paper
Code

Scaling up sign spotting through sign language dictionaries

no code implementations • 9 May 2022 • Gül Varol, Liliane Momeni, Samuel Albanie, Triantafyllos Afouras, Andrew Zisserman

The focus of this work is $\textit{sign spotting}$ - given a video of an isolated sign, our task is to identify $\textit{whether}$ and $\textit{where}$ it has been signed in a continuous, co-articulated sign language video.

Multiple Instance Learning

Paper
Add Code

A 23 MW data centre is all you need

no code implementations • 31 Mar 2022 • Samuel Albanie, Dylan Campbell, João F. Henriques

The field of machine learning has achieved striking progress in recent years, witnessing breakthrough results on language modelling, protein folding and nitpickingly fine-grained dog breed classification.

Board Games Language Modelling +1

Paper
Add Code

Unsupervised Salient Object Detection with Spectral Cluster Voting

1 code implementation • 23 Mar 2022 • Gyungin Shin, Samuel Albanie, Weidi Xie

In this paper, we tackle the challenging task of unsupervised salient object detection (SOD) by leveraging spectral clustering on self-supervised features.

Ranked #1 on Unsupervised Saliency Detection on ECSSD

Clustering Object +5

Paper
Code

Sign Language Video Retrieval with Free-Form Textual Queries

no code implementations • CVPR 2022 • Amanda Duarte, Samuel Albanie, Xavier Giró-i-Nieto, Gül Varol

Systems that can efficiently search collections of sign language videos have been highlighted as a useful application of sign language technology.

Retrieval Sentence +2

Paper
Add Code

Cross Modal Retrieval with Querybank Normalisation

1 code implementation • CVPR 2022 • Simion-Vlad Bogolin, Ioana Croitoru, Hailin Jin, Yang Liu, Samuel Albanie

In this work we first show that, despite their effectiveness, state-of-the-art joint embeddings suffer significantly from the longstanding "hubness problem" in which a small number of gallery embeddings form the nearest neighbours of many queries.

Ranked #5 on Video Retrieval on QuerYD

Cross-Modal Retrieval Metric Learning +3

Paper
Code

Audio Retrieval with Natural Language Queries: A Benchmark Study

1 code implementation • 17 Dec 2021 • A. Sophia Koepke, Andreea-Maria Oncescu, João F. Henriques, Zeynep Akata, Samuel Albanie

Additionally, we introduce the SoundDescs benchmark, which consists of paired audio and natural language descriptions for a diverse collection of sounds that are complementary to those found in AudioCaps and Clotho.

Ranked #1 on Audio to Text Retrieval on SoundDescs

AudioCaps Audio captioning +5

Paper
Code

BBC-Oxford British Sign Language Dataset

no code implementations • 5 Nov 2021 • Samuel Albanie, Gül Varol, Liliane Momeni, Hannah Bull, Triantafyllos Afouras, Himel Chowdhury, Neil Fox, Bencie Woll, Rob Cooper, Andrew McParland, Andrew Zisserman

In this work, we introduce the BBC-Oxford British Sign Language (BOBSL) dataset, a large-scale video collection of British Sign Language (BSL).

Sign Language Translation Translation

Paper
Add Code

Adaptive Cross-Modal Prototypes for Cross-Domain Visual-Language Retrieval

no code implementations • CVPR 2021 • Yang Liu, Qingchao Chen, Samuel Albanie

In this paper, we study the task of visual-text retrieval in the highly practical setting in which labelled visual data with paired text descriptions are available in one domain (the "source"), but only unlabelled visual data (without text descriptions) are available in the domain of interest (the "target").

Inductive Bias Retrieval +1

Paper
Add Code

Aligning Subtitles in Sign Language Videos

no code implementations • ICCV 2021 • Hannah Bull, Triantafyllos Afouras, Gül Varol, Samuel Albanie, Liliane Momeni, Andrew Zisserman

The goal of this work is to temporally align asynchronous subtitles in sign language videos.

Machine Translation Translation

Paper
Add Code

Audio Retrieval with Natural Language Queries

1 code implementation • 5 May 2021 • Andreea-Maria Oncescu, A. Sophia Koepke, João F. Henriques, Zeynep Akata, Samuel Albanie

We consider the task of retrieving audio using free-form natural language queries.

Ranked #1 on Audio/Video to Text Retrieval on AudioCaps

AudioCaps Audio to Text Retrieval +5

Paper
Code

Sign Segmentation with Changepoint-Modulated Pseudo-Labelling

1 code implementation • 28 Apr 2021 • Katrin Renz, Nicolaj C. Stache, Neil Fox, Gül Varol, Samuel Albanie

The objective of this work is to find temporal boundaries between signs in continuous sign language.

Segmentation Source-Free Domain Adaptation

Paper
Code

TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval

1 code implementation • ICCV 2021 • Ioana Croitoru, Simion-Vlad Bogolin, Marius Leordeanu, Hailin Jin, Andrew Zisserman, Samuel Albanie, Yang Liu

In recent years, considerable progress on the task of text-video retrieval has been achieved by leveraging large-scale pretraining on visual and audio datasets to construct powerful video encoders.

Retrieval Video Retrieval

327

Paper
Code

All you need are a few pixels: semantic segmentation with PixelPick

2 code implementations • 13 Apr 2021 • Gyungin Shin, Weidi Xie, Samuel Albanie

A central challenge for the task of semantic segmentation is the prohibitive cost of obtaining dense pixel-level annotations to supervise model training.

Active Learning Segmentation +1

Paper
Code

On the Origin of Species of Self-Supervised Learning

no code implementations • 31 Mar 2021 • Samuel Albanie, Erika Lu, Joao F. Henriques

In the quiet backwaters of cs. CV, cs. LG and stat. ML, a cornucopia of new learning systems is emerging from a primordial soup of mathematics-learning systems with no need for external supervision.

Self-Supervised Learning

Paper
Add Code

Read and Attend: Temporal Localisation in Sign Language Videos

no code implementations • CVPR 2021 • Gül Varol, Liliane Momeni, Samuel Albanie, Triantafyllos Afouras, Andrew Zisserman

Our contributions are as follows: (1) we demonstrate the ability to leverage large quantities of continuous signing videos with weakly-aligned subtitles to localise signs in continuous sign language; (2) we employ the learned attention to automatically generate hundreds of thousands of annotations for a large sign vocabulary; (3) we collect a set of 37K manually verified sign instances across a vocabulary of 950 sign classes to support our study of sign language recognition; (4) by training on the newly annotated data from our method, we outperform the prior state of the art on the BSL-1K sign language recognition benchmark.

Sign Language Recognition

Paper
Add Code

Quantum Self-Supervised Learning

2 code implementations • 26 Mar 2021 • Ben Jaderberg, Lewis W. Anderson, Weidi Xie, Samuel Albanie, Martin Kiffner, Dieter Jaksch

The resurgence of self-supervised learning, whereby a deep learning model generates its own supervisory signal from the data, promises a scalable way to tackle the dramatically increasing size of real-world data sets without human annotation.

Self-Supervised Learning

Paper
Code

Sign language segmentation with temporal convolutional networks

1 code implementation • 25 Nov 2020 • Katrin Renz, Nicolaj C. Stache, Samuel Albanie, Gül Varol

The objective of this work is to determine the location of temporal boundaries between signs in continuous sign language videos.

Paper
Code

QuerYD: A video dataset with high-quality text and audio narrations

2 code implementations • 22 Nov 2020 • Andreea-Maria Oncescu, João F. Henriques, Yang Liu, Andrew Zisserman, Samuel Albanie

We introduce QuerYD, a new large-scale dataset for retrieval and event localisation in video.

Retrieval Video Understanding +1

Paper
Code

Watch, read and lookup: learning to spot signs from multiple supervisors

1 code implementation • 8 Oct 2020 • Liliane Momeni, Gül Varol, Samuel Albanie, Triantafyllos Afouras, Andrew Zisserman

The focus of this work is sign spotting - given a video of an isolated sign, our task is to identify whether and where it has been signed in a continuous, co-articulated sign language video.

Multiple Instance Learning

Paper
Code

Seeing wake words: Audio-visual Keyword Spotting

1 code implementation • 2 Sep 2020 • Liliane Momeni, Triantafyllos Afouras, Themos Stafylakis, Samuel Albanie, Andrew Zisserman

The goal of this work is to automatically determine whether and when a word of interest is spoken by a talking face, with or without the audio.

Lip Reading Visual Keyword Spotting

Paper
Code

The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020)

1 code implementation • 3 Aug 2020 • Samuel Albanie, Yang Liu, Arsha Nagrani, Antoine Miech, Ernesto Coto, Ivan Laptev, Rahul Sukthankar, Bernard Ghanem, Andrew Zisserman, Valentin Gabeur, Chen Sun, Karteek Alahari, Cordelia Schmid, Shi-Zhe Chen, Yida Zhao, Qin Jin, Kaixu Cui, Hui Liu, Chen Wang, Yudong Jiang, Xiaoshuai Hao

This report summarizes the results of the first edition of the challenge together with the findings of the participants.

Natural Language Queries Retrieval +3

327

Paper
Code

BSL-1K: Scaling up co-articulated sign language recognition using mouthing cues

1 code implementation • ECCV 2020 • Samuel Albanie, Gül Varol, Liliane Momeni, Triantafyllos Afouras, Joon Son Chung, Neil Fox, Andrew Zisserman

Recent progress in fine-grained gesture and action classification, and machine translation, point to the possibility of automated sign language recognition becoming a reality.

Ranked #4 on Sign Language Recognition on WLASL-2000

Action Classification Keyword Spotting +2

Paper
Code

State-of-Art-Reviewing: A Radical Proposal to Improve Scientific Publication

no code implementations • 31 Mar 2020 • Samuel Albanie, Jaime Thewmore, Robert McCraith, Joao F. Henriques

Peer review forms the backbone of modern scientific manuscript evaluation.

Paper
Add Code

Iterative Averaging in the Quest for Best Test Error

no code implementations • 2 Mar 2020 • Diego Granziol, Xingchen Wan, Samuel Albanie, Stephen Roberts

We analyse and explain the increased generalisation performance of iterate averaging using a Gaussian process perturbation model between the true and batch risk surface on the high dimensional quadratic.

Image Classification

Paper
Add Code

Disentangled Speech Embeddings using Cross-modal Self-supervision

no code implementations • 20 Feb 2020 • Arsha Nagrani, Joon Son Chung, Samuel Albanie, Andrew Zisserman

The objective of this paper is to learn representations of speaker identity without access to manually annotated data.

Self-Supervised Learning Speaker Recognition

Paper
Add Code

Unsupervised Learning of Landmarks by Descriptor Vector Exchange

1 code implementation • ICCV 2019 • James Thewlis, Samuel Albanie, Hakan Bilen, Andrea Vedaldi

Equivariance to random image transformations is an effective method to learn landmarks of object categories, such as the eyes and the nose in faces, without manual supervision.

Ranked #1 on Unsupervised Facial Landmark Detection on 300W

Object Unsupervised Facial Landmark Detection

Paper
Code

Use What You Have: Video Retrieval Using Representations From Collaborative Experts

3 code implementations • 31 Jul 2019 • Yang Liu, Samuel Albanie, Arsha Nagrani, Andrew Zisserman

The rapid growth of video on the internet has made searching for video content using natural language queries a significant challenge.

Ranked #24 on Video Retrieval on MSVD

Natural Language Queries Retrieval +2

327

Paper
Code

Deep Industrial Espionage

no code implementations • 1 Apr 2019 • Samuel Albanie, James Thewlis, Sebastien Ehrhardt, Joao Henriques

The theory of deep learning is now considered largely solved, and is well understood by researchers and influencers alike.

Paper
Add Code

Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks

9 code implementations • NeurIPS 2018 • Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Andrea Vedaldi

We also propose a parametric gather-excite operator pair which yields further performance gains, relate it to the recently-introduced Squeeze-and-Excitation Networks, and analyse the effects of these changes to the CNN feature activation statistics.

29,648

Paper
Code

Emotion Recognition in Speech using Cross-Modal Transfer in the Wild

no code implementations • 16 Aug 2018 • Samuel Albanie, Arsha Nagrani, Andrea Vedaldi, Andrew Zisserman

We make the following contributions: (i) we develop a strong teacher network for facial emotion recognition that achieves the state of the art on a standard benchmark; (ii) we use the teacher to train a student, tabula rasa, to learn representations (embeddings) for speech emotion recognition without access to labelled audio data; and (iii) we show that the speech emotion embedding can be used for speech emotion recognition on external benchmark datasets.

Ranked #3 on Facial Expression Recognition (FER) on FERPlus

Facial Emotion Recognition Facial Expression Recognition (FER) +1

Paper
Add Code

Semi-convolutional Operators for Instance Segmentation

no code implementations • ECCV 2018 • David Novotny, Samuel Albanie, Diane Larlus, Andrea Vedaldi

Object detection and instance segmentation are dominated by region-based methods such as Mask RCNN.

Instance Segmentation object-detection +3

Paper
Add Code

PyTorch CurveBall - A second-order optimizer for deep networks

1 code implementation • 21 May 2018 • João F. Henriques, Sebastien Ehrhardt, Samuel Albanie, Andrea Vedaldi

We propose a fast second-order method that can be used as a drop-in replacementfor current deep learning solvers.

Paper
Code

Small steps and giant leaps: Minimal Newton solvers for Deep Learning

6 code implementations • ICLR 2019 • João F. Henriques, Sebastien Ehrhardt, Samuel Albanie, Andrea Vedaldi

Instead, we propose to keep a single estimate of the gradient projected by the inverse Hessian matrix, and update it once per iteration.

Paper
Code

Learnable PINs: Cross-Modal Embeddings for Person Identity

1 code implementation • ECCV 2018 • Arsha Nagrani, Samuel Albanie, Andrew Zisserman

We propose and investigate an identity sensitive joint embedding of face and voice.

Cross-Modal Retrieval Retrieval

Paper
Code

Self-supervised Learning of Geometrically Stable Features Through Probabilistic Introspection

no code implementations • CVPR 2018 • David Novotny, Samuel Albanie, Diane Larlus, Andrea Vedaldi

Self-supervision can dramatically cut back the amount of manually-labelled data required to train deep neural networks.

Image Classification Self-Supervised Learning +1

Paper
Add Code

Substitute Teacher Networks: Learning with Almost No Supervision

1 code implementation • 1 Apr 2018 • Samuel Albanie, James Thewlis, Joao F. Henriques

Learning through experience is time-consuming, inefficient and often bad for your cortisol levels.

Paper
Code

Seeing Voices and Hearing Faces: Cross-modal biometric matching

no code implementations • CVPR 2018 • Arsha Nagrani, Samuel Albanie, Andrew Zisserman

We make the following contributions: (i) we introduce CNN architectures for both binary and multi-way cross-modal face and audio matching, (ii) we compare dynamic testing (where video information is available, but the audio is not from the same video) with static testing (where only a single still image is available), and (iii) we use human testing as a baseline to calibrate the difficulty of the task.

Face Recognition Speaker Identification

Paper
Add Code

Squeeze-and-Excitation Networks

82 code implementations • CVPR 2018 • Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Enhua Wu

Squeeze-and-Excitation Networks formed the foundation of our ILSVRC 2017 classification submission which won first place and reduced the top-5 error to 2. 251%, surpassing the winning entry of 2016 by a relative improvement of ~25%.

Ranked #59 on Image Classification on CIFAR-10

Image Classification

38,252

Paper
Code

Stopping GAN Violence: Generative Unadversarial Networks

1 code implementation • 7 Mar 2017 • Samuel Albanie, Sébastien Ehrhardt, João F. Henriques

While the costs of human violence have attracted a great deal of attention from the research community, the effects of the network-on-network (NoN) violence popularised by Generative Adversarial Networks have yet to be addressed.

194

Paper
Code