no code implementations • 10 Mar 2025 • Pramit Saha, Divyanshu Mishra, Netzahualcoyotl Hernandez-Cruz, Olga Patey, Aris Papageorghiou, Yuki M. Asano, J. Alison Noble
To address these challenges, we introduce, for the first time, a novel privacy-preserving, zero-shot CHD detection framework that formulates CHD detection as a normality modeling problem integrated with model merging.
no code implementations • 16 Jan 2025 • Kohei Torimi, Ryosuke Yamada, Daichi Otsuka, Kensho Hara, Yuki M. Asano, Hirokatsu Kataoka, Yoshimitsu Aoki
Zero-shot recognition models require extensive training data for generalization.
1 code implementation • 15 Dec 2024 • Mohammadreza Salehi, Nikolaos Apostolikas, Efstratios Gavves, Cees G. M. Snoek, Yuki M. Asano
Adapting to our object-level definition of `normal', we modify knowledge distillation frameworks, where a student network learns from a pre-trained teacher network.
no code implementations • 14 Oct 2024 • Aritra Bhowmik, Mohammad Mahdi Derakhshani, Dennis Koelma, Martin R. Oswald, Yuki M. Asano, Cees G. M. Snoek
Yet, without vast amounts of spatial supervision, current Visual Language Models (VLMs) struggle at this task.
1 code implementation • 13 Oct 2024 • Ivona Najdenkoska, Mohammad Mahdi Derakhshani, Yuki M. Asano, Nanne van Noord, Marcel Worring, Cees G. M. Snoek
By effectively encoding captions longer than the default 77 tokens, our model outperforms baselines on cross-modal tasks such as retrieval and text-to-image generation.
no code implementations • 10 Oct 2024 • Daniel Cores, Michael Dorkenwald, Manuel Mucientes, Cees G. M. Snoek, Yuki M. Asano
Large language models have demonstrated impressive performance when integrated with vision models even enabling video understanding.
no code implementations • 9 Oct 2024 • Jona Ruthardt, Gertjan J. Burghouts, Serge Belongie, Yuki M. Asano
To this end, we propose the Visual Text Representation Benchmark (ViTeRB) to isolate key properties that make language models well-aligned with the visual world.
1 code implementation • 11 Sep 2024 • Alfonso Taboada Warmerdam, Mathilde Caron, Yuki M. Asano
We validate the usefulness of learning binary masks as a fine-tuning method on 8 datasets and 3 model architectures, and we demonstrate the effectiveness of SMNs in 3 label-efficient settings.
1 code implementation • 5 Sep 2024 • Marga Don, Stijn Pinson, Blanca Guillen Cebrian, Yuki M. Asano
In this work, we compare the performance of FMs to finetuned pre-trained supervised models in the task of semantic segmentation on an entirely new dataset.
1 code implementation • 1 Sep 2024 • Go Ohtani, Ryu Tadokoro, Ryosuke Yamada, Yuki M. Asano, Iro Laina, Christian Rupprecht, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka, Yoshimitsu Aoki
In this work, we investigate the understudied effect of the training data used for image super-resolution (SR).
2 code implementations • 26 Aug 2024 • Sarah Rastegar, Mohammadreza Salehi, Yuki M. Asano, Hazel Doughty, Cees G. M. Snoek
In this paper, we address Generalized Category Discovery, aiming to simultaneously uncover novel categories and accurately classify known ones.
no code implementations • 20 Aug 2024 • Valentinos Pariza, Mohammadreza Salehi, Gertjan Burghouts, Francesco Locatello, Yuki M. Asano
We introduce NeCo: Patch Neighbor Consistency, a novel self-supervised training loss that enforces patch-level nearest neighbor consistency across a student and teacher model.
1 code implementation • 1 Aug 2024 • Ryo Nakamura, Ryu Tadokoro, Ryosuke Yamada, Yuki M. Asano, Iro Laina, Christian Rupprecht, Nakamasa Inoue, Rio Yokota, Hirokatsu Kataoka
To this end, we search for a minimal, purely synthetic pre-training dataset that allows us to achieve performance similar to the 1 million images of ImageNet-1k.
no code implementations • 22 Jul 2024 • Mohammadreza Salehi, Michael Dorkenwald, Fida Mohammad Thoker, Efstratios Gavves, Cees G. M. Snoek, Yuki M. Asano
To tackle this, we present Sinkhorn-guided Masked Video Modelling (SIGMA), a novel video pretraining method that jointly learns the video model in addition to a target feature space using a projection network.
1 code implementation • 17 Jul 2024 • Luc P. J. Sträter, Mohammadreza Salehi, Efstratios Gavves, Cees G. M. Snoek, Yuki M. Asano
These features are fed to an attention-based discriminator, which is trained to score every patch in the image.
Ranked #1 on
Anomaly Detection
on One-class CIFAR-100
1 code implementation • 15 Jul 2024 • Walter Simoncini, Spyros Gidaris, Andrei Bursuc, Yuki M. Asano
This paper introduces FUNGI, Features from UNsupervised GradIents, a method to enhance the features of transformer encoders by leveraging self-supervised gradients.
1 code implementation • 18 Jun 2024 • Sunny Soni, Aaqib Saeed, Yuki M. Asano
To this end, in this paper, we introduce a new method that improves this knowledge distillation method to only rely on a single shared image between clients and server.
no code implementations • 27 May 2024 • Laurens Samson, Nimrod Barazani, Sennay Ghebreab, Yuki M. Asano
This paper aims to advance our understanding of how Visual Language Models (VLMs) handle privacy-sensitive information, a crucial concern as these technologies become integral to everyday life.
no code implementations • 23 May 2024 • Dawid J. Kopiczko, Tijmen Blankevoort, Yuki M. Asano
We introduce Bitune, a method that improves instruction-tuning of pretrained decoder-only large language models, leading to consistent gains on downstream tasks.
no code implementations • 26 Apr 2024 • Sotirios Konstantakos, Jorgen Cani, Ioannis Mademlis, Despina Ioanna Chalkiadaki, Yuki M. Asano, Efstratios Gavves, Georgios Th. Papadopoulos
Self-Supervised Learning (SSL) is a valuable and robust training methodology for contemporary Deep Neural Networks (DNNs), enabling unsupervised pretraining on a 'pretext task' that does not require ground-truth labels/annotation.
no code implementations • 20 Apr 2024 • Rob Romijnders, Christos Louizos, Yuki M. Asano, Max Welling
The COVID19 pandemic had enormous economic and societal consequences.
1 code implementation • 22 Feb 2024 • Abhishek Jha, Matthew B. Blaschko, Yuki M. Asano, Tinne Tuytelaars
Last couple of years have witnessed a tremendous progress in self-supervised learning (SSL), the success of which can be attributed to the introduction of useful inductive biases in the learning process to learn meaningful visual representations while avoiding collapse.
no code implementations • CVPR 2024 • Michael Dorkenwald, Nimrod Barazani, Cees G. M. Snoek, Yuki M. Asano
Vision-Language Models (VLMs), such as Flamingo and GPT-4V, have shown immense potential by integrating large language models with vision systems.
no code implementations • 11 Jan 2024 • Kumara Kahatapitiya, Adil Karjauv, Davide Abati, Fatih Porikli, Yuki M. Asano, Amirhossein Habibian
Both techniques are readily applicable to a given video editing model without retraining, and can drastically reduce its memory and computational cost.
1 code implementation • 28 Dec 2023 • Tycho F. A. van der Ouderaa, Markus Nagel, Mart van Baalen, Yuki M. Asano, Tijmen Blankevoort
Experimentally, our method can prune rows and columns from a range of OPT models and Llamav2-7B by 20%-30%, with a negligible loss in performance, and achieve state-of-the-art results in unstructured and semi-structured pruning of large language models.
no code implementations • 18 Dec 2023 • Rob Romijnders, Christos Louizos, Yuki M. Asano, Max Welling
The pandemic in 2020 and 2021 had enormous economic and societal consequences, and studies show that contact tracing algorithms can be key in the early containment of the virus.
no code implementations • 14 Dec 2023 • Shijie Li, Farhad G. Zanjani, Haitam Ben Yahia, Yuki M. Asano, Juergen Gall, Amirhossein Habibian
This is because the source-view images and corresponding poses are processed separately and injected into the model at different stages.
no code implementations • 14 Dec 2023 • Vincent Tao Hu, Yunlu Chen, Mathilde Caron, Yuki M. Asano, Cees G. M. Snoek, Bjorn Ommer
However, recent studies have revealed that the feature representation derived from diffusion model itself is discriminative for numerous downstream tasks as well, which prompts us to propose a framework to extract guidance from, and specifically for, diffusion models.
no code implementations • 17 Oct 2023 • Dawid J. Kopiczko, Tijmen Blankevoort, Yuki M. Asano
Low-rank adapation (LoRA) is a popular method that reduces the number of trainable parameters when finetuning large language models, but still faces acute storage challenges when scaling to even larger models or deploying numerous per-user or per-task adapted models.
no code implementations • 12 Oct 2023 • Shashanka Venkataramanan, Mamshad Nayeem Rizve, João Carreira, Yuki M. Asano, Yannis Avrithis
But are we making the best use of data?
no code implementations • 30 Sep 2023 • Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Cees G. M. Snoek, Marcel Worring, Yuki M. Asano
We present Self-Context Adaptation (SeCAt), a self-supervised approach that unlocks few-shot abilities for open-ended classification with small visual language models.
1 code implementation • ICCV 2023 • Mohammadreza Salehi, Efstratios Gavves, Cees G. M. Snoek, Yuki M. Asano
Our paper aims to address this gap by proposing a novel approach that incorporates temporal consistency in dense self-supervised learning.
no code implementations • 14 Aug 2023 • Winfried van den Dool, Tijmen Blankevoort, Max Welling, Yuki M. Asano
In the past years, the application of neural networks as an alternative to classical numerical methods to solve Partial Differential Equations has emerged as a potential paradigm shift in this century-old mathematical field.
1 code implementation • CVPR 2024 • Lukas Knobel, Tengda Han, Yuki M. Asano
While recent supervised methods for reference-based object counting continue to improve the performance on benchmark datasets, they have to rely on small datasets due to the cost associated with manually annotating dozens of objects in images.
1 code implementation • 16 Jun 2023 • Phillip Lippe, Sara Magliacane, Sindy Löwe, Yuki M. Asano, Taco Cohen, Efstratios Gavves
Identifying the causal variables of an environment and how to intervene on them is of core value in applications such as robotics and embodied AI.
no code implementations • ICCV 2023 • Pengwan Yang, Cees G. M. Snoek, Yuki M. Asano
In this paper we address the task of finding representative subsets of points in a 3D point cloud by means of a point-wise ordering.
1 code implementation • 1 Feb 2023 • Mert Kilickaya, Joost Van de Weijer, Yuki M. Asano
The current dominant paradigm when building a machine learning model is to iterate over a dataset over and over until convergence.
no code implementations • 5 Jan 2023 • Shashanka Venkataramanan, Amir Ghodrati, Yuki M. Asano, Fatih Porikli, Amirhossein Habibian
This work aims to improve the efficiency of vision transformers (ViT).
1 code implementation • 19 Oct 2022 • Laura Hanu, James Thewlis, Yuki M. Asano, Christian Rupprecht
In this paper, we a) introduce a new dataset of videos, titles and comments; b) present an attention-based mechanism that allows the model to learn from sometimes irrelevant data such as comments; c) show that by using comments, our method is able to learn better, more contextualised, representations for image, video and audio representations.
1 code implementation • 12 Oct 2022 • Jochem Loedeman, Maarten C. Stol, Tengda Han, Yuki M. Asano
With the introduction of the transformer architecture in computer vision, increasing model scale has been demonstrated as a clear path to achieving performance and robustness gains.
1 code implementation • CVPR 2023 • Vincent Tao Hu, David W Zhang, Yuki M. Asano, Gertjan J. Burghouts, Cees G. M. Snoek
Diffusion models have demonstrated remarkable progress in image generation quality, especially when guidance is used to control the generative process.
no code implementations • 7 Sep 2022 • Iro Laina, Yuki M. Asano, Andrea Vedaldi
Self-supervised visual representation learning has recently attracted significant research interest.
1 code implementation • 13 Jun 2022 • Phillip Lippe, Sara Magliacane, Sindy Löwe, Yuki M. Asano, Taco Cohen, Efstratios Gavves
To address this issue, we propose iCITRIS, a causal representation learning method that allows for instantaneous effects in intervened temporal sequences when intervention targets can be observed, e. g., as actions of an agent.
1 code implementation • NAACL (GeBNLP) 2022 • Conrad Borchers, Dalia Sara Gala, Benjamin Gilburt, Eduard Oravkin, Wilfried Bounsi, Yuki M. Asano, Hannah Rose Kirk
The growing capability and availability of generative language models has enabled a wide range of new downstream tasks.
1 code implementation • CVPR 2022 • Adrian Ziegler, Yuki M. Asano
However, learning dense representations is challenging, as in the unsupervised context it is not clear how to guide the model to learn representations that correspond to various potential object categories.
Ranked #5 on
Unsupervised Semantic Segmentation
on PASCAL VOC 2012 val
(using extra training data)
no code implementations • 19 Apr 2022 • Pengwan Yang, Yuki M. Asano, Pascal Mettes, Cees G. M. Snoek
The goal of this paper is to bypass the need for labelled examples in few-shot video understanding at run time.
2 code implementations • 7 Feb 2022 • Phillip Lippe, Sara Magliacane, Sindy Löwe, Yuki M. Asano, Taco Cohen, Efstratios Gavves
Understanding the latent causal factors of a dynamical system from visual observations is considered a crucial step towards agents reasoning in complex environments.
1 code implementation • 1 Dec 2021 • Yuki M. Asano, Aaqib Saeed
What can neural networks learn about the visual world when provided with only a single image as input?
1 code implementation • NeurIPS Workshop ImageNet_PPF 2021 • Yuki M. Asano, Christian Rupprecht, Andrew Zisserman, Andrea Vedaldi
On the other hand, state-of-the-art pretraining is nowadays obtained with unsupervised methods, meaning that labelled datasets such as ImageNet may not be necessary, or perhaps not even optimal, for model pretraining.
no code implementations • ACL (WOAH) 2021 • Hannah Rose Kirk, Yennie Jun, Paulius Rauba, Gal Wachtel, Ruining Li, Xingjian Bai, Noah Broestl, Martin Doff-Sotta, Aleksandar Shtedritski, Yuki M. Asano
In this paper, we collect hateful and non-hateful memes from Pinterest to evaluate out-of-sample performance on models pre-trained on the Facebook dataset.
2 code implementations • NeurIPS 2021 • Mandela Patrick, Dylan Campbell, Yuki M. Asano, Ishan Misra, Florian Metze, Christoph Feichtenhofer, Andrea Vedaldi, João F. Henriques
In video transformers, the time dimension is often treated in the same way as the two spatial dimensions.
Ranked #16 on
Action Recognition
on EPIC-KITCHENS-100
(using extra training data)
no code implementations • CVPR 2022 • Triantafyllos Afouras, Yuki M. Asano, Francois Fagan, Andrea Vedaldi, Florian Metze
We tackle the problem of learning object detectors without supervision.
1 code implementation • ICCV 2021 • Mandela Patrick, Yuki M. Asano, Bernie Huang, Ishan Misra, Florian Metze, Joao Henriques, Andrea Vedaldi
First, for space, we show that spatial augmentations such as cropping do work well for videos too, but that previous implementations, due to the high processing and memory cost, could not do this at a scale sufficient for it to work well.
no code implementations • 11 Mar 2021 • Peiyang He, Charlie Griffin, Krzysztof Kacprzyk, Artjom Joosen, Michael Collyer, Aleksandar Shtedritski, Yuki M. Asano
Privacy considerations and bias in datasets are quickly becoming high-priority issues that the computer vision community needs to face.
1 code implementation • NeurIPS 2021 • Hannah Kirk, Yennie Jun, Haider Iqbal, Elias Benussi, Filippo Volpin, Frederic A. Dreyer, Aleksandar Shtedritski, Yuki M. Asano
Using a template-based data collection pipeline, we collect 396K sentence completions made by GPT-2 and find: (i) The machine-predicted jobs are less diverse and more stereotypical for women than for men, especially for intersections; (ii) Intersectional interactions are highly relevant for occupational associations, which we quantify by fitting 262 logistic models; (iii) For most occupations, GPT-2 reflects the skewed gender and ethnicity distribution found in US Labor Bureau data, and even pulls the societally-skewed distribution towards gender parity in cases where its predictions deviate from real labor market observations.
1 code implementation • NeurIPS 2020 • Yuki M. Asano, Mandela Patrick, Christian Rupprecht, Andrea Vedaldi
A large part of the current success of deep learning lies in the effectiveness of data -- more precisely: labelled data.
1 code implementation • ICCV 2021 • Mandela Patrick, Yuki M. Asano, Polina Kuznetsova, Ruth Fong, João F. Henriques, Geoffrey Zweig, Andrea Vedaldi
In the image domain, excellent representations can be learned by inducing invariance to content-preserving transformations via noise contrastive learning.
2 code implementations • ICLR 2020 • Yuki M. Asano, Christian Rupprecht, Andrea Vedaldi
We look critically at popular self-supervision techniques for learning deep convolutional neural networks without manual labels.