Search Results for author: Basil Mustafa

Found 25 papers, 14 papers with code

From Sparse to Soft Mixtures of Experts

3 code implementations2 Aug 2023 Joan Puigcerver, Carlos Riquelme, Basil Mustafa, Neil Houlsby

Sparse mixture of expert architectures (MoEs) scale model capacity without large increases in training or inference costs.

Massively Scaling Heteroscedastic Classifiers

no code implementations30 Jan 2023 Mark Collier, Rodolphe Jenatton, Basil Mustafa, Neil Houlsby, Jesse Berent, Effrosyni Kokiopoulou

Heteroscedastic classifiers, which learn a multivariate Gaussian distribution over prediction logits, have been shown to perform well on image classification problems with hundreds to thousands of classes.

Classification Contrastive Learning +1

CLIPPO: Image-and-Language Understanding from Pixels Only

1 code implementation CVPR 2023 Michael Tschannen, Basil Mustafa, Neil Houlsby

Multimodal models are becoming increasingly effective, in part due to unified components, such as the Transformer architecture.

Contrastive Learning Image Classification +6

Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints

1 code implementation9 Dec 2022 Aran Komatsuzaki, Joan Puigcerver, James Lee-Thorp, Carlos Riquelme Ruiz, Basil Mustafa, Joshua Ainslie, Yi Tay, Mostafa Dehghani, Neil Houlsby

In this work, we propose sparse upcycling -- a simple way to reuse sunk training costs by initializing a sparsely activated Mixture-of-Experts model from a dense checkpoint.

Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts

1 code implementation6 Jun 2022 Basil Mustafa, Carlos Riquelme, Joan Puigcerver, Rodolphe Jenatton, Neil Houlsby

MoEs are a natural fit for a multimodal backbone, since expert layers can learn an appropriate partitioning of modalities.

Contrastive Learning

LiT: Zero-Shot Transfer with Locked-image text Tuning

4 code implementations CVPR 2022 Xiaohua Zhai, Xiao Wang, Basil Mustafa, Andreas Steiner, Daniel Keysers, Alexander Kolesnikov, Lucas Beyer

This paper presents contrastive-tuning, a simple method employing contrastive training to align image and text models while still taking advantage of their pre-training.

Image Classification Retrieval +2

Sparse MoEs meet Efficient Ensembles

1 code implementation7 Oct 2021 James Urquhart Allingham, Florian Wenzel, Zelda E Mariet, Basil Mustafa, Joan Puigcerver, Neil Houlsby, Ghassen Jerfel, Vincent Fortuin, Balaji Lakshminarayanan, Jasper Snoek, Dustin Tran, Carlos Riquelme Ruiz, Rodolphe Jenatton

Machine learning models based on the aggregated outputs of submodels, either at the activation or prediction levels, often exhibit strong performance compared to individual models.

Few-Shot Learning

Big Self-Supervised Models Advance Medical Image Classification

1 code implementation ICCV 2021 Shekoofeh Azizi, Basil Mustafa, Fiona Ryan, Zachary Beaver, Jan Freyberg, Jonathan Deaton, Aaron Loh, Alan Karthikesalingam, Simon Kornblith, Ting Chen, Vivek Natarajan, Mohammad Norouzi

Self-supervised pretraining followed by supervised fine-tuning has seen success in image recognition, especially when labeled examples are scarce, but has received limited attention in medical image analysis.

Contrastive Learning General Classification +3

A Simple Probabilistic Method for Deep Classification under Input-Dependent Label Noise

no code implementations15 Mar 2020 Mark Collier, Basil Mustafa, Efi Kokiopoulou, Rodolphe Jenatton, Jesse Berent

By tuning the softmax temperature, we improve accuracy, log-likelihood and calibration on both image classification benchmarks with controlled label noise as well as Imagenet-21k which has naturally occurring label noise.

General Classification Image Classification +2

Learning to Segment Medical Images with Scribble-Supervision Alone

no code implementations12 Jul 2018 Yigit B. Can, Krishna Chaitanya, Basil Mustafa, Lisa M. Koch, Ender Konukoglu, Christian F. Baumgartner

We find that the networks trained on scribbles suffer from a remarkably small degradation in Dice of only 2. 9% (cardiac) and 4. 5% (prostate) with respect to a network trained on full annotations.

Anatomy Image Segmentation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.