OCD: Learning to Overfit with Conditional Diffusion Models

2 Oct 2022  Â·  Shahar Lutati, Lior Wolf ·

We present a dynamic model in which the weights are conditioned on an input sample x and are learned to match those that would be obtained by finetuning a base model on x and its label y. This mapping between an input sample and network weights is approximated by a denoising diffusion model. The diffusion model we employ focuses on modifying a single layer of the base model and is conditioned on the input, activations, and output of this layer. Since the diffusion model is stochastic in nature, multiple initializations generate different networks, forming an ensemble, which leads to further improvements. Our experiments demonstrate the wide applicability of the method for image classification, 3D reconstruction, tabular data, speech separation, and natural language processing. Our code is available at https://github.com/ShaharLutatiPersonal/OCD

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Few-Shot Text Classification Amazon Counterfeit SetFit + OCD Accuracy 0.41 # 1
Few-Shot Text Classification Average on NLP datasets SetFit + OCD(5) Accuracy 0.648 # 1
Few-Shot Text Classification Average on NLP datasets T-few 3B Accuracy 0.633 # 3
Few-Shot Text Classification Average on NLP datasets SetFit Accuracy 0.622 # 4
Few-Shot Text Classification Average on NLP datasets SetFit + OCD Accuracy 0.643 # 2
Speech Separation Libri5Mix OCD SI-SDRi 13.4 # 3
Few-Shot Text Classification SST-5 SetFit + OCD Accuracy 0.478 # 1
Image Classification Tiny ImageNet Classification DeiT-B/16-D + OCD Validation Acc 90.8% # 6
Image Classification Tiny ImageNet Classification DeiT-B/16-D + OCD(5) Validation Acc 92.0% # 2

Methods


BASE • Diffusion • OCD