1 code implementation • 22 Nov 2023 • Francesco Corti, Balz Maag, Joachim Schauer, Ulrich Pferschy, Olga Saukh
REDS support conventional deep networks frequently deployed on the edge and provide computational benefits even for small and simple networks.
no code implementations • 18 Nov 2023 • Nam Cao, Olga Saukh
We leverage the domain knowledge that geometric features are highly important for accurate pollen identification and introduce two novel geometric image augmentation techniques to significantly narrow the accuracy gap between the model performance on the train and test datasets.
1 code implementation • 22 May 2023 • Olga Saukh, Dong Wang, Xiaoxi He, Lothar Thiele
The obtained subspace is low-dimensional and has a surprisingly simple structure even for complex, non-invertible transformations of the input, leading to an exceptionally high efficiency of subspace-configurable networks (SCNs) when limited storage and computing resources are at stake.
1 code implementation • NeurIPS 2023 • Samir Yitzhak Gadre, Gabriel Ilharco, Alex Fang, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, Dhruba Ghosh, Jieyu Zhang, Eyal Orgad, Rahim Entezari, Giannis Daras, Sarah Pratt, Vivek Ramanujan, Yonatan Bitton, Kalyani Marathe, Stephen Mussmann, Richard Vencu, Mehdi Cherti, Ranjay Krishna, Pang Wei Koh, Olga Saukh, Alexander Ratner, Shuran Song, Hannaneh Hajishirzi, Ali Farhadi, Romain Beaumont, Sewoong Oh, Alex Dimakis, Jenia Jitsev, Yair Carmon, Vaishaal Shankar, Ludwig Schmidt
Multimodal datasets are a critical component in recent breakthroughs such as Stable Diffusion and GPT-4, yet their design does not receive the same research attention as model architectures or training algorithms.
2 code implementations • 27 Feb 2023 • Rahim Entezari, Mitchell Wortsman, Olga Saukh, M. Moein Shariatnia, Hanie Sedghi, Ludwig Schmidt
We investigate the impact of pre-training data distribution on the few-shot and full fine-tuning performance using 3 pre-training methods (supervised, contrastive language-image and image-image), 7 pre-training datasets, and 9 downstream datasets.
1 code implementation • 15 Nov 2022 • Keller Jordan, Hanie Sedghi, Olga Saukh, Rahim Entezari, Behnam Neyshabur
In this paper we look into the conjecture of Entezari et al. (2021) which states that if the permutation invariance of neural networks is taken into account, then there is likely no loss barrier to the linear interpolation between SGD solutions.
1 code implementation • 1 Jul 2022 • Francesco Corti, Rahim Entezari, Sara Hooker, Davide Bacciu, Olga Saukh
We study the impact of different pruning techniques on the representation learned by deep neural networks trained with contrastive loss functions.
no code implementations • 22 Jun 2022 • Lukas Timpl, Rahim Entezari, Hanie Sedghi, Behnam Neyshabur, Olga Saukh
This paper examines the impact of static sparsity on the robustness of a trained network to weight perturbations, data corruption, and adversarial examples.
no code implementations • 15 Jun 2022 • Amirreza Mahbod, Rahim Entezari, Isabella Ellinger, Olga Saukh
We investigate the impact of weight pruning on the performance of both branches separately and on the final nuclei instance segmentation result.
1 code implementation • ICLR 2022 • Rahim Entezari, Hanie Sedghi, Olga Saukh, Behnam Neyshabur
In this paper, we conjecture that if the permutation invariance of neural networks is taken into account, SGD solutions will likely have no barrier in the linear interpolation between them.
1 code implementation • 19 Nov 2020 • Johanna Einsiedler, Yun Cheng, Franz Papst, Olga Saukh
In this work, we estimate pollution reduction over the lockdown period by using the measurements from ground air pollution monitoring stations, training a long-term prediction model and comparing its predictions to measured values over the lockdown month. We show that our models achieve state-of-the-art performance on the data from air pollution measurement stations in Switzerland and in China: evaluate up to -15. 8% / +34. 4% change in NO2 / PM10 in Zurich; -35. 3 % / -3. 5 % and -42. 4 % / -34. 7 % in NO2 / PM2. 5 in Beijing and Wuhan respectively.
1 code implementation • 23 Sep 2019 • Rahim Entezari, Olga Saukh
Motivated by the success of the lottery ticket hypothesis, in this paper we propose an iterative deep model compression technique, which keeps the number of false negatives of the compressed model close to the one of the original model at the price of increasing the number of false positives if necessary.