no code implementations • 23 Feb 2024 • Maximilien Dreveton, Alperen Gözeten, Matthias Grossglauser, Patrick Thiran
In such mixtures, we establish that Bregman hard clustering, a variant of Lloyd's algorithm employing a Bregman divergence, is rate optimal.
1 code implementation • 20 Jul 2023 • Mahsa Forouzesh, Patrick Thiran
We study various data partitioning methods in the presence of label noise and observe that filtering out noisy samples from hard samples with this proposed metric results in the best datasets as evidenced by the high test accuracy achieved after models are trained on the filtered datasets.
no code implementations • 1 Jun 2023 • Maximilien Dreveton, Daichi Kuroda, Matthias Grossglauser, Patrick Thiran
We also establish that this bottom-up algorithm attains the information-theoretic threshold for exact recovery at intermediate levels of the hierarchy.
1 code implementation • 31 May 2023 • Anthony Bardou, Patrick Thiran, Thomas Begin
Bayesian Optimization (BO) is typically used to optimize an unknown function $f$ that is noisy and costly to evaluate, by exploiting an acquisition function that must be maximized at each optimization step.
1 code implementation • 8 Dec 2022 • Mahsa Forouzesh, Hanie Sedghi, Patrick Thiran
We empirically show the effectiveness of our metric in tracking memorization on various architectures and datasets and provide theoretical insights into the design of the susceptibility metric.
no code implementations • 25 May 2022 • Saeed Masiha, Saber Salehkaleybar, Niao He, Negar Kiyavash, Patrick Thiran
We prove that the total sample complexity of SCRN in achieving $\epsilon$-global optimum is $\mathcal{O}(\epsilon^{-7/(2\alpha)+1})$ for $1\le\alpha< 3/2$ and $\mathcal{\tilde{O}}(\epsilon^{-2/(\alpha)})$ for $3/2\le\alpha\le 2$.
no code implementations • 17 May 2022 • Saber Salehkaleybar, Sadegh Khorasani, Negar Kiyavash, Niao He, Patrick Thiran
SHARP algorithm is parameter-free, achieving $\epsilon$-approximate first-order stationary point with $O(\epsilon^{-3})$ number of trajectories, while using a batch size of $O(1)$ at each iteration.
1 code implementation • 14 Jul 2021 • Mahsa Forouzesh, Patrick Thiran
We propose a metric for evaluating the generalization ability of deep neural networks trained with mini-batch gradient descent.
no code implementations • 1 Jan 2021 • Mahsa Forouzesh, Patrick Thiran
Validation-based early-stopping methods are one of the most popular techniques used to avoid over-training deep neural networks.
1 code implementation • 30 Jul 2020 • Mahsa Forouzesh, Farnood Salehi, Patrick Thiran
We find a rather strong empirical relation between the output sensitivity and the variance in the bias-variance decomposition of the loss function, which hints on using sensitivity as a metric for comparing the generalization performance of networks, without requiring labeled data.
no code implementations • 26 Nov 2019 • Victor Kristof, Valentin Quelquejay-Leclère, Robin Zbinden, Lucas Maystre, Matthias Grossglauser, Patrick Thiran
We propose a statistical model to understand people's perception of their carbon footprint.
1 code implementation • NeurIPS 2019 • Farnood Salehi, William Trouleau, Matthias Grossglauser, Patrick Thiran
It is also able to take into account the uncertainty in the model parameters by learning a posterior distribution over them.
no code implementations • 25 Sep 2019 • Mahsa Forouzesh, Farnood Salehi, Patrick Thiran
We find a rather strong empirical relation between the output sensitivity and the variance in the bias-variance decomposition of the loss function, which hints on using sensitivity as a metric for comparing generalization performance of networks, without requiring labeled data.
no code implementations • NeurIPS 2018 • Farnood Salehi, Patrick Thiran, L. Elisa Celis
Ideally, we would update the decision variable that yields the largest decrease in the cost function.
no code implementations • 8 Aug 2017 • Farnood Salehi, L. Elisa Celis, Patrick Thiran
This approach for sampling datapoints is general, and can be used in conjunction with any algorithm that uses an unbiased gradient estimation -- we expect it to have broad applicability beyond the specific examples explored in this work.
no code implementations • ICML 2017 • Pedram Pad, Farnood Salehi, Elisa Celis, Patrick Thiran, Michael Unser
We propose a new statistical dictionary learning algorithm for sparse signals that is based on an $\alpha$-stable innovation model.
no code implementations • 9 Dec 2015 • Farid M. Naini, Jayakrishnan Unnikrishnan, Patrick Thiran, Martin Vetterli
The accuracy obtained under this optimal method can thus be used to quantify the maximum level of user identification that is possible in such settings.
1 code implementation • 12 Dec 2012 • Mohamed Kafsi, Matthias Grossglauser, Patrick Thiran
To quantify the randomness of Markov trajectories with fixed initial and final states, Ekroot and Cover proposed a closed-form expression for the entropy of trajectories of an irreducible finite state Markov chain.
Information Theory Information Theory Applications