Search Results for author: Patrick Thiran

Found 18 papers, 7 papers with code

Universal Lower Bounds and Optimal Rates: Achieving Minimax Clustering Error in Sub-Exponential Mixture Models

no code implementations23 Feb 2024 Maximilien Dreveton, Alperen Gözeten, Matthias Grossglauser, Patrick Thiran

In such mixtures, we establish that Bregman hard clustering, a variant of Lloyd's algorithm employing a Bregman divergence, is rate optimal.

Clustering

Differences Between Hard and Noisy-labeled Samples: An Empirical Study

1 code implementation20 Jul 2023 Mahsa Forouzesh, Patrick Thiran

We study various data partitioning methods in the presence of label noise and observe that filtering out noisy samples from hard samples with this proposed metric results in the best datasets as evidenced by the high test accuracy achieved after models are trained on the filtered datasets.

When Does Bottom-up Beat Top-down in Hierarchical Community Detection?

no code implementations1 Jun 2023 Maximilien Dreveton, Daichi Kuroda, Matthias Grossglauser, Patrick Thiran

We also establish that this bottom-up algorithm attains the information-theoretic threshold for exact recovery at intermediate levels of the hierarchy.

Clustering Community Detection +1

Relaxing the Additivity Constraints in Decentralized No-Regret High-Dimensional Bayesian Optimization

1 code implementation31 May 2023 Anthony Bardou, Patrick Thiran, Thomas Begin

Bayesian Optimization (BO) is typically used to optimize an unknown function $f$ that is noisy and costly to evaluate, by exploiting an acquisition function that must be maximized at each optimization step.

Bayesian Optimization

Leveraging Unlabeled Data to Track Memorization

1 code implementation8 Dec 2022 Mahsa Forouzesh, Hanie Sedghi, Patrick Thiran

We empirically show the effectiveness of our metric in tracking memorization on various architectures and datasets and provide theoretical insights into the design of the susceptibility metric.

Memorization

Stochastic Second-Order Methods Improve Best-Known Sample Complexity of SGD for Gradient-Dominated Function

no code implementations25 May 2022 Saeed Masiha, Saber Salehkaleybar, Niao He, Negar Kiyavash, Patrick Thiran

We prove that the total sample complexity of SCRN in achieving $\epsilon$-global optimum is $\mathcal{O}(\epsilon^{-7/(2\alpha)+1})$ for $1\le\alpha< 3/2$ and $\mathcal{\tilde{O}}(\epsilon^{-2/(\alpha)})$ for $3/2\le\alpha\le 2$.

Policy Gradient Methods Reinforcement Learning (RL) +1

Momentum-Based Policy Gradient with Second-Order Information

no code implementations17 May 2022 Saber Salehkaleybar, Sadegh Khorasani, Negar Kiyavash, Niao He, Patrick Thiran

SHARP algorithm is parameter-free, achieving $\epsilon$-approximate first-order stationary point with $O(\epsilon^{-3})$ number of trajectories, while using a batch size of $O(1)$ at each iteration.

Policy Gradient Methods

Disparity Between Batches as a Signal for Early Stopping

1 code implementation14 Jul 2021 Mahsa Forouzesh, Patrick Thiran

We propose a metric for evaluating the generalization ability of deep neural networks trained with mini-batch gradient descent.

Early Stopping by Gradient Disparity

no code implementations1 Jan 2021 Mahsa Forouzesh, Patrick Thiran

Validation-based early-stopping methods are one of the most popular techniques used to avoid over-training deep neural networks.

Generalization Comparison of Deep Neural Networks via Output Sensitivity

1 code implementation30 Jul 2020 Mahsa Forouzesh, Farnood Salehi, Patrick Thiran

We find a rather strong empirical relation between the output sensitivity and the variance in the bias-variance decomposition of the loss function, which hints on using sensitivity as a metric for comparing the generalization performance of networks, without requiring labeled data.

Learning Hawkes Processes from a Handful of Events

1 code implementation NeurIPS 2019 Farnood Salehi, William Trouleau, Matthias Grossglauser, Patrick Thiran

It is also able to take into account the uncertainty in the model parameters by learning a posterior distribution over them.

On the Reflection of Sensitivity in the Generalization Error

no code implementations25 Sep 2019 Mahsa Forouzesh, Farnood Salehi, Patrick Thiran

We find a rather strong empirical relation between the output sensitivity and the variance in the bias-variance decomposition of the loss function, which hints on using sensitivity as a metric for comparing generalization performance of networks, without requiring labeled data.

Coordinate Descent with Bandit Sampling

no code implementations NeurIPS 2018 Farnood Salehi, Patrick Thiran, L. Elisa Celis

Ideally, we would update the decision variable that yields the largest decrease in the cost function.

Stochastic Optimization with Bandit Sampling

no code implementations8 Aug 2017 Farnood Salehi, L. Elisa Celis, Patrick Thiran

This approach for sampling datapoints is general, and can be used in conjunction with any algorithm that uses an unbiased gradient estimation -- we expect it to have broad applicability beyond the specific examples explored in this work.

Stochastic Optimization

Dictionary Learning Based on Sparse Distribution Tomography

no code implementations ICML 2017 Pedram Pad, Farnood Salehi, Elisa Celis, Patrick Thiran, Michael Unser

We propose a new statistical dictionary learning algorithm for sparse signals that is based on an $\alpha$-stable innovation model.

Dictionary Learning Image Denoising

Where You Are Is Who You Are: User Identification by Matching Statistics

no code implementations9 Dec 2015 Farid M. Naini, Jayakrishnan Unnikrishnan, Patrick Thiran, Martin Vetterli

The accuracy obtained under this optimal method can thus be used to quantify the maximum level of user identification that is possible in such settings.

The Entropy of Conditional Markov Trajectories

1 code implementation12 Dec 2012 Mohamed Kafsi, Matthias Grossglauser, Patrick Thiran

To quantify the randomness of Markov trajectories with fixed initial and final states, Ekroot and Cover proposed a closed-form expression for the entropy of trajectories of an irreducible finite state Markov chain.

Information Theory Information Theory Applications

Cannot find the paper you are looking for? You can Submit a new open access paper.