no code implementations • 13 Mar 2025 • Hai-Vy Nguyen, Fabrice Gamboa, Sixin Zhang, Reda Chhaibi, Serge Gratton, Thierry Giaccone
In this paper, we introduce a novel spatial attention module, that can be integrated to any convolutional network.
no code implementations • 28 May 2024 • Hai-Vy Nguyen, Fabrice Gamboa, Sixin Zhang, Reda Chhaibi, Serge Gratton, Thierry Giaccone
This loss boosts the discriminative power of neural nets, represented by intra-class compactness and inter-class separability.
no code implementations • 22 May 2024 • Sixin Zhang
Based on the notions of differential Stackelberg equilibrium and differential Nash equilibrium on Riemannian manifold, we analyze the local convergence of two representative deterministic simultaneous algorithms $\tau$-GDA and $\tau$-SGA to such equilibria.
no code implementations • 12 Apr 2024 • Hai-Vy Nguyen, Fabrice Gamboa, Reda Chhaibi, Sixin Zhang, Serge Gratton, Thierry Giaccone
The method is applicable to any classification model as it is applied directly in feature space at test time and does not intervene in training process.
no code implementations • 14 Mar 2022 • Sixin Zhang
Generative Adversarial Networks (GANs) learn an implicit generative model from data samples through a two-player game.
1 code implementation • ICLR 2022 • Antoine Brochard, Sixin Zhang, Stéphane Mallat
State-of-the-art maximum entropy models for texture synthesis are built from statistics relying on image representations defined by convolutional neural networks (CNN).
1 code implementation • 10 Dec 2021 • Sixin Zhang, Emmanuel Soubies, Cédric Févotte
Non-negative matrix factorization with transform learning (TL-NMF) is a recent idea that aims at learning data representations suited to NMF.
2 code implementations • 27 Oct 2020 • Antoine Brochard, Bartłomiej Błaszczyszyn, Stéphane Mallat, Sixin Zhang
This paper presents a statistical model for stationary ergodic point processes, estimated from a single realization observed in a square window.
1 code implementation • 19 Oct 2020 • Pierre Boudier, Anthony Fillion, Serge Gratton, Selime Gürol, Sixin Zhang
Data assimilation (DA) aims at forecasting the state of a dynamical system by combining a mathematical representation of the system with noisy observations taking into account their uncertainties.
1 code implementation • 22 Nov 2019 • Sixin Zhang, Stéphane Mallat
The covariance of a stationary process $X$ is diagonalized by a Fourier transform.
2 code implementations • 28 Dec 2018 • Mathieu Andreux, Tomás Angles, Georgios Exarchakis, Roberto Leonarduzzi, Gaspar Rochette, Louis Thiry, John Zarka, Stéphane Mallat, Joakim andén, Eugene Belilovsky, Joan Bruna, Vincent Lostanlen, Muawiz Chaudhary, Matthew J. Hirn, Edouard Oyallon, Sixin Zhang, Carmine Cella, Michael Eickenberg
The wavelet scattering transform is an invariant signal representation suitable for many signal processing and machine learning applications.
no code implementations • 19 Dec 2018 • Antoine Brochard, Bartłomiej Błaszczyszyn, Stéphane Mallat, Sixin Zhang
To approximate (interpolate) the marking function, in our baseline approach, we build a statistical regression model of the marks with respect some local point distance representation.
1 code implementation • 29 Oct 2018 • Stéphane Mallat, Sixin Zhang, Gaspar Rochette
For wavelet filters, we show numerically that signals having sparse wavelet coefficients can be recovered from few phase harmonic correlations, which provide a compressive representation
no code implementations • 7 May 2016 • Sixin Zhang
We also find a surprising connection between the momentum SGD and the EASGD method with a negative moving average rate.
10 code implementations • NeurIPS 2015 • Sixin Zhang, Anna Choromanska, Yann Lecun
We empirically demonstrate that in the deep learning setting, due to the existence of many local optima, allowing more exploration can lead to the improved performance.
1 code implementation • ICML'13: Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28 2013 • Li Wan, Matthew Zeiler, Sixin Zhang, Yann Lecun, Rob Fergus
When training with Dropout, a randomly selected subset of activations are set to zero within each layer.
Ranked #7 on
Image Classification
on MNIST
no code implementations • 6 Jun 2012 • Tom Schaul, Sixin Zhang, Yann Lecun
The performance of stochastic gradient descent (SGD) depends critically on how learning rates are tuned and decreased over time.