1 code implementation • 8 Aug 2024 • Frank Nielsen, Alexander Soen
A Bregman manifold is a synonym for a dually flat space in information geometry which admits as a canonical divergence a Bregman divergence.
no code implementations • 29 May 2024 • Alexander Soen, Hisham Husain, Philip Schulz, Vu Nguyen
Instead, we propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
no code implementations • 8 Feb 2024 • Alexander Soen, Ke Sun
The Fisher information matrix can be used to characterize the local geometry of the parameter space of neural networks.
no code implementations • 6 Feb 2024 • Richard Nock, Ehsan Amid, Frank Nielsen, Alexander Soen, Manfred K. Warmuth
Most mathematical distortions used in ML are fundamentally integral in nature: $f$-divergences, Bregman divergences, (regularized) optimal transport distances, integral probability metrics, geodesic distances, etc.
no code implementations • 28 Feb 2023 • Shidi Li, Christian Walder, Alexander Soen, Lexing Xie, Miaomiao Liu
The sparse transformer can reduce the computational complexity of the self-attention layers to $O(n)$, whilst still being a universal approximator of continuous sequence-to-sequence functions.
1 code implementation • 31 Jan 2022 • Alexander Soen, Ibrahim Alabdulmohsin, Sanmi Koyejo, Yishay Mansour, Nyalleng Moorosi, Richard Nock, Ke Sun, Lexing Xie
We introduce a new family of techniques to post-process ("wrap") a black-box classifier in order to reduce its bias.
1 code implementation • 3 Nov 2021 • Pio Calderon, Alexander Soen, Marian-Andrei Rizoiu
The multivariate Hawkes process (MHP) is widely used for analyzing data streams that interact with each other, where events generate new events within their own dimension (via self-excitation) or across different dimensions (via cross-excitation).
no code implementations • NeurIPS 2021 • Alexander Soen, Ke Sun
In the realm of deep learning, the Fisher information matrix (FIM) gives novel insights and useful tools to characterize the loss landscape, perform second-order optimization, and build geometric learning theories.
no code implementations • 16 Apr 2021 • Marian-Andrei Rizoiu, Alexander Soen, Shidi Li, Pio Calderon, Leanne Dong, Aditya Krishna Menon, Lexing Xie
We propose the multi-impulse exogenous function - for when the exogenous events are observed as event time - and the latent homogeneous Poisson process exogenous function - for when the exogenous events are presented as interval-censored volumes.
1 code implementation • 1 Dec 2020 • Alexander Soen, Hisham Husain, Richard Nock
Furthermore, when the weak learners are specified to be decision trees, the sufficient statistics of the learned distribution can be examined to provide clues on sources of (un)fairness.
no code implementations • 28 Jul 2020 • Alexander Soen, Alexander Mathews, Daniel Grixti-Cheng, Lexing Xie
The proof connects the well known Stone-Weierstrass Theorem for function approximation, the uniform density of non-negative continuous functions using a transfer functions, the formulation of the parameters of a piece-wise continuous functions as a dynamic system, and a recurrent neural network implementation for capturing the dynamics.