Chain and Causal Attention for Efficient Entity Tracking

no code implementations7 Oct 2024 Erwan Fagnou, Paul Caillon, Blaise Delattre, Alexandre Allauzen

This paper investigates the limitations of transformers for entity-tracking tasks in large language models.

Exploring Precision and Recall to assess the quality and diversity of LLMs

1 code implementation16 Feb 2024 Florian Le Bronnec, Alexandre Verine, Benjamin Negrevergne, Yann Chevaleyre, Alexandre Allauzen

We introduce a novel evaluation framework for Large Language Models (LLMs) such as \textsc{Llama-2} and \textsc{Mistral}, focusing on importing Precision and Recall metrics from image generation to text generation.

Spectral Norm of Convolutional Layers with Circular and Zero Paddings

1 code implementation31 Jan 2024 Blaise Delattre, Quentin Barthélemy, Alexandre Allauzen

This paper leverages the use of \emph{Gram iteration} an efficient, deterministic, and differentiable method for computing spectral norm with an upper bound guarantee.

Differentially Private Gradient Flow based on the Sliced Wasserstein Distance

1 code implementation13 Dec 2023 Ilana Sebag, Muni Sreenivas Pydi, Jean-Yves Franceschi, Alain Rakotomamonjy, Mike Gartrell, Jamal Atif, Alexandre Allauzen

In this paper, we introduce a novel differentially private generative modeling approach based on a gradient flow in the space of probability measures.

Efficient Bound of Lipschitz Constant for Convolutional Layers by Gram Iteration

1 code implementation25 May 2023 Blaise Delattre, Quentin Barthélemy, Alexandre Araujo, Alexandre Allauzen

Since the control of the Lipschitz constant has a great impact on the training stability, generalization, and robustness of neural networks, the estimation of this value is nowadays a real scientific challenge.

Experimental study of Neural ODE training with adaptive solver for dynamical systems modeling

1 code implementation13 Nov 2022 Alexandre Allauzen, Thiago Petrilli Maffei Dardis, Hannah Plath

Neural Ordinary Differential Equations (ODEs) was recently introduced as a new family of neural network models, which relies on black-box ODE solvers for inference and training.

Curriculum learning for data-driven modeling of dynamical systems

no code implementations15 Dec 2021 Alessandro Bucci, Onofrio Semeraro, Alexandre Allauzen, Sergio Chibbaro, Lionel Mathelin

Based on that, we consider entropy as a metric of complexity of the dataset; we show how an informed design of the training set based on the analysis of the entropy significantly improves the resulting models in terms of generalizability, and provide insights on the amount and the choice of data required for an effective data-driven modeling.

A Dynamical System Perspective for Lipschitz Neural Networks

no code implementations25 Oct 2021 Laurent Meunier, Blaise Delattre, Alexandre Araujo, Alexandre Allauzen

The Lipschitz constant of neural networks has been established as a key quantity to enforce the robustness to adversarial examples.

Measure and Evaluation of Semantic Divergence across Two Languages

no code implementations ACL 2021 Syrielle Montariol, Alexandre Allauzen

We propose a set of scenarios to characterize semantic divergence across two languages, along with a setup to differentiate them in a bilingual corpus.

Exploring sentence informativeness

no code implementations JEPTALNRECITAL 2019 Syrielle Montariol, Aina Garí Soler, Alexandre Allauzen

This study is a preliminary exploration of the concept of informativeness -how much information a sentence gives about a word it contains- and its potential benefits to building quality word representations from scarce data.

Document Neural Autoregressive Distribution Estimation

no code implementations18 Mar 2016 Stanislas Lauly, Yin Zheng, Alexandre Allauzen, Hugo Larochelle

We present an approach based on feed-forward neural networks for learning the distribution of textual documents.

