Search Results for author: Nathan Godey

Found 6 papers, 0 papers with code

Why do small language models underperform? Studying Language Model Saturation via the Softmax Bottleneck

no code implementations11 Apr 2024 Nathan Godey, Éric de la Clergerie, Benoît Sagot

In this paper, we find that such saturation can be explained by a mismatch between the hidden dimension of smaller models and the high rank of the target contextual probability distribution.

Language Modelling

On the Scaling Laws of Geographical Representation in Language Models

no code implementations29 Feb 2024 Nathan Godey, Éric de la Clergerie, Benoît Sagot

Language models have long been shown to embed geographical information in their hidden representations.

Anisotropy Is Inherent to Self-Attention in Transformers

no code implementations22 Jan 2024 Nathan Godey, Éric de la Clergerie, Benoît Sagot

The representation degeneration problem is a phenomenon that is widely observed among self-supervised learning methods based on Transformers.

Self-Supervised Learning

Headless Language Models: Learning without Predicting with Contrastive Weight Tying

no code implementations15 Sep 2023 Nathan Godey, Éric de la Clergerie, Benoît Sagot

Self-supervised pre-training of language models usually consists in predicting probability distributions over extensive token vocabularies.

LAMBADA

Is Anisotropy Inherent to Transformers?

no code implementations13 Jun 2023 Nathan Godey, Éric de la Clergerie, Benoît Sagot

The representation degeneration problem is a phenomenon that is widely observed among self-supervised learning methods based on Transformers.

Self-Supervised Learning

MANTa: Efficient Gradient-Based Tokenization for Robust End-to-End Language Modeling

no code implementations14 Dec 2022 Nathan Godey, Roman Castagné, Éric de la Clergerie, Benoît Sagot

The resulting system offers a trade-off between the expressiveness of byte-level models and the speed of models trained using subword tokenization.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.