Search Results for author: Pietro Lesci

Found 4 papers, 3 papers with code

Tending Towards Stability: Convergence Challenges in Small Language Models

no code implementations15 Oct 2024 Richard Diehl Martinez, Pietro Lesci, Paula Buttery

We find that nearly all layers in larger models stabilise early in training - within the first 20% - whereas layers in smaller models exhibit slower and less stable convergence, especially when their parameters have lower effective rank.

Causal Estimation of Memorisation Profiles

1 code implementation6 Jun 2024 Pietro Lesci, Clara Meister, Thomas Hofmann, Andreas Vlachos, Tiago Pimentel

Understanding memorisation in language models has practical and societal implications, e. g., studying models' training dynamics or preventing copyright infringements.

counterfactual Econometrics

AnchorAL: Computationally Efficient Active Learning for Large and Imbalanced Datasets

2 code implementations8 Apr 2024 Pietro Lesci, Andreas Vlachos

By dynamically selecting different anchors at each iteration it promotes class balance and prevents overfitting the initial decision boundary, thus promoting the discovery of new clusters of minority instances.

Active Learning imbalanced classification

Diable: Efficient Dialogue State Tracking as Operations on Tables

1 code implementation26 May 2023 Pietro Lesci, Yoshinari Fujinuma, Momchil Hardalov, Chao Shang, Yassine Benajiba, Lluis Marquez

Sequence-to-sequence state-of-the-art systems for dialogue state tracking (DST) use the full dialogue history as input, represent the current state as a list with all the slots, and generate the entire state from scratch at each dialogue turn.

Dialogue State Tracking

Cannot find the paper you are looking for? You can Submit a new open access paper.