Search Results for author: Alessio Devoto

Found 12 papers, 7 papers with code

Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression

1 code implementation4 Mar 2025 Nathan Godey, Alessio Devoto, Yu Zhao, Simone Scardapane, Pasquale Minervini, Éric de la Clergerie, Benoît Sagot

Autoregressive language models rely on a Key-Value (KV) Cache, which avoids re-computing past hidden states during generation, making it faster.

Text Generation

Mixture-of-Experts Graph Transformers for Interpretable Particle Collision Detection

no code implementations6 Jan 2025 Donatella Genovese, Alessandro Sgroi, Alessio Devoto, Samuel Valentine, Lennox Wood, Cristiano Sebastiani, Stefano Giagu, Monica D'Onofrio, Simone Scardapane

In this paper, we propose a novel approach that combines a Graph Transformer model with Mixture-of-Expert layers to achieve high predictive performance while embedding interpretability into the architecture.

Decision Making

Goal-oriented Communications based on Recursive Early Exit Neural Networks

no code implementations27 Dec 2024 Jary Pomponi, Mattia Merluzzi, Alessio Devoto, Mateus Pontes Mota, Paolo Di Lorenzo, Simone Scardapane

This paper presents a novel framework for goal-oriented semantic communications leveraging recursive early exit models.

Analysing the Residual Stream of Language Models Under Knowledge Conflicts

1 code implementation21 Oct 2024 Yu Zhao, Xiaotang Du, Giwon Hong, Aryo Pradipta Gema, Alessio Devoto, Hongru Wang, Xuanli He, Kam-Fai Wong, Pasquale Minervini

Through probing tasks, we find that LLMs can internally register the signal of knowledge conflict in the residual stream, which can be accurately detected by probing the intermediate model activations.

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering

1 code implementation21 Oct 2024 Yu Zhao, Alessio Devoto, Giwon Hong, Xiaotang Du, Aryo Pradipta Gema, Hongru Wang, Xuanli He, Kam-Fai Wong, Pasquale Minervini

In this work, we propose \textsc{SpARE}, a \emph{training-free} representation engineering method that uses pre-trained sparse auto-encoders (SAEs) to control the knowledge selection behaviour of LLMs.

Open-Domain Question Answering

Adaptive Layer Selection for Efficient Vision Transformer Fine-Tuning

no code implementations16 Aug 2024 Alessio Devoto, Federico Alvetreti, Jary Pomponi, Paolo Di Lorenzo, Pasquale Minervini, Simone Scardapane

To this end, in this paper we introduce an efficient fine-tuning method for ViTs called $\textbf{ALaST}$ ($\textit{Adaptive Layer Selection Fine-Tuning for Vision Transformers}$) to speed up the fine-tuning process while reducing computational cost, memory load, and training time.

parameter-efficient fine-tuning

A Simple and Effective $L_2$ Norm-Based Strategy for KV Cache Compression

2 code implementations17 Jun 2024 Alessio Devoto, Yu Zhao, Simone Scardapane, Pasquale Minervini

Existing approaches to reduce the KV cache size involve either fine-tuning the model to learn a compression strategy or leveraging attention scores to reduce the sequence length.

Decoder Language Modelling

Adaptive Semantic Token Selection for AI-native Goal-oriented Communications

no code implementations25 Apr 2024 Alessio Devoto, Simone Petruzzi, Jary Pomponi, Paolo Di Lorenzo, Simone Scardapane

In this paper, we propose a novel design for AI-native goal-oriented communications, exploiting transformer neural networks under dynamic inference constraints on bandwidth and computation.

Conditional computation in neural networks: principles and research trends

no code implementations12 Mar 2024 Simone Scardapane, Alessandro Baiocchi, Alessio Devoto, Valerio Marsocci, Pasquale Minervini, Jary Pomponi

This article summarizes principles and ideas from the emerging area of applying \textit{conditional computation} methods to the design of neural networks.

scientific discovery Semantic Communication +1

Class incremental learning with probability dampening and cascaded gated classifier

2 code implementations2 Feb 2024 Jary Pomponi, Alessio Devoto, Simone Scardapane

The latter is a gated incremental classifier, helping the model modify past predictions without directly interfering with them.

class-incremental learning Class Incremental Learning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.