Search Results for author: Christoforos Nalmpantis

Found 6 papers, 3 papers with code

Teaching Large Language Models to Reason with Reinforcement Learning

no code implementations7 Mar 2024 Alex Havrilla, Yuqing Du, Sharath Chandra Raparthy, Christoforos Nalmpantis, Jane Dwivedi-Yu, Maksym Zhuravinskyi, Eric Hambro, Sainbayar Sukhbaatar, Roberta Raileanu

Surprisingly, we find the sample complexity of Expert Iteration is similar to that of PPO, requiring at most on the order of $10^6$ samples to converge from a pretrained checkpoint.

reinforcement-learning

Understanding the Effects of RLHF on LLM Generalisation and Diversity

1 code implementation10 Oct 2023 Robert Kirk, Ishita Mediratta, Christoforos Nalmpantis, Jelena Luketina, Eric Hambro, Edward Grefenstette, Roberta Raileanu

OOD generalisation is crucial given the wide range of real-world scenarios in which these models are being used, while output diversity refers to the model's ability to generate varied outputs and is important for a variety of use cases.

Instruction Following

Neurons in Large Language Models: Dead, N-gram, Positional

no code implementations9 Sep 2023 Elena Voita, Javier Ferrando, Christoforos Nalmpantis

Specifically, we focus on the OPT family of models ranging from 125m to 66b parameters and rely only on whether an FFN neuron is activated or not.

Position

PEER: A Collaborative Language Model

no code implementations24 Aug 2022 Timo Schick, Jane Dwivedi-Yu, Zhengbao Jiang, Fabio Petroni, Patrick Lewis, Gautier Izacard, Qingfei You, Christoforos Nalmpantis, Edouard Grave, Sebastian Riedel

Textual content is often the output of a collaborative writing process: We start with an initial draft, ask for suggestions, and repeatedly make changes.

Language Modelling

On time series representations for multi-label NILM

1 code implementation Neural Computing and Applications 2020 Christoforos Nalmpantis, Dimitris Vrakas

Given only the main power consumption of a household, a non-intrusive load monitoring (NILM) system identifies which appliances are operating.

Dimensionality Reduction Non-Intrusive Load Monitoring +2

Cannot find the paper you are looking for? You can Submit a new open access paper.