2 code implementations • NeurIPS 2023 • Amirhossein Kazemnejad, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Payel Das, Siva Reddy
In this paper, we conduct a systematic empirical study comparing the length generalization performance of decoder-only Transformers with five different position encoding approaches including Absolute Position Embedding (APE), T5's Relative PE, ALiBi, and Rotary, in addition to Transformers without positional encoding (NoPE).
no code implementations • 24 May 2023 • Amirhossein Kazemnejad, Mehdi Rezagholizadeh, Prasanna Parthasarathi, Sarath Chandar
We propose a systematic framework to measure parametric knowledge utilization in PLMs.
1 code implementation • 23 Oct 2022 • Koustuv Sinha, Amirhossein Kazemnejad, Siva Reddy, Joelle Pineau, Dieuwke Hupkes, Adina Williams
Transformer language models encode the notion of word order using positional information.
no code implementations • ACL 2020 • Amirhossein Kazemnejad, Mohammadreza Salehi, Mahdieh Soleymani Baghshah
With its novel editor module, the model then paraphrases the input sequence by editing it using the extracted relations between the retrieved pair of sentences.