Search Results for author: Hofit Bata

Found 3 papers, 1 papers with code

Jamba: A Hybrid Transformer-Mamba Language Model

no code implementations • 28 Mar 2024 • Opher Lieber, Barak Lenz, Hofit Bata, Gal Cohen, Jhonathan Osin, Itay Dalmedigos, Erez Safahi, Shaked Meirom, Yonatan Belinkov, Shai Shalev-Shwartz, Omri Abend, Raz Alon, Tomer Asida, Amir Bergman, Roman Glozman, Michael Gokhman, Avashalom Manevich, Nir Ratner, Noam Rozen, Erez Shwartz, Mor Zusman, Yoav Shoham

We present Jamba, a new base large language model based on a novel hybrid Transformer-Mamba mixture-of-experts (MoE) architecture.

Language Modelling Large Language Model

Paper
Add Code

MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning

no code implementations • 1 May 2022 • Ehud Karpas, Omri Abend, Yonatan Belinkov, Barak Lenz, Opher Lieber, Nir Ratner, Yoav Shoham, Hofit Bata, Yoav Levine, Kevin Leyton-Brown, Dor Muhlgay, Noam Rozen, Erez Schwartz, Gal Shachaf, Shai Shalev-Shwartz, Amnon Shashua, Moshe Tenenholtz

Huge language models (LMs) have ushered in a new era for AI, serving as a gateway to natural-language-based knowledge tasks.

Paper
Add Code

The Depth-to-Width Interplay in Self-Attention

1 code implementation • NeurIPS 2020 • Yoav Levine, Noam Wies, Or Sharir, Hofit Bata, Amnon Shashua

Our guidelines elucidate the depth-to-width trade-off in self-attention networks of sizes up to the scale of GPT3 (which we project to be too deep for its size), and beyond, marking an unprecedented width of 30K as optimal for a 1-Trillion parameter network.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.