no code implementations • 29 Jan 2024 • Yotam Wolf, Noam Wies, Dorin Shteyman, Binyamin Rothberg, Yoav Levine, Amnon Shashua
Representation engineering yields gains in alignment oriented tasks such as resistance to adversarial attacks and reduction of social biases, but was also shown to cause a decrease in the ability of the model to perform basic tasks.
no code implementations • 4 Jul 2023 • Eliya Segev, Maya Alroy, Ronen Katsir, Noam Wies, Ayana Shenhav, Yael Ben-Oren, David Zar, Oren Tadmor, Jacob Bitterman, Amnon Shashua, Tal Rosenwein
Here we propose $\textit{Align With Purpose}$, a $\textbf{general Plug-and-Play framework}$ for enhancing a desired property in models trained with the CTC criterion.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 19 Apr 2023 • Yotam Wolf, Noam Wies, Oshri Avnery, Yoav Levine, Amnon Shashua
An important aspect in developing language models that interact with humans is aligning their behavior to be useful and unharmful for their human users.
1 code implementation • 6 Apr 2022 • Noam Wies, Yoav Levine, Amnon Shashua
Recently, several works have demonstrated high gains by taking a straightforward approach for incorporating intermediate supervision in compounded natural language problems: the sequence-to-sequence LM is fed with an augmented input, in which the decomposed tasks' labels are simply concatenated to the original input.
no code implementations • ICLR 2022 • Yoav Levine, Noam Wies, Daniel Jannai, Dan Navon, Yedid Hoshen, Amnon Shashua
We highlight a bias introduced by this common practice: we prove that the pretrained NLM can model much stronger dependencies between text segments that appeared in the same training example, than it can between text segments that appeared in different training examples.
no code implementations • 9 May 2021 • Noam Wies, Yoav Levine, Daniel Jannai, Amnon Shashua
After their successful debut in natural language processing, Transformer architectures are now becoming the de-facto standard in many domains.
1 code implementation • NeurIPS 2020 • Yoav Levine, Noam Wies, Or Sharir, Hofit Bata, Amnon Shashua
Our guidelines elucidate the depth-to-width trade-off in self-attention networks of sizes up to the scale of GPT3 (which we project to be too deep for its size), and beyond, marking an unprecedented width of 30K as optimal for a 1-Trillion parameter network.
2 code implementations • 11 Feb 2019 • Or Sharir, Yoav Levine, Noam Wies, Giuseppe Carleo, Amnon Shashua
Artificial Neural Networks were recently shown to be an efficient representation of highly-entangled many-body quantum states.