no code implementations • 24 Mar 2025 • Akhiad Bercovich, Mohammad Dabbah, Omri Puny, Ido Galil, Amnon Geifman, Yonatan Geifman, Izhak Golan, Ehud Karpas, Itay Levy, Zach Moshe, Najeeb Nabwani, Tomer Ronen, Itamar Schen, Elad Segal, Ido Shahaf, Oren Tropp, Ran Zilberstein, Ran El-Yaniv
We introduce FFN Fusion, an architectural optimization technique that reduces sequential computation in large language models by identifying and exploiting natural opportunities for parallelization.
no code implementations • 28 Nov 2024 • Akhiad Bercovich, Tomer Ronen, Talor Abramovich, Nir Ailon, Nave Assaf, Mohammad Dabbah, Ido Galil, Amnon Geifman, Yonatan Geifman, Izhak Golan, Netanel Haber, Ehud Karpas, Roi Koren, Itay Levy, Pavlo Molchanov, Shahar Mor, Zach Moshe, Najeeb Nabwani, Omri Puny, Ran Rubin, Itamar Schen, Ido Shahaf, Oren Tropp, Omer Ullman Argov, Ran Zilberstein, Ran El-Yaniv
We demonstrate the real-world impact of our framework through Llama-3. 1-Nemotron-51B-Instruct (Nemotron-51B), a publicly available model derived from Llama-3. 1-70B-Instruct.
no code implementations • 26 Jul 2023 • Amnon Geifman, Daniel Barzilai, Ronen Basri, Meirav Galun
We leverage the duality between wide neural networks and Neural Tangent Kernels and propose a preconditioned gradient descent method, which alters the trajectory of GD.
no code implementations • 27 Nov 2022 • Daniel Barzilai, Amnon Geifman, Meirav Galun, Ronen Basri
Over-parameterized residual networks (ResNets) are amongst the most successful convolutional neural architectures for image processing.
no code implementations • 17 Mar 2022 • Amnon Geifman, Meirav Galun, David Jacobs, Ronen Basri
We study the properties of various over-parametrized convolutional neural architectures through their respective Gaussian process and neural tangent kernels.
no code implementations • 7 Apr 2021 • Yuval Belfer, Amnon Geifman, Meirav Galun, Ronen Basri
Deep residual network architectures have been shown to achieve superior accuracy over classical feed-forward networks, yet their success is still not fully understood.
1 code implementation • NeurIPS 2020 • Amnon Geifman, Abhay Yadav, Yoni Kasten, Meirav Galun, David Jacobs, Ronen Basri
Experiments show that these kernel methods perform similarly to real neural networks.
no code implementations • ICML 2020 • Ronen Basri, Meirav Galun, Amnon Geifman, David Jacobs, Yoni Kasten, Shira Kritchman
Recent works have partly attributed the generalization ability of over-parameterized neural networks to frequency bias -- networks trained with gradient descent on data drawn from a uniform distribution find a low frequency fit before high frequency ones.
no code implementations • CVPR 2020 • Amnon Geifman, Yoni Kasten, Meirav Galun, Ronen Basri
Global methods to Structure from Motion have gained popularity in recent years.
no code implementations • ICCV 2019 • Yoni Kasten, Amnon Geifman, Meirav Galun, Ronen Basri
A common approach to essential matrix averaging is to separately solve for camera orientations and subsequently for camera positions.
1 code implementation • CVPR 2019 • Yoni Kasten, Amnon Geifman, Meirav Galun, Ronen Basri
First, given ${n \choose 2}$ fundamental matrices computed for $n$ images, we provide a complete algebraic characterization in the form of conditions that are both necessary and sufficient to enabling the recovery of camera matrices.