no code implementations • 8 Feb 2024 • Roy Ganz, Yair Kittenplon, Aviad Aberdam, Elad Ben Avraham, Oren Nuriel, Shai Mazor, Ron Litman
This integration results in dynamic visual features focusing on relevant image aspects to the posed question.
no code implementations • 7 Jan 2024 • Tsachi Blau, Sharon Fogel, Roi Ronen, Alona Golts, Roy Ganz, Elad Ben Avraham, Aviad Aberdam, Shahar Tsiper, Ron Litman
The increasing use of transformer-based large language models brings forward the challenge of processing long sequences.
no code implementations • 29 Jun 2023 • Roy Ganz, Michael Elad
Perceptually Aligned Gradients (PAG) refer to an intriguing property observed in robust image classification models, wherein their input gradients align with human perception and pose semantic meanings.
1 code implementation • 28 May 2023 • Noam Rotstein, David Bensaid, Shaked Brody, Roy Ganz, Ron Kimmel
Our proposed method, FuseCap, fuses the outputs of such vision experts with the original captions using a large language model (LLM), yielding comprehensive image descriptions.
Ranked #1 on Image Captioning on COCO Captions (CLIPScore metric)
1 code implementation • 27 Mar 2023 • Tsachi Blau, Roy Ganz, Chaim Baskin, Michael Elad, Alex Bronstein
We show that the proposed method achieves state-of-the-art results and validate our claim through extensive experiments on a variety of defense methods, classifier architectures, and datasets.
no code implementations • ICCV 2023 • Roy Ganz, Oren Nuriel, Aviad Aberdam, Yair Kittenplon, Shai Mazor, Ron Litman
Visual Question Answering (VQA) and Image Captioning (CAP), which are among the most popular vision-language tasks, have analogous scene-text versions that require reasoning from the text in the image.
no code implementations • ICCV 2023 • Aviad Aberdam, David Bensaïd, Alona Golts, Roy Ganz, Oren Nuriel, Royee Tichauer, Shai Mazor, Ron Litman
Reading text in real-world scenarios often requires understanding the context surrounding it, especially when dealing with poor-quality text.
1 code implementation • 18 Aug 2022 • Bahjat Kawar, Roy Ganz, Michael Elad
In order to obtain class-conditional generation, it was suggested to guide the diffusion process by gradients from a time-dependent classifier.
1 code implementation • 22 Jul 2022 • Roy Ganz, Bahjat Kawar, Michael Elad
In this work, we focus on this trait and test whether \emph{Perceptually Aligned Gradients imply Robustness}.
1 code implementation • 17 Jul 2022 • Tsachi Blau, Roy Ganz, Bahjat Kawar, Alex Bronstein, Michael Elad
Deep Neural Networks (DNNs) are highly sensitive to imperceptible malicious perturbations, known as adversarial attacks.
2 code implementations • 8 May 2022 • Aviad Aberdam, Roy Ganz, Shai Mazor, Ron Litman
In a novel setup, consistency is enforced on each modality separately.
no code implementations • 29 Sep 2021 • Roy Ganz, Michael Elad
The interest of the deep learning community in image synthesis has grown massively in recent years.
1 code implementation • 8 Aug 2021 • Roy Ganz, Michael Elad
The interest of the machine learning community in image synthesis has grown significantly in recent years, with the introduction of a wide range of deep generative models and means for training them.
Ranked #4 on Image Generation on ImageNet 128x128
no code implementations • 1 Apr 2021 • Roy Ganz, Michael Elad
The interest of the deep learning community in image synthesis has grown massively in recent years.