no code implementations • CVPR 2024 • Roy Ganz, Yair Kittenplon, Aviad Aberdam, Elad Ben Avraham, Oren Nuriel, Shai Mazor, Ron Litman
This integration results in dynamic visual features focusing on relevant image aspects to the posed question.
no code implementations • CVPR 2024 • Tsachi Blau, Sharon Fogel, Roi Ronen, Alona Golts, Roy Ganz, Elad Ben Avraham, Aviad Aberdam, Shahar Tsiper, Ron Litman
The increasing use of transformer-based large language models brings forward the challenge of processing long sequences.
no code implementations • ICCV 2023 • Aviad Aberdam, David Bensaïd, Alona Golts, Roy Ganz, Oren Nuriel, Royee Tichauer, Shai Mazor, Ron Litman
Reading text in real-world scenarios often requires understanding the context surrounding it, especially when dealing with poor-quality text.
no code implementations • ICCV 2023 • Roy Ganz, Oren Nuriel, Aviad Aberdam, Yair Kittenplon, Shai Mazor, Ron Litman
Visual Question Answering (VQA) and Image Captioning (CAP), which are among the most popular vision-language tasks, have analogous scene-text versions that require reasoning from the text in the image.
no code implementations • 14 Sep 2022 • Sergi Garcia-Bordils, Andrés Mafla, Ali Furkan Biten, Oren Nuriel, Aviad Aberdam, Shai Mazor, Ron Litman, Dimosthenis Karatzas
This paper presents final results of the Out-Of-Vocabulary 2022 (OOV) challenge.
Optical Character Recognition Optical Character Recognition (OCR) +1
2 code implementations • 8 May 2022 • Aviad Aberdam, Roy Ganz, Shai Mazor, Ron Litman
In a novel setup, consistency is enforced on each modality separately.
no code implementations • 1 Jan 2021 • Aviad Aberdam, Dror Simon, Michael Elad
Deep generative models (e. g. GANs and VAEs) have been developed quite extensively in recent years.
no code implementations • 23 Dec 2020 • Ron Slossberg, Oron Anschel, Amir Markovitz, Ron Litman, Aviad Aberdam, Shahar Tsiper, Shai Mazor, Jon Wu, R. Manmatha
Although the topic of confidence calibration has been an active research area for the last several decades, the case of structured and sequence prediction calibration has been scarcely explored.
2 code implementations • CVPR 2021 • Aviad Aberdam, Ron Litman, Shahar Tsiper, Oron Anschel, Ron Slossberg, Shai Mazor, R. Manmatha, Pietro Perona
We propose a framework for sequence-to-sequence contrastive learning (SeqCLR) of visual representations, which we apply to text recognition.
no code implementations • 28 Jun 2020 • Aviad Aberdam, Dror Simon, Michael Elad
Deep generative models (e. g. GANs and VAEs) have been developed quite extensively in recent years.
1 code implementation • 23 Jan 2020 • Aviad Aberdam, Alona Golts, Michael Elad
Neural networks that are based on unfolding of an iterative solver, such as LISTA (learned iterative soft threshold algorithm), are widely used due to their accelerated performance.
1 code implementation • 24 Dec 2019 • Dror Simon, Aviad Aberdam
Image interpolation, or image morphing, refers to a visual transition between two (or more) input images.
2 code implementations • 2 Jun 2018 • Jeremias Sulam, Aviad Aberdam, Amir Beck, Michael Elad
Parsimonious representations are ubiquitous in modeling and processing information.
no code implementations • 29 May 2018 • Yaniv Romano, Aviad Aberdam, Jeremias Sulam, Michael Elad
Despite their impressive performance, deep convolutional neural networks (CNNs) have been shown to be sensitive to small adversarial perturbations.
no code implementations • 25 Apr 2018 • Aviad Aberdam, Jeremias Sulam, Michael Elad
The recently proposed multi-layer sparse model has raised insightful connections between sparse representations and convolutional neural networks (CNN).