Search Results for author: Yair Kittenplon

Found 4 papers, 1 papers with code

Question Aware Vision Transformer for Multimodal Reasoning

no code implementations • 8 Feb 2024 • Roy Ganz, Yair Kittenplon, Aviad Aberdam, Elad Ben Avraham, Oren Nuriel, Shai Mazor, Ron Litman

This integration results in dynamic visual features focusing on relevant image aspects to the posed question.

Language Modelling Large Language Model +1

Paper
Add Code

Towards Models that Can See and Read

no code implementations • ICCV 2023 • Roy Ganz, Oren Nuriel, Aviad Aberdam, Yair Kittenplon, Shai Mazor, Ron Litman

Visual Question Answering (VQA) and Image Captioning (CAP), which are among the most popular vision-language tasks, have analogous scene-text versions that require reasoning from the text in the image.

Image Captioning Question Answering +1

Paper
Add Code

Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer

no code implementations • CVPR 2022 • Yair Kittenplon, Inbal Lavi, Sharon Fogel, Yarin Bar, R. Manmatha, Pietro Perona

Text spotting end-to-end methods have recently gained attention in the literature due to the benefits of jointly optimizing the text detection and recognition components.

Text Detection Text Spotting

Paper
Add Code

FlowStep3D: Model Unrolling for Self-Supervised Scene Flow Estimation

1 code implementation • CVPR 2021 • Yair Kittenplon, Yonina C. Eldar, Dan Raviv

Estimating the 3D motion of points in a scene, known as scene flow, is a core problem in computer vision.

Rolling Shutter Correction Self-supervised Scene Flow Estimation

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.