Search Results for author: Minesh Mathew

Found 15 papers, 2 papers with code

Understanding Video Scenes through Text: Insights from Text-based Video Question Answering

no code implementations4 Sep 2023 Soumya Jahagirdar, Minesh Mathew, Dimosthenis Karatzas, C. V. Jawahar

Researchers have extensively studied the field of vision and language, discovering that both visual and textual content is crucial for understanding scenes effectively.

Domain Adaptation Question Answering +1

Reading Between the Lanes: Text VideoQA on the Road

no code implementations8 Jul 2023 George Tom, Minesh Mathew, Sergi Garcia, Dimosthenis Karatzas, C. V. Jawahar

Text and signs around roads provide crucial information for drivers, vital for safe navigation and situational awareness.

Question Answering Scene Text Recognition +1

Watching the News: Towards VideoQA Models that can Read

no code implementations10 Nov 2022 Soumya Jahagirdar, Minesh Mathew, Dimosthenis Karatzas, C. V. Jawahar

We demonstrate the limitations of current Scene Text VQA and VideoQA methods and propose ways to incorporate scene text information into VideoQA methods.

Question Answering Video Question Answering +1

An empirical study of CTC based models for OCR of Indian languages

no code implementations13 May 2022 Minesh Mathew, CV Jawahar

Recognition of text on word or line images, without the need for sub-word segmentation has become the mainstream of research and development of text recognition for Indian languages.

Optical Character Recognition (OCR) Segmentation +1

InfographicVQA

no code implementations26 Apr 2021 Minesh Mathew, Viraj Bagal, Rubèn Pérez Tito, Dimosthenis Karatzas, Ernest Valveny, C. V Jawahar

Infographics are documents designed to effectively communicate information using a combination of textual, graphical and visual elements.

Question Answering Visual Question Answering

Benchmarking Scene Text Recognition in Devanagari, Telugu and Malayalam

no code implementations9 Apr 2021 Minesh Mathew, Mohit Jain, CV Jawahar

And the performance is bench-marked on a new IIIT-ILST dataset comprising of hundreds of real scene images containing text in the above mentioned scripts.

Benchmarking Scene Text Recognition

Document Visual Question Answering Challenge 2020

no code implementations20 Aug 2020 Minesh Mathew, Ruben Tito, Dimosthenis Karatzas, R. Manmatha, C. V. Jawahar

For the task 1 a new dataset is introduced comprising 50, 000 questions-answer(s) pairs defined over 12, 767 document images.

Question Answering Retrieval +2

RoadText-1K: Text Detection & Recognition Dataset for Driving Videos

no code implementations19 May 2020 Sangeeth Reddy, Minesh Mathew, Lluis Gomez, Marcal Rusinol, Dimosthenis Karatzas., C. V. Jawahar

State of the art methods for text detection, recognition and tracking are evaluated on the new dataset and the results signify the challenges in unconstrained driving videos compared to existing datasets.

Text Detection

ICDAR 2019 Competition on Scene Text Visual Question Answering

no code implementations30 Jun 2019 Ali Furkan Biten, Rubèn Tito, Andres Mafla, Lluis Gomez, Marçal Rusiñol, Minesh Mathew, C. V. Jawahar, Ernest Valveny, Dimosthenis Karatzas

ST-VQA introduces an important aspect that is not addressed by any Visual Question Answering system up to date, namely the incorporation of scene text to answer questions asked about an image.

Question Answering Visual Question Answering

Unconstrained Scene Text and Video Text Recognition for Arabic Script

no code implementations7 Nov 2017 Mohit Jain, Minesh Mathew, C. V. Jawahar

For scripts like Arabic, a major challenge in developing robust recognizers is the lack of large quantity of annotated data.

Scene Text Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.