Search Results for author: Ernest Valveny

Found 14 papers, 2 papers with code

EKTVQA: Generalized use of External Knowledge to empower Scene Text in Text-VQA

no code implementations22 Aug 2021 Arka Ujjal Dey, Ernest Valveny, Gaurav Harit

The open-ended question answering task of Text-VQA often requires reading and reasoning about rarely seen or completely unseen scene-text content of an image.

Optical Character Recognition Question Answering +2

Document Collection Visual Question Answering

no code implementations27 Apr 2021 Rubèn Tito, Dimosthenis Karatzas, Ernest Valveny

Current tasks and methods in Document Understanding aims to process documents as single elements.

Question Answering Visual Question Answering

InfographicVQA

no code implementations26 Apr 2021 Minesh Mathew, Viraj Bagal, Rubèn Pérez Tito, Dimosthenis Karatzas, Ernest Valveny, C. V Jawahar

Infographics are documents designed to effectively communicate information using a combination of textual, graphical and visual elements.

Question Answering Visual Question Answering +1

Multimodal grid features and cell pointers for Scene Text Visual Question Answering

no code implementations1 Jun 2020 Lluís Gómez, Ali Furkan Biten, Rubèn Tito, Andrés Mafla, Marçal Rusiñol, Ernest Valveny, Dimosthenis Karatzas

This paper presents a new model for the task of scene text visual question answering, in which questions about a given image can only be answered by reading and understanding scene text that is present in it.

Question Answering Visual Question Answering +1

ICDAR 2019 Competition on Scene Text Visual Question Answering

no code implementations30 Jun 2019 Ali Furkan Biten, Rubèn Tito, Andres Mafla, Lluis Gomez, Marçal Rusiñol, Minesh Mathew, C. V. Jawahar, Ernest Valveny, Dimosthenis Karatzas

ST-VQA introduces an important aspect that is not addressed by any Visual Question Answering system up to date, namely the incorporation of scene text to answer questions asked about an image.

Question Answering Visual Question Answering +1

Don't only Feel Read: Using Scene text to understand advertisements

no code implementations21 Jun 2018 Arka Ujjal Dey, Suman K. Ghosh, Ernest Valveny

We propose a framework for automated classification of Advertisement Images, using not just Visual features but also Textual cues extracted from embedded text.

General Classification

Learning Cross-Modal Deep Embeddings for Multi-Object Image Retrieval using Text and Sketch

no code implementations28 Apr 2018 Sounak Dey, Anjan Dutta, Suman K. Ghosh, Ernest Valveny, Josep Lladós, Umapada Pal

In this work we introduce a cross modal image retrieval system that allows both text and sketch as input modalities for the query.

Image Retrieval

R-PHOC: Segmentation-Free Word Spotting using CNN

no code implementations5 Jul 2017 Suman Ghosh, Ernest Valveny

This paper proposes a region based convolutional neural network for segmentation-free word spotting.

Visual attention models for scene text recognition

no code implementations5 Jun 2017 Suman K. Ghosh, Ernest Valveny, Andrew D. Bagdanov

A set of feature vectors are derived from an intermediate convolutional layer corresponding to different areas of the image.

Language Modelling Scene Text Recognition

Query by String word spotting based on character bi-gram indexing

no code implementations28 May 2015 Suman K. Ghosh, Ernest Valveny

Both the documents and query strings are encoded using a recently proposed word representa- tion that projects images and strings into a common atribute space based on a pyramidal histogram of characters(PHOC).

Re-Ranking

Cannot find the paper you are looking for? You can Submit a new open access paper.