Search Results for author: Josep Lladós

Found 28 papers, 13 papers with code

SVGCraft: Beyond Single Object Text-to-SVG Synthesis with Comprehensive Canvas Layout

no code implementations • 30 Mar 2024 • Ayan Banerjee, Nityanand Mathur, Josep Lladós, Umapada Pal, Anjan Dutta

In response, this work introduces SVGCraft, a novel end-to-end framework for the creation of vector graphics depicting entire scenes from textual descriptions.

Vector Graphics

Paper
Add Code

GraphKD: Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation

1 code implementation • 17 Feb 2024 • Ayan Banerjee, Sanket Biswas, Josep Lladós, Umapada Pal

Object detection in documents is a key step to automate the structural elements identification process in a digital or scanned document through understanding the hierarchical structure and relationships between different elements.

Knowledge Distillation object-detection +1

Paper
Code

Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing Text Spotting Performance

1 code implementation • 2 Oct 2023 • Alloy Das, Sanket Biswas, Ayan Banerjee, Josep Lladós, Umapada Pal, Saumik Bhattacharya

The adaptation capability to a wide range of domains is crucial for scene text spotting models when deployed to real-world conditions.

Scene Text Detection Text Detection +1

Paper
Code

Diving into the Depths of Spotting Text in Multi-Domain Noisy Scenes

no code implementations • 1 Oct 2023 • Alloy Das, Sanket Biswas, Umapada Pal, Josep Lladós

When used in a real-world noisy environment, the capacity to generalize to multiple domains is essential for any autonomous scene text spotting system.

Super-Resolution Text Spotting

Paper
Add Code

TransferDoc: A Self-Supervised Transferable Document Representation Learning Model Unifying Vision and Language

no code implementations • 11 Sep 2023 • Souhail Bakkali, Sanket Biswas, Zuheng Ming, Mickael Coustaty, Marçal Rusiñol, Oriol Ramos Terrades, Josep Lladós

The field of visual document understanding has witnessed a rapid growth in emerging challenges and powerful multi-modal strategies.

Ranked #19 on Document Image Classification on RVL-CDIP

Document Image Classification document understanding +1

Paper
Add Code

SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation

1 code implementation • 8 May 2023 • Ayan Banerjee, Sanket Biswas, Josep Lladós, Umapada Pal

Instance-level segmentation of documents consists in assigning a class-aware and instance-aware label to each pixel of the image.

Instance Segmentation Segmentation +1

Paper
Code

SelfDocSeg: A Self-Supervised vision-based Approach towards Document Segmentation

1 code implementation • 1 May 2023 • Subhajit Maity, Sanket Biswas, Siladittya Manna, Ayan Banerjee, Josep Lladós, Saumik Bhattacharya, Umapada Pal

Document layout analysis is a known problem to the documents research community and has been vastly explored yielding a multitude of solutions ranging from text mining, and recognition to graph-based representation, visual feature extraction, etc.

Document Layout Analysis object-detection +1

Paper
Code

Easing Automatic Neurorehabilitation via Classification and Smoothness Analysis

no code implementations • 9 Dec 2022 • Asma Bensalah, Alicia Fornés, Cristina Carmona-Duarte, Josep Lladós

Assessing the quality of movements for post-stroke patients during the rehabilitation phase is vital given that there is no standard stroke rehabilitation plan for all the patients.

Classification

Paper
Add Code

The RPM3D project: 3D Kinematics for Remote Patient Monitoring

no code implementations • 9 Dec 2022 • Alicia Fornés, Asma Bensalah, Cristina Carmona-Duarte, Jialuo Chen, Miguel A. Ferrer, Andreas Fischer, Josep Lladós, Cristina Martín, Eloy Opisso, Réjean Plamondon, Anna Scius-Bertrand, Josep Maria Tormos

This project explores the feasibility of remote patient monitoring based on the analysis of 3D movements captured with smartwatches.

Paper
Add Code

Towards Stroke Patients' Upper-limb Automatic Motor Assessment Using Smartwatches

no code implementations • 9 Dec 2022 • Asma Bensalah, Jialuo Chen, Alicia Fornés, Cristina Carmona-Duarte, Josep Lladós, Miguel A. Ferrer

Assessing the physical condition in rehabilitation scenarios is a challenging problem, since it involves Human Activity Recognition (HAR) and kinematic analysis methods.

Human Activity Recognition

Paper
Add Code

A Few Shot Multi-Representation Approach for N-gram Spotting in Historical Manuscripts

no code implementations • 21 Sep 2022 • Giuseppe De Gregorio, Sanket Biswas, Mohamed Ali Souibgui, Asma Bensalah, Josep Lladós, Alicia Fornés, Angelo Marcelli

Despite recent advances in automatic text recognition, the performance remains moderate when it comes to historical manuscripts.

Few-Shot Learning Handwritten Text Recognition +3

Paper
Add Code

Doc2Graph: a Task Agnostic Document Understanding Framework based on Graph Neural Networks

1 code implementation • 23 Aug 2022 • Andrea Gemelli, Sanket Biswas, Enrico Civitelli, Josep Lladós, Simone Marinai

Geometric Deep Learning has recently attracted significant interest in a wide range of machine learning fields, including document analysis.

Ranked #5 on Entity Linking on FUNSD

Document Layout Analysis document understanding +4

105

Paper
Code

A Generic Image Retrieval Method for Date Estimation of Historical Document Collections

no code implementations • 8 Apr 2022 • Adrià Molina, Lluis Gomez, Oriol Ramos Terrades, Josep Lladós

Date estimation of historical document images is a challenging problem, with several contributions in the literature that lack of the ability to generalize from one dataset to others.

Image Retrieval Retrieval

Paper
Add Code

Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement

1 code implementation • 9 Mar 2022 • Mohamed Ali Souibgui, Sanket Biswas, Andres Mafla, Ali Furkan Biten, Alicia Fornés, Yousri Kessentini, Josep Lladós, Lluis Gomez, Dimosthenis Karatzas

In this paper, we propose a Text-Degradation Invariant Auto Encoder (Text-DIAE), a self-supervised model designed to tackle two tasks, text recognition (handwritten or scene-text) and document image enhancement.

Document Enhancement Scene Text Recognition

Paper
Code

DocSegTr: An Instance-Level End-to-End Document Image Segmentation Transformer

1 code implementation • 27 Jan 2022 • Sanket Biswas, Ayan Banerjee, Josep Lladós, Umapada Pal

has emerged as an interesting problem for the document analysis and understanding community.

Decision Making Document Layout Analysis +4

Paper
Code

DocEnTr: An End-to-End Document Image Enhancement Transformer

1 code implementation • 25 Jan 2022 • Mohamed Ali Souibgui, Sanket Biswas, Sana Khamekhem Jemni, Yousri Kessentini, Alicia Fornés, Josep Lladós, Umapada Pal

Document images can be affected by many degradation scenarios, which cause recognition and processing difficulties.

Ranked #1 on Binarization on H-DIBCO 2011

Binarization Image Enhancement

130

Paper
Code

Graph-based Deep Generative Modelling for Document Layout Generation

no code implementations • 9 Jul 2021 • Sanket Biswas, Pau Riba, Josep Lladós, Umapada Pal

One of the major prerequisites for any deep learning approach is the availability of large-scale training data.

Paper
Add Code

DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis

1 code implementation • 6 Jul 2021 • Sanket Biswas, Pau Riba, Josep Lladós, Umapada Pal

The results highlight that our model can successfully generate realistic and diverse document images with multiple objects.

Document Layout Analysis Image Generation

Paper
Code

Date Estimation in the Wild of Scanned Historical Photos: An Image Retrieval Approach

1 code implementation • 10 Jun 2021 • Adrià Molina, Pau Riba, Lluis Gomez, Oriol Ramos-Terrades, Josep Lladós

This paper presents a novel method for date estimation of historical photographs from archival sources.

Image Retrieval Retrieval

Paper
Code

Learning to Rank Words: Optimizing Ranking Metrics for Word Spotting

1 code implementation • 9 Jun 2021 • Pau Riba, Adrià Molina, Lluis Gomez, Oriol Ramos-Terrades, Josep Lladós

In this paper, we explore and evaluate the use of ranking-based objective functions for learning simultaneously a word string and a word image encoder.

Learning-To-Rank Retrieval

Paper
Code

One-shot Compositional Data Generation for Low Resource Handwritten Text Recognition

no code implementations • 11 May 2021 • Mohamed Ali Souibgui, Ali Furkan Biten, Sounak Dey, Alicia Fornés, Yousri Kessentini, Lluis Gomez, Dimosthenis Karatzas, Josep Lladós

Low resource Handwritten Text Recognition (HTR) is a hard problem due to the scarce annotated data and the very limited linguistic information (dictionaries and language models).

Handwritten Text Recognition HTR

Paper
Add Code

Learning Graph Edit Distance by Graph Neural Networks

no code implementations • 17 Aug 2020 • Pau Riba, Andreas Fischer, Josep Lladós, Alicia Fornés

The emergence of geometric deep learning as a novel framework to deal with graph-based representations has faded away traditional approaches in favor of completely new methodologies.

Graph Similarity Keyword Spotting +2

Paper
Add Code

A Neural Model for Text Localization, Transcription and Named Entity Recognition in Full Pages

2 code implementations • 20 Dec 2019 • Manuel Carbonell, Alicia Fornés, Mauricio Villegas, Josep Lladós

In this work we propose an end-to-end model that combines a one stage object detection network with branches for the recognition of text and named entities respectively in a way that shared features can be learned simultaneously from the training error of each of the tasks.

named-entity-recognition Named Entity Recognition +4

Paper
Code

Hierarchical stochastic graphlet embedding for graph-based pattern recognition

1 code implementation • 8 Jul 2018 • Anjan Dutta, Pau Riba, Josep Lladós, Alicia Fornés

Graph embedding, which maps graphs to a vectorial space, has been proposed as a way to tackle these difficulties enabling the use of standard machine learning techniques.

BIG-bench Machine Learning Clustering +1

Paper
Code

Learning Cross-Modal Deep Embeddings for Multi-Object Image Retrieval using Text and Sketch

no code implementations • 28 Apr 2018 • Sounak Dey, Anjan Dutta, Suman K. Ghosh, Ernest Valveny, Josep Lladós, Umapada Pal

In this work we introduce a cross modal image retrieval system that allows both text and sketch as input modalities for the query.

Image Retrieval Retrieval

Paper
Add Code

Joint Recognition of Handwritten Text and Named Entities with a Neural End-to-end Model

no code implementations • 16 Mar 2018 • Manuel Carbonell, Mauricio Villegas, Alicia Fornés, Josep Lladós

When extracting information from handwritten documents, text transcription and named entity recognition are usually faced as separate subsequent tasks.

Language Modelling named-entity-recognition +3

Paper
Add Code

e-Counterfeit: a mobile-server platform for document counterfeit detection

no code implementations • 21 Aug 2017 • Albert Berenguel, Oriol Ramos Terrades, Josep Lladós, Cristina Cañero

This paper presents a novel application to detect counterfeit identity documents forged by a scan-printing operation.

Texture Classification

Paper
Add Code

Product Graph-based Higher Order Contextual Similarities for Inexact Subgraph Matching

no code implementations • 1 Feb 2017 • Anjan Dutta, Josep Lladós, Horst Bunke, Umapada Pal

Many algorithms formulate graph matching as an optimization of an objective function of pairwise quantification of nodes and edges of two graphs to be matched.

Graph Matching

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.