no code implementations • 5 Dec 2024 • Juan Rodriguez, Xiangru Jian, Siba Smarak Panigrahi, Tianyu Zhang, Aarash Feizi, Abhay Puri, Akshay Kalkunte, François Savard, Ahmed Masry, Shravan Nayak, Rabiul Awal, Mahsa Massoud, Amirhossein Abaskohi, Zichao Li, Suyuchen Wang, Pierre-André Noël, Mats Leon Richter, Saverio Vadacchino, Shubbam Agarwal, Sanket Biswas, Sara Shanian, Ying Zhang, Noah Bolger, Kurt MacDonald, Simon Fauvel, Sathwik Tejaswi, Srinivas Sunkara, Joao Monteiro, Krishnamurthy Dj Dvijotham, Torsten Scholak, Nicolas Chapados, Sepideh Kharagani, Sean Hughes, M. Özsu, Siva Reddy, Marco Pedersoli, Yoshua Bengio, Christopher Pal, Issam Laradji, Spandanna Gella, Perouz Taslakian, David Vazquez, Sai Rajeswar
We use an efficient data curation process to ensure our data is high-quality and license-permissive.
no code implementations • 3 Oct 2024 • Maxime Talarmain, Carlos Boned, Sanket Biswas, Oriol Ramos
This task is particularly challenging when dealing with unseen class of ID, or travel, documents.
1 code implementation • 3 Sep 2024 • Soumitri Chattopadhyay, Sanket Biswas, Emanuele Vivoli, Josep Lladós
Specifically, we propose two novel methods: Generative Class Prompt Learning (GCPL) and Contrastive Multi-class Prompt Learning (CoMPLe).
no code implementations • 27 Aug 2024 • Alloy Das, Sanket Biswas, Umapada Pal, Josep Lladós, Saumik Bhattacharya
The proliferation of scene text in both structured and unstructured environments presents significant challenges in optical character recognition (OCR), necessitating more efficient and robust text spotting solutions.
no code implementations • 12 Jun 2024 • Jordy Van Landeghem, Subhajit Maity, Ayan Banerjee, Matthew Blaschko, Marie-Francine Moens, Josep Lladós, Sanket Biswas
This work explores knowledge distillation (KD) for visually-rich document (VRD) applications such as document layout analysis (DLA) and document image classification (DIC).
no code implementations • 12 Jun 2024 • Sanket Biswas, Rajiv Jain, Vlad I. Morariu, Jiuxiang Gu, Puneet Mathur, Curtis Wigington, Tong Sun, Josep Lladós
While the generation of document layouts has been extensively explored, comprehensive document generation encompassing both layout and content presents a more complex challenge.
1 code implementation • 12 Jun 2024 • Maria Pilligua, Nil Biescas, Javier Vazquez-Corral, Josep Lladós, Ernest Valveny, Sanket Biswas
The rapid evolution of intelligent document processing systems demands robust solutions that adapt to diverse domains without extensive retraining.
no code implementations • 6 May 2024 • Adarsh Tiwari, Sanket Biswas, Josep Lladós
We present SketchGPT, a flexible framework that employs a sequence-to-sequence autoregressive model for sketch generation, and completion, and an interpretation case study for sketch recognition.
1 code implementation • 6 May 2024 • Nil Biescas, Carlos Boned, Josep Lladós, Sanket Biswas
This paper presents GeoContrastNet, a language-agnostic framework to structured document understanding (DU) by integrating a contrastive learning objective with graph attention networks (GATs), emphasizing the significant role of geometric features.
1 code implementation • 17 Feb 2024 • Ayan Banerjee, Sanket Biswas, Josep Lladós, Umapada Pal
Object detection in documents is a key step to automate the structural elements identification process in a digital or scanned document through understanding the hierarchical structure and relationships between different elements.
1 code implementation • 3 Jan 2024 • Carlos Boned, Maxime Talarmain, Nabil Ghanmi, Guillaume Chiron, Sanket Biswas, Ahmad Montaser Awal, Oriol Ramos Terrades
This paper presents a new synthetic dataset of ID and travel documents, called SIDTD.
1 code implementation • 20 Dec 2023 • Pau Torras, Sanket Biswas, Alicia Fornés
Modern-day Optical Music Recognition (OMR) is a fairly fragmented field.
1 code implementation • 2 Oct 2023 • Alloy Das, Sanket Biswas, Ayan Banerjee, Josep Lladós, Umapada Pal, Saumik Bhattacharya
The adaptation capability to a wide range of domains is crucial for scene text spotting models when deployed to real-world conditions.
no code implementations • 1 Oct 2023 • Alloy Das, Sanket Biswas, Umapada Pal, Josep Lladós
When used in a real-world noisy environment, the capacity to generalize to multiple domains is essential for any autonomous scene text spotting system.
no code implementations • 11 Sep 2023 • Souhail Bakkali, Sanket Biswas, Zuheng Ming, Mickaël Coustaty, Marçal Rusiñol, Oriol Ramos Terrades, Josep Lladós
Visual document understanding (VDU) has rapidly advanced with the development of powerful multi-modal language models.
Ranked #20 on Document Image Classification on RVL-CDIP
1 code implementation • 24 Aug 2023 • Jordy Van Landeghem, Sanket Biswas, Matthew B. Blaschko, Marie-Francine Moens
This paper highlights the need to bring document classification benchmarking closer to real-world applications, both in the nature of data tested ($X$: multi-channel, multi-paged, multi-industry; $Y$: class distributions and label set variety) and in classification tasks considered ($f$: multi-page document, page stream, and document bundle classification, ...).
no code implementations • 5 Aug 2023 • Alloy Das, Sanket Biswas, Prasun Roy, Subhankar Ghosh, Umapada Pal, Michael Blumenstein, Josep Lladós, Saumik Bhattacharya
Scene Text Editing (STE) is a challenging research problem, that primarily aims towards modifying existing texts in an image while preserving the background and the font style of the original text.
1 code implementation • 8 May 2023 • Ayan Banerjee, Sanket Biswas, Josep Lladós, Umapada Pal
Instance-level segmentation of documents consists in assigning a class-aware and instance-aware label to each pixel of the image.
1 code implementation • 1 May 2023 • Subhajit Maity, Sanket Biswas, Siladittya Manna, Ayan Banerjee, Josep Lladós, Saumik Bhattacharya, Umapada Pal
Document layout analysis is a known problem to the documents research community and has been vastly explored yielding a multitude of solutions ranging from text mining, and recognition to graph-based representation, visual feature extraction, etc.
no code implementations • 21 Sep 2022 • Giuseppe De Gregorio, Sanket Biswas, Mohamed Ali Souibgui, Asma Bensalah, Josep Lladós, Alicia Fornés, Angelo Marcelli
Despite recent advances in automatic text recognition, the performance remains moderate when it comes to historical manuscripts.
1 code implementation • 23 Aug 2022 • Andrea Gemelli, Sanket Biswas, Enrico Civitelli, Josep Lladós, Simone Marinai
Geometric Deep Learning has recently attracted significant interest in a wide range of machine learning fields, including document analysis.
Ranked #7 on Entity Linking on FUNSD
1 code implementation • 9 Mar 2022 • Mohamed Ali Souibgui, Sanket Biswas, Andres Mafla, Ali Furkan Biten, Alicia Fornés, Yousri Kessentini, Josep Lladós, Lluis Gomez, Dimosthenis Karatzas
In this paper, we propose a Text-Degradation Invariant Auto Encoder (Text-DIAE), a self-supervised model designed to tackle two tasks, text recognition (handwritten or scene-text) and document image enhancement.
1 code implementation • 27 Jan 2022 • Sanket Biswas, Ayan Banerjee, Josep Lladós, Umapada Pal
has emerged as an interesting problem for the document analysis and understanding community.
1 code implementation • 25 Jan 2022 • Mohamed Ali Souibgui, Sanket Biswas, Sana Khamekhem Jemni, Yousri Kessentini, Alicia Fornés, Josep Lladós, Umapada Pal
Document images can be affected by many degradation scenarios, which cause recognition and processing difficulties.
Ranked #1 on Binarization on H-DIBCO 2011
no code implementations • 9 Jul 2021 • Sanket Biswas, Pau Riba, Josep Lladós, Umapada Pal
One of the major prerequisites for any deep learning approach is the availability of large-scale training data.
1 code implementation • 6 Jul 2021 • Sanket Biswas, Pau Riba, Josep Lladós, Umapada Pal
The results highlight that our model can successfully generate realistic and diverse document images with multiple objects.
no code implementations • 21 Jan 2021 • Fiona Abney-McPeek, Sanket Biswas, Senjuti Dutta, Yongyuan Huang, Deyuan Li, Nancy Xu
In this paper, we establish a relationship between Ehrhart-equivalence and other forms of equivalence: the $\operatorname{GL}_n(\mathbb{Z})$-equidecomposability and unimodular equivalence of two integral $n$-polytopes in $\mathbb{R}^n$.
Combinatorics
no code implementations • 24 Oct 2018 • Subhajit Maity, Sujan Sarkar, Avinaba Tapadar, Ayan Dutta, Sanket Biswas, Sayon Nayek, Pritam Saha
With increasing population the crisis of food is getting bigger day by day. In this time of crisis, the leaf disease of crops is the biggest problem in the food industry. In this paper, we have addressed that problem and proposed an efficient method to detect leaf disease. Leaf diseases can be detected from sample images of the leaf with the help of image processing and segmentation. Using k-means clustering and Otsu's method the faulty region in a leaf is detected which helps to determine proper course of action to be taken. Further the ratio of normal and faulty region if calculated would be able to predict if the leaf can be cured at all.
1 code implementation • 23 Oct 2018 • Navoneel Chakrabarty, Sanket Biswas
The prominent inequality of wealth and income is a huge concern especially in the United States.