no code implementations • 30 Mar 2024 • Ayan Banerjee, Nityanand Mathur, Josep Lladós, Umapada Pal, Anjan Dutta
In response, this work introduces SVGCraft, a novel end-to-end framework for the creation of vector graphics depicting entire scenes from textual descriptions.
1 code implementation • 17 Feb 2024 • Ayan Banerjee, Sanket Biswas, Josep Lladós, Umapada Pal
Object detection in documents is a key step to automate the structural elements identification process in a digital or scanned document through understanding the hierarchical structure and relationships between different elements.
no code implementations • 30 Jan 2024 • Miguel A. Ferrer, Sukalpa Chanda, Moises Diaz, Chayan Kr. Banerjee, Anirban Majumdar, Cristina Carmona-Duarte, Parikshit Acharya, Umapada Pal
This paper aims to adapt this scheme for the generation of synthetic signatures in two Indic scripts, Bengali (Bangla), and Devanagari (Hindi).
1 code implementation • 6 Dec 2023 • Risab Biswas, Swalpa Kumar Roy, Umapada Pal
Instead of using a simple ViT and hard splitting of images for the document image enhancement task, we employed a progressive tokenization technique to capture this local information from an image to achieve more effective results.
no code implementations • 6 Dec 2023 • Risab Biswas, Swalpa Kumar Roy, Ning Wang, Umapada Pal, Guang-Bin Huang
Instead of using a simple vision transformer block to extract information from the image patches, the proposed architecture uses two transformer blocks for greater coverage of the extracted feature space on a global and local scale.
1 code implementation • 2 Oct 2023 • Alloy Das, Sanket Biswas, Ayan Banerjee, Josep Lladós, Umapada Pal, Saumik Bhattacharya
The adaptation capability to a wide range of domains is crucial for scene text spotting models when deployed to real-world conditions.
no code implementations • 1 Oct 2023 • Alloy Das, Sanket Biswas, Umapada Pal, Josep Lladós
When used in a real-world noisy environment, the capacity to generalize to multiple domains is essential for any autonomous scene text spotting system.
no code implementations • 5 Aug 2023 • Alloy Das, Prasun Roy, Saumik Bhattacharya, Subhankar Ghosh, Umapada Pal, Michael Blumenstein
However, most of the existing STE methods show inferior editing performance because of (1) complex image backgrounds, (2) various font styles, and (3) varying word lengths within the text.
no code implementations • 2 Aug 2023 • Siladittya Manna, Soumitri Chattopadhyay, Rakesh Dey, Saumik Bhattacharya, Umapada Pal
We propose a cosine similarity-dependent temperature scaling function to effectively optimize the distribution of the samples in the feature space.
no code implementations • 8 May 2023 • Jiajun Wei, Hongjian Zhan, Xiao Tu, Yue Lu, Umapada Pal
Inspired by ITC, the SITM network combines the visual features and the text features of all candidates to identify the candidate with the minimum distance in the feature space.
1 code implementation • 8 May 2023 • Ayan Banerjee, Sanket Biswas, Josep Lladós, Umapada Pal
Instance-level segmentation of documents consists in assigning a class-aware and instance-aware label to each pixel of the image.
1 code implementation • 1 May 2023 • Subhajit Maity, Sanket Biswas, Siladittya Manna, Ayan Banerjee, Josep Lladós, Saumik Bhattacharya, Umapada Pal
Document layout analysis is a known problem to the documents research community and has been vastly explored yielding a multitude of solutions ranging from text mining, and recognition to graph-based representation, visual feature extraction, etc.
no code implementations • 24 Apr 2023 • Subhankar Ghosh, Saumik Bhattacharya, Prasun Roy, Umapada Pal, Michael Blumenstein
Handling various objects with different colors is a significant challenge for image colorization techniques.
no code implementations • 10 Apr 2023 • Weijia Wu, Yuzhong Zhao, Zhuang Li, Jiahong Li, Mike Zheng Shou, Umapada Pal, Dimosthenis Karatzas, Xiang Bai
In this competition report, we establish a video text reading benchmark, DSText, which focuses on dense and small text reading challenges in the video with various scenarios.
1 code implementation • 14 Mar 2023 • Prasun Roy, Subhankar Ghosh, Umapada Pal
Air-writing refers to virtually writing linguistic characters through hand gestures in three-dimensional space with six degrees of freedom.
no code implementations • 28 Feb 2023 • Prasun Roy, Saumik Bhattacharya, Subhankar Ghosh, Umapada Pal, Michael Blumenstein
The proposed strategy enables us to synthesize semantically coherent realistic persons that can blend into an existing scene without altering the global context.
1 code implementation • Preprint 2023 • Risab Biswas, Swalpa Kumar Roy, Umapada Pal
Instead of using a simple ViT and hard splitting of images for the document image enhancement task, we employed a progressive tokeniza-tion technique to capture this local information from an image for achieving more effective results.
Ranked #1 on Binarization on H-DIBCO 2012
no code implementations • 4 Aug 2022 • Subhankar Ghosh, Prasun Roy, Saumik Bhattacharya, Umapada Pal, Michael Blumenstein
Image colorization is a well-known problem in computer vision.
no code implementations • 24 Jul 2022 • Prasun Roy, Subhankar Ghosh, Saumik Bhattacharya, Umapada Pal, Michael Blumenstein
In computer vision, human pose synthesis and transfer deal with probabilistic image generation of a person in a previously unseen pose from an already available observation of that person.
no code implementations • 21 Jul 2022 • Dajian Zhong, Shujing Lyu, Palaiahnakote Shivakumara, Bing Yin, Jiajia Wu, Umapada Pal, Yue Lu
For target images (scene text images), the Semantic Generator Module generates simple semantic features that share the same feature distribution with support images (clear text images).
no code implementations • 6 Jun 2022 • Prasun Roy, Subhankar Ghosh, Saumik Bhattacharya, Umapada Pal, Michael Blumenstein
Finally, the target image is generated from the refined skeleton using another generative network conditioned on a given image of the target person.
no code implementations • 26 Feb 2022 • Siladittya Manna, Soumitri Chattopadhyay, Saumik Bhattacharya, Umapada Pal
Writer independent offline signature verification is one of the most challenging tasks in pattern recognition as there is often a scarcity of training data.
1 code implementation • 14 Feb 2022 • Prasun Roy, Saumik Bhattacharya, Subhankar Ghosh, Umapada Pal
Pose transfer refers to the probabilistic image generation of a person with a previously unseen novel pose from another image of that person having a different pose.
1 code implementation • 27 Jan 2022 • Sanket Biswas, Ayan Banerjee, Josep Lladós, Umapada Pal
has emerged as an interesting problem for the document analysis and understanding community.
1 code implementation • 25 Jan 2022 • Soumitri Chattopadhyay, Siladittya Manna, Saumik Bhattacharya, Umapada Pal
This results in robust discriminative learning of the embedding space.
1 code implementation • 25 Jan 2022 • Mohamed Ali Souibgui, Sanket Biswas, Sana Khamekhem Jemni, Yousri Kessentini, Alicia Fornés, Josep Lladós, Umapada Pal
Document images can be affected by many degradation scenarios, which cause recognition and processing difficulties.
Ranked #1 on Binarization on H-DIBCO 2011
no code implementations • 24 Nov 2021 • Siladittya Manna, Umapada Pal, Saumik Bhattacharya
After 200 epochs of pre-training with ResNet-18 as the backbone, the proposed model achieves an accuracy of 86. 2\%, 58. 18\%, 77. 49\%, and 30. 87\% on CIFAR-10, CIFAR-100, STL-10, and Tiny-ImageNet datasets, respectively, and surpasses the SOTA contrastive baseline by 1. 23\%, 3. 57\%, 2. 00\%, and 0. 33\%, respectively.
1 code implementation • 20 Nov 2021 • Abhishek Srivastava, Sukalpa Chanda, Debesh Jha, Umapada Pal, Sharib Ali
The repeated fusion operations gated by CMSA and MSFS demonstrate improved generalizability of the network.
Ranked #13 on Medical Image Segmentation on Kvasir-SEG
no code implementations • 20 Nov 2021 • Abhishek Srivastava, Sukalpa Chanda, Umapada Pal
The performance of facial super-resolution methods relies on their ability to recover facial structures and salient features effectively.
1 code implementation • 20 Nov 2021 • Abhishek Srivastava, Sukalpa Chanda, Umapada Pal
Our methods are based on the hypothesis that handwritten text images have specific spatial regions which are more unique to a writer's style, multi-scale features propagate characteristic features with respect to individual writers and patch-based features give more general and robust representations that helps to discriminate handwriting from different writers.
no code implementations • 20 Nov 2021 • Abhishek Srivastava, Sukalpa Chanda, Debesh Jha, Michael A. Riegler, Pål Halvorsen, Dag Johansen, Umapada Pal
We develop progressive alternating attention dense (PAAD) blocks, which construct a guiding attention map (GAM) after every convolutional layer in the dense blocks using features from all scales.
no code implementations • NeurIPS Workshop ICBINB 2021 • Bhavya Vasudeva, Puneesh Deora, Saumik Bhattacharya, Umapada Pal, Sukalpa Chanda
Deep metric learning (ML) uses a carefully designed loss function to learn distance metrics for improving the discriminatory ability for tasks like clustering and retrieval.
1 code implementation • ICCV 2021 • Bhavya Vasudeva, Puneesh Deora, Saumik Bhattacharya, Umapada Pal, Sukalpa Chanda
Deep metric learning has been effectively used to learn distance metrics for different visual tasks like image retrieval, clustering, etc.
no code implementations • 9 Jul 2021 • Sanket Biswas, Pau Riba, Josep Lladós, Umapada Pal
One of the major prerequisites for any deep learning approach is the availability of large-scale training data.
1 code implementation • 6 Jul 2021 • Sanket Biswas, Pau Riba, Josep Lladós, Umapada Pal
The results highlight that our model can successfully generate realistic and diverse document images with multiple objects.
1 code implementation • 16 May 2021 • Abhishek Srivastava, Debesh Jha, Sukalpa Chanda, Umapada Pal, Håvard D. Johansen, Dag Johansen, Michael A. Riegler, Sharib Ali, Pål Halvorsen
The proposed MSRF-Net allows to capture object variabilities and provides improved results on different biomedical datasets.
Ranked #3 on Medical Image Segmentation on 2018 Data Science Bowl
1 code implementation • 6 May 2021 • Dipayan Das, Saumik Bhattacharya, Umapada Pal, Sukalpa Chanda
Reservoir Computing (RC) offers a viable option to deploy AI algorithms on low-end embedded system platforms.
2 code implementations • 21 Apr 2021 • Siladittya Manna, Saumik Bhattacharya, Umapada Pal
The downstream task in our paper is a class imbalanced multi-label classification.
Ranked #2 on Multi-Label Classification on MRNet
1 code implementation • 10 Nov 2020 • Andrey Ignatov, Radu Timofte, Zhilu Zhang, Ming Liu, Haolin Wang, WangMeng Zuo, Jiawei Zhang, Ruimao Zhang, Zhanglin Peng, Sijie Ren, Linhui Dai, Xiaohong Liu, Chengqi Li, Jun Chen, Yuichi Ito, Bhavya Vasudeva, Puneesh Deora, Umapada Pal, Zhenyu Guo, Yu Zhu, Tian Liang, Chenghua Li, Cong Leng, Zhihong Pan, Baopu Li, Byung-Hoon Kim, Joonyoung Song, Jong Chul Ye, JaeHyun Baek, Magauiya Zhussip, Yeskendir Koishekenov, Hwechul Cho Ye, Xin Liu, Xueying Hu, Jun Jiang, Jinwei Gu, Kai Li, Pengliang Tan, Bingxin Hou
This paper reviews the second AIM learned ISP challenge and provides the description of the proposed solutions and results.
1 code implementation • 23 Oct 2020 • Prasun Roy, Saumik Bhattacharya, Partha Pratim Roy, Umapada Pal
Sign language is a gesture-based symbolic communication medium among speech and hearing impaired people.
1 code implementation • 15 Jul 2020 • Siladittya Manna, Saumik Bhattacharya, Umapada Pal
In this paper, we propose a self-supervised learning approach to learn transferable features from MR video clips by enforcing the model to learn anatomical features.
1 code implementation • 14 Jul 2020 • Amandeep Kumar, Shuvozit Ghose, Pinaki Nath Chowdhury, Partha Pratim Roy, Umapada Pal
In this paper, we present a novel approach towards document image binarization by introducing three-player min-max adversarial game.
Ranked #2 on Binarization on DIBCO 2011
no code implementations • 26 May 2020 • Sauradip Nag, Palaiahnakote Shivakumara, Umapada Pal, Tong Lu, Michael Blumenstein
The proposed method fuses gradient magnitude and direction coherence of text pixels in a new way for detecting candidate regions.
1 code implementation • 17 Apr 2020 • Shuvozit Ghose, Pinaki Nath Chowdhury, Partha Pratim Roy, Umapada Pal
Ground Terrain Recognition is a difficult task as the context information varies significantly over the regions of a ground terrain image.
no code implementations • 4 Oct 2019 • Zied Selmi, Mohamed Ben Halima, Umapada Pal, M. Adel Alimi
For this, we present in this paper an automatic framework for License Plate (LP) detection and recognition from complex scenes.
no code implementations • 1 Jul 2019 • Nibal Nayef, Yash Patel, Michal Busta, Pinaki Nath Chowdhury, Dimosthenis Karatzas, Wafa Khlif, Jiri Matas, Umapada Pal, Jean-Christophe Burie, Cheng-Lin Liu, Jean-Marc Ogier
With the growing cosmopolitan culture of modern cities, the need of robust Multi-Lingual scene Text (MLT) detection and recognition systems has never been more immense.
Cultural Vocal Bursts Intensity Prediction General Classification +2
no code implementations • 3 May 2019 • Tapabrata Chakraborti, Brendan McCane, Steven Mills, Umapada Pal
We present a simple effective way of achieving this by learning a generic Mahalanabis distance in a collaborative loss function in an end-to-end fashion with any standard convolutional network as the feature learner.
no code implementations • 21 Mar 2019 • Tapabrata Chakraborti, Brendan McCane, Steven Mills, Umapada Pal
We present a conditional probabilistic framework for collaborative representation of image patches.
1 code implementation • CVPR 2020 • Prasun Roy, Saumik Bhattacharya, Subhankar Ghosh, Umapada Pal
In this paper, we propose a method to modify text in an image at character-level.
no code implementations • 28 Jan 2019 • Tapabrata Chakraborti, Brendan McCane, Steven Mills, Umapada Pal
We present an end-to-end deep network for fine-grained visual categorization called Collaborative Convolutional Network (CoCoNet).
Fine-Grained Visual Categorization Fine-Grained Visual Recognition +1
2 code implementations • 4 Nov 2018 • Ayan Kumar Bhunia, Ankan Kumar Bhunia, Shuvozit Ghose, Abhirup Das, Partha Pratim Roy, Umapada Pal
Logo detection in real-world scene images is an important problem with applications in advertisement and marketing.
2 code implementations • 26 Jul 2018 • Prasun Roy, Subhankar Ghosh, Saumik Bhattacharya, Umapada Pal
Deep convolutional neural networks (CNN) have massively influenced recent advances in large-scale image classification.
no code implementations • 18 Jul 2018 • Ranju Mandal, Partha Pratim Roy, Umapada Pal, Michael Blumenstein
Finally, three distance measures were used to match a query signature with the signature present in target documents for retrieval.
no code implementations • 19 Jun 2018 • Sauradip Nag, Palaiahnakote Shivakumara, Wu Yirui, Umapada Pal, Tong Lu
For each line segment, the proposed method estimates angle and length, which gives a point in polar domain.
no code implementations • 28 Apr 2018 • Sounak Dey, Anjan Dutta, Suman K. Ghosh, Ernest Valveny, Josep Lladós, Umapada Pal
In this work we introduce a cross modal image retrieval system that allows both text and sketch as input modalities for the query.
no code implementations • 23 Feb 2018 • Ayan Kumar Bhunia, Subham Mukherjee, Aneeshan Sain, Ankan Kumar Bhunia, Partha Pratim Roy, Umapada Pal
In this paper, we propose a novel approach of word-level Indic script identification using only character-level data in training stage.
no code implementations • 22 Jan 2018 • Aishik Konwer, Ayan Kumar Bhunia, Abir Bhowmick, Ankan Kumar Bhunia, Prithaj Banerjee, Partha Pratim Roy, Umapada Pal
Staff line removal is a crucial pre-processing step in Optical Music Recognition.
no code implementations • 22 Jan 2018 • Ayan Kumar Bhunia, Abir Bhowmick, Ankan Kumar Bhunia, Aishik Konwer, Prithaj Banerjee, Partha Pratim Roy, Umapada Pal
Our encoder module consists of Convolutional LSTM network, which takes an offline character image as the input and encodes the feature sequence to a hidden representation.
no code implementations • 22 Jan 2018 • Ankan Kumar Bhunia, Ayan Kumar Bhunia, Prithaj Banerjee, Aishik Konwer, Abir Bhowmick, Partha Pratim Roy, Umapada Pal
We employ a novel convolutional recurrent model architecture in the Generator that efficiently deals with the word images of arbitrary width.
1 code implementation • 1 Jan 2018 • Ankan Kumar Bhunia, Aishik Konwer, Ayan Kumar Bhunia, Abir Bhowmick, Partha P. Roy, Umapada Pal
In this paper, we propose a novel method that involves extraction of local and global features using CNN-LSTM framework and weighting them dynamically for script identification.
no code implementations • 19 Dec 2017 • Ayan Kumar Bhunia, Partha Pratim Roy, Akash Mohta, Umapada Pal
This paper presents a novel cross language platform for handwritten word recognition and spotting for such low-resource scripts where training is performed with a sufficiently large dataset of an available script (considered as source script) and testing is done on other scripts (considered as target script).
no code implementations • 5 Dec 2017 • Ayan Kumar Bhunia, Partha Pratim Roy, Umapada Pal
Also, we propose a novel feature combining foreground and background information of text line images for keyword-spotting by character filler models.
no code implementations • 25 Oct 2017 • Tapabrata Chakraborti, Brendan McCane, Steven Mills, Umapada Pal
This letter introduces the LOOP binary descriptor (local optimal oriented pattern) that encodes rotation invariance into the main formulation itself.
no code implementations • 18 Aug 2017 • Partha Pratim Roy, Ayan Kumar Bhunia, Avirup Bhattacharyya, Umapada Pal
To evaluate the proposed system for searching keyword from natural scene image and video frames, we have considered two popular Indic scripts such as Bangla (Bengali) and Devanagari along with English.
no code implementations • 1 Aug 2017 • Partha Pratim Roy, Ayan Kumar Bhunia, Ayan Das, Prasenjit Dey, Umapada Pal
To avoid character segmentation in such scripts, HMM-based sequence modeling has been used earlier in holistic way.
no code implementations • 22 Jul 2017 • Aneeshan Sain, Ayan Kumar Bhunia, Partha Pratim Roy, Umapada Pal
Until now only a few methods have been proposed that look into curved text detection in video frames, wherein lies our novelty.
no code implementations • 21 Jul 2017 • Partha Pratim Roy, Ayan Kumar Bhunia, Umapada Pal
A novel Factor Analysis based feature selection technique is applied in sliding window features to reduce the noise appearing from staff lines which proves efficiency in writer identification performance. In our framework we have also proposed a novel score line detection approach in musical sheet using HMM.
no code implementations • 21 Jul 2017 • Partha Pratim Roy, Ayan Kumar Bhunia, Umapada Pal
We propose a line based date spotting approach using Hidden Markov Model (HMM) which is used to detect the date information in a given text.
no code implementations • 21 Jul 2017 • Ayan Kumar Bhunia, Gautam Kumar, Partha Pratim Roy, R. Balasubramanian, Umapada Pal
In this paper, we present a novel approach based on color channel selection for text recognition from scene images and video frames.
5 code implementations • 7 Jul 2017 • Sounak Dey, Anjan Dutta, J. Ignacio Toledo, Suman K. Ghosh, Josep Llados, Umapada Pal
Offline signature verification is one of the most challenging tasks in biometrics and document forensics.
Ranked #1 on Handwriting Verification on CEDAR Signature
no code implementations • 1 Feb 2017 • Anjan Dutta, Josep Lladós, Horst Bunke, Umapada Pal
Many algorithms formulate graph matching as an optimization of an objective function of pairwise quantification of nodes and edges of two graphs to be matched.
no code implementations • 21 Apr 2016 • Sounak Dey, Anguelos Nicolaou, Josep Llados, Umapada Pal
Word spotting is an important recognition task in historical document analysis.
no code implementations • 20 Apr 2016 • Sounak Dey, Anguelos Nicolaou, Josep Llados, Umapada Pal
Digital libraries store images which can be highly degraded and to index this kind of images we resort to word spot- ting as our information retrieval system.