no code implementations • 25 Apr 2022 • Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Tao Xiang, Yi-Zhe Song
We spell out a few insights on the complementarity of each modality for scene understanding, and study for the first time a series of scene-specific applications like joint sketch- and text-based image retrieval, sketch captioning.
no code implementations • CVPR 2022 • Aneeshan Sain, Ayan Kumar Bhunia, Vaishnav Potlapalli, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song
In this paper, we question to argue that this setup by definition is not compatible with the inherent abstract and subjective nature of sketches, i. e., the model might transfer well to new categories, but will not understand sketches existing in different test-time distribution as a result.
no code implementations • CVPR 2022 • Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Viswanatha Reddy Gajjala, Aneeshan Sain, Tao Xiang, Yi-Zhe Song
We scrutinise an important observation plaguing scene-level sketch research -- that a significant portion of scene sketches are "partial".
1 code implementation • CVPR 2022 • Ayan Kumar Bhunia, Subhadeep Koley, Abdullah Faiz Ur Rahman Khilji, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song
We first conducted a pilot study that revealed the secret lies in the existence of noisy strokes, but not so much of the "I can't sketch".
no code implementations • 4 Mar 2022 • Pinaki Nath Chowdhury, Aneeshan Sain, Yulia Gryaditskaya, Ayan Kumar Bhunia, Tao Xiang, Yi-Zhe Song
In addition, we propose new solutions enabled by our dataset (i) We adopt meta-learning to show how the retrieval model can be fine-tuned to a new user style given just a small set of sketches, (ii) We extend a popular vector sketch LSTM-based encoder to handle sketches with larger complexity than was supported by previous work.
no code implementations • ICCV 2021 • Yonggang Qi, Guoyao Su, Pinaki Nath Chowdhury, Mingkang Li, Yi-Zhe Song
The key challenge in designing a sketch representation lies with handling the abstract and iconic nature of sketches.
no code implementations • ICCV 2021 • Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song
Our framework is iterative in nature, in that it utilises predicted knowledge of character sequences from a previous iteration, to augment the main network in improving the next prediction.
no code implementations • ICCV 2021 • Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Yi-Zhe Song
In this paper, for the first time, we argue for their unification -- we aim for a single model that can compete favourably with two separate state-of-the-art STR and HTR models.
no code implementations • ICCV 2021 • Ayan Kumar Bhunia, Aneeshan Sain, Amandeep Kumar, Shuvozit Ghose, Pinaki Nath Chowdhury, Yi-Zhe Song
In this paper, we argue that semantic information offers a complementary role in addition to visual only.
1 code implementation • CVPR 2021 • Ayan Kumar Bhunia, Shuvozit Ghose, Amandeep Kumar, Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song
In this paper, we take a completely different perspective -- we work on the assumption that there is always a new style that is drastically different, and that we will only have very limited data during testing to perform adaptation.
1 code implementation • CVPR 2021 • Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song
This data is uniquely characterised by its existence in dual modalities of rasterized images and vector coordinate sequences.
1 code implementation • CVPR 2021 • Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yongxin Yang, Tao Xiang, Yi-Zhe Song
A fundamental challenge faced by existing Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) models is the data scarcity -- model performances are largely bottlenecked by the lack of sketch-photo pairs.
Cross-Modal Retrieval
Semi-Supervised Sketch Based Image Retrieval
+1
1 code implementation • 14 Jul 2020 • Amandeep Kumar, Shuvozit Ghose, Pinaki Nath Chowdhury, Partha Pratim Roy, Umapada Pal
In this paper, we present a novel approach towards document image binarization by introducing three-player min-max adversarial game.
1 code implementation • 17 Apr 2020 • Shuvozit Ghose, Pinaki Nath Chowdhury, Partha Pratim Roy, Umapada Pal
Ground Terrain Recognition is a difficult task as the context information varies significantly over the regions of a ground terrain image.
no code implementations • 1 Jul 2019 • Nibal Nayef, Yash Patel, Michal Busta, Pinaki Nath Chowdhury, Dimosthenis Karatzas, Wafa Khlif, Jiri Matas, Umapada Pal, Jean-Christophe Burie, Cheng-Lin Liu, Jean-Marc Ogier
With the growing cosmopolitan culture of modern cities, the need of robust Multi-Lingual scene Text (MLT) detection and recognition systems has never been more immense.