no code implementations • 18 Sep 2024 • Edresson Casanova, Ryan Langman, Paarth Neekhara, Shehzeen Hussain, Jason Li, Subhankar Ghosh, Ante Jukić, Sang-gil Lee
Large language models (LLMs) have significantly advanced audio processing through audio codecs that convert audio into discrete tokens, enabling the application of language modeling techniques to audio data.
no code implementations • 1 Jul 2024 • Subhankar Ghosh, Jayant Gupta, Arun Sharma, Shuai An, Shashi Shekhar
Previously, we proposed a miner \cite{10. 1145/3557989. 3566158} that finds statistically significant regional colocation patterns.
no code implementations • 29 Jun 2024 • Subhankar Ghosh, Arun Sharma, Jayant Gupta, Shashi Shekhar
Given a collection of Boolean spatial feature types, their instances, a neighborhood relation (e. g., proximity), and a hierarchical taxonomy of the feature types, the goal is to find the subsets of feature types or their parents whose spatial interaction is statistically significant.
no code implementations • 25 Jun 2024 • Paarth Neekhara, Shehzeen Hussain, Subhankar Ghosh, Jason Li, Rafael Valle, Rohan Badlani, Boris Ginsburg
Large Language Model (LLM) based text-to-speech (TTS) systems have demonstrated remarkable capabilities in handling large speech datasets and generating natural speech for new speakers.
no code implementations • 10 Jun 2024 • Yuanjie Shi, Subhankar Ghosh, Taha Belkhouja, Janardhan Rao Doppa, Yan Yan
In contrast to the standard class-conditional CP (CCP) method that uniformly thresholds the class-wise conformity score for each class, the augmented label rank calibration step allows RC3P to selectively iterate this class-wise thresholding subroutine only for a subset of classes whose class-wise top-k error is small.
no code implementations • 19 Oct 2023 • Subhankar Ghosh, Shuai An, Arun Sharma, Jayant Gupta, Shashi Shekhar, Aneesh Subramanian
Given multi-model ensemble climate projections, the goal is to accurately and reliably predict future sea-level rise while lowering the uncertainty.
1 code implementation • 13 Oct 2023 • Zhehuai Chen, He Huang, Andrei Andrusenko, Oleksii Hrinchuk, Krishna C. Puvvada, Jason Li, Subhankar Ghosh, Jagadeesh Balam, Boris Ginsburg
We present a novel Speech Augmented Language Model (SALM) with {\em multitask} and {\em in-context} learning capabilities.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 5 Aug 2023 • Alloy Das, Prasun Roy, Saumik Bhattacharya, Subhankar Ghosh, Umapada Pal, Michael Blumenstein
However, most of the existing STE methods show inferior editing performance because of (1) complex image backgrounds, (2) various font styles, and (3) varying word lengths within the text.
no code implementations • 31 Jul 2023 • Subhankar Ghosh, Yuanjie Shi, Taha Belkhouja, Yan Yan, Jana Doppa, Brian Jones
We propose a novel adaptive PRCP (aPRCP) algorithm to achieve probabilistically robust coverage.
no code implementations • 24 Apr 2023 • Subhankar Ghosh, Saumik Bhattacharya, Prasun Roy, Umapada Pal, Michael Blumenstein
Handling various objects with different colors is a significant challenge for image colorization techniques.
1 code implementation • 19 Mar 2023 • Subhankar Ghosh, Taha Belkhouja, Yan Yan, Janardhan Rao Doppa
Safe deployment of deep neural networks in high-stake real-world applications requires theoretically sound uncertainty quantification.
no code implementations • 14 Mar 2023 • Rohan Badlani, Akshit Arora, Subhankar Ghosh, Rafael Valle, Kevin J. Shih, João Felipe Santos, Boris Ginsburg, Bryan Catanzaro
We introduce VANI, a very lightweight multi-lingual accent controllable speech synthesis system.
1 code implementation • 14 Mar 2023 • Prasun Roy, Subhankar Ghosh, Umapada Pal
Air-writing refers to virtually writing linguistic characters through hand gestures in three-dimensional space with six degrees of freedom.
no code implementations • 28 Feb 2023 • Prasun Roy, Saumik Bhattacharya, Subhankar Ghosh, Umapada Pal, Michael Blumenstein
The proposed strategy enables us to synthesize semantically coherent realistic persons that can blend into an existing scene without altering the global context.
1 code implementation • 1 Nov 2022 • Cheng-Ping Hsieh, Subhankar Ghosh, Boris Ginsburg
In the proposed approach, a few small adapter modules are added to the original network.
no code implementations • 4 Aug 2022 • Subhankar Ghosh, Prasun Roy, Saumik Bhattacharya, Umapada Pal, Michael Blumenstein
Image colorization is a well-known problem in computer vision.
no code implementations • 24 Jul 2022 • Prasun Roy, Subhankar Ghosh, Saumik Bhattacharya, Umapada Pal, Michael Blumenstein
In computer vision, human pose synthesis and transfer deal with probabilistic image generation of a person in a previously unseen pose from an already available observation of that person.
no code implementations • 6 Jun 2022 • Prasun Roy, Subhankar Ghosh, Saumik Bhattacharya, Umapada Pal, Michael Blumenstein
Finally, the target image is generated from the refined skeleton using another generative network conditioned on a given image of the target person.
1 code implementation • 14 Feb 2022 • Prasun Roy, Saumik Bhattacharya, Subhankar Ghosh, Umapada Pal
Pose transfer refers to the probabilistic image generation of a person with a previously unseen novel pose from another image of that person having a different pose.
1 code implementation • 17 May 2021 • Subhankar Ghosh
We propose a hybrid continual learning model that is more suitable in real case scenarios to address the issues that has a task-invariant shared variational autoencoder and T task-specific variational autoencoders.
1 code implementation • 26 Apr 2021 • Subhankar Ghosh
Continual zero-shot learning(CZSL) is a new domain to classify objects sequentially the model has not seen during training.
1 code implementation • 7 Feb 2021 • Subhankar Ghosh
We propose a continual zero-shot learning model(A-CZSL) that is more suitable in real-case scenarios to address the issue that can learn sequentially and distinguish classes the model has not seen during training.
1 code implementation • CVPR 2020 • Prasun Roy, Saumik Bhattacharya, Subhankar Ghosh, Umapada Pal
In this paper, we propose a method to modify text in an image at character-level.
2 code implementations • 26 Jul 2018 • Prasun Roy, Subhankar Ghosh, Saumik Bhattacharya, Umapada Pal
Deep convolutional neural networks (CNN) have massively influenced recent advances in large-scale image classification.
no code implementations • 9 Jul 2014 • SRM Prasanna, Rituparna Devi, Deepjoy Das, Subhankar Ghosh, Krishna Naik
The work describes the development of Online Assamese Stroke & Akshara Recognizer based on a set of language rules.