Search Results for author: Subhankar Ghosh

Found 25 papers, 10 papers with code

Low Frame-rate Speech Codec: a Codec Designed for Fast High-quality Speech LLM Training and Inference

no code implementations18 Sep 2024 Edresson Casanova, Ryan Langman, Paarth Neekhara, Shehzeen Hussain, Jason Li, Subhankar Ghosh, Ante Jukić, Sang-gil Lee

Large language models (LLMs) have significantly advanced audio processing through audio codecs that convert audio into discrete tokens, enabling the application of language modeling techniques to audio data.

Audio Compression Language Modelling +2

Reducing False Discoveries in Statistically-Significant Regional-Colocation Mining: A Summary of Results

no code implementations1 Jul 2024 Subhankar Ghosh, Jayant Gupta, Arun Sharma, Shuai An, Shashi Shekhar

Previously, we proposed a miner \cite{10. 1145/3557989. 3566158} that finds statistically significant regional colocation patterns.


Towards Statistically Significant Taxonomy Aware Co-location Pattern Detection

no code implementations29 Jun 2024 Subhankar Ghosh, Arun Sharma, Jayant Gupta, Shashi Shekhar

Given a collection of Boolean spatial feature types, their instances, a neighborhood relation (e. g., proximity), and a hierarchical taxonomy of the feature types, the goal is to find the subsets of feature types or their parents whose spatial interaction is statistically significant.

Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment

no code implementations25 Jun 2024 Paarth Neekhara, Shehzeen Hussain, Subhankar Ghosh, Jason Li, Rafael Valle, Rohan Badlani, Boris Ginsburg

Large Language Model (LLM) based text-to-speech (TTS) systems have demonstrated remarkable capabilities in handling large speech datasets and generating natural speech for new speakers.

Decoder Language Modelling +3

Conformal Prediction for Class-wise Coverage via Augmented Label Rank Calibration

no code implementations10 Jun 2024 Yuanjie Shi, Subhankar Ghosh, Taha Belkhouja, Janardhan Rao Doppa, Yan Yan

In contrast to the standard class-conditional CP (CCP) method that uniformly thresholds the class-wise conformity score for each class, the augmented label rank calibration step allows RC3P to selectively iterate this class-wise thresholding subroutine only for a subset of classes whose class-wise top-k error is small.

Conformal Prediction imbalanced classification +2

Reducing Uncertainty in Sea-level Rise Prediction: A Spatial-variability-aware Approach

no code implementations19 Oct 2023 Subhankar Ghosh, Shuai An, Arun Sharma, Jayant Gupta, Shashi Shekhar, Aneesh Subramanian

Given multi-model ensemble climate projections, the goal is to accurately and reliably predict future sea-level rise while lowering the uncertainty.


FAST: Font-Agnostic Scene Text Editing

no code implementations5 Aug 2023 Alloy Das, Prasun Roy, Saumik Bhattacharya, Subhankar Ghosh, Umapada Pal, Michael Blumenstein

However, most of the existing STE methods show inferior editing performance because of (1) complex image backgrounds, (2) various font styles, and (3) varying word lengths within the text.

Scene Text Editing Style Transfer +1

Probabilistically robust conformal prediction

no code implementations31 Jul 2023 Subhankar Ghosh, Yuanjie Shi, Taha Belkhouja, Yan Yan, Jana Doppa, Brian Jones

We propose a novel adaptive PRCP (aPRCP) algorithm to achieve probabilistically robust coverage.

Conformal Prediction

A CNN Based Framework for Unistroke Numeral Recognition in Air-Writing

1 code implementation14 Mar 2023 Prasun Roy, Subhankar Ghosh, Umapada Pal

Air-writing refers to virtually writing linguistic characters through hand gestures in three-dimensional space with six degrees of freedom.

Segmentation Transfer Learning

Global Context-Aware Person Image Generation

no code implementations28 Feb 2023 Prasun Roy, Saumik Bhattacharya, Subhankar Ghosh, Umapada Pal, Michael Blumenstein

The proposed strategy enables us to synthesize semantically coherent realistic persons that can blend into an existing scene without altering the global context.

Image Generation

TIPS: Text-Induced Pose Synthesis

no code implementations24 Jul 2022 Prasun Roy, Subhankar Ghosh, Saumik Bhattacharya, Umapada Pal, Michael Blumenstein

In computer vision, human pose synthesis and transfer deal with probabilistic image generation of a person in a previously unseen pose from an already available observation of that person.

Descriptive Pose Transfer

Scene Aware Person Image Generation through Global Contextual Conditioning

no code implementations6 Jun 2022 Prasun Roy, Subhankar Ghosh, Saumik Bhattacharya, Umapada Pal, Michael Blumenstein

Finally, the target image is generated from the refined skeleton using another generative network conditioned on a given image of the target person.

Generative Adversarial Network Image Generation

Multi-scale Attention Guided Pose Transfer

1 code implementation14 Feb 2022 Prasun Roy, Saumik Bhattacharya, Subhankar Ghosh, Umapada Pal

Pose transfer refers to the probabilistic image generation of a person with a previously unseen novel pose from another image of that person having a different pose.

Decoder Pose Transfer

Shared and Private VAEs with Generative Replay for Continual Learning

1 code implementation17 May 2021 Subhankar Ghosh

We propose a hybrid continual learning model that is more suitable in real case scenarios to address the issues that has a task-invariant shared variational autoencoder and T task-specific variational autoencoders.

Continual Learning Permuted-MNIST

Dynamic VAEs with Generative Replay for Continual Zero-shot Learning

1 code implementation26 Apr 2021 Subhankar Ghosh

Continual zero-shot learning(CZSL) is a new domain to classify objects sequentially the model has not seen during training.

Continual Learning Zero-Shot Learning

Adversarial Training of Variational Auto-encoders for Continual Zero-shot Learning(A-CZSL)

1 code implementation7 Feb 2021 Subhankar Ghosh

We propose a continual zero-shot learning model(A-CZSL) that is more suitable in real-case scenarios to address the issue that can learn sequentially and distinguish classes the model has not seen during training.

Continual Learning Generalized Zero-Shot Learning

Effects of Degradations on Deep Neural Network Architectures

2 code implementations26 Jul 2018 Prasun Roy, Subhankar Ghosh, Saumik Bhattacharya, Umapada Pal

Deep convolutional neural networks (CNN) have massively influenced recent advances in large-scale image classification.

General Classification Image Classification

Online Stroke and Akshara Recognition GUI in Assamese Language Using Hidden Markov Model

no code implementations9 Jul 2014 SRM Prasanna, Rituparna Devi, Deepjoy Das, Subhankar Ghosh, Krishna Naik

The work describes the development of Online Assamese Stroke & Akshara Recognizer based on a set of language rules.


Cannot find the paper you are looking for? You can Submit a new open access paper.