Search Results for author: Swapnil Bhosale

Found 8 papers, 0 papers with code

Unsupervised Audio-Visual Segmentation with Modality Alignment

no code implementations • 21 Mar 2024 • Swapnil Bhosale, Haosen Yang, Diptesh Kanojia, Jiangkang Deng, Xiatian Zhu

Audio-Visual Segmentation (AVS) aims to identify, at the pixel level, the object in a visual scene that produces a given sound.

Contrastive Learning

Paper
Add Code

Sarcasm in Sight and Sound: Benchmarking and Expansion to Improve Multimodal Sarcasm Detection

no code implementations • 29 Sep 2023 • Swapnil Bhosale, Abhra Chaudhuri, Alex Lee Robert Williams, Divyank Tiwari, Anjan Dutta, Xiatian Zhu, Pushpak Bhattacharyya, Diptesh Kanojia

The introduction of the MUStARD dataset, and its emotion recognition extension MUStARD++, have identified sarcasm to be a multi-modal phenomenon -- expressed not only in natural language text, but also through manners of speech (like tonality and intonation) and visual cues (facial expression).

Benchmarking Emotion Recognition +1

Paper
Add Code

Leveraging Foundation models for Unsupervised Audio-Visual Segmentation

no code implementations • 13 Sep 2023 • Swapnil Bhosale, Haosen Yang, Diptesh Kanojia, Xiatian Zhu

Particularly, in situations where existing supervised AVS methods struggle with overlapping foreground objects, our models still excel in accurately segmenting overlapped auditory objects.

Segmentation

Paper
Add Code

DiffSED: Sound Event Detection with Denoising Diffusion

no code implementations • 14 Aug 2023 • Swapnil Bhosale, Sauradip Nag, Diptesh Kanojia, Jiankang Deng, Xiatian Zhu

In this work, we reformulate the SED problem by taking a generative learning perspective.

Decoder Denoising +2

Paper
Add Code

Text-to-Audio Grounding Based Novel Metric for Evaluating Audio Caption Similarity

no code implementations • 3 Oct 2022 • Swapnil Bhosale, Rupayan Chakraborty, Sunil Kumar Kopparapu

Automatic Audio Captioning (AAC) refers to the task of translating an audio sample into a natural language (NL) text that describes the audio events, source of the events and their relationships.

Audio captioning Image Captioning +2

Paper
Add Code

Automatic Audio Captioning using Attention weighted Event based Embeddings

no code implementations • 28 Jan 2022 • Swapnil Bhosale, Rupayan Chakraborty, Sunil Kumar Kopparapu

Automatic Audio Captioning (AAC) refers to the task of translating audio into a natural language that describes the audio events, source of the events and their relationships.

Audio captioning Decoder +2

Paper
Add Code

Automatic Speaker Independent Dysarthric Speech Intelligibility Assessment System

no code implementations • 10 Mar 2021 • Ayush Tripathi, Swapnil Bhosale, Sunil Kumar Kopparapu

Dysarthria is a condition which hampers the ability of an individual to control the muscles that play a major role in speech delivery.

Paper
Add Code

Semi Supervised Learning For Few-shot Audio Classification By Episodic Triplet Mining

no code implementations • 16 Feb 2021 • Swapnil Bhosale, Rupayan Chakraborty, Sunil Kumar Kopparapu

In this paper, we propose to replace the typical prototypical loss function with an Episodic Triplet Mining (ETM) technique.

Audio Classification Event Detection +4

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.