no code implementations • 19 Dec 2023 • Shraman Pramanick, Guangxing Han, Rui Hou, Sayan Nag, Ser-Nam Lim, Nicolas Ballas, Qifan Wang, Rama Chellappa, Amjad Almahairi
In this work, we introduce VistaLLM, a powerful visual system that addresses coarse- and fine-grained VL tasks over single and multiple input images using a unified framework.
no code implementations • 4 Dec 2023 • Sanjoy Chowdhury, Sayan Nag, Dinesh Manocha
Our method is designed to substantially improve the generalization capabilities of VLP models when they are fine-tuned in a few-shot setting.
1 code implementation • ICCV 2023 • Shraman Pramanick, Yale Song, Sayan Nag, Kevin Qinghong Lin, Hardik Shah, Mike Zheng Shou, Rama Chellappa, Pengchuan Zhang
Video-language pre-training (VLP) has become increasingly important due to its ability to generalize to various vision and language tasks.
no code implementations • 5 Jun 2023 • Ahana Deb, Sayan Nag, Ayan Mahapatra, Soumitri Chattopadhyay, Aritra Marik, Pijush Kanti Gayen, Shankha Sanyal, Archi Banerjee, Samir Karmakar
Spoken languages often utilise intonation, rhythm, intensity, and structure, to communicate intention, which can be interpreted differently depending on the rhythm of speech of their utterance.
no code implementations • 9 Mar 2023 • Soumitri Chattopadhyay, Soham Ganguly, Sreejit Chaudhury, Sayan Nag, Samiran Chattopadhyay
In this paper, we seek to tackle these concerns head-on and systematically explore the applicability of non-contrastive self-supervised learning (SSL) algorithms under federated learning (FL) simulations for medical image analysis.
1 code implementation • 3 Mar 2023 • Soumitri Chattopadhyay, Soham Ganguly, Sreejit Chaudhury, Sayan Nag, Samiran Chattopadhyay
The success of self-supervised learning (SSL) has mostly been attributed to the availability of unlabeled yet large-scale datasets.
1 code implementation • 26 Oct 2022 • Hritam Basak, Soumitri Chattopadhyay, Rohit Kundu, Sayan Nag, Rammohan Mallipeddi
To this end, we extend the concept of metric learning to the segmentation task, using a dense (dis)similarity learning for pre-training a deep encoder network, and employing a semi-supervised paradigm to fine-tune for the downstream task.
1 code implementation • 9 Oct 2022 • Shraman Pramanick, Li Jing, Sayan Nag, Jiachen Zhu, Hardik Shah, Yann Lecun, Rama Chellappa
Extensive experiments on a wide range of vision- and vision-language downstream tasks demonstrate the effectiveness of VoLTA on fine-grained applications without compromising the coarse-grained downstream performance, often outperforming methods using significantly more caption and box annotations.
1 code implementation • 9 Sep 2021 • Mayukh Bhattacharyya, Sayan Nag, Udita Ghosh
Air pollution poses a serious threat to sustainable environmental conditions in the 21st century.
no code implementations • 21 Aug 2021 • Sayan Nag, Mayukh Bhattacharyya
Activation functions play a pivotal role in determining the training dynamics and neural network performance.
1 code implementation • 25 May 2021 • Sayan Nag
Self-supervised learning and pre-training strategieshave developed over the last few years especiallyfor Convolutional Neural Networks (CNNs).
no code implementations • 11 Feb 2021 • Sayan Nag, Uddalok Sarkar, Shankha Sanyal, Archi Banerjee, Souparno Roy, Samir Karmakar, Ranjan Sengupta, Dipak Ghosh
It is already known that both auditory and visual stimulus is able to convey emotions in human mind to different extent.
no code implementations • 11 Feb 2021 • Uddalok Sarkar, Sayan Nag, Chirayata Bhattacharya, Shankha Sanyal, Archi Banerjee, Ranjan Sengupta, Dipak Ghosh
In this work we have modelled our articulation system using nonlinear multifractal analysis.
no code implementations • 1 Feb 2021 • Uddalok Sarkar, Sayan Nag, Medha Basu, Archi Banerjee, Shankha Sanyal, Ranjan Sengupta, Dipak Ghosh
Music is often considered as the language of emotions.
no code implementations • 14 Jan 2021 • Sayan Nag
Experimental manipulations perturb the neuronal activity.
no code implementations • 3 Dec 2020 • Sayan Nag
Recently, an optimizer was introduced which is known as lookahead optimizer which significantly enhances the performances of Adam as well as SGD.
no code implementations • 15 Apr 2020 • Uddalok Sarkar, Soumyadeep Pal, Sayan Nag, Chirayata Bhattacharya, Shankha Sanyal, Archi Banerjee, Ranjan Sengupta, Dipak Ghosh
The study of Bengali speech recognition and speaker identification is scarce in the literature.
1 code implementation • 23 Nov 2019 • Mayukh Bhattacharyya, Sayan Nag
These low level style features are captured to a large extent in techniques used in neural style transfer.
Ranked #1 on Complimentary Image Retrieval on iMaterialist
no code implementations • 28 Nov 2017 • Sayan Nag
Image Registration is the process of aligning two or more images of the same scene with reference to a particular image.
1 code implementation • 1 Nov 2017 • Uddalok Sarkar, Sayan Nag
In this paper a Metaheuristic approach for solving the N-Queens Problem is introduced to find the best possible solution in a reasonable amount of time.
no code implementations • 15 Oct 2017 • Sayan Nag
LindeBuzoGray, LBG is a traditional method of generation of VQ Codebook which results in lower PSNR value.
no code implementations • 23 Aug 2017 • Sayan Nag
Thresholding based Image Segmentation using fuzzy entropy combined with intelligent optimization approaches are commonly used direct methods to properly identify the thresholds so that they can be used to segment an Image accurately.
no code implementations • 4 Aug 2017 • Sayan Nag
Optimization problems in design engineering are complex by nature, often because of the involvement of critical objective functions accompanied by a number of rigid constraints associated with the products involved.