no code implementations • CoNLL (EMNLP) 2021 • Arijit Nag, Bidisha Samanta, Animesh Mukherjee, Niloy Ganguly, Soumen Chakrabarti
Data collection is challenging for Indian languages, because they are syntactically and morphologically diverse, as well as different from resource-rich languages like English.
1 code implementation • 4 Jan 2024 • Rachit Bansal, Bidisha Samanta, Siddharth Dalmia, Nitish Gupta, Shikhar Vashishth, Sriram Ganapathy, Abhishek Bapna, Prateek Jain, Partha Talukdar
Foundational models with billions of parameters which have been trained on large corpora of data have demonstrated non-trivial skills in a variety of domains.
1 code implementation • 19 May 2023 • Sebastian Ruder, Jonathan H. Clark, Alexander Gutkin, Mihir Kale, Min Ma, Massimo Nicosia, Shruti Rijhwani, Parker Riley, Jean-Michel A. Sarr, Xinyi Wang, John Wieting, Nitish Gupta, Anna Katanova, Christo Kirov, Dana L. Dickinson, Brian Roark, Bidisha Samanta, Connie Tao, David I. Adelani, Vera Axelrod, Isaac Caswell, Colin Cherry, Dan Garrette, Reeve Ingle, Melvin Johnson, Dmitry Panteleev, Partha Talukdar
We evaluate commonly used models on the benchmark.
1 code implementation • 14 Jan 2023 • Kishalay Das, Bidisha Samanta, Pawan Goyal, Seung-Cheol Lee, Satadeep Bhattacharjee, Niloy Ganguly
To leverage these untapped data, this paper presents CrysGNN, a new pre-trained GNN framework for crystalline materials, which captures both node and graph level structural information of crystal graphs using a huge amount of unlabelled material data.
no code implementations • 13 Oct 2022 • Abhijeet Awasthi, Nitish Gupta, Bidisha Samanta, Shachi Dave, Sunita Sarawagi, Partha Talukdar
Despite cross-lingual generalization demonstrated by pre-trained multilingual models, the translate-train paradigm of transferring English datasets across multiple languages remains to be a key mechanism for training task-specific multilingual models.
no code implementations • 18 Oct 2021 • Arijit Nag, Bidisha Samanta, Animesh Mukherjee, Niloy Ganguly, Soumen Chakrabarti
Relation classification (sometimes called 'extraction') requires trustworthy datasets for fine-tuning large language models, as well as for evaluation.
no code implementations • ACL 2022 • Kalpesh Krishna, Deepak Nathani, Xavier Garcia, Bidisha Samanta, Partha Talukdar
When compared to prior work, our model achieves 2-3x better performance in formality transfer and code-mixing addition across seven languages.
no code implementations • ACL 2021 • Bidisha Samanta, Mohit Agrawal, Niloy Ganguly
In this digital age, online users expect personalized content.
no code implementations • 17 Jun 2020 • Bidisha Samanta, Mohit Agarwal, Niloy Ganguly
DE-VAE achieves better control of sentiment as an attribute while preserving the content by learning a suitable lossless transformation network from the disentangled sentiment space to the desired entangled representation.
1 code implementation • 21 Jun 2019 • Bidisha Samanta, Sharmila Reddy, Hussain Jagirdar, Niloy Ganguly, Soumen Chakrabarti
Code-switching, the interleaving of two or more languages within a sentence or discourse is pervasive in multilingual societies.
1 code implementation • ACL 2019 • Bidisha Samanta, Niloy Ganguly, Soumen Chakrabarti
Consequently, the best monolingual methods perform relatively poorly on code-switched text.
2 code implementations • 14 Feb 2018 • Bidisha Samanta, Abir De, Gourhari Jana, Pratim Kumar Chattaraj, Niloy Ganguly, Manuel Gomez-Rodriguez
Moreover, in contrast with the state of the art, our decoder is able to provide the spatial coordinates of the atoms of the molecules it generates.
no code implementations • EMNLP 2017 • Jasabanta Patro, Bidisha Samanta, Saurabh Singh, Abhipsa Basu, Prithwish Mukherjee, Monojit Choudhury, Animesh Mukherjee
Based on this likeliness estimate we asked annotators to re-annotate the language tags of foreign words in predominantly native contexts.
no code implementations • 25 Jul 2017 • Jasabanta Patro, Bidisha Samanta, Saurabh Singh, Abhipsa Basu, Prithwish Mukherjee, Monojit Choudhury, Animesh Mukherjee
Based on this likeliness estimate we asked annotators to re-annotate the language tags of foreign words in predominantly native contexts.
no code implementations • 15 Mar 2017 • Jasabanta Patro, Bidisha Samanta, Saurabh Singh, Prithwish Mukherjee, Monojit Choudhury, Animesh Mukherjee
We first propose context based clustering method to sample a set of candidate words from the social media data. Next, we propose three novel and similar metrics based on the usage of these words by the users in different tweets; these metrics were used to score and rank the candidate words indicating their borrowed likeliness.