1 code implementation • 23 Mar 2025 • Suman Adhya, Avishek Lahiri, Debarshi Kumar Sanyal, Partha Pratim Das
Negative sampling has emerged as an effective technique that enables deep learning models to learn better representations by introducing the paradigm of learn-to-compare.
no code implementations • 26 Feb 2025 • Tohida Rehman, Soumabha Ghosh, Kuntal Das, Souvik Bhattacharjee, Debarshi Kumar Sanyal, Samiran Chattopadhyay
Text summarization plays a crucial role in natural language processing by condensing large volumes of text into concise and coherent summaries.
no code implementations • 26 Jan 2025 • Tohida Rehman, Debarshi Kumar Sanyal, Samiran Chattopadhyay
Artificial intelligence systems significantly impact the environment, particularly in natural language processing (NLP) tasks.
1 code implementation • 22 Sep 2024 • Tohida Rehman, Debarshi Kumar Sanyal, Samiran Chattopadhyay
In this paper, we fine-tune pre-trained language models to generate titles of papers from their abstracts.
no code implementations • 28 Apr 2024 • Tohida Rehman, Raghubir Bose, Samiran Chattopadhyay, Debarshi Kumar Sanyal
Financial sentiment analysis allows financial institutions like Banks and Insurance Companies to better manage the credit scoring of their customers in a better way.
1 code implementation • 2 Apr 2024 • Suman Adhya, Debarshi Kumar Sanyal
Topic modeling is a widely used approach for analyzing and exploring large document collections.
2 code implementations • 28 Nov 2023 • Soumya Banerjee, Debarshi Kumar Sanyal, Samiran Chattopadhyay, Plaban Kumar Bhowmick, Partha Pratim Das
Digital libraries often face the challenge of processing a large volume of diverse document types.
document-image-classification
Document Image Classification
+2
1 code implementation • 28 Sep 2023 • Tohida Rehman, Ronit Mandal, Abhishek Agarwal, Debarshi Kumar Sanyal
We have used the following metrics to measure factual consistency at the entity level: precision-source, and F1-target.
1 code implementation • 25 Apr 2023 • Avishek Lahiri, Debarshi Kumar Sanyal, Imon Mukherjee
For the ACL-ARC dataset, we report a 53. 86% F1 score for the zero-shot setting, which improves to 63. 61% and 66. 99% for the 5-shot and 10-shot settings, respectively.
Ranked #2 on
Citation Intent Classification
on ACL-ARC
1 code implementation • PoliticalNLP (LREC) 2022 • Suman Adhya, Debarshi Kumar Sanyal
The TCPD-IPD dataset is a collection of questions and answers discussed in the Lower House of the Parliament of India during the Question Hour between 1999 and 2019.
1 code implementation • 28 Mar 2023 • Suman Adhya, Avishek Lahiri, Debarshi Kumar Sanyal
Dropout is a widely used regularization trick to resolve the overfitting issue in large feedforward neural networks trained on a small dataset, which performs poorly on the held-out test subset.
2 code implementations • 27 Mar 2023 • Suman Adhya, Debarshi Kumar Sanyal
Topic modeling is a dominant method for exploring document collections on the web and in digital libraries.
2 code implementations • 27 Mar 2023 • Suman Adhya, Avishek Lahiri, Debarshi Kumar Sanyal, Partha Pratim Das
Topic modeling has emerged as a dominant method for exploring large document collections.
no code implementations • 25 Feb 2023 • Tohida Rehman, Suchandan Das, Debarshi Kumar Sanyal, Samiran Chattopadhyay
People nowadays use search engines like Google, Yahoo, and Bing to find information on the Internet.
no code implementations • sdp (COLING) 2022 • Tohida Rehman, Debarshi Kumar Sanyal, Prasenjit Majumder, Samiran Chattopadhyay
We investigate whether the use of named entity recognition on the input improves the quality of the generated highlights.
no code implementations • 25 Feb 2023 • Tohida Rehman, Suchandan Das, Debarshi Kumar Sanyal, Samiran Chattopadhyay
Indeed automatic text summarization has emerged as an important application of machine learning in text processing.
1 code implementation • 14 Feb 2023 • Tohida Rehman, Debarshi Kumar Sanyal, Samiran Chattopadhyay, Plaban Kumar Bhowmick, Partha Pratim Das
On the new MixSub dataset, where only the abstract is the input, our proposed model (when trained on the whole training corpus without distinguishing between the subject categories) achieves ROUGE-1, ROUGE-2 and ROUGE-L F1-scores of 31. 78, 9. 76 and 29. 3, respectively, METEOR score of 24. 00, and BERTScore F1 of 85. 25.
no code implementations • 25 Apr 2022 • Prantika Chakraborty, Sudakshina Dutta, Debarshi Kumar Sanyal
Maintaining research-related information in an organized manner can be challenging for a researcher.
1 code implementation • Extraction and Evaluation of Knowledge Entities from Scientific Documents 2021 • T Y S S Santosh, Prantika Chakraborty, Sudakshina Dutta, Debarshi Kumar Sanyal, Partha Pratim Das
Scientific articles contain various types of domain-specific entities and relations between them.
Ranked #3 on
Joint Entity and Relation Extraction
on SciERC
no code implementations • COLING 2020 • T.y.s.s Santosh, Debarshi Kumar Sanyal, Plaban Kumar Bhowmick, Partha Pratim Das
Keyphrases in a research paper succinctly capture the primary content of the paper and also assist in indexing the paper at a concept level.
1 code implementation • 11 May 2020 • Soumya Banerjee, Debarshi Kumar Sanyal, Samiran Chattopadhyay, Plaban Kumar Bhowmick, Parthapratim Das
In the biomedical literature, it is customary to structure an abstract into discourse categories like BACKGROUND, OBJECTIVE, METHOD, RESULT, and CONCLUSION, but this segmentation is uncommon in other fields like computer science.