no code implementations • CoNLL (EMNLP) 2021 • Arijit Nag, Bidisha Samanta, Animesh Mukherjee, Niloy Ganguly, Soumen Chakrabarti
Data collection is challenging for Indian languages, because they are syntactically and morphologically diverse, as well as different from resource-rich languages like English.
no code implementations • 13 Dec 2024 • Arijit Nag, Soumen Chakrabarti, Animesh Mukherjee, Niloy Ganguly
On the other hand, continual pre-training (CPT) with large amounts of language-specific data is a costly proposition in terms of data acquisition and computational resources.
1 code implementation • 4 Oct 2024 • Gunjan Balde, Soumyadeep Roy, Mainack Mondal, Niloy Ganguly
Current approaches trivially append the target domain-specific vocabulary at the end of the PLM vocabulary.
1 code implementation • 20 Sep 2024 • Abhilash Nandy, Yash Agarwal, Ashish Patwa, Millon Madhur Das, Aman Bansal, Ankit Raj, Pawan Goyal, Niloy Ganguly
In this paper, we propose the challenging tasks of Satirical Image Detection (detecting whether an image is satirical), Understanding (generating the reason behind the image being satirical), and Completion (given one half of the image, selecting the other half from 2 given options, such that the complete image is satirical) and release a high-quality dataset YesBut, consisting of 2547 images, 1084 satirical and 1463 non-satirical, containing different artistic styles, to evaluate those tasks.
1 code implementation • 13 Aug 2024 • Soumyadeep Roy, Shamik Sural, Niloy Ganguly
Our findings reveal that CM-GEMS outperforms state-of-the-art models (DNABert-2, Nucleotide transformer, DNABert) trained at 120K steps, achieving similar results in just 10K and 1K steps.
1 code implementation • 6 Jun 2024 • Ankan Mullick, Sombit Bose, Rounak Saha, Ayan Kumar Bhowmick, Pawan Goyal, Niloy Ganguly, Prasenjit Dey, Ravi Kokku
However, every persona of a domain has different requirements of information and hence their summarization.
1 code implementation • 7 May 2024 • Gunjan Balde, Soumyadeep Roy, Mainack Mondal, Niloy Ganguly
In contrast to existing domain adaptation approaches in summarization, MEDVOC treats vocabulary as an optimizable parameter and optimizes the PLM vocabulary based on fragment score conditioned only on the downstream task's reference summaries.
1 code implementation • 3 May 2024 • Subhendu Khatuya, Koushiki Sinha, Niloy Ganguly, Saptarshi Ghosh, Pawan Goyal
While automatic summarization techniques have made significant advancements, their primary focus has been on summarizing short news articles or documents that have clear structural patterns like scientific articles or government reports.
1 code implementation • 3 May 2024 • Subhendu Khatuya, Rajdeep Mukherjee, Akash Ghosh, Manjunath Hegde, Koustuv Dasgupta, Niloy Ganguly, Saptarshi Ghosh, Pawan Goyal
We study the problem of automatically annotating relevant numerals (GAAP metrics) occurring in the financial documents with their corresponding XBRL tags.
no code implementations • 26 Apr 2024 • Hailay Teklehaymanot, Dren Fazlija, Niloy Ganguly, Gourab K. Patro, Wolfgang Nejdl
The absence of explicitly tailored, accessible annotated datasets for educational purposes presents a notable obstacle for NLP tasks in languages with limited resources. This study initially explores the feasibility of using machine translation (MT) to convert an existing dataset into a Tigrinya dataset in SQuAD format.
1 code implementation • 20 Apr 2024 • Soumyadeep Roy, Aparup Khatua, Fatemeh Ghoochani, Uwe Hadler, Wolfgang Nejdl, Niloy Ganguly
In our annotated dataset, a substantial portion of GPT-4's incorrect responses is categorized as a "Reasonable response by GPT-4," by annotators.
1 code implementation • 6 Apr 2024 • Abhilash Nandy, Yash Kulkarni, Pawan Goyal, Niloy Ganguly
In this paper, we propose sequence-based pretraining methods to enhance procedural understanding in natural language processing.
1 code implementation • 2 Apr 2024 • Soham Poddar, Rajdeep Mukherjee, Subhendu Khatuya, Niloy Ganguly, Saptarshi Ghosh
The debate around vaccines has been going on for decades, but the COVID-19 pandemic showed how crucial it is to understand and mitigate anti-vaccine sentiments.
1 code implementation • 30 Mar 2024 • Akash Ghosh, B Venkata Sahith, Niloy Ganguly, Pawan Goyal, Mayank Singh
Question-answering (QA) on hybrid scientific tabular and textual data deals with scientific information, and relies on complex numerical reasoning.
no code implementations • 8 Mar 2024 • Arijit Nag, Animesh Mukherjee, Niloy Ganguly, Soumen Chakrabarti
As means to reduce the number of tokens processed by the LLM, we consider code-mixing, translation, and transliteration of LRLs to HRLs.
no code implementations • 26 Feb 2024 • Ankan Mullick, Ayan Kumar Bhowmick, Raghav R, Ravi Kokku, Prasenjit Dey, Pawan Goyal, Niloy Ganguly
Dialog summarization has become increasingly important in managing and comprehending large-scale conversations across various domains.
1 code implementation • 26 Feb 2024 • Debopriyo Banerjee, Krothapalli Sreenivasa Rao, Shamik Sural, Niloy Ganguly
In this paper, we propose a box recommendation framework - BOXREC - which at first, collects user preferences across different item types (namely, top-wear, bottom-wear and foot-wear) including price-range of each type and a maximum shopping budget for a particular shopping session.
1 code implementation • 22 Oct 2023 • Abhilash Nandy, Manav Nitin Kapadnis, Pawan Goyal, Niloy Ganguly
In this paper, we propose CLMSM, a domain-specific, continual pre-training framework, that learns from a large set of procedural recipes.
1 code implementation • 29 Jul 2023 • Soumyadeep Roy, Jonas Wallat, Sowmya S Sundaram, Wolfgang Nejdl, Niloy Ganguly
Large-scale language models such as DNABert and LOGO aim to learn optimal gene representations and are trained on the entire Human Reference Genome.
1 code implementation • 9 Jun 2023 • Abhilash Nandy, Manav Nitin Kapadnis, Sohan Patnaik, Yash Parag Butala, Pawan Goyal, Niloy Ganguly
In this paper, we propose $FastDoc$ (Fast Continual Pre-training Technique using Document Level Metadata and Taxonomy), a novel, compute-efficient framework that utilizes Document metadata and Domain-Specific Taxonomy as supervision signals to continually pre-train transformer encoder on a domain-specific corpus.
1 code implementation • 9 Jun 2023 • Kishalay Das, Pawan Goyal, Seung-Cheol Lee, Satadeep Bhattacharjee, Niloy Ganguly
In this work, we leverage textual descriptions of materials to model global structural information into graph structure and learn a more robust and enriched representation of crystalline materials.
no code implementations • 6 Jun 2023 • Soumya Sharma, Subhendu Khatuya, Manjunath Hegde, Afreen Shaikh. Koustuv Dasgupta, Pawan Goyal, Niloy Ganguly
The U. S. Securities and Exchange Commission (SEC) mandates all public companies to file periodic financial statements that should contain numerals annotated with a particular label from a taxonomy.
1 code implementation • 6 Jun 2023 • Soumya Sharma, Tapas Nayak, Arusarka Bose, Ajay Kumar Meena, Koustuv Dasgupta, Niloy Ganguly, Pawan Goyal
Relation extraction models trained on a source domain cannot be applied on a different target domain due to the mismatch between relation sets.
2 code implementations • 11 May 2023 • Jingge Xiao, Leonie Basso, Wolfgang Nejdl, Niloy Ganguly, Sandipan Sikdar
Continuous-time models such as Neural ODEs and Neural Flows have shown promising results in analyzing irregularly sampled time series frequently encountered in electronic health records.
1 code implementation • 14 Feb 2023 • Niloy Ganguly, Dren Fazlija, Maryam Badar, Marco Fisichella, Sandipan Sikdar, Johanna Schrader, Jonas Wallat, Koustav Rudra, Manolis Koubarakis, Gourab K. Patro, Wadhah Zai El Amri, Wolfgang Nejdl
This review aims to provide the reader with an overview of causal methods that have been developed to improve the trustworthiness of AI models.
1 code implementation • 14 Jan 2023 • Kishalay Das, Bidisha Samanta, Pawan Goyal, Seung-Cheol Lee, Satadeep Bhattacharjee, Niloy Ganguly
To leverage these untapped data, this paper presents CrysGNN, a new pre-trained GNN framework for crystalline materials, which captures both node and graph level structural information of crystal graphs using a huge amount of unlabelled material data.
no code implementations • 23 Dec 2022 • Parantapa Bhattacharya, Saptarshi Ghosh, Muhammad Bilal Zafar, Soumya K. Ghosh, Niloy Ganguly
With over 500 million tweets posted per day, in Twitter, it is difficult for Twitter users to discover interesting content from the deluge of uninteresting posts.
1 code implementation • 22 Oct 2022 • Rajdeep Mukherjee, Abhinav Bohra, Akash Banerjee, Soumya Sharma, Manjunath Hegde, Afreen Shaikh, Shivani Shrivastava, Koustuv Dasgupta, Niloy Ganguly, Saptarshi Ghosh, Pawan Goyal
Despite tremendous progress in automatic summarization, state-of-the-art methods are predominantly trained to excel in summarizing short newswire articles, or documents with strong layout biases such as scientific articles or government reports.
1 code implementation • Findings (NAACL) 2022 • Ankan Mullick, Sukannya Purkayastha, Pawan Goyal, Niloy Ganguly
However, the newer intents may not be explicitly announced and need to be inferred dynamically.
1 code implementation • 2 May 2022 • Debopriyo Banerjee, Harsh Maheshwari, Lucky Dhakad1, Arnab Bhattacharya1, Niloy Ganguly, Muthusamy Chelliah, Suyash Agarwal1
Fashion recommendation has witnessed a phenomenal growth of research, particularly in the domains of shop-the-look, contextaware outfit creation, personalizing outfit creation etc.
1 code implementation • 28 Apr 2022 • Soham Poddar, Azlaan Mustafa Samad, Rajdeep Mukherjee, Niloy Ganguly, Saptarshi Ghosh
This is also the first multi-label classification dataset that provides explanations for each of the labels.
1 code implementation • 26 Apr 2022 • Gourab K. Patro, Prithwish Jana, Abhijnan Chakraborty, Krishna P. Gummadi, Niloy Ganguly
As the efficiency and fairness objectives can be in conflict with each other, we propose a joint optimization framework that allows conference organizers to design schedules that balance (i. e., allow trade-offs) among efficiency, participant fairness and speaker fairness objectives.
1 code implementation • 12 Apr 2022 • Tapas Nayak, Soumya Sharma, Yash Butala, Koustuv Dasgupta, Pawan Goyal, Niloy Ganguly
Causality represents the foremost relation between events in financial documents such as financial news articles, financial reports.
no code implementations • 30 Mar 2022 • Debopriyo Banerjee, Lucky Dhakad, Harsh Maheshwari, Muthusamy Chelliah, Niloy Ganguly, Arnab Bhattacharya
Recommendation in the fashion domain has seen a recent surge in research in various areas, for example, shop-the-look, context-aware outfit creation, personalizing outfit creation, etc.
no code implementations • 5 Jan 2022 • Paramita Koley, Aurghya Maiti, Sourangshu Bhattacharya, Niloy Ganguly
On inspecting, we realize that an overall incentive scheme for the weak team does not incentivize the weaker agents within that team to learn and improve.
Multi-agent Reinforcement Learning reinforcement-learning +1
1 code implementation • 26 Dec 2021 • Arpita Biswas, Gourab K Patro, Niloy Ganguly, Krishna P. Gummadi, Abhijnan Chakraborty
Many online platforms today (such as Amazon, Netflix, Spotify, LinkedIn, and AirBnB) can be thought of as two-sided markets with producers and customers of goods and services.
1 code implementation • 10 Dec 2021 • Rajdeep Mukherjee, Uppada Vishnu, Hari Chandana Peruri, Sourangshu Bhattacharya, Koustav Rudra, Pawan Goyal, Niloy Ganguly
Occurrences of catastrophes such as natural or man-made disasters trigger the spread of rumours over social media at a rapid pace.
no code implementations • 18 Oct 2021 • Arijit Nag, Bidisha Samanta, Animesh Mukherjee, Niloy Ganguly, Soumen Chakrabarti
Relation classification (sometimes called 'extraction') requires trustworthy datasets for fine-tuning large language models, as well as for evaluation.
1 code implementation • 27 Sep 2021 • Soumyadeep Roy, Sudip Chakraborty, Aishik Mandal, Gunjan Balde, Prakhar Sharma, Anandhavelu Natarajan, Megha Khosla, Shamik Sural, Niloy Ganguly
Online medical forums have become a predominant platform for answering health-related information needs of consumers.
1 code implementation • Findings (EMNLP) 2021 • Abhilash Nandy, Soumya Sharma, Shubham Maddhashiya, Kapil Sachdeva, Pawan Goyal, Niloy Ganguly
Answering questions asked from instructional corpora such as E-manuals, recipe books, etc., has been far less studied than open-domain factoid context-based question answering.
no code implementations • ACL 2021 • Bidisha Samanta, Mohit Agrawal, Niloy Ganguly
In this digital age, online users expect personalized content.
2 code implementations • NAACL 2021 • Ayush Kaushal, Avirup Saha, Niloy Ganguly
The stance detection task aims at detecting the stance of a tweet or a text for a target.
1 code implementation • 9 May 2021 • Rajdeep Mukherjee, Atharva Naik, Sriyash Poddar, Soham Dasgupta, Niloy Ganguly
For the regression task, VADEC, when trained with SenWave, achieves 7. 6% and 16. 5% gains in Pearson Correlation scores over the current state-of-the-art on the EMOBANK dataset for the Valence (V) and Dominance (D) affect dimensions respectively.
no code implementations • 24 Mar 2021 • Soumi Das, Harikrishna Patibandla, Suparna Bhattacharya, Kshounis Bera, Niloy Ganguly, Sourangshu Bhattacharya
We design a novel convex optimization-based multi-criteria online subset selection algorithm that uses a thresholded concave function of selection variables.
no code implementations • 11 Feb 2021 • Paramita Koley, Avirup Saha, Sourangshu Bhattacharya, Niloy Ganguly, Abir De
The networked opinion diffusion in online social networks (OSN) is often governed by the two genres of opinions - endogenous opinions that are driven by the influence of social contacts among users, and exogenous opinions which are formed by external effects like news, feeds etc.
no code implementations • WS 2014 • Rajkumar Pujari, Swara Desai, Niloy Ganguly, Pawan Goyal
This paper presents a novel two-stage framework to extract opinionated sentences from a given news article.
no code implementations • ICCV 2021 • Soumi Das, Harikrishna Patibandla, Suparna Bhattacharya, Kshounis Bera, Niloy Ganguly, Sourangshu Bhattacharya
Training vision-based Autonomous driving models is a challenging problem with enormous practical implications.
1 code implementation • 19 Nov 2020 • Soumyadeep Roy, Shamik Sural, Niyati Chhaya, Anandhavelu Natarajan, Niloy Ganguly
A consumer-dependent (business-to-consumer) organization tends to present itself as possessing a set of human qualities, which is termed as the brand personality of the company.
no code implementations • 24 Oct 2020 • Gourab K Patro, Abhijnan Chakraborty, Niloy Ganguly, Krishna P. Gummadi
We show that the welfare and fairness objectives can be in conflict with each other, and there is a need to maintain a balance between these objective while caring for them simultaneously.
no code implementations • 17 Jun 2020 • Bidisha Samanta, Mohit Agarwal, Niloy Ganguly
DE-VAE achieves better control of sentiment as an attribute while preserving the content by learning a suitable lossless transformation network from the disentangled sentiment space to the desired entangled representation.
1 code implementation • 8 Jun 2020 • Rajdeep Mukherjee, Hari Chandana Peruri, Uppada Vishnu, Pawan Goyal, Sourangshu Bhattacharya, Niloy Ganguly
Manually extracting relevant aspects and opinions from large volumes of user-generated text is a time-consuming process.
2 code implementations • 25 Feb 2020 • Gourab K Patro, Arpita Biswas, Niloy Ganguly, Krishna P. Gummadi, Abhijnan Chakraborty
We investigate the problem of fair recommendation in the context of two-sided online platforms, comprising customers on one side and producers on the other.
no code implementations • 6 Nov 2019 • Soumi Das, Rajath Nandan Kalava, Kolli Kiran Kumar, Akhil Kandregula, Kalpam Suhaas, Sourangshu Bhattacharya, Niloy Ganguly
Travel time estimation is a fundamental problem in transportation science with extensive literature.
1 code implementation • 6 Sep 2019 • Abir De, Nastaran Okati, Paramita Koley, Niloy Ganguly, Manuel Gomez-Rodriguez
In this paper, we take a first step towards the development of machine learning models that are optimized to operate under different automation levels.
1 code implementation • IJCNLP 2019 • Soumya Sharma, Bishal Santra, Abhik Jana, T. Y. S. S. Santosh, Niloy Ganguly, Pawan Goyal
Specifically, we experiment with fusing embeddings obtained from knowledge graph with the state-of-the-art approaches for NLI task (ESIM model).
1 code implementation • 21 Jun 2019 • Bidisha Samanta, Sharmila Reddy, Hussain Jagirdar, Niloy Ganguly, Soumen Chakrabarti
Code-switching, the interleaving of two or more languages within a sentence or discourse is pervasive in multilingual societies.
1 code implementation • ACL 2019 • Bidisha Samanta, Niloy Ganguly, Soumen Chakrabarti
Consequently, the best monolingual methods perform relatively poorly on code-switched text.
no code implementations • NAACL 2019 • Santosh Tokala, Vishal G, Avirup Saha, Niloy Ganguly
The recently released FEVER dataset provided benchmark results on a fact-checking task in which given a factual claim, the system must extract textual evidence (sets of sentences from Wikipedia pages) that support or refute the claim.
2 code implementations • 14 Feb 2018 • Bidisha Samanta, Abir De, Gourhari Jana, Pratim Kumar Chattaraj, Niloy Ganguly, Manuel Gomez-Rodriguez
Moreover, in contrast with the state of the art, our decoder is able to provide the spatial coordinates of the atoms of the molecules it generates.
no code implementations • WS 2016 • Ankan Mullick, Pawan Goyal, Niloy Ganguly
This paper proposes a graphical framework to extract opinionated sentences which highlight different contexts within a given news article by introducing the concept of diversity in a graphical model for opinion detection. We conduct extensive evaluations and find that the proposed modification leads to impressive improvement in performance and makes the final results of the model much more usable.
no code implementations • 5 Oct 2016 • Koustav Rudra, Siddhartha Banerjee, Niloy Ganguly, Pawan Goyal, Muhammad Imran, Prasenjit Mitra
The use of microblogging platforms such as Twitter during crises has become widespread.
no code implementations • LREC 2016 • Rafiya Begum, Kalika Bali, Monojit Choudhury, Koustav Rudra, Niloy Ganguly
Code-Switching (CS) between two languages is extremely common in communities with societal multilingualism where speakers switch between two or more languages when interacting with each other.
no code implementations • 17 Oct 2013 • Abir De, Niloy Ganguly, Soumen Chakrabarti
Apart from the new predictor, another contribution is a rigorous protocol for benchmarking and reporting LP algorithms, which reveals the regions of strengths and weaknesses of all the predictors studied here, and establishes the new proposal as the most robust.