1 code implementation • COLING 2022 • Oishik Chatterjee, Isha Pandey, Aashish Waikar, Vishwajeet Kumar, Ganesh Ramakrishnan
In order to address this challenge of equation annotation, we propose a weakly supervised model for solving MWPs by requiring only the final answer as supervision.
1 code implementation • 13 Mar 2024 • H S V N S Kowndinya Renduchintala, Sumit Bhatia, Ganesh Ramakrishnan
Instruction Tuning involves finetuning a language model on a collection of instruction-formatted datasets in order to enhance the generalizability of the model to unseen tasks.
no code implementations • 7 Mar 2024 • Ojas Gramopadhye, Saeel Sandeep Nachane, Prateek Chanda, Ganesh Ramakrishnan, Kshitij Sharad Jadhav, Yatin Nandwani, Dinesh Raghu, Sachindra Joshi
In this paper, we propose a modified version of the MedQA-USMLE dataset, which is subjective, to mimic real-life clinical scenarios.
no code implementations • 23 Feb 2024 • Divya Jyoti Bajpai, Ayush Maheshwari, Manjesh Kumar Hanawal, Ganesh Ramakrishnan
The availability of large annotated data can be a critical bottleneck in training machine learning algorithms successfully, especially when applied to diverse domains.
1 code implementation • IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024 • Dhruv Kudale, Badri Vishal Kasuba, Venkatapathy Subramanian, Parag Chaudhuri, Ganesh Ramakrishnan
In order to solve this problem, we propose TEXTRON, a Data Programming-based approach, where users can plug various text detection methods into a weak supervision-based learning framework.
no code implementations • 11 Feb 2024 • Akshat Gautam, Anurag Shandilya, Akshit Srivastava, Venkatapathy Subramanian, Ganesh Ramakrishnan, Kshitij Jadhav
We demonstrate that informed subset selection followed by semi-supervised data programming methods using these images as exemplars perform better than other state-of-the-art semi-supervised methods.
1 code implementation • 13 Jan 2024 • Durga Sivasubramanian, Lokesh Nagalapatti, Rishabh Iyer, Ganesh Ramakrishnan
We conduct experiments using four real-world datasets and show that GCFL is (1) more compute and energy efficient than FL, (2) robust to various kinds of noise in both the feature space and labels, (3) preserves the privacy of the validation dataset, and (4) introduces a small communication overhead but achieves significant gains in performance, particularly in cases when the clients' data is noisy.
1 code implementation • 23 Nov 2023 • Abhishek Singh, Venkatapathy Subramanian, Ayush Maheshwari, Pradeep Narayan, Devi Prasad Shetty, Ganesh Ramakrishnan
We empirically show that our EIGEN framework can significantly improve the performance of state-of-the-art deep models with the availability of very few labeled data instances.
no code implementations • 28 Oct 2023 • Rishabh Tiwari, Durga Sivasubramanian, Anmol Mekala, Ganesh Ramakrishnan, Pradeep Shenoy
Deep networks tend to learn spurious feature-label correlations in real-world supervised learning tasks.
1 code implementation • 10 Oct 2023 • Piyush Singh Pasi, Karthikeya Battepati, Preethi Jyothi, Ganesh Ramakrishnan, Tanmay Mahapatra, Manoj Singh
The problem of audio-to-text alignment has seen significant amount of research using complete supervision during training.
1 code implementation • 23 May 2023 • Ayush Maheshwari, Ashim Gupta, Amrith Krishna, Atul Kumar Singh, Ganesh Ramakrishnan, G. Anil Kumar, Jitin Singla
Translation models trained on our dataset demonstrate statistically significant improvements when translating out-of-domain contemporary corpora, outperforming models trained on older classical-era poetry datasets.
no code implementations • 11 May 2023 • H S V N S Kowndinya Renduchintala, KrishnaTeja Killamsetty, Sumit Bhatia, Milan Aggarwal, Ganesh Ramakrishnan, Rishabh Iyer, Balaji Krishnamurthy
A salient characteristic of pre-trained language models (PTLMs) is a remarkable improvement in their generalization capability and emergence of new capabilities with increasing model capacity and pre-training dataset size.
1 code implementation • NeurIPS 2023 • Duncan McElfresh, Sujay Khandagale, Jonathan Valverde, Vishak Prasad C, Benjamin Feuer, Chinmay Hegde, Ganesh Ramakrishnan, Micah Goldblum, Colin White
To this end, we conduct the largest tabular data analysis to date, comparing 19 algorithms across 176 datasets, and we find that the 'NN vs. GBDT' debate is overemphasized: for a surprisingly high number of datasets, either the performance difference between GBDTs and NNs is negligible, or light hyperparameter tuning on a GBDT is more important than choosing between NNs and GBDTs.
1 code implementation • 15 Nov 2022 • Ayush Maheshwari, Nikhil Singh, Amrith Krishna, Ganesh Ramakrishnan
Keeping this in mind, we release a multi-domain dataset, from areas as diverse as astronomy, medicine and mathematics, with some of them as old as 18 centuries.
no code implementations • 2 Nov 2022 • Vishak Prasad C, Colin White, Paarth Jain, Sibasis Nayak, Ganesh Ramakrishnan
A majority of recent developments in neural architecture search (NAS) have been aimed at decreasing the computational cost of various techniques without affecting their final performance.
no code implementations • 30 Oct 2022 • Ashish Mittal, Durga Sivasubramanian, Rishabh Iyer, Preethi Jyothi, Ganesh Ramakrishnan
Training state-of-the-art ASR systems such as RNN-T often has a high associated financial and environmental cost.
no code implementations • 13 Oct 2022 • Ayush Maheshwari, Piyush Sharma, Preethi Jyothi, Ganesh Ramakrishnan
In this work we present \dictdis, a lexically constrained NMT system that disambiguates between multiple candidate translations derived from dictionaries.
1 code implementation • 7 Oct 2022 • Renbo Tu, Nicholas Roberts, Vishak Prasad, Sibasis Nayak, Paarth Jain, Frederic Sala, Ganesh Ramakrishnan, Ameet Talwalkar, Willie Neiswanger, Colin White
The challenge that climate change poses to humanity has spurred a rapidly developing field of artificial intelligence research focused on climate change applications.
no code implementations • 4 Oct 2022 • Suraj Kothawade, Akshit Srivastava, Venkat Iyer, Ganesh Ramakrishnan, Rishabh Iyer
Avoiding out-of-distribution (OOD) data is critical for training supervised machine learning models in the medical imaging domain.
no code implementations • 4 Oct 2022 • Suraj Kothawade, Atharv Savarkar, Venkat Iyer, Lakshman Tamil, Ganesh Ramakrishnan, Rishabh Iyer
It is often the case that a suboptimal performance is obtained on some classes due to the natural class imbalance issue that comes with medical data.
no code implementations • 10 Apr 2022 • Sravya Vardhani Shivapuja, Ashwin Gopinath, Ayush Gupta, Ganesh Ramakrishnan, Ravi Kiran Sarvadevabhatla
This skew affects all stages within the pipelines of deep crowd counting approaches.
no code implementations • 31 Mar 2022 • Piyush Singh Pasi, Shubham Nemani, Preethi Jyothi, Ganesh Ramakrishnan
We focus on the audio-visual video parsing (AVVP) problem that involves detecting audio and visual event labels with temporal boundaries.
1 code implementation • 15 Mar 2022 • KrishnaTeja Killamsetty, Guttu Sai Abhishek, Aakriti, Alexandre V. Evfimievski, Lucian Popa, Ganesh Ramakrishnan, Rishabh Iyer
Our central insight is that using an informative subset of the dataset for model training runs involved in hyper-parameter optimization, allows us to find the optimal hyper-parameter configuration significantly faster.
no code implementations • 10 Mar 2022 • Suraj Kothawade, Pavan Kumar Reddy, Ganesh Ramakrishnan, Rishabh Iyer
This issue is further pronounced in SSL methods, as they would use this biased model to obtain psuedo-labels (on the unlabeled data) during training.
1 code implementation • 3 Mar 2022 • Ayush Maheshwari, Ajay Ravindran, Venkatapathy Subramanian, Ganesh Ramakrishnan
UDAAN has an end-to-end Machine Translation (MT) plus post-editing pipeline wherein users can upload a document to obtain raw MT output.
1 code implementation • 22 Feb 2022 • Vishal Kaushal, Ganesh Ramakrishnan, Rishabh Iyer
A recent work has also leveraged submodular functions to propose submodular information measures which have been found to be very useful in solving the problems of guided subset selection and guided summarization.
1 code implementation • 7 Feb 2022 • Durga Sivasubramanian, Ayush Maheshwari, Pradeep Shenoy, Prathosh AP, Ganesh Ramakrishnan
In several supervised learning scenarios, auxiliary losses are used in order to introduce additional information or constraints into the supervised learning objective.
no code implementations • 2 Feb 2022 • Samrat Dutta, Shreyansh Jain, Ayush Maheshwari, Souvik Pal, Ganesh Ramakrishnan, Preethi Jyothi
Post-editing in Automatic Speech Recognition (ASR) entails automatically correcting common and systematic errors produced by the ASR system.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 10 Oct 2021 • Suraj Kothawade, Anmol Mekala, Chandra Sekhara D, Mayank Kothyari, Rishabh Iyer, Ganesh Ramakrishnan, Preethi Jyothi
To address this problem, we propose DITTO (Data-efficient and faIr Targeted subseT selectiOn) that uses Submodular Mutual Information (SMI) functions as acquisition functions to find the most informative set of utterances matching a target accent within a fixed budget.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • Findings (ACL) 2022 • Ayush Maheshwari, KrishnaTeja Killamsetty, Ganesh Ramakrishnan, Rishabh Iyer, Marina Danilevsky, Lucian Popa
These LFs, in turn, have been used to generate a large amount of additional noisy labeled data, in a paradigm that is now commonly referred to as data programming.
1 code implementation • 19 Aug 2021 • Sravya Vardhani Shivapuja, Mansi Pradeep Khamkar, Divij Bajaj, Ganesh Ramakrishnan, Ravi Kiran Sarvadevabhatla
We analyze the performance of representative crowd counting approaches across standard datasets at per strata level and in aggregate.
1 code implementation • 1 Aug 2021 • Guttu Sai Abhishek, Harshad Ingole, Parth Laturia, Vineeth Dorna, Ayush Maheshwari, Rishabh Iyer, Ganesh Ramakrishnan
SPEAR facilitates weak supervision in the form of heuristics (or rules) and association of noisy labels to the training dataset.
1 code implementation • 23 Jun 2021 • Durga Sivasubramanian, Rishabh Iyer, Ganesh Ramakrishnan, Abir De
First, we represent this problem with simplified constraints using the dual of the original training problem and show that the objective of this new representation is a monotone and alpha-submodular function, for a wide variety of modeling choices.
no code implementations • 16 Jun 2021 • Nathan Beck, Durga Sivasubramanian, Apurva Dani, Ganesh Ramakrishnan, Rishabh Iyer
Issues in the current literature include sometimes contradictory observations on the performance of different AL algorithms, unintended exclusion of important generalization approaches such as data augmentation and SGD for optimization, a lack of study of evaluation facets like the labeling efficiency of AL, and little or no clarity on the scenarios in which AL outperforms random sampling (RS).
1 code implementation • Findings (ACL) 2021 • Devaraja Adiga, Rishabh Kumar, Amrith Krishna, Preethi Jyothi, Ganesh Ramakrishnan, Pawan Goyal
In this work, we propose the first large scale study of automatic speech recognition (ASR) in Sanskrit, with an emphasis on the impact of unit selection in Sanskrit ASR.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • Findings (ACL) 2021 • Atul Sahay, Anshul Nasery, Ayush Maheshwari, Ganesh Ramakrishnan, Rishabh Iyer
We introduce a novel formulation that takes advantage of the syntactic grammar rules and is independent of the base system.
no code implementations • 30 Apr 2021 • Suraj Kothawade, Vishal Kaushal, Ganesh Ramakrishnan, Jeff Bilmes, Rishabh Iyer
With the rapid growth of data, it is becoming increasingly difficult to train or improve deep learning models with the right subset of data.
no code implementations • 14 Apr 2021 • Oishik Chatterjee, Isha Pandey, Aashish Waikar, Vishwajeet Kumar, Ganesh Ramakrishnan
We approach this problem by first learning to generate the equation using the problem description and the final answer, which we subsequently use to train a supervised MWP solver.
1 code implementation • 11 Apr 2021 • Atul Sahay, Ayush Maheshwari, Ritesh Kumar, Ganesh Ramakrishnan, Manjesh Kumar Hanawal, Kavi Arya
In this work, we propose an attention mechanism over Tree-LSTMs to learn more meaningful and explainable parse tree structures.
1 code implementation • 3 Apr 2021 • Jatin Lamba, abhishek, Jayaprakash Akula, Rishabh Dabral, Preethi Jyothi, Ganesh Ramakrishnan
In this paper, we present a novel approach to the audio-visual video parsing (AVVP) task that demarcates events from a video separately for audio and visual modalities.
Ranked #1 on Event Detection on Audio Set
1 code implementation • 9 Mar 2021 • Aman Jain, Mayank Kothyari, Vishwajeet Kumar, Preethi Jyothi, Ganesh Ramakrishnan, Soumen Chakrabarti
In response, we identify a key structural idiom in OKVQA , viz., S3 (select, substitute and search), and build a new data set and challenge around it.
1 code implementation • 9 Mar 2021 • Jayaprakash A, abhishek, Rishabh Dabral, Ganesh Ramakrishnan, Preethi Jyothi
Video retrieval using natural language queries requires learning semantically meaningful joint embeddings between the text and the audio-visual input.
Ranked #1 on Video Retrieval on Charades-STA
1 code implementation • 27 Feb 2021 • Suraj Kothawade, Vishal Kaushal, Ganesh Ramakrishnan, Jeff Bilmes, Rishabh Iyer
Examples of such problems include: i)targeted learning, where the goal is to find subsets with rare classes or rare attributes on which the model is underperforming, and ii)guided summarization, where data (e. g., image collection, text, document or video) is summarized for quicker human consumption with specific additional user intent.
3 code implementations • 27 Feb 2021 • KrishnaTeja Killamsetty, Durga Sivasubramanian, Ganesh Ramakrishnan, Abir De, Rishabh Iyer
We show rigorous theoretical and convergence guarantees of the proposed algorithm and, through our extensive experiments on real-world datasets, show the effectiveness of our proposed framework.
no code implementations • 26 Jan 2021 • Vishal Kaushal, Suraj Kothawade, Anshul Tomar, Rishabh Iyer, Ganesh Ramakrishnan
For long videos, human reference summaries necessary for supervised video summarization techniques are difficult to obtain.
1 code implementation • EACL 2021 • Ishan Tarunesh, Sushil Khyalia, Vishwajeet Kumar, Ganesh Ramakrishnan, Preethi Jyothi
We present experiments on five different tasks and six different languages from the XTREME multilingual benchmark dataset.
1 code implementation • EACL 2021 • Soumya Chatterjee, Ayush Maheshwari, Ganesh Ramakrishnan, Saketha Nath Jagaralpudi
Such a joint learning is expected to provide a twofold advantage: i) the classifier generalizes better as it leverages the prior knowledge of existence of a hierarchy over the labels, and ii) in addition to the label co-occurrence information, the label-embedding may benefit from the manifold structure of the input datapoints, leading to embeddings that are more faithful to the label hierarchy.
Ranked #1 on Multi-Label Text Classification on RCV1
General Classification Hierarchical Multi-label Classification +1
1 code implementation • 19 Dec 2020 • KrishnaTeja Killamsetty, Durga Sivasubramanian, Ganesh Ramakrishnan, Rishabh Iyer
Finally, we propose Glister-Active, an extension to batch active learning, and we empirically demonstrate the performance of Glister on a wide range of tasks including, (a) data selection to reduce training time, (b) robust learning under label noise and imbalance settings, and (c) batch-active learning with several deep and shallow models.
1 code implementation • 17 Dec 2020 • Sai Praneeth Reddy Sunkesula, Rishabh Dabral, Ganesh Ramakrishnan
Analyzing the interactions between humans and objects from a video includes identification of the relationships between humans and the objects present in the video.
Human-Object Interaction Detection Relationship Detection +1
no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Vishwajeet Kumar, Manish Joshi, Ganesh Ramakrishnan, Yuan-Fang Li
Question generation (QG) has recently attracted considerable attention.
no code implementations • 12 Oct 2020 • Vishal Kaushal, Suraj Kothawade, Ganesh Ramakrishnan, Jeff Bilmes, Himanshu Asnani, Rishabh Iyer
We study submodular information measures as a rich framework for generic, query-focused, privacy sensitive, and update summarization tasks.
1 code implementation • Findings (ACL) 2021 • Ayush Maheshwari, Oishik Chatterjee, KrishnaTeja Killamsetty, Ganesh Ramakrishnan, Rishabh Iyer
The first contribution of this work is an introduction of a framework, \model which is a semi-supervised data programming paradigm that learns a \emph{joint model} that effectively uses the rules/labelling functions along with semi-supervised loss functions on the feature space.
no code implementations • 29 Jul 2020 • Vishal Kaushal, Suraj Kothawade, Rishabh Iyer, Ganesh Ramakrishnan
Thirdly, we demonstrate that in the presence of multiple ground truth summaries (due to the highly subjective nature of the task), learning from a single combined ground truth summary using a single loss function is not a good idea.
2 code implementations • 22 Nov 2019 • Oishik Chatterjee, Ganesh Ramakrishnan, Sunita Sarawagi
Scarcity of labeled data is a bottleneck for supervised learning models.
no code implementations • 8 Nov 2019 • Vishwajeet Kumar, Raktim Chaki, Sai Teja Talluri, Ganesh Ramakrishnan, Yuan-Fang Li, Gholamreza Haffari
Specifically, we propose (a) a novel hierarchical BiLSTM model with selective attention and (b) a novel hierarchical Transformer architecture, both of which learn hierarchical representations of paragraphs.
no code implementations • CONLL 2019 • Vishwajeet Kumar, Ganesh Ramakrishnan, Yuan-Fang Li
The \textit{generator} is a sequence-to-sequence model that incorporates the \textit{structure} and \textit{semantics} of the question being generated.
no code implementations • 24 Sep 2019 • Rishabh Dabral, Nitesh B. Gundavarapu, Rahul Mitra, Abhishek Sharma, Ganesh Ramakrishnan, Arjun Jain
Multi-person 3D human pose estimation from a single image is a challenging problem, especially for in-the-wild settings due to the lack of 3D annotated data.
Ranked #8 on 3D Multi-Person Pose Estimation on MuPoTS-3D
3D Human Pose Estimation 3D Multi-Person Human Pose Estimation
no code implementations • IJCNLP 2019 • Vishwajeet Kumar, Sivaanandh Muneeswaran, Ganesh Ramakrishnan, Yuan-Fang Li
Generating syntactically and semantically valid and relevant questions from paragraphs is useful with many applications.
no code implementations • 19 Aug 2019 • Ayush Maheshwari, Hrishikesh Patel, Nandan Rathod, Ritesh Kumar, Ganesh Ramakrishnan, Pushpak Bhattacharyya
The problem of event extraction is a relatively difficult task for low resource languages due to the non-availability of sufficient annotated data.
1 code implementation • ACL 2019 • Vishwajeet Kumar, Nitish Joshi, Arijit Mukherjee, Ganesh Ramakrishnan, Preethi Jyothi
For a new language, such training instances are hard to obtain making the QG problem even more challenging.
no code implementations • 3 Jan 2019 • Vishal Kaushal, Rishabh Iyer, Khoshrav Doctor, Anurag Sahoo, Pratik Dubal, Suraj Kothawade, Rohan Mahadev, Kunal Dargan, Ganesh Ramakrishnan
This paper addresses automatic summarization of videos in a unified manner.
1 code implementation • 3 Jan 2019 • Vishal Kaushal, Rishabh Iyer, Suraj Kothawade, Rohan Mahadev, Khoshrav Doctor, Ganesh Ramakrishnan
Supervised machine learning based state-of-the-art computer vision techniques are in general data hungry.
no code implementations • 24 Sep 2018 • Vishal Kaushal, Sandeep Subramanian, Suraj Kothawade, Rishabh Iyer, Ganesh Ramakrishnan
We propose a novel framework for domain specific video summarization.
no code implementations • 15 Aug 2018 • Vishwajeet Kumar, Ganesh Ramakrishnan, Yuan-Fang Li
The {\it generator} is a sequence-to-sequence model that incorporates the {\it structure} and {\it semantics} of the question being generated.
no code implementations • NAACL 2018 • Ayush Maheshwari, Vishwajeet Kumar, Ganesh Ramakrishnan, J. Saketha Nath
We present a system for resolving entities and disambiguating locations based on publicly available web data in the domain of ancient Hindu Temples.
no code implementations • 28 May 2018 • Vishal Kaushal, Anurag Sahoo, Khoshrav Doctor, Narasimha Raju, Suyash Shetty, Pankaj Singh, Rishabh Iyer, Ganesh Ramakrishnan
Supervised machine learning based state-of-the-art computer vision techniques are in general data hungry and pose the challenges of not having adequate computing resources and of high costs involved in human labeling efforts.
no code implementations • 7 Mar 2018 • Vishwajeet Kumar, Kireeti Boorla, Yogesh Meena, Ganesh Ramakrishnan, Yuan-Fang Li
Neural network-based methods represent the state-of-the-art in question generation from text.
no code implementations • 7 May 2017 • Naveen Nair, Ajay Nagesh, Ganesh Ramakrishnan
For learning features derived from inputs at a particular sequence position, we propose a Hierarchical Kernels-based approach (referred to as Hierarchical Kernel Learning for Structured Output Spaces - StructHKL).
no code implementations • 4 Apr 2017 • Anurag Sahoo, Vishal Kaushal, Khoshrav Doctor, Suyash Shetty, Rishabh Iyer, Ganesh Ramakrishnan
Most importantly, we also show that we can summarize hours of video data in a few seconds, and our system allows the user to generate summaries of various lengths and types interactively on the fly.
no code implementations • LREC 2014 • Chetana Gavankar, Ashish Kulkarni, Ganesh Ramakrishnan
A domain ontology for an organization, often consists of classes whose instances are either specific to, or independent of the organization.