Search Results for author: Vivek Gupta

Found 36 papers, 18 papers with code

TabPert : An Effective Platform for Tabular Perturbation

1 code implementation EMNLP (ACL) 2021 Nupur Jain, Vivek Gupta, Anshul Rai, Gaurav Kumar

To grasp the true reasoning ability, the Natural Language Inference model should be evaluated on counterfactual data.

Natural Language Inference

Bilingual Tabular Inference: A Case Study on Indic Languages

no code implementations NAACL 2022 Chaitanya Agarwal, Vivek Gupta, Anoop Kunchukuttan, Manish Shrivastava

Existing research on Tabular Natural Language Inference (TNLI) exclusively examines the task in a monolingual setting where the tabular premise and hypothesis are in the same language.

Natural Language Inference

Right for the Right Reason: Evidence Extraction for Trustworthy Tabular Reasoning

no code implementations ACL 2022 Vivek Gupta, Shuo Zhang, Alakananda Vempala, Yujie He, Temma Choji, Vivek Srikumar

On the downstream tabular inference task, using only the automatically extracted evidence as the premise, our approach outperforms prior benchmarks.

XInfoTabS: Evaluating Multilingual Tabular Natural Language Inference

no code implementations FEVER (ACL) 2022 Bhavnick Minhas, Anant Shankhdhar, Vivek Gupta, Divyanshu Aggarwal, Shuo Zhang

In this paper, we use machine translation methods to construct a multilingual tabular NLI dataset, namely XINFOTABS, which expands the English tabular NLI dataset of INFOTABS to ten diverse languages.

Machine Translation Natural Language Inference +1

SLATE: A Sequence Labeling Approach for Task Extraction from Free-form Inked Content

1 code implementation8 Nov 2022 Apurva Gandhi, Ryan Serrao, Biyi Fang, Gilbert Antonius, Jenna Hong, Tra My Nguyen, Sheng Yi, Ehi Nosakhare, Irene Shaffer, Soundararajan Srinivasan, Vivek Gupta

We present SLATE, a sequence labeling approach for extracting tasks from free-form content such as digitally handwritten (or "inked") notes on a virtual whiteboard.

Sentence segmentation

Realistic Data Augmentation Framework for Enhancing Tabular Reasoning

no code implementations23 Oct 2022 Dibyakanti Kumar, Vivek Gupta, Soumya Sharma, Shuo Zhang

We observed that our framework could generate human-like tabular inference examples, which could benefit training data augmentation, especially in the scenario with limited supervision.

Data Augmentation Natural Language Inference

Enhancing Tabular Reasoning with Pattern Exploiting Training

no code implementations21 Oct 2022 Abhilash Reddy Shankarampeta, Vivek Gupta, Shuo Zhang

Recent methods based on pre-trained language models have exhibited superior performance over tabular tasks (e. g., tabular NLI), despite showing inherent problems such as not using the right evidence and inconsistent predictions across inputs while reasoning over the tabular data.

IndicXNLI: Evaluating Multilingual Inference for Indian Languages

1 code implementation19 Apr 2022 Divyanshu Aggarwal, Vivek Gupta, Anoop Kunchukuttan

While Indic NLP has made rapid advances recently in terms of the availability of corpora and pre-trained models, benchmark datasets on standard NLU tasks are limited.

Cross-Lingual Transfer Machine Translation +1

Unsupervised Contextualized Document Representation

1 code implementation EMNLP (sustainlp) 2021 Ankur Gupta, Vivek Gupta

In this paper, we address this issue by proposing SCDV+BERT(ctxd), a simple and effective unsupervised representation that combines contextualized BERT (Devlin et al., 2019) based word embedding for word sense disambiguation with SCDV soft clustering approach.

Sentence Similarity Word Sense Disambiguation

RETRONLU: Retrieval Augmented Task-Oriented Semantic Parsing

no code implementations NLP4ConvAI (ACL) 2022 Vivek Gupta, Akshat Shrivastava, Adithya Sagar, Armen Aghajanyan, Denis Savenkov

While large pre-trained language models accumulate a lot of knowledge in their parameters, it has been demonstrated that augmenting it with non-parametric retrieval-based memory has a number of benefits from accuracy improvements to data efficiency for knowledge-focused tasks, such as question answering.

Question Answering Retrieval +1

Is My Model Using The Right Evidence? Systematic Probes for Examining Evidence-Based Tabular Reasoning

no code implementations2 Aug 2021 Vivek Gupta, Riyaz A. Bhat, Atreya Ghosal, Manish Shrivastava, Maneesh Singh, Vivek Srikumar

Our experiments demonstrate that a RoBERTa-based model, representative of the current state-of-the-art, fails at reasoning on the following counts: it (a) ignores relevant parts of the evidence, (b) is over-sensitive to annotation artifacts, and (c) relies on the knowledge encoded in the pre-trained language model rather than the evidence presented in its tabular inputs.

Language Modelling

TabPert: An Effective Platform for Tabular Perturbation

1 code implementation2 Aug 2021 Nupur Jain, Vivek Gupta, Anshul Rai, Gaurav Kumar

To truly grasp reasoning ability, a Natural Language Inference model should be evaluated on counterfactual data.

Natural Language Inference

Incorporating External Knowledge to Enhance Tabular Reasoning

1 code implementation NAACL 2021 J. Neeraja, Vivek Gupta, Vivek Srikumar

Reasoning about tabular information presents unique challenges to modern NLP approaches which largely rely on pre-trained contextualized embeddings of text.

Natural Language Inference

P-SIF: Document Embeddings Using Partition Averaging

1 code implementation18 May 2020 Vivek Gupta, Ankit Saw, Pegah Nokhiz, Praneeth Netrapalli, Piyush Rai, Partha Talukdar

One of the key reasons is that a longer document is likely to contain words from many different topics; hence, creating a single vector while ignoring all the topical structure is unlikely to yield an effective document representation.

INFOTABS: Inference on Tables as Semi-structured Data

no code implementations ACL 2020 Vivek Gupta, Maitrey Mehta, Pegah Nokhiz, Vivek Srikumar

In this paper, we observe that semi-structured tabulated text is ubiquitous; understanding them requires not only comprehending the meaning of text fragments, but also implicit relationships between them.

DeepSumm -- Deep Code Summaries using Neural Transformer Architecture

no code implementations31 Mar 2020 Vivek Gupta

In this paper, we employ neural techniques to solve the task of source code summarizing and specifically compare NMT based techniques to more simplified and appealing Transformer architecture on a dataset of Java methods and comments.

NMT

Improving Document Classification with Multi-Sense Embeddings

1 code implementation18 Nov 2019 Vivek Gupta, Ankit Saw, Pegah Nokhiz, Harshit Gupta, Partha Talukdar

Through extensive experiments on multiple real-world datasets, we show that SCDV-MS embeddings outperform previous state-of-the-art embeddings on multi-class and multi-label text categorization tasks.

Classification Document Classification +3

On Dimensional Linguistic Properties of the Word Embedding Space

2 code implementations WS 2020 Vikas Raunak, Vaibhav Kumar, Vivek Gupta, Florian Metze

Word embeddings have become a staple of several natural language processing tasks, yet much remains to be understood about their properties.

Machine Translation Sentence Classification +2

Equalizing Recourse across Groups

no code implementations7 Sep 2019 Vivek Gupta, Pegah Nokhiz, Chitradeep Dutta Roy, Suresh Venkatasubramanian

We measure recourse as the distance of an individual from the decision boundary of a classifier.

Decision Making Fairness

A Logic-Driven Framework for Consistency of Neural Models

1 code implementation IJCNLP 2019 Tao Li, Vivek Gupta, Maitrey Mehta, Vivek Srikumar

While neural models show remarkable accuracy on individual predictions, their internal beliefs can be inconsistent across examples.

Natural Language Inference

Effective Dimensionality Reduction for Word Embeddings

1 code implementation WS 2019 Vikas Raunak, Vivek Gupta, Florian Metze

Pre-trained word embeddings are used in several downstream applications as well as for constructing representations for sentences, paragraphs and documents.

Dimensionality Reduction Word Embeddings

Unsupervised Document Representation using Partition Word-Vectors Averaging

no code implementations27 Sep 2018 Vivek Gupta, Ankit Kumar Saw, Partha Pratim Talukdar, Praneeth Netrapalli

One reason for this degradation is due to the fact that a longer document is likely to contain words from many different themes (or topics), and hence creating a single vector while ignoring all the thematic structure is unlikely to yield an effective representation of the document.

Document Classification

Efficient Estimation of Generalization Error and Bias-Variance Components of Ensembles

no code implementations15 Nov 2017 Dhruv Mahajan, Vivek Gupta, S. Sathiya Keerthi, Sellamanickam Sundararajan, Shravan Narayanamurthy, Rahul Kidambi

We also demonstrate their usefulness in making design choices such as the number of classifiers in the ensemble and the size of a subset of data used for training that is needed to achieve a certain value of generalization error.

Leveraging Distributional Semantics for Multi-Label Learning

no code implementations18 Sep 2017 Rahul Wadbude, Vivek Gupta, Piyush Rai, Nagarajan Natarajan, Harish Karnick, Prateek Jain

Our approach is novel in that it highlights interesting connections between label embedding methods used for multi-label learning and paragraph/document embedding methods commonly used for learning representations of text data.

Document Embedding Multi-Label Learning +1

Text Summarization using Abstract Meaning Representation

2 code implementations6 Jun 2017 Shibhansh Dohare, Harish Karnick, Vivek Gupta

With an ever increasing size of text present on the Internet, automatic summary generation remains an important problem for natural language understanding.

Natural Language Understanding Text Summarization

SCDV : Sparse Composite Document Vectors using soft clustering over distributional representations

4 code implementations EMNLP 2017 Dheeraj Mekala, Vivek Gupta, Bhargavi Paranjape, Harish Karnick

We present a feature vector formation technique for documents - Sparse Composite Document Vector (SCDV) - which overcomes several shortcomings of the current distributional paragraph vector representations that are widely used for text representation.

Information Retrieval Multi-Label Classification +2

User Bias Removal in Review Score Prediction

no code implementations20 Dec 2016 Rahul Wadbude, Vivek Gupta, Dheeraj Mekala, Harish Karnick

Review score prediction of text reviews has recently gained a lot of attention in recommendation systems.

Recommendation Systems

Product Classification in E-Commerce using Distributional Semantics

no code implementations COLING 2016 Vivek Gupta, Harish Karnick, Ashendra Bansal, Pradhuman Jhala

Product classification is the task of automatically predicting a taxonomy path for a product in a predefined taxonomy hierarchy given a textual product description or title.

Classification General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.