Search Results for author: Vivek Gupta

Found 44 papers, 20 papers with code

Effective Dimensionality Reduction for Word Embeddings

1 code implementation • WS 2019 • Vikas Raunak, Vivek Gupta, Florian Metze

Pre-trained word embeddings are used in several downstream applications as well as for constructing representations for sentences, paragraphs and documents.

Dimensionality Reduction Word Embeddings

124

Paper
Code

On Dimensional Linguistic Properties of the Word Embedding Space

2 code implementations • WS 2020 • Vikas Raunak, Vaibhav Kumar, Vivek Gupta, Florian Metze

Word embeddings have become a staple of several natural language processing tasks, yet much remains to be understood about their properties.

Machine Translation Sentence +3

124

Paper
Code

SCDV : Sparse Composite Document Vectors using soft clustering over distributional representations

4 code implementations • EMNLP 2017 • Dheeraj Mekala, Vivek Gupta, Bhargavi Paranjape, Harish Karnick

We present a feature vector formation technique for documents - Sparse Composite Document Vector (SCDV) - which overcomes several shortcomings of the current distributional paragraph vector representations that are widely used for text representation.

Clustering Information Retrieval +3

Paper
Code

P-SIF: Document Embeddings Using Partition Averaging

1 code implementation • 18 May 2020 • Vivek Gupta, Ankit Saw, Pegah Nokhiz, Praneeth Netrapalli, Piyush Rai, Partha Talukdar

One of the key reasons is that a longer document is likely to contain words from many different topics; hence, creating a single vector while ignoring all the topical structure is unlikely to yield an effective document representation.

Paper
Code

A Logic-Driven Framework for Consistency of Neural Models

1 code implementation • IJCNLP 2019 • Tao Li, Vivek Gupta, Maitrey Mehta, Vivek Srikumar

While neural models show remarkable accuracy on individual predictions, their internal beliefs can be inconsistent across examples.

Natural Language Inference

Paper
Code

SumPubMed: Summarization Dataset of PubMed Scientific Articles

1 code implementation • ACL 2021 • Vivek Gupta, Prerna Bharti, Pegah Nokhiz, Harish Karnick

Most earlier work on text summarization is carried out on news article datasets.

Informativeness Text Summarization

Paper
Code

Incorporating External Knowledge to Enhance Tabular Reasoning

1 code implementation • NAACL 2021 • J. Neeraja, Vivek Gupta, Vivek Srikumar

Reasoning about tabular information presents unique challenges to modern NLP approaches which largely rely on pre-trained contextualized embeddings of text.

Natural Language Inference

Paper
Code

Two-Step Classification using Recasted Data for Low Resource Settings

2 code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Shagun Uppal, Vivek Gupta, Avinash Swaminathan, Haimin Zhang, Debanjan Mahata, Rakesh Gosangi, Rajiv Ratn Shah, Amanda Stent

We further improve the performance by using a joint-objective for classification and textual entailment.

Natural Language Inference text-classification +2

Paper
Code

Improving Document Classification with Multi-Sense Embeddings

1 code implementation • 18 Nov 2019 • Vivek Gupta, Ankit Saw, Pegah Nokhiz, Harshit Gupta, Partha Talukdar

Through extensive experiments on multiple real-world datasets, we show that SCDV-MS embeddings outperform previous state-of-the-art embeddings on multi-class and multi-label text categorization tasks.

Ranked #5 on Document Classification on Reuters-21578 (F1 metric)

Classification Clustering +5

Paper
Code

IndicXNLI: Evaluating Multilingual Inference for Indian Languages

1 code implementation • 19 Apr 2022 • Divyanshu Aggarwal, Vivek Gupta, Anoop Kunchukuttan

While Indic NLP has made rapid advances recently in terms of the availability of corpora and pre-trained models, benchmark datasets on standard NLU tasks are limited.

Cross-Lingual Transfer Machine Translation +1

Paper
Code

On Long-Tailed Phenomena in Neural Machine Translation

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Vikas Raunak, Siddharth Dalmia, Vivek Gupta, Florian Metze

State-of-the-art Neural Machine Translation (NMT) models struggle with generating low-frequency tokens, tackling which remains a major challenge.

Conditional Text Generation Machine Translation +5

Paper
Code

ChartCheck: Explainable Fact-Checking over Real-World Chart Images

1 code implementation • 13 Nov 2023 • Mubashara Akhtar, Nikesh Subedi, Vivek Gupta, Sahar Tahmasebi, Oana Cocarascu, Elena Simperl

Whilst fact verification has attracted substantial interest in the natural language processing community, verifying misinforming statements against data visualizations such as charts has so far been overlooked.

Fact Checking Fact Verification +1

Paper
Code

Text Summarization using Abstract Meaning Representation

2 code implementations • 6 Jun 2017 • Shibhansh Dohare, Harish Karnick, Vivek Gupta

With an ever increasing size of text present on the Internet, automatic summary generation remains an important problem for natural language understanding.

Natural Language Understanding Text Summarization

Paper
Code

Unsupervised Semantic Abstractive Summarization

1 code implementation • ACL 2018 • Shibhansh Dohare, Vivek Gupta, Harish Karnick

Automatic abstractive summary generation remains a significant open problem for natural language processing.

Abstractive Text Summarization Sentence Compression

Paper
Code

Unsupervised Contextualized Document Representation

1 code implementation • EMNLP (sustainlp) 2021 • Ankur Gupta, Vivek Gupta

In this paper, we address this issue by proposing SCDV+BERT(ctxd), a simple and effective unsupervised representation that combines contextualized BERT (Devlin et al., 2019) based word embedding for word sense disambiguation with SCDV soft clustering approach.

Clustering Sentence +2

Paper
Code

SLATE: A Sequence Labeling Approach for Task Extraction from Free-form Inked Content

1 code implementation • 8 Nov 2022 • Apurva Gandhi, Ryan Serrao, Biyi Fang, Gilbert Antonius, Jenna Hong, Tra My Nguyen, Sheng Yi, Ehi Nosakhare, Irene Shaffer, Soundararajan Srinivasan, Vivek Gupta

We present SLATE, a sequence labeling approach for extracting tasks from free-form content such as digitally handwritten (or "inked") notes on a virtual whiteboard.

Segmentation Sentence +1

Paper
Code

Unbiasing Review Ratings with Tendency Based Collaborative Filtering

1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • Pranshi Yadav, Priya Yadav, Pegah Nokhiz, Vivek Gupta

User-generated contents{'} score-based prediction and item recommendation has become an inseparable part of the online recommendation systems.

Collaborative Filtering Recommendation Systems

Paper
Code

TabPert: An Effective Platform for Tabular Perturbation

1 code implementation • 2 Aug 2021 • Nupur Jain, Vivek Gupta, Anshul Rai, Gaurav Kumar

To truly grasp reasoning ability, a Natural Language Inference model should be evaluated on counterfactual data.

counterfactual Natural Language Inference

Paper
Code

TabPert : An Effective Platform for Tabular Perturbation

1 code implementation • EMNLP (ACL) 2021 • Nupur Jain, Vivek Gupta, Anshul Rai, Gaurav Kumar

To grasp the true reasoning ability, the Natural Language Inference model should be evaluated on counterfactual data.

counterfactual Natural Language Inference

Paper
Code

Bayes-optimal Hierarchical Classification over Asymmetric Tree-Distance Loss

no code implementations • 17 Feb 2018 • Dheeraj Mekala, Vivek Gupta, Purushottam Kar, Harish Karnick

We extend the consistency of hierarchical classification algorithm over asymmetric tree distance loss.

Classification General Classification +1

Paper
Add Code

Efficient Estimation of Generalization Error and Bias-Variance Components of Ensembles

no code implementations • 15 Nov 2017 • Dhruv Mahajan, Vivek Gupta, S. Sathiya Keerthi, Sellamanickam Sundararajan, Shravan Narayanamurthy, Rahul Kidambi

We also demonstrate their usefulness in making design choices such as the number of classifiers in the ensemble and the size of a subset of data used for training that is needed to achieve a certain value of generalization error.

Paper
Add Code

Leveraging Distributional Semantics for Multi-Label Learning

no code implementations • 18 Sep 2017 • Rahul Wadbude, Vivek Gupta, Piyush Rai, Nagarajan Natarajan, Harish Karnick, Prateek Jain

Our approach is novel in that it highlights interesting connections between label embedding methods used for multi-label learning and paragraph/document embedding methods commonly used for learning representations of text data.

Document Embedding Missing Labels +1

Paper
Add Code

User Bias Removal in Review Score Prediction

no code implementations • 20 Dec 2016 • Rahul Wadbude, Vivek Gupta, Dheeraj Mekala, Harish Karnick

Review score prediction of text reviews has recently gained a lot of attention in recommendation systems.

Recommendation Systems

Paper
Add Code

Product Classification in E-Commerce using Distributional Semantics

no code implementations • COLING 2016 • Vivek Gupta, Harish Karnick, Ashendra Bansal, Pradhuman Jhala

Product classification is the task of automatically predicting a taxonomy path for a product in a predefined taxonomy hierarchy given a textual product description or title.

Classification General Classification

Paper
Add Code

Equalizing Recourse across Groups

no code implementations • 7 Sep 2019 • Vivek Gupta, Pegah Nokhiz, Chitradeep Dutta Roy, Suresh Venkatasubramanian

We measure recourse as the distance of an individual from the decision boundary of a classifier.

Decision Making Fairness

Paper
Add Code

DeepSumm -- Deep Code Summaries using Neural Transformer Architecture

no code implementations • 31 Mar 2020 • Vivek Gupta

In this paper, we employ neural techniques to solve the task of source code summarizing and specifically compare NMT based techniques to more simplified and appealing Transformer architecture on a dataset of Java methods and comments.

NMT

Paper
Add Code

INFOTABS: Inference on Tables as Semi-structured Data

no code implementations • ACL 2020 • Vivek Gupta, Maitrey Mehta, Pegah Nokhiz, Vivek Srikumar

In this paper, we observe that semi-structured tabulated text is ubiquitous; understanding them requires not only comprehending the meaning of text fragments, but also implicit relationships between them.

Paper
Add Code

Is My Model Using The Right Evidence? Systematic Probes for Examining Evidence-Based Tabular Reasoning

no code implementations • 2 Aug 2021 • Vivek Gupta, Riyaz A. Bhat, Atreya Ghosal, Manish Shrivastava, Maneesh Singh, Vivek Srikumar

Our experiments demonstrate that a RoBERTa-based model, representative of the current state-of-the-art, fails at reasoning on the following counts: it (a) ignores relevant parts of the evidence, (b) is over-sensitive to annotation artifacts, and (c) relies on the knowledge encoded in the pre-trained language model rather than the evidence presented in its tabular inputs.

Language Modelling

Paper
Add Code

RETRONLU: Retrieval Augmented Task-Oriented Semantic Parsing

no code implementations • NLP4ConvAI (ACL) 2022 • Vivek Gupta, Akshat Shrivastava, Adithya Sagar, Armen Aghajanyan, Denis Savenkov

While large pre-trained language models accumulate a lot of knowledge in their parameters, it has been demonstrated that augmenting it with non-parametric retrieval-based memory has a number of benefits from accuracy improvements to data efficiency for knowledge-focused tasks, such as question answering.

Question Answering Retrieval +1

Paper
Add Code

Unsupervised Document Representation using Partition Word-Vectors Averaging

no code implementations • 27 Sep 2018 • Vivek Gupta, Ankit Kumar Saw, Partha Pratim Talukdar, Praneeth Netrapalli

One reason for this degradation is due to the fact that a longer document is likely to contain words from many different themes (or topics), and hence creating a single vector while ignoring all the thematic structure is unlikely to yield an effective representation of the document.

Document Classification Sentence

Paper
Add Code

XInfoTabS: Evaluating Multilingual Tabular Natural Language Inference

no code implementations • FEVER (ACL) 2022 • Bhavnick Minhas, Anant Shankhdhar, Vivek Gupta, Divyanshu Aggarwal, Shuo Zhang

In this paper, we use machine translation methods to construct a multilingual tabular NLI dataset, namely XINFOTABS, which expands the English tabular NLI dataset of INFOTABS to ten diverse languages.

Machine Translation Natural Language Inference +1

Paper
Add Code

Trans-KBLSTM: An External Knowledge Enhanced Transformer BiLSTM Model for Tabular Reasoning

no code implementations • DeeLIO (ACL) 2022 • Yerram Varun, Aayush Sharma, Vivek Gupta

Natural language inference on tabular data is a challenging task.

Common Sense Reasoning Natural Language Inference

Paper
Add Code

Right for the Right Reason: Evidence Extraction for Trustworthy Tabular Reasoning

no code implementations • ACL 2022 • Vivek Gupta, Shuo Zhang, Alakananda Vempala, Yujie He, Temma Choji, Vivek Srikumar

On the downstream tabular inference task, using only the automatically extracted evidence as the premise, our approach outperforms prior benchmarks.

Paper
Add Code

Bilingual Tabular Inference: A Case Study on Indic Languages

no code implementations • NAACL 2022 • Chaitanya Agarwal, Vivek Gupta, Anoop Kunchukuttan, Manish Shrivastava

Existing research on Tabular Natural Language Inference (TNLI) exclusively examines the task in a monolingual setting where the tabular premise and hypothesis are in the same language.

Natural Language Inference

Paper
Add Code

Realistic Data Augmentation Framework for Enhancing Tabular Reasoning

no code implementations • 23 Oct 2022 • Dibyakanti Kumar, Vivek Gupta, Soumya Sharma, Shuo Zhang

We observed that our framework could generate human-like tabular inference examples, which could benefit training data augmentation, especially in the scenario with limited supervision.

counterfactual Data Augmentation +1

Paper
Add Code

Enhancing Tabular Reasoning with Pattern Exploiting Training

no code implementations • 21 Oct 2022 • Abhilash Reddy Shankarampeta, Vivek Gupta, Shuo Zhang

Recent methods based on pre-trained language models have exhibited superior performance over tabular tasks (e. g., tabular NLI), despite showing inherent problems such as not using the right evidence and inconsistent predictions across inputs while reasoning over the tabular data.

Paper
Add Code

Leveraging Data Recasting to Enhance Tabular Reasoning

no code implementations • 23 Nov 2022 • Aashna Jena, Vivek Gupta, Manish Shrivastava, Julian Martin Eisenschlos

Creating challenging tabular inference data is essential for learning complex reasoning.

Semantic Parsing

Paper
Add Code

Evaluating Inter-Bilingual Semantic Parsing for Indian Languages

1 code implementation • 25 Apr 2023 • Divyanshu Aggarwal, Vivek Gupta, Anoop Kunchukuttan

Despite significant progress in Natural Language Generation for Indian languages (IndicNLP), there is a lack of datasets around complex structured tasks such as semantic parsing.

Semantic Parsing Text Generation +1

Paper
Code

MANER: Multi-Agent Neural Rearrangement Planning of Objects in Cluttered Environments

no code implementations • 10 Jun 2023 • Vivek Gupta, Praphpreet Dhir, Jeegn Dani, Ahmed H. Qureshi

Object rearrangement is a fundamental problem in robotics with various practical applications ranging from managing warehouses to cleaning and organizing home kitchens.

Paper
Add Code

InfoSync: Information Synchronization across Multilingual Semi-structured Tables

no code implementations • 6 Jul 2023 • Siddharth Khincha, Chelsi Jain, Vivek Gupta, Tushar Kataria, Shuo Zhang

The proposed method includes 1) Information Alignment to map rows and 2) Information Update for updating missing/outdated information for aligned tables across multilingual tables.

Paper
Add Code

Exploring the Numerical Reasoning Capabilities of Language Models: A Comprehensive Analysis on Tabular Data

no code implementations • 3 Nov 2023 • Mubashara Akhtar, Abhilash Shankarampeta, Vivek Gupta, Arpit Patil, Oana Cocarascu, Elena Simperl

Thus, understanding and reasoning with numbers are essential skills for language models to solve different tasks.

Natural Language Inference

Paper
Add Code

TempTabQA: Temporal Question Answering for Semi-Structured Tables

no code implementations • 14 Nov 2023 • Vivek Gupta, Pranshu Kandoi, Mahek Bhavesh Vora, Shuo Zhang, Yujie He, Ridho Reinanda, Vivek Srikumar

Given these results, our dataset has the potential to serve as a challenging benchmark to improve the temporal reasoning capabilities of NLP models.

Question Answering

Paper
Add Code

Multi-Set Inoculation: Assessing Model Robustness Across Multiple Challenge Sets

no code implementations • 15 Nov 2023 • Vatsal Gupta, Pranshu Pandya, Tushar Kataria, Vivek Gupta, Dan Roth

Language models, given their black-box nature, often exhibit sensitivity to input perturbations, leading to trust issues due to hallucinations.

Paper
Add Code

Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering

no code implementations • 17 Feb 2024 • Pragya Srivastava, Manuj Malik, Vivek Gupta, Tanuja Ganu, Dan Roth

Large Language Models (LLMs), excel in natural language understanding, but their capability for complex mathematical reasoning with an amalgamation of structured tables and unstructured text is uncertain.

Arithmetic Reasoning Mathematical Reasoning +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.