Search Results for author: Manish Shrivastava

Found 108 papers, 25 papers with code

HashSet - A Dataset For Hashtag Segmentation

1 code implementation LREC 2022 Prashant Kodali, Akshala Bhatnagar, Naman Ahuja, Manish Shrivastava, Ponnurangam Kumaraguru

HashSet dataset is sampled from a different set of tweets when compared to existing datasets and provides an alternate distribution of hashtags to build and validate hashtag segmentation models.

Specificity

TeSum: Human-Generated Abstractive Summarization Corpus for Telugu

no code implementations LREC 2022 Ashok Urlana, Nirmal Surange, Pavan Baswani, Priyanka Ravva, Manish Shrivastava

But with this work, we show that even with a crowd sourced summary generation approach, quality can be controlled by aggressive expert informed filtering and sampling-based human evaluation.

Abstractive Text Summarization

Bilingual Tabular Inference: A Case Study on Indic Languages

no code implementations NAACL 2022 Chaitanya Agarwal, Vivek Gupta, Anoop Kunchukuttan, Manish Shrivastava

Existing research on Tabular Natural Language Inference (TNLI) exclusively examines the task in a monolingual setting where the tabular premise and hypothesis are in the same language.

Natural Language Inference

A3-108 Machine Translation System for LoResMT Shared Task @MT Summit 2021 Conference

no code implementations MTSummit 2021 Saumitra Yadav, Manish Shrivastava

Also, we reorder English to match Marathi syntax to further train another set of baseline and data augmented models using various tokenization schemes.

Machine Translation Translation

Topic Shift Detection for Mixed Initiative Response

no code implementations SIGDIAL (ACL) 2021 Rachna Konigari, Saurabh Ramola, Vijay Vardhan Alluri, Manish Shrivastava

In this paper, we present a model which uses a fine-tuned XLNet-base to classify the utterances pertaining to the major topic of conversation and those which are not, with a precision of 84%.

Towards Fine-grained Classification of Climate Change related Social Media Text

no code implementations ACL 2022 Roopal Vaid, Kartikey Pant, Manish Shrivastava

We benchmark both the datasets for climate change stance detection and fine-grained classification using state-of-the-art methods in text classification.

Classification Decision Making +6

CoMeT: Towards Code-Mixed Translation Using Parallel Monolingual Sentences

1 code implementation NAACL (CALCS) 2021 Devansh Gautam, Prashant Kodali, Kshitij Gupta, Anmol Goel, Manish Shrivastava, Ponnurangam Kumaraguru

Code-mixed languages are very popular in multilingual societies around the world, yet the resources lag behind to enable robust systems on such languages.

Machine Translation Translation

Translate and Classify: Improving Sequence Level Classification for English-Hindi Code-Mixed Data

1 code implementation NAACL (CALCS) 2021 Devansh Gautam, Kshitij Gupta, Manish Shrivastava

To translate English-Hindi code-mixed data to English, we use mBART, a pre-trained multilingual sequence-to-sequence model that has shown competitive performance on various low-resource machine translation pairs and has also shown performance gains in languages that were not in its pre-training corpus.

Machine Translation Natural Language Inference +2

SimpleNER Sentence Simplification System for GEM 2021

no code implementations ACL (GEM) 2021 K V Aditya Srivatsa, Monil Gokani, Manish Shrivastava

This paper describes SimpleNER, a model developed for the sentence simplification task at GEM-2021.

NER

The WEAVE Corpus: Annotating Synthetic Chemical Procedures in Patents with Chemical Named Entities

1 code implementation ICON 2020 Ravindra Nittala, Manish Shrivastava

Such a design process calls for analyzing many existing synthetic chemical reactions and planning the synthesis of novel chemicals.

named-entity-recognition NER

Kunji : A Resource Management System for Higher Productivity in Computer Aided Translation Tools

no code implementations ICON 2019 Priyank Gupta, Manish Shrivastava, Dipti Misra Sharma, Rashid Ahmad

Similarly, translators working on Computer Aided Translation workbenches, also require help from various kinds of resources - glossaries, terminologies, concordances and translation memory in the workbenches in order to increase their productivity.

Machine Translation Management +2

Event Centric Entity Linking for Hindi News Articles: A Knowledge Graph Based Approach

no code implementations ICON 2019 Pranav Goel, Suhan Prabhu, Alok Debnath, Manish Shrivastava

We describe the development of a knowledge graph from an event annotated corpus by presenting a pipeline that identifies and extracts the relations between entities and events from Hindi news articles.

Entity Linking

HashSet -- A Dataset For Hashtag Segmentation

2 code implementations18 Jan 2022 Prashant Kodali, Akshala Bhatnagar, Naman Ahuja, Manish Shrivastava, Ponnurangam Kumaraguru

HashSet dataset is sampled from a different set of tweets when compared to existing datasets and provides an alternate distribution of hashtags to build and validate hashtag segmentation models.

Specificity

Is My Model Using The Right Evidence? Systematic Probes for Examining Evidence-Based Tabular Reasoning

no code implementations2 Aug 2021 Vivek Gupta, Riyaz A. Bhat, Atreya Ghosal, Manish Shrivastava, Maneesh Singh, Vivek Srikumar

Our experiments demonstrate that a RoBERTa-based model, representative of the current state-of-the-art, fails at reasoning on the following counts: it (a) ignores relevant parts of the evidence, (b) is over-sensitive to annotation artifacts, and (c) relies on the knowledge encoded in the pre-trained language model rather than the evidence presented in its tabular inputs.

Language Modelling

Volta at SemEval-2021 Task 9: Statement Verification and Evidence Finding with Tables using TAPAS and Transfer Learning

1 code implementation SEMEVAL 2021 Devansh Gautam, Kshitij Gupta, Manish Shrivastava

We fine-tune TAPAS (a model which extends BERT's architecture to capture tabular structure) for both the subtasks as it has shown state-of-the-art performance in various table understanding tasks.

Logical Reasoning Transfer Learning

Finding The Right One and Resolving it

no code implementations CONLL 2020 Payal Khullar, Arghya Bhattacharya, Manish Shrivastava

One-anaphora has figured prominently in theoretical linguistic literature, but computational linguistics research on the phenomenon is sparse.

AbuseAnalyzer: Abuse Detection, Severity and Target Prediction for Gab Posts

1 code implementation COLING 2020 Mohit Chandra, Ashwin Pathak, Eesha Dutta, Paryul Jain, Manish Gupta, Manish Shrivastava, Ponnurangam Kumaraguru

While extensive popularity of online social media platforms has made information dissemination faster, it has also resulted in widespread online abuse of different types like hate speech, offensive language, sexist and racist opinions, etc.

Abuse Detection severity prediction

Tag2Risk: Harnessing Social Music Tags for Characterizing Depression Risk

no code implementations26 Jul 2020 Aayush Surana, Yash Goyal, Manish Shrivastava, Suvi Saarikallio, Vinoo Alluri

Studies have shown musical engagement to be an indirect representation of internal states including internalized symptomatology and depression.

Word Embeddings as Tuples of Feature Probabilities

no code implementations WS 2020 Siddharth Bhat, Alok Debnath, Souvik Banerjee, Manish Shrivastava

In this paper, we provide an alternate perspective on word representations, by reinterpreting the dimensions of the vector space of a word embedding as a collection of features.

Word Embeddings

A Simple and Effective Dependency Parser for Telugu

no code implementations ACL 2020 Sneha Nallani, Manish Shrivastava, Dipti Sharma

We present a simple and effective dependency parser for Telugu, a morphologically rich, free word order language.

Feature Engineering

ConfNet2Seq: Full Length Answer Generation from Spoken Questions

1 code implementation9 Jun 2020 Vaishali Pal, Manish Shrivastava, Laurent Besacier

This is the first attempt towards generating full-length natural answers from a graph input(confusion network) to the best of our knowledge.

Answer Generation Task-Oriented Dialogue Systems

NoEl: An Annotated Corpus for Noun Ellipsis in English

no code implementations LREC 2020 Payal Khullar, Kushal Majmundar, Manish Shrivastava

Ellipsis resolution has been identified as an important step to improve the accuracy of mainstream Natural Language Processing (NLP) tasks such as information retrieval, event extraction, dialog systems, etc.

Event Extraction Information Retrieval +2

Detection and Annotation of Events in Kannada

no code implementations LREC 2020 Suhan Prabhu, Ujwal Narayan, Alok Debnath, Sumukh S, Manish Shrivastava

In this paper, we provide the basic guidelines towards the detection and linguistic analysis of events in Kannada.

Information Retrieval Retrieval

A Multi-Dimensional View of Aggression when voicing Opinion

no code implementations LREC 2020 Arjit Srivastava, Avijit Vajpayee, Syed Sarfaraz Akhtar, Naman jain, Vinay Singh, Manish Shrivastava

The advent of social media has immensely proliferated the amount of opinions and arguments voiced on the internet.

A Fully Expanded Dependency Treebank for Telugu

no code implementations LREC 2020 Sneha Nallani, Manish Shrivastava, Dipti Sharma

The available Paninian dependency treebank(s) for Telugu is annotated only with inter-chunk dependency relations and not all words of a sentence are part of the parse tree.

Hindi TimeBank: An ISO-TimeML Annotated Reference Corpus

no code implementations LREC 2020 Pranav Goel, Suhan Prabhu, Alok Debnath, Priyank Modi, Manish Shrivastava

In this paper, we present the Hindi TimeBank, an ISO-TimeML annotated reference corpus for the detection and classification of events, states and time expressions, and the links between them.

Modeling ASR Ambiguity for Dialogue State Tracking Using Word Confusion Networks

1 code implementation3 Feb 2020 Vaishali Pal, Fabien Guillot, Manish Shrivastava, Jean-Michel Renders, Laurent Besacier

Spoken dialogue systems typically use a list of top-N ASR hypotheses for inferring the semantic meaning and tracking the state of the dialogue.

Dialogue State Tracking Spoken Dialogue Systems

Transition-Based Deep Input Linearization

1 code implementation EACL 2017 Ratish Puduppully, Yue Zhang, Manish Shrivastava

Traditional methods for deep NLG adopt pipeline approaches comprising stages such as constructing syntactic input, predicting function words, linearizing the syntactic input and generating the surface forms.

Data-to-Text Generation Machine Translation

Answering Naturally: Factoid to Full length Answer Generation

1 code implementation WS 2019 Vaishali Pal, Manish Shrivastava, Irshad Bhat

A reading comprehension system extracts a span of text, comprising of named entities, dates, small phrases, etc., which serve as the answer to a given question.

Answer Generation Question Answering +2

Using Syntax to Resolve NPE in English

no code implementations RANLP 2019 Payal Khullar, Allen Antony, Manish Shrivastava

We get an F1-score of 76. 47{\%} for detection and 70. 27{\%} for NPE resolution on the testset.

De-Mixing Sentiment from Code-Mixed Text

no code implementations ACL 2019 Yash Kumar Lal, Vaibhav Kumar, Mrinal Dhar, Manish Shrivastava, Philipp Koehn

The Collective Encoder captures the overall sentiment of the sentence, while the Specific Encoder utilizes an attention mechanism in order to focus on individual sentiment-bearing sub-words.

Sentiment Analysis Word Embeddings

Curriculum Learning Strategies for Hindi-English Codemixed Sentiment Analysis

no code implementations18 Jun 2019 Anirudh Dahiya, Neeraj Battan, Manish Shrivastava, Dipti Mishra Sharma

Sentiment Analysis and other semantic tasks are commonly used for social media textual analysis to gauge public opinion and make sense from the noise on social media.

Sentiment Analysis

Fermi at SemEval-2019 Task 8: An elementary but effective approach to Question Discernment in Community QA Forums

no code implementations SEMEVAL 2019 Bakhtiyar Syed, Vijayasaradhi Indurthi, Manish Shrivastava, Manish Gupta, Vasudeva Varma

This information is highly useful in segregating factual questions from non-factual ones which highly helps in organizing the questions into useful categories and trims down the problem space for the next task in the pipeline for fact evaluation among the available answers.

Community Question Answering

A Pregroup Representation of Word Order Alternation Using Hindi Syntax

no code implementations NAACL 2019 Alok Debnath, Manish Shrivastava

Pregroup calculus has been used for the representation of free word order languages (Sanskrit and Hungarian), using a construction called precyclicity.

Predicting Algorithm Classes for Programming Word Problems

no code implementations WS 2019 Vinayak Athavale, Aayush Naik, Rajas Vanjape, Manish Shrivastava

We present four new datasets for this task, two multiclass datasets with 550 and 1159 problems each and two multilabel datasets having 3737 and 3960 problems each.

Classification General Classification +2

Aggression Detection on Social Media Text Using Deep Neural Networks

1 code implementation WS 2018 Vinay Singh, Aman Varshney, Syed Sarfaraz Akhtar, Deepanshu Vijay, Manish Shrivastava

In this paper, we introduce a deep learning based classification system for Facebook posts and comments of Hindi-English Code-Mixed text to detect the aggressive behaviour of/towards users.

Classification General Classification

NUTS: Network for Unsupervised Telegraphic Summarization

no code implementations27 Sep 2018 Chanakya Malireddy, Tirth Maniar, Sajal Maheshwari, Manish Shrivastava

Extractive summarization methods operate by ranking and selecting the sentences which best encapsulate the theme of a given document.

Extractive Summarization

SWDE : A Sub-Word And Document Embedding Based Engine for Clickbait Detection

no code implementations2 Aug 2018 Vaibhav Kumar, Mrinal Dhar, Dhruv Khattar, Yash Kumar Lal, Abhimanshu Mishra, Manish Shrivastava, Vasudeva Varma

We generate sub-word level embeddings of the title using Convolutional Neural Networks and use them to train a bidirectional LSTM architecture.

Clickbait Detection Document Embedding +1

Gold Corpus for Telegraphic Summarization

1 code implementation COLING 2018 Chanakya Malireddy, Srivenkata N M Somisetty, Manish Shrivastava

Here, we don{'}t select whole sentences, rather pick short segments of text spread across sentences, as the summary.

Extractive Summarization

Enabling Code-Mixed Translation: Parallel Corpus Creation and MT Augmentation Approach

no code implementations COLING 2018 Mrinal Dhar, Vaibhav Kumar, Manish Shrivastava

With the help of the created parallel corpus, we analyzed the structure of English-Hindi code-mixed data and present a technique to augment run-of-the-mill machine translation (MT) approaches that can help achieve superior translations without the need for specially designed translation systems.

Machine Translation Translation

Automatic Question Generation using Relative Pronouns and Adverbs

no code implementations ACL 2018 Payal Khullar, Konigari Rachna, Mukul Hase, Manish Shrivastava

This paper presents a system that automatically generates multiple, natural language questions using relative pronouns and relative adverbs from complex English sentences.

Dialogue Generation Information Retrieval +4

Exploring Chunk Based Templates for Generating a subset of English Text

no code implementations ACL 2018 Nikhilesh Bhatnagar, Manish Shrivastava, Radhika Mamidi

Natural Language Generation (NLG) is a research task which addresses the automatic generation of natural language text representative of an input non-linguistic collection of knowledge.

Text Generation

Named Entity Recognition for Hindi-English Code-Mixed Social Media Text

1 code implementation WS 2018 Vinay Singh, Deepanshu Vijay, Syed Sarfaraz Akhtar, Manish Shrivastava

Named Entity Recognition (NER) is a major task in the field of Natural Language Processing (NLP), and also is a sub-task of Information Extraction.

BIG-bench Machine Learning Entity Extraction using GAN +2

Transliteration Better than Translation? Answering Code-mixed Questions over a Knowledge Base

no code implementations WS 2018 Vishal Gupta, Manoj Chinnakotla, Manish Shrivastava

Our network is trained only on English questions provided in this dataset and noisy Hindi translations of these questions and can answer English-Hindi CM questions effectively without the need of translation into English.

Automatic Speech Recognition Information Retrieval +7

Gender Prediction in English-Hindi Code-Mixed Social Media Content : Corpus and Baseline System

no code implementations14 Jun 2018 Ankush Khandelwal, Sahil Swami, Syed Sarfaraz Akhtar, Manish Shrivastava

In this paper, we analyze the task of author's gender prediction in code-mixed content and present a corpus of English-Hindi texts collected from Twitter which is annotated with author's gender.

Gender Prediction General Classification +4

Humor Detection in English-Hindi Code-Mixed Social Media Content : Corpus and Baseline System

no code implementations LREC 2018 Ankush Khandelwal, Sahil Swami, Syed S. Akhtar, Manish Shrivastava

In this paper, we analyze the task of humor detection in texts and describe a freely available corpus containing English-Hindi code-mixed tweets annotated with humorous(H) or non-humorous(N) tags.

General Classification Humor Detection +2

Cross-Lingual Task-Specific Representation Learning for Text Classification in Resource Poor Languages

no code implementations10 Jun 2018 Nurendra Choudhary, Rajat Singh, Manish Shrivastava

The model learns the representation of resource-poor and resource-rich sentences in a common space by using the similarity between their assigned annotation tags.

Classification General Classification +4

Corpus Creation and Emotion Prediction for Hindi-English Code-Mixed Social Media Text

no code implementations NAACL 2018 Deepanshu Vijay, Aditya Bohra, Vinay Singh, Syed Sarfaraz Akhtar, Manish Shrivastava

Emotion Prediction is a Natural Language Processing (NLP) task dealing with detection and classification of emotions in various monolingual and bilingual texts.

General Classification

A Dataset of Hindi-English Code-Mixed Social Media Text for Hate Speech Detection

no code implementations WS 2018 Aditya Bohra, Deepanshu Vijay, Vinay Singh, Syed Sarfaraz Akhtar, Manish Shrivastava

Hate speech detection in social media texts is an important Natural language Processing task, which has several crucial applications like sentiment analysis, investigating cyberbullying and examining socio-political controversies.

General Classification Hate Speech Detection +1

A Corpus of English-Hindi Code-Mixed Tweets for Sarcasm Detection

2 code implementations30 May 2018 Sahil Swami, Ankush Khandelwal, Vinay Singh, Syed Sarfaraz Akhtar, Manish Shrivastava

Social media platforms like twitter and facebook have be- come two of the largest mediums used by people to express their views to- wards different topics.

Opinion Mining Sarcasm Detection +2

Universal Dependency Parsing for Hindi-English Code-switching

2 code implementations NAACL 2018 Irshad Ahmad Bhat, Riyaz Ahmad Bhat, Manish Shrivastava, Dipti Misra Sharma

We present a treebank of Hindi-English code-switching tweets under Universal Dependencies scheme and propose a neural stacking model for parsing that efficiently leverages part-of-speech tag and syntactic tree annotations in the code-switching treebank and the preexisting Hindi and English treebanks.

Dependency Parsing Language Identification +2

Sentiment Analysis of Code-Mixed Languages leveraging Resource Rich Languages

1 code implementation3 Apr 2018 Nurendra Choudhary, Rajat Singh, Ishita Bindlish, Manish Shrivastava

Code-mixed data is an important challenge of natural language processing because its characteristics completely vary from the traditional structures of standard languages.

Contrastive Learning Sentiment Analysis

Contrastive Learning of Emoji-based Representations for Resource-Poor Languages

no code implementations3 Apr 2018 Nurendra Choudhary, Rajat Singh, Ishita Bindlish, Manish Shrivastava

The model learns the representations of resource-poor and resource-rich language in a common emoji space by using a similarity metric based on the emojis present in sentences from both languages.

Contrastive Learning

An Unsupervised Approach for Mapping between Vector Spaces

no code implementations15 Nov 2017 Syed Sarfaraz Akhtar, Arihant Gupta, Avijit Vajpayee, Arjit Srivastava, Madan Gopal Jhawar, Manish Shrivastava

Our model handles the problem of data scarcity which is faced by many languages in the world and yields improved word embeddings for words in the target language by relying on transformed embeddings of words of the source language.

Word Embeddings Word Similarity

Deep Neural Network based system for solving Arithmetic Word problems

no code implementations IJCNLP 2017 Purvanshi Mehta, Pruthwik Mishra, Vinayak Athavale, Manish Shrivastava, Dipti Sharma

The worldstate and the query are processed separately in two different networks and finally, the networks are merged to predict the final operation.

Injecting Word Embeddings with Another Language's Resource : An Application of Bilingual Embeddings

no code implementations IJCNLP 2017 P, Prakhar ey, Vikram Pudi, Manish Shrivastava

Word embeddings learned from text corpus can be improved by injecting knowledge from external resources, while at the same time also specializing them for similarity or relatedness.

Learning Word Embeddings Word Similarity

Word Similarity Datasets for Indian Languages: Annotation and Baseline Systems

no code implementations WS 2017 Syed Sarfaraz Akhtar, Arihant Gupta, Avijit Vajpayee, Arjit Srivastava, Manish Shrivastava

With the advent of word representations, word similarity tasks are becoming increasing popular as an evaluation metric for the quality of the representations.

Dependency Parsing Machine Translation +6

Hand in Glove: Deep Feature Fusion Network Architectures for Answer Quality Prediction in Community Question Answering

no code implementations COLING 2016 Sai Praneeth Suggu, Kushwanth Naga Goutham, Manoj K. Chinnakotla, Manish Shrivastava

Given a question-answer pair along with its metadata, the DFFN architecture independently - a) learns features from the Deep Neural Network (DNN) and b) computes hand-crafted features using various external resources and then combines them using a fully connected neural network trained to predict the final answer quality.

Community Question Answering

Towards Sub-Word Level Compositions for Sentiment Analysis of Hindi-English Code Mixed Text

3 code implementations COLING 2016 Ameya Prabhu, Aditya Joshi, Manish Shrivastava, Vasudeva Varma

We introduce a Hindi-English (Hi-En) code-mixed dataset for sentiment analysis and perform empirical analysis comparing the suitability and performance of various state-of-the-art SA methods in social media.

Opinion Mining Sentiment Analysis

Deep Feature Fusion Network for Answer Quality Prediction in Community Question Answering

no code implementations22 Jun 2016 Sai Praneeth Suggu, Kushwanth N. Goutham, Manoj K. Chinnakotla, Manish Shrivastava

Current AQP systems either learn models using - a) various hand-crafted features (HCF) or b) use deep learning (DL) techniques which automatically learn the required feature representations.

Community Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.