Search Results for author: Dheeraj Mekala

Found 20 papers, 12 papers with code

META: Metadata-Empowered Weak Supervision for Text Classification

1 code implementation EMNLP 2020 Dheeraj Mekala, Xinyang Zhang, Jingbo Shang

Based on seed words, we rank and filter motif instances to distill highly label-indicative ones as {``}seed motifs{''}, which provide additional weak supervision.

General Classification text-classification +2

DOCMASTER: A Unified Platform for Annotation, Training, & Inference in Document Question-Answering

no code implementations30 Mar 2024 Alex Nguyen, Zilong Wang, Jingbo Shang, Dheeraj Mekala

The application of natural language processing models to PDF documents is pivotal for various business applications yet the challenge of training models for this purpose persists in businesses due to specific hurdles.

Privacy Preserving Question Answering

TOOLVERIFIER: Generalization to New Tools via Self-Verification

1 code implementation21 Feb 2024 Dheeraj Mekala, Jason Weston, Jack Lanchantin, Roberta Raileanu, Maria Lomeli, Jingbo Shang, Jane Dwivedi-Yu

Teaching language models to use tools is an important milestone towards building general assistants, but remains an open problem.

MORL-Prompt: An Empirical Analysis of Multi-Objective Reinforcement Learning for Discrete Prompt Optimization

no code implementations18 Feb 2024 Yasaman Jafari, Dheeraj Mekala, Rose Yu, Taylor Berg-Kirkpatrick

RL-based techniques can be used to search for prompts that when fed into a target language model maximize a set of user-specified reward functions.

Language Modelling Machine Translation +2

Smaller Language Models are capable of selecting Instruction-Tuning Training Data for Larger Language Models

1 code implementation16 Feb 2024 Dheeraj Mekala, Alex Nguyen, Jingbo Shang

In this paper, we introduce a novel training data selection based on the learning percentage of the samples.

SELFOOD: Self-Supervised Out-Of-Distribution Detection via Learning to Rank

1 code implementation24 May 2023 Dheeraj Mekala, Adithya Samavedhi, chengyu dong, Jingbo Shang

To address the annotation bottleneck, we introduce SELFOOD, a self-supervised OOD detection method that requires only in-distribution samples as supervision.

Learning-To-Rank Out-of-Distribution Detection +1

A Benchmark on Extremely Weakly Supervised Text Classification: Reconcile Seed Matching and Prompting Approaches

1 code implementation22 May 2023 Zihan Wang, Tianle Wang, Dheeraj Mekala, Jingbo Shang

Etremely Weakly Supervised Text Classification (XWS-TC) refers to text classification based on minimal high-level human guidance, such as a few label-indicative seed words or classification instructions.

Benchmarking text-classification +1

ZEROTOP: Zero-Shot Task-Oriented Semantic Parsing using Large Language Models

no code implementations21 Dec 2022 Dheeraj Mekala, Jason Wolfe, Subhro Roy

For each utterance, we prompt the LLM with questions corresponding to its top-level intent and a set of slots and use the LLM generations to construct the target meaning representation.

Extractive Question-Answering Language Modelling +3

Progressive Sentiment Analysis for Code-Switched Text Data

1 code implementation25 Oct 2022 Sudhanshu Ranjan, Dheeraj Mekala, Jingbo Shang

Instead of training on the entire code-switched corpus at once, we create buckets based on the fraction of words in the resource-rich language and progressively train from resource-rich language dominated samples to low-resource language dominated samples.

Cross-Lingual Transfer named-entity-recognition +6

Leveraging QA Datasets to Improve Generative Data Augmentation

2 code implementations25 May 2022 Dheeraj Mekala, Tu Vu, Timo Schick, Jingbo Shang

The ability of generative language models (GLMs) to generate text has improved considerably in the last few years, enabling their use for generative data augmentation.

Common Sense Reasoning Data Augmentation +3

LOPS: Learning Order Inspired Pseudo-Label Selection for Weakly Supervised Text Classification

1 code implementation25 May 2022 Dheeraj Mekala, chengyu dong, Jingbo Shang

Weakly supervised text classification methods typically train a deep neural classifier based on pseudo-labels.

Memorization Pseudo Label +2

BFClass: A Backdoor-free Text Classification Framework

no code implementations Findings (EMNLP) 2021 Zichao Li, Dheeraj Mekala, chengyu dong, Jingbo Shang

To recognize the poisoned subset, we examine the training samples with these identified triggers as the most suspicious token, and check if removing the trigger will change the poisoned model's prediction.

Backdoor Attack Language Modelling +2

Coarse2Fine: Fine-grained Text Classification on Coarsely-grained Annotated Data

no code implementations EMNLP 2021 Dheeraj Mekala, Varun Gangal, Jingbo Shang

Existing text classification methods mainly focus on a fixed label set, whereas many real-world applications require extending to new fine-grained classes as the number of samples per label increases.

text-classification Text Classification +1

News Meets Microblog: Hashtag Annotation via Retriever-Generator

1 code implementation18 Apr 2021 Xiuwen Zheng, Dheeraj Mekala, Amarnath Gupta, Jingbo Shang

Hashtag annotation for microblog posts has been recently formulated as a sequence generation problem to handle emerging hashtags that are unseen in the training set.

Contextualized Weak Supervision for Text Classification

1 code implementation ACL 2020 Dheeraj Mekala, Jingbo Shang

Weakly supervised text classification based on a few user-provided seed words has recently attracted much attention from researchers.

General Classification text-classification +1

SCDV : Sparse Composite Document Vectors using soft clustering over distributional representations

4 code implementations EMNLP 2017 Dheeraj Mekala, Vivek Gupta, Bhargavi Paranjape, Harish Karnick

We present a feature vector formation technique for documents - Sparse Composite Document Vector (SCDV) - which overcomes several shortcomings of the current distributional paragraph vector representations that are widely used for text representation.

Clustering Information Retrieval +3

User Bias Removal in Review Score Prediction

no code implementations20 Dec 2016 Rahul Wadbude, Vivek Gupta, Dheeraj Mekala, Harish Karnick

Review score prediction of text reviews has recently gained a lot of attention in recommendation systems.

Recommendation Systems

Cannot find the paper you are looking for? You can Submit a new open access paper.