Search Results for author: Subhabrata Mukherjee

Found 41 papers, 9 papers with code

Self-training with Few-shot Rationalization

no code implementations EMNLP 2021 Meghana Moorthy Bhat, Alessandro Sordoni, Subhabrata Mukherjee

While pre-trained language models have obtained state-of-the-art performance for several natural language understanding tasks, they are quite opaque in terms of their decision-making process.

Decision Making Natural Language Understanding

Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners

no code implementations16 Apr 2022 Shashank Gupta, Subhabrata Mukherjee, Krishan Subudhi, Eduardo Gonzalez, Damien Jose, Ahmed H. Awadallah, Jianfeng Gao

Traditional multi-task learning (MTL) methods use dense networks that use the same set of shared weights across several different tasks.

Multi-Task Learning

LiteTransformerSearch: Training-free On-device Search for Efficient Autoregressive Language Models

1 code implementation4 Mar 2022 Mojan Javaheripi, Shital Shah, Subhabrata Mukherjee, Tomasz L. Religa, Caio C. T. Mendes, Gustavo H. de Rosa, Sebastien Bubeck, Farinaz Koushanfar, Debadeepta Dey

In this work, we leverage the somewhat surprising empirical observation that the number of non-embedding parameters in autoregressive transformers has a high rank correlation with task performance, irrespective of the architectural hyperparameters.

Language Modelling

AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models

no code implementations29 Jan 2022 Dongkuan Xu, Subhabrata Mukherjee, Xiaodong Liu, Debadeepta Dey, Wenhui Wang, Xiang Zhang, Ahmed Hassan Awadallah, Jianfeng Gao

Our framework AutoDistil addresses above challenges with the following steps: (a) Incorporates inductive bias and heuristics to partition Transformer search space into K compact sub-spaces (K=3 for typical student sizes of base, small and tiny); (b) Trains one SuperLM for each sub-space using task-agnostic objective (e. g., self-attention distillation) with weight-sharing of students; (c) Lightweight search for the optimal student without re-training.

Knowledge Distillation Neural Architecture Search

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding

1 code implementation4 Nov 2021 Subhabrata Mukherjee, Xiaodong Liu, Guoqing Zheng, Saghar Hosseini, Hao Cheng, Greg Yang, Christopher Meek, Ahmed Hassan Awadallah, Jianfeng Gao

We demonstrate that while recent models reach human performance when they have access to large amounts of labeled data, there is a huge gap in performance in the few-shot setting for most tasks.

Few-Shot Learning Natural Language Understanding

What do Compressed Large Language Models Forget? Robustness Challenges in Model Compression

no code implementations16 Oct 2021 Mengnan Du, Subhabrata Mukherjee, Yu Cheng, Milad Shokouhi, Xia Hu, Ahmed Hassan Awadallah

Recent works have focused on compressing pre-trained language models (PLMs) like BERT where the major focus has been to improve the compressed model performance for downstream tasks.

Knowledge Distillation Model Compression +1

LiST: Lite Prompted Self-training Makes Parameter-Efficient Few-shot Learners

1 code implementation12 Oct 2021 Yaqing Wang, Subhabrata Mukherjee, Xiaodong Liu, Jing Gao, Ahmed Hassan Awadallah, Jianfeng Gao

The first is the use of self-training to leverage large amounts of unlabeled data for prompt-based FN in few-shot settings.

Few-Shot Learning

Self-training with Few-shot Rationalization: Teacher Explanations Aid Student in Few-shot NLU

no code implementations17 Sep 2021 Meghana Moorthy Bhat, Alessandro Sordoni, Subhabrata Mukherjee

While pre-trained language models have obtained state-of-the-art performance for several natural language understanding tasks, they are quite opaque in terms of their decision-making process.

Decision Making Natural Language Understanding

Fairness via Representation Neutralization

no code implementations NeurIPS 2021 Mengnan Du, Subhabrata Mukherjee, Guanchu Wang, Ruixiang Tang, Ahmed Hassan Awadallah, Xia Hu

This process not only requires a lot of instance-level annotations for sensitive attributes, it also does not guarantee that all fairness sensitive information has been removed from the encoder.

Classification Fairness

XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation

1 code implementation8 Jun 2021 Subhabrata Mukherjee, Ahmed Hassan Awadallah, Jianfeng Gao

While deep and large pre-trained models are the state-of-the-art for various natural language processing tasks, their huge size poses significant challenges for practical uses in resource constrained settings.

Knowledge Distillation NER +1

MetaXL: Meta Representation Transformation for Low-resource Cross-lingual Learning

1 code implementation NAACL 2021 Mengzhou Xia, Guoqing Zheng, Subhabrata Mukherjee, Milad Shokouhi, Graham Neubig, Ahmed Hassan Awadallah

Extensive experiments on real-world low-resource languages - without access to large-scale monolingual corpora or large amounts of labeled data - for tasks like cross-lingual sentiment analysis and named entity recognition show the effectiveness of our approach.

Cross-Lingual Transfer Meta-Learning +3

Self-Training with Weak Supervision

1 code implementation NAACL 2021 Giannis Karamanolakis, Subhabrata Mukherjee, Guoqing Zheng, Ahmed Hassan Awadallah

In this work, we develop a weak supervision framework (ASTRA) that leverages all the available data for a given task.

Text Classification

Adaptive Self-training for Neural Sequence Labeling with Few Labels

no code implementations1 Jan 2021 Yaqing Wang, Subhabrata Mukherjee, Haoda Chu, Yuancheng Tu, Ming Wu, Jing Gao, Ahmed Hassan Awadallah

Neural sequence labeling is an important technique employed for many Natural Language Processing (NLP) tasks, such as Named Entity Recognition (NER), slot tagging for dialog systems and semantic parsing.

Meta-Learning Named Entity Recognition +2

Uncertainty-aware Self-training for Few-shot Text Classification

no code implementations NeurIPS 2020 Subhabrata Mukherjee, Ahmed Awadallah

Recent success of pre-trained language models crucially hinges on fine-tuning them on large amounts of labeled data for the downstream task, that are typically expensive to acquire or difficult to access for many applications.

Classification Few-Shot Text Classification +2

Adaptive Self-training for Few-shot Neural Sequence Labeling

no code implementations7 Oct 2020 Yaqing Wang, Subhabrata Mukherjee, Haoda Chu, Yuancheng Tu, Ming Wu, Jing Gao, Ahmed Hassan Awadallah

While self-training serves as an effective mechanism to learn from large amounts of unlabeled data -- meta-learning helps in adaptive sample re-weighting to mitigate error propagation from noisy pseudo-labels.

Meta-Learning Named Entity Recognition +2

Smart To-Do: Automatic Generation of To-Do Items from Emails

no code implementations ACL 2020 Sudipto Mukherjee, Subhabrata Mukherjee, Marcello Hasegawa, Ahmed Hassan Awadallah, Ryen White

Intelligent features in email service applications aim to increase productivity by helping people organize their folders, compose their emails and respond to pending tasks.

Text Generation

Learning with Weak Supervision for Email Intent Detection

no code implementations26 May 2020 Kai Shu, Subhabrata Mukherjee, Guoqing Zheng, Ahmed Hassan Awadallah, Milad Shokouhi, Susan Dumais

In this paper, we propose to leverage user actions as a source of weak supervision, in addition to a limited set of annotated examples, to detect intents in emails.

Intent Classification Intent Detection

Product Insights: Analyzing Product Intents in Web Search

no code implementations18 May 2020 Nikitha Rao, Chetan Bansal, Subhabrata Mukherjee, Chandra Maddila

Web search engines are frequently used to access information about products.

Smart To-Do : Automatic Generation of To-Do Items from Emails

no code implementations5 May 2020 Sudipto Mukherjee, Subhabrata Mukherjee, Marcello Hasegawa, Ahmed Hassan Awadallah, Ryen White

Intelligent features in email service applications aim to increase productivity by helping people organize their folders, compose their emails and respond to pending tasks.

Text Generation

Distilling BERT into Simple Neural Networks with Unlabeled Transfer Data

no code implementations4 Oct 2019 Subhabrata Mukherjee, Ahmed Hassan Awadallah

We show that our student models can compress the huge teacher by up to 26x while still matching or even marginally exceeding the teacher performance in low-resource settings with small amount of labeled data.

Knowledge Distillation NER

GhostLink: Latent Network Inference for Influence-aware Recommendation

no code implementations15 May 2019 Subhabrata Mukherjee, Stephan Guennemann

As additional use-cases, we show that GhostLink can be used to differentiate between users' latent preferences and influenced ones, as well as to detect influential users based on the learned influence graph.

OpenKI: Integrating Open Information Extraction and Knowledge Bases with Relation Inference

1 code implementation NAACL 2019 Dongxu Zhang, Subhabrata Mukherjee, Colin Lockard, Xin Luna Dong, Andrew McCallum

In this paper, we consider advancing web-scale knowledge extraction and alignment by integrating OpenIE extractions in the form of (subject, predicate, object) triples with Knowledge Bases (KB).

Open Information Extraction

Probabilistic Graphical Models for Credibility Analysis in Evolving Online Communities

no code implementations26 Jul 2017 Subhabrata Mukherjee

To address the above limitations, we propose probabilistic graphical models that can leverage the joint interplay between multiple factors in online communities --- like user interactions, community dynamics, and textual content --- to automatically assess the credibility of user-contributed online content, and the expertise of users and their evolution with user-interpretable explanation.

Language Modelling Recommendation Systems +1

Credible Review Detection with Limited Information using Consistency Analysis

no code implementations7 May 2017 Subhabrata Mukherjee, Sourav Dutta, Gerhard Weikum

Online reviews provide viewpoints on the strengths and shortcomings of products/services, influencing potential customers' purchasing decisions.

Topic Models

People on Media: Jointly Identifying Credible News and Trustworthy Citizen Journalists in Online Communities

no code implementations7 May 2017 Subhabrata Mukherjee, Gerhard Weikum

This paper presents a model to systematically analyze the different interactions in a news community between users, news, and sources.

Fairness

Item Recommendation with Evolving User Preferences and Experience

no code implementations6 May 2017 Subhabrata Mukherjee, Hemank Lamba, Gerhard Weikum

As only item ratings and review texts are observables, we capture the user's experience and interests in a latent model learned from her reviews, vocabulary and writing style.

Collaborative Filtering Recommendation Systems

Exploring Latent Semantic Factors to Find Useful Product Reviews

no code implementations6 May 2017 Subhabrata Mukherjee, Kashyap Popat, Gerhard Weikum

In this work, we attempt to automatically identify review quality in terms of its helpfulness to the end consumers.

Author-Specific Sentiment Aggregation for Polarity Prediction of Reviews

no code implementations LREC 2014 Subhabrata Mukherjee, Sachindra Joshi

Furthermore, we also show the effectiveness of our approach in capturing thwarting in reviews, achieving an accuracy improvement of 11. 53{\%} over the SVM baseline.

Dependency Parsing General Classification +2

Sentiment Analysis : A Literature Survey

no code implementations16 Apr 2013 Subhabrata Mukherjee, Pushpak Bhattacharyya

We will discuss in details various approaches to perform a computational treatment of sentiments and opinions.

Opinion Mining Sentiment Analysis

Cannot find the paper you are looking for? You can Submit a new open access paper.