Search Results for author: Shubhanshu Mishra

Found 8 papers, 5 papers with code

Improved Multilingual Language Model Pretraining for Social Media Text via Translation Pair Prediction

no code implementations20 Oct 2021 Shubhanshu Mishra, Aria Haghighi

We evaluate a simple approach to improving zero-shot multilingual transfer of mBERT on social media corpus by adding a pretraining task called translation pair prediction (TPP), which predicts whether a pair of cross-lingual texts are a valid translation.

Language Modelling NER +4

LMSOC: An Approach for Socially Sensitive Pretraining

1 code implementation20 Oct 2021 Vivek Kulkarni, Shubhanshu Mishra, Aria Haghighi

Although language depends heavily on the geographical, temporal, and other social contexts of the speaker, these elements have not been incorporated into modern transformer-based language models.

Cloze Test Graph Representation Learning +1

Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency

2 code implementations18 May 2021 Kyra Yee, Uthaipon Tantipongpipat, Shubhanshu Mishra

However, we demonstrate that formalized fairness metrics and quantitative analysis on their own are insufficient for capturing the risk of representational harm in automatic cropping.

Fairness Image Cropping

A Framework for Generating Annotated Social Media Corpora with Demographics, Stance, Civility, and Topicality

no code implementations10 Dec 2020 Shubhanshu Mishra, Daniel Collier

In this paper we introduce a framework for annotating a social media text corpora for various categories.

Assessing Demographic Bias in Named Entity Recognition

no code implementations8 Aug 2020 Shubhanshu Mishra, Sijun He, Luca Belli

Named Entity Recognition (NER) is often the first step towards automated Knowledge Base (KB) generation from raw text.

Named Entity Recognition NER

Multilingual Joint Fine-tuning of Transformer models for identifying Trolling, Aggression and Cyberbullying at TRAC 2020

1 code implementation LREC 2020 Sudhanshu Mishra, Shivangi Prasad, Shubhanshu Mishra

We also investigated the utility of task label marginalization, joint label classification, and joint training on multilingual datasets as possible improvements to our models.

Cannot find the paper you are looking for? You can Submit a new open access paper.