Search Results for author: Shomir Wilson

Found 25 papers, 3 papers with code

A Study of Implicit Bias in Pretrained Language Models against People with Disabilities

no code implementations COLING 2022 Pranav Narayanan Venkit, Mukund Srinath, Shomir Wilson

Pretrained language models (PLMs) have been shown to exhibit sociodemographic biases, such as against gender and race, raising concerns of downstream biases in language technologies.

This Table is Different: A WordNet-Based Approach to Identifying References to Document Entities

no code implementations GWC 2016 Shomir Wilson, Alan Black, Jon Oberlander

Writing intended to inform frequently contains references to document entities (DEs), a mixed class that includes orthographically structured items (e. g., illustrations, sections, lists) and discourse entities (arguments, suggestions, points).

STAPI: An Automatic Scraper for Extracting Iterative Title-Text Structure from Web Documents

1 code implementation LREC 2022 Nan Zhang, Shomir Wilson, Prasenjit Mitra

Therefore, we propose the first title-text dataset on web documents that incorporates a wide variety of domains to facilitate downstream training.

Headline Generation

"Confidently Nonsensical?'': A Critical Survey on the Perspectives and Challenges of 'Hallucinations' in NLP

no code implementations11 Apr 2024 Pranav Narayanan Venkit, Tatiana Chakravorti, Vipul Gupta, Heidi Biggs, Mukund Srinath, Koustava Goswami, Sarah Rajtmajer, Shomir Wilson

We investigate how hallucination in large language models (LLM) is characterized in peer-reviewed literature using a critical examination of 103 publications across NLP research.

Hallucination

Automated Detection and Analysis of Data Practices Using A Real-World Corpus

no code implementations16 Feb 2024 Mukund Srinath, Pranav Venkit, Maria Badillo, Florian Schaub, C. Lee Giles, Shomir Wilson

Privacy policies are crucial for informing users about data practices, yet their length and complexity often deter users from reading them.

The Sentiment Problem: A Critical Survey towards Deconstructing Sentiment Analysis

no code implementations18 Oct 2023 Pranav Narayanan Venkit, Mukund Srinath, Sanjana Gautam, Saranya Venkatraman, Vipul Gupta, Rebecca J. Passonneau, Shomir Wilson

We conduct an inquiry into the sociotechnical aspects of sentiment analysis (SA) by critically examining 189 peer-reviewed papers on their applications, models, and datasets.

Ethics Sentiment Analysis

CALM : A Multi-task Benchmark for Comprehensive Assessment of Language Model Bias

1 code implementation24 Aug 2023 Vipul Gupta, Pranav Narayanan Venkit, Hugo Laurençon, Shomir Wilson, Rebecca J. Passonneau

We apply CALM to 20 large language models, and find that for 2 language model series, larger parameter models tend to be more biased than smaller ones.

Language Modelling Natural Language Inference +4

Automated Ableism: An Exploration of Explicit Disability Biases in Sentiment and Toxicity Analysis Models

no code implementations18 Jul 2023 Pranav Narayanan Venkit, Mukund Srinath, Shomir Wilson

We analyze sentiment analysis and toxicity detection models to detect the presence of explicit bias against people with disability (PWD).

Sentiment Analysis

Sociodemographic Bias in Language Models: A Survey and Forward Path

no code implementations13 Jun 2023 Vipul Gupta, Pranav Narayanan Venkit, Shomir Wilson, Rebecca J. Passonneau

This paper presents a comprehensive survey of work on sociodemographic bias in language models (LMs).

Nationality Bias in Text Generation

no code implementations5 Feb 2023 Pranav Narayanan Venkit, Sanjana Gautam, Ruchi Panchanadikar, Ting-Hao 'Kenneth' Huang, Shomir Wilson

Little attention is placed on analyzing nationality bias in language models, especially when nationality is highly used as a factor in increasing the performance of social NLP models.

Text Generation

Creation and Analysis of an International Corpus of Privacy Laws

no code implementations28 Jun 2022 Sonu Gupta, Ellen Poplavska, Nora O'Toole, Siddhant Arora, Thomas Norton, Norman Sadeh, Shomir Wilson

To examine the status and evolution of this patchwork, we introduce the Government Privacy Instructions Corpus, or GPI Corpus, of 1, 043 privacy laws, regulations, and guidelines, covering 182 jurisdictions.

Automated Detection of Doxing on Twitter

no code implementations2 Feb 2022 Younes Karimi, Anna Squicciarini, Shomir Wilson

Doxing refers to the practice of disclosing sensitive personal information about a person without their consent.

Supervised and Unsupervised Methods for Robust Separation of Section Titles and Prose Text in Web Documents

no code implementations EMNLP 2018 Abhijith Athreya Mysore Gopinath, Shomir Wilson, Norman Sadeh

To remedy this, we present a flexible system for automatically extracting the hierarchical section titles and prose organization of web documents irrespective of differences in HTML representation.

Information Retrieval Question Answering

Identifying the Provision of Choices in Privacy Policy Text

no code implementations EMNLP 2017 Kanthashree Mysore Sathyendra, Shomir Wilson, Florian Schaub, Sebastian Zimmeck, Norman Sadeh

Our techniques enable the creation of systems to help Internet users to learn about their choices, thereby effectuating notice and choice and improving Internet privacy.

Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.