Search Results for author: Scott A. Hale

Found 28 papers, 7 papers with code

The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models

1 code implementation • 24 Apr 2024 • Hannah Rose Kirk, Alexander Whitefield, Paul Röttger, Andrew Bean, Katerina Margatina, Juan Ciro, Rafael Mosquera, Max Bartolo, Adina Williams, He He, Bertie Vidgen, Scott A. Hale

Human feedback plays a central role in the alignment of Large Language Models (LLMs).

Paper
Code

Introducing v0.5 of the AI Safety Benchmark from MLCommons

1 code implementation • 18 Apr 2024 • Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Borhane Blili-Hamelin, Kurt Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller, Ram Gandikota, Agasthya Gangavarapu, Ananya Gangavarapu, James Gealy, Rajat Ghosh, James Goel, Usman Gohar, Sujata Goswami, Scott A. Hale, Wiebke Hutiri, Joseph Marvin Imperial, Surgan Jandial, Nick Judd, Felix Juefei-Xu, Foutse khomh, Bhavya Kailkhura, Hannah Rose Kirk, Kevin Klyman, Chris Knotz, Michael Kuchnik, Shachi H. Kumar, Chris Lengerich, Bo Li, Zeyi Liao, Eileen Peters Long, Victor Lu, Yifan Mai, Priyanka Mary Mammen, Kelvin Manyeki, Sean McGregor, Virendra Mehta, Shafee Mohammed, Emanuel Moss, Lama Nachman, Dinesh Jinenhally Naganna, Amin Nikanjam, Besmira Nushi, Luis Oala, Iftach Orr, Alicia Parrish, Cigdem Patlak, William Pietri, Forough Poursabzi-Sangdeh, Eleonora Presani, Fabrizio Puletti, Paul Röttger, Saurav Sahay, Tim Santos, Nino Scherrer, Alice Schoenauer Sebag, Patrick Schramowski, Abolfazl Shahbazi, Vin Sharma, Xudong Shen, Vamsi Sistla, Leonard Tang, Davide Testuggine, Vithursan Thangarasa, Elizabeth Anne Watkins, Rebecca Weiss, Chris Welty, Tyler Wilbers, Adina Williams, Carole-Jean Wu, Poonam Yadav, Xianjun Yang, Yi Zeng, Wenhui Zhang, Fedor Zhdanov, Jiacheng Zhu, Percy Liang, Peter Mattson, Joaquin Vanschoren

We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0. 5 benchmark.

Paper
Code

SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models

no code implementations • 14 Nov 2023 • Bertie Vidgen, Nino Scherrer, Hannah Rose Kirk, Rebecca Qian, Anand Kannappan, Scott A. Hale, Paul Röttger

While some of the models do not give a single unsafe response, most give unsafe responses to more than 20% of the prompts, with over 50% unsafe responses in the extreme.

Paper
Add Code

Lost in Translation -- Multilingual Misinformation and its Evolution

no code implementations • 27 Oct 2023 • Dorian Quelle, Calvin Cheng, Alexandre Bovet, Scott A. Hale

Using fact-checks as a proxy for the spread of misinformation, we find 33% of repeated claims cross linguistic boundaries, suggesting that some misinformation permeates language barriers.

Fact Checking Misinformation +3

Paper
Add Code

The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values

no code implementations • 11 Oct 2023 • Hannah Rose Kirk, Andrew M. Bean, Bertie Vidgen, Paul Röttger, Scott A. Hale

Human feedback is increasingly used to steer the behaviours of Large Language Models (LLMs).

Paper
Add Code

The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising "Alignment" in Large Language Models

no code implementations • 3 Oct 2023 • Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale

In this paper, we address the concept of "alignment" in large language models (LLMs) through the lens of post-structuralist socio-political theory, specifically examining its parallels to empty signifiers.

Paper
Add Code

Casteist but Not Racist? Quantifying Disparities in Large Language Model Bias between India and the West

no code implementations • 15 Sep 2023 • Khyati Khandelwal, Manuel Tonneau, Andrew M. Bean, Hannah Rose Kirk, Scott A. Hale

In this paper, we quantify stereotypical bias in popular LLMs according to an Indian-centric frame and compare bias levels between the Indian and Western contexts.

Language Modelling Large Language Model

Paper
Add Code

DoDo Learning: DOmain-DemOgraphic Transfer in Language Models for Detecting Abuse Targeted at Public Figures

1 code implementation • 31 Jul 2023 • Angus R. Williams, Hannah Rose Kirk, Liam Burke, Yi-Ling Chung, Ivan Debono, Pica Johansson, Francesca Stevens, Jonathan Bright, Scott A. Hale

We find that (i) small amounts of diverse data are hugely beneficial to generalisation and model adaptation; (ii) models transfer more easily across demographics but models trained on cross-domain data are more generalisable; (iii) some groups contribute more to generalisability than others; and (iv) dataset similarity is a signal of transferability.

text-classification Text Classification

Paper
Code

Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback

no code implementations • 9 Mar 2023 • Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale

Large language models (LLMs) are used to generate content for a wide range of tasks, and are set to reach a growing audience in coming years due to integration in product interfaces like ChatGPT or search engines like Bing.

Paper
Add Code

Query Rewriting for Effective Misinformation Discovery

no code implementations • 14 Oct 2022 • Ashkan Kazemi, Artem Abzaliev, Naihao Deng, Rui Hou, Scott A. Hale, Verónica Pérez-Rosas, Rada Mihalcea

We propose a novel system to help fact-checkers formulate search queries for known misinformation claims and effectively search across multiple social media platforms.

Misinformation reinforcement-learning +2

Paper
Add Code

Is More Data Better? Re-thinking the Importance of Efficiency in Abusive Language Detection with Transformers-Based Active Learning

1 code implementation • TRAC (COLING) 2022 • Hannah Rose Kirk, Bertie Vidgen, Scott A. Hale

Annotating abusive language is expensive, logistically complex and creates a risk of psychological harm.

Abusive Language Active Learning

Paper
Code

Top Gear or Black Mirror: Inferring Political Leaning From Non-Political Content

no code implementations • 11 Aug 2022 • Ahmet Kurnaz, Scott A. Hale

Polarization and echo chambers are often studied in the context of explicitly political events such as elections, and little scholarship has examined the mixing of political groups in non-political contexts.

Paper
Add Code

Matching Tweets With Applicable Fact-Checks Across Languages

no code implementations • 14 Feb 2022 • Ashkan Kazemi, Zehua Li, Verónica Pérez-Rosas, Scott A. Hale, Rada Mihalcea

We conduct both classification and retrieval experiments, in monolingual (English only), multilingual (Spanish, Portuguese), and cross-lingual (Hindi-English) settings using multilingual transformer models such as XLM-RoBERTa and multilingual embeddings such as LaBSE and SBERT.

Fact Checking Retrieval

Paper
Add Code

Fairness via AI: Bias Reduction in Medical Information

no code implementations • 6 Sep 2021 • Shiri Dori-Hacohen, Roberto Montenegro, Fabricio Murai, Scott A. Hale, Keen Sung, Michela Blain, Jennifer Edwards-Johnson

While part (3) of this work specifically focuses on the health domain, the fundamental computer science advances and contributions stemming from research efforts in bias reduction and Fairness via AI have broad implications in all areas of society.

Bias Detection Fairness +3

Paper
Add Code

Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-based Hate

1 code implementation • NAACL 2022 • Hannah Rose Kirk, Bertram Vidgen, Paul Röttger, Tristan Thrush, Scott A. Hale

Using the test suite, we expose weaknesses in existing hate detection models.

Benchmarking

Paper
Code

Deciphering Implicit Hate: Evaluating Automated Detection Algorithms for Multimodal Hate

no code implementations • Findings (ACL) 2021 • Austin Botelho, Bertie Vidgen, Scott A. Hale

We show that both text- and visual- enrichment improves model performance, with the multimodal model (0. 771) outperforming other models' F1 scores (0. 544, 0. 737, and 0. 754).

Paper
Add Code

Tiplines to Combat Misinformation on Encrypted Platforms: A Case Study of the 2019 Indian Election on WhatsApp

no code implementations • 8 Jun 2021 • Ashkan Kazemi, Kiran Garimella, Gautam Kishore Shahi, Devin Gaffney, Scott A. Hale

There is currently no easy way to fact-check content on WhatsApp and other end-to-end encrypted platforms at scale.

Misinformation

Paper
Add Code

Claim Matching Beyond English to Scale Global Fact-Checking

no code implementations • ACL 2021 • Ashkan Kazemi, Kiran Garimella, Devin Gaffney, Scott A. Hale

We train our own embedding model using knowledge distillation and a high-quality "teacher" model in order to address the imbalance in embedding quality between the low- and high-resource languages in our dataset.

Fact Checking Knowledge Distillation

Paper
Add Code

Semantic Journeys: Quantifying Change in Emoji Meaning from 2012-2018

1 code implementation • 3 May 2021 • Alexander Robertson, Farhana Ferdousi Liza, Dong Nguyen, Barbara McGillivray, Scott A. Hale

The semantics of emoji has, to date, been considered from a static perspective.

Paper
Code

Tackling Racial Bias in Automated Online Hate Detection: Towards Fair and Accurate Classification of Hateful Online Users Using Geometric Deep Learning

no code implementations • 22 Mar 2021 • Zo Ahmed, Bertie Vidgen, Scott A. Hale

Yet, most research in online hate detection to date has focused on hateful content.

Fairness Feature Engineering

Paper
Add Code

Demographic Inference and Representative Population Estimates from Multilingual Social Media Data

no code implementations • 15 May 2019 • Zijian Wang, Scott A. Hale, David Adelani, Przemyslaw A. Grabowicz, Timo Hartmann, Fabian Flöck, David Jurgens

In a large experiment over multilingual heterogeneous European regions, we show that our demographic inference and bias correction together allow for more accurate estimates of populations and make a significant step towards representative social sensing in downstream applications with multilingual social media.

Attribute

Paper
Add Code

Measuring the Volatility of the Political agenda in Public Opinion and News Media

1 code implementation • 27 Aug 2018 • Chico Q. Camargo, Scott A. Hale, Peter John, Helen Z. Margetts

Recent election surprises, regime changes, and political shocks indicate that political agendas have become more fast-moving and volatile.

Paper
Code

Foreign-language Reviews: Help or Hindrance?

no code implementations • 1 Feb 2017 • Scott A. Hale, Irene Eleta

The number and quality of user reviews greatly affects consumer purchasing decisions.

Paper
Add Code

User Reviews and Language: How Language Influences Ratings

no code implementations • 6 May 2016 • Scott A. Hale

The number of user reviews of tourist attractions, restaurants, mobile apps, etc.

Paper
Add Code

Understanding Editing Behaviors in Multilingual Wikipedia

no code implementations • 28 Aug 2015 • Suin Kim, Sungjoon Park, Scott A. Hale, Sooyoung Kim, Jeongmin Byun, Alice Oh

We study multilingualism by collecting and analyzing a large dataset of the content written by multilingual editors of the English, German, and Spanish editions of Wikipedia.

Paper
Add Code

How much is said in a microblog? A multilingual inquiry based on Weibo and Twitter

no code implementations • 1 Jun 2015 • Han-Teng Liao, King-wa Fu, Scott A. Hale

This paper presents a multilingual study on, per single post of microblog text, (a) how much can be said, (b) how much is written in terms of characters and bytes, and (c) how much is said in terms of information content in posts by different organizations in different languages.

Paper
Add Code

Cross-language Wikipedia Editing of Okinawa, Japan

no code implementations • 4 Jan 2015 • Scott A. Hale

This article analyzes users who edit Wikipedia articles about Okinawa, Japan, in English and Japanese.

Paper
Add Code

Multilinguals and Wikipedia Editing

no code implementations • 3 Dec 2013 • Scott A. Hale

This article analyzes one month of edits to Wikipedia in order to examine the role of users editing multiple language editions (referred to as multilingual users).

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.