Search Results for author: David Bamman

Found 34 papers, 22 papers with code

Gender and Representation Bias in GPT-3 Generated Stories

1 code implementation NAACL (NUSE) 2021 Li Lucy, David Bamman

Using topic modeling and lexicon-based word similarity, we find that stories generated by GPT-3 exhibit many known gender stereotypes.

Word Similarity

Narrative Theory for Computational Narrative Understanding

no code implementations EMNLP 2021 Andrew Piper, Richard Jean So, David Bamman

Over the past decade, the field of natural language processing has developed a wide array of computational methods for reasoning about narrative, including summarization, commonsense inference, and event detection.

Event Detection Position

Social Meme-ing: Measuring Linguistic Variation in Memes

1 code implementation15 Nov 2023 Naitian Zhou, David Jurgens, David Bamman

Much work in the space of NLP has used computational methods to explore sociolinguistic variation in text.

Grounding Characters and Places in Narrative Texts

1 code implementation27 May 2023 Sandeep Soni, Amanpreet Sihra, Elizabeth F. Evans, Matthew Wilkens, David Bamman

Tracking characters and locations throughout a story can help improve the understanding of its plot structure.

Dramatic Conversation Disentanglement

1 code implementation26 May 2023 Kent K. Chang, Danica Chen, David Bamman

We present a new dataset for studying conversation disentanglement in movies and TV series.

Conversation Disentanglement Disentanglement +1

Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4

1 code implementation28 Apr 2023 Kent K. Chang, Mackenzie Cramer, Sandeep Soni, David Bamman

In this work, we carry out a data archaeology to infer books that are known to ChatGPT and GPT-4 using a name cloze membership inference query.

Memorization

Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications

1 code implementation19 Dec 2022 Li Lucy, Jesse Dodge, David Bamman, Katherine A. Keith

Scholarly text is often laden with jargon, or specialized language that can facilitate efficient in-group communication within fields but hinder understanding for out-groups.

Word Sense Induction

Predicting Long-Term Citations from Short-Term Linguistic Influence

1 code implementation24 Oct 2022 Sandeep Soni, David Bamman, Jacob Eisenstein

A standard measure of the influence of a research paper is the number of times it is cited.

Discovering Differences in the Representation of People using Contextualized Semantic Axes

1 code implementation21 Oct 2022 Li Lucy, Divya Tadimeti, David Bamman

A common paradigm for identifying semantic differences across social and temporal contexts is the use of static word embeddings and their distances.

Word Embeddings

Characterizing English Variation across Social Media Communities with BERT

1 code implementation12 Feb 2021 Li Lucy, David Bamman

Much previous work characterizing language variation across Internet social groups has focused on the types of words used by these groups.

Specificity

Attending to Long-Distance Document Context for Sequence Labeling

1 code implementation Findings of the Association for Computational Linguistics 2020 Matthew J{\"o}rke, Jon Gillick, Matthew Sims, David Bamman

We present in this work a method for incorporating global context in long documents when making local decisions in sequence labeling problems like NER.

NER

Latin BERT: A Contextual Language Model for Classical Philology

1 code implementation21 Sep 2020 David Bamman, Patrick J. Burns

We present Latin BERT, a contextual language model for the Latin language, trained on 642. 7 million words from a variety of sources spanning the Classical era to the 21st century.

Language Modelling Part-Of-Speech Tagging +2

Measuring Information Propagation in Literary Social Networks

3 code implementations EMNLP 2020 Matthew Sims, David Bamman

We present the task of modeling information propagation in literature, in which we seek to identify pieces of information passing from character A to character B to character C, only given a description of their activity in text.

Breaking Speech Recognizers to Imagine Lyrics

1 code implementation15 Dec 2019 Jon Gillick, David Bamman

We introduce a new method for generating text, and in particular song lyrics, based on the speech-like acoustic qualities of a given audio file.

An Annotated Dataset of Coreference in English Literature

3 code implementations LREC 2020 David Bamman, Olivia Lewke, Anya Mansoor

We present in this work a new dataset of coreference annotations for works of literature in English, covering 29, 103 mentions in 210, 532 tokens from 100 works of fiction.

Literary Event Detection

2 code implementations ACL 2019 Matthew Sims, Jong Ho Park, David Bamman

In this work we present a new dataset of literary events{---}events that are depicted as taking place within the imagined space of a novel.

Event Detection

An annotated dataset of literary entities

2 code implementations NAACL 2019 David Bamman, Sejal Popat, Sheng Shen

We present a new dataset comprised of 210, 532 tokens evenly drawn from 100 different English-language literary texts annotated for ACE entity categories (person, location, geo-political entity, facility, organization, and vehicle).

Learning to Groove with Inverse Sequence Transformations

1 code implementation14 May 2019 Jon Gillick, Adam Roberts, Jesse Engel, Douglas Eck, David Bamman

We explore models for translating abstract musical ideas (scores, rhythms) into expressive performances using Seq2Seq and recurrent Variational Information Bottleneck (VIB) models.

Generative Adversarial Network Quantization

Please Clap: Modeling Applause in Campaign Speeches

no code implementations NAACL 2018 Jon Gillick, David Bamman

This work examines the rhetorical techniques that speakers employ during political campaigns.

Adversarial Training for Relation Extraction

no code implementations EMNLP 2017 Yi Wu, David Bamman, Stuart Russell

Adversarial training is a mean of regularizing classification algorithms by generating adversarial noise to the training data.

General Classification Image Classification +4

The Labeled Segmentation of Printed Books

no code implementations EMNLP 2017 Lara McConnaughey, Jennifer Dai, David Bamman

We introduce the task of book structure labeling: segmenting and assigning a fixed category (such as Table of Contents, Preface, Index) to the document structure of printed books.

Optical Character Recognition (OCR) Segmentation

Annotating Character Relationships in Literary Texts

no code implementations2 Dec 2015 Philip Massey, Patrick Xia, David Bamman, Noah A. Smith

We present a dataset of manually annotated relationships between characters in literary texts, in order to support the training and evaluation of automatic methods for relation type prediction in this domain (Makazhanov et al., 2014; Kokkinakis, 2013) and the broader computational analysis of literary character (Elson et al., 2010; Bamman et al., 2014; Vala et al., 2015; Flekova and Gurevych, 2015).

Type prediction

Unsupervised Discovery of Biographical Structure from Text

no code implementations TACL 2014 David Bamman, Noah A. Smith

We present a method for discovering abstract event classes in biographies, based on a probabilistic latent-variable model.

New Alignment Methods for Discriminative Book Summarization

no code implementations6 May 2013 David Bamman, Noah A. Smith

We consider the unsupervised alignment of the full text of a book with a human-written summary.

Book summarization

Gender identity and lexical variation in social media

1 code implementation16 Oct 2012 David Bamman, Jacob Eisenstein, Tyler Schnoebelen

Examining individuals whose language does not match the classifier's model for their gender, we find that they have social networks that include significantly fewer same-gender social connections and that, in general, social network homophily is correlated with the use of same-gender language markers.

Clustering

Cannot find the paper you are looking for? You can Submit a new open access paper.