Search Results for author: David Bamman

Found 34 papers, 22 papers with code

Gender and Representation Bias in GPT-3 Generated Stories

1 code implementation • NAACL (NUSE) 2021 • Li Lucy, David Bamman

Using topic modeling and lexicon-based word similarity, we find that stories generated by GPT-3 exhibit many known gender stereotypes.

Word Similarity

Paper
Code

Narrative Theory for Computational Narrative Understanding

no code implementations • EMNLP 2021 • Andrew Piper, Richard Jean So, David Bamman

Over the past decade, the field of natural language processing has developed a wide array of computational methods for reasoning about narrative, including summarization, commonsense inference, and event detection.

Event Detection Position

Paper
Add Code

AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters

1 code implementation • 12 Jan 2024 • Li Lucy, Suchin Gururangan, Luca Soldaini, Emma Strubell, David Bamman, Lauren Klein, Jesse Dodge

Large language models' (LLMs) abilities are drawn from their pretraining data, and model development begins with data curation.

Language Identification

Paper
Code

Social Meme-ing: Measuring Linguistic Variation in Memes

1 code implementation • 15 Nov 2023 • Naitian Zhou, David Jurgens, David Bamman

Much work in the space of NLP has used computational methods to explore sociolinguistic variation in text.

Paper
Code

Grounding Characters and Places in Narrative Texts

1 code implementation • 27 May 2023 • Sandeep Soni, Amanpreet Sihra, Elizabeth F. Evans, Matthew Wilkens, David Bamman

Tracking characters and locations throughout a story can help improve the understanding of its plot structure.

Paper
Code

Dramatic Conversation Disentanglement

1 code implementation • 26 May 2023 • Kent K. Chang, Danica Chen, David Bamman

We present a new dataset for studying conversation disentanglement in movies and TV series.

Conversation Disentanglement Disentanglement +1

Paper
Code

Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4

1 code implementation • 28 Apr 2023 • Kent K. Chang, Mackenzie Cramer, Sandeep Soni, David Bamman

In this work, we carry out a data archaeology to infer books that are known to ChatGPT and GPT-4 using a name cloze membership inference query.

Memorization

Paper
Code

Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications

1 code implementation • 19 Dec 2022 • Li Lucy, Jesse Dodge, David Bamman, Katherine A. Keith

Scholarly text is often laden with jargon, or specialized language that can facilitate efficient in-group communication within fields but hinder understanding for out-groups.

Word Sense Induction

Paper
Code

Predicting Long-Term Citations from Short-Term Linguistic Influence

1 code implementation • 24 Oct 2022 • Sandeep Soni, David Bamman, Jacob Eisenstein

A standard measure of the influence of a research paper is the number of times it is cited.

Paper
Code

Discovering Differences in the Representation of People using Contextualized Semantic Axes

1 code implementation • 21 Oct 2022 • Li Lucy, Divya Tadimeti, David Bamman

A common paradigm for identifying semantic differences across social and temporal contexts is the use of static word embeddings and their distances.

Word Embeddings

Paper
Code

Characterizing English Variation across Social Media Communities with BERT

1 code implementation • 12 Feb 2021 • Li Lucy, David Bamman

Much previous work characterizing language variation across Internet social groups has focused on the types of words used by these groups.

Specificity

Paper
Code

Attending to Long-Distance Document Context for Sequence Labeling

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Matthew J{\"o}rke, Jon Gillick, Matthew Sims, David Bamman

We present in this work a method for incorporating global context in long documents when making local decisions in sequence labeling problems like NER.

NER

Paper
Code

Latin BERT: A Contextual Language Model for Classical Philology

1 code implementation • 21 Sep 2020 • David Bamman, Patrick J. Burns

We present Latin BERT, a contextual language model for the Latin language, trained on 642. 7 million words from a variety of sources spanning the Classical era to the 21st century.

Language Modelling Part-Of-Speech Tagging +2

Paper
Code

Measuring Information Propagation in Literary Social Networks

3 code implementations • EMNLP 2020 • Matthew Sims, David Bamman

We present the task of modeling information propagation in literature, in which we seek to identify pieces of information passing from character A to character B to character C, only given a description of their activity in text.

761

Paper
Code

Breaking Speech Recognizers to Imagine Lyrics

1 code implementation • 15 Dec 2019 • Jon Gillick, David Bamman

We introduce a new method for generating text, and in particular song lyrics, based on the speech-like acoustic qualities of a given audio file.

Paper
Code

An Annotated Dataset of Coreference in English Literature

3 code implementations • LREC 2020 • David Bamman, Olivia Lewke, Anya Mansoor

We present in this work a new dataset of coreference annotations for works of literature in English, covering 29, 103 mentions in 210, 532 tokens from 100 works of fiction.

761

Paper
Code

Literary Event Detection

2 code implementations • ACL 2019 • Matthew Sims, Jong Ho Park, David Bamman

In this work we present a new dataset of literary events{---}events that are depicted as taking place within the imagined space of a novel.

Event Detection

331

Paper
Code

An annotated dataset of literary entities

2 code implementations • NAACL 2019 • David Bamman, Sejal Popat, Sheng Shen

We present a new dataset comprised of 210, 532 tokens evenly drawn from 100 different English-language literary texts annotated for ACE entity categories (person, location, geo-political entity, facility, organization, and vehicle).

331

Paper
Code

Learning to Groove with Inverse Sequence Transformations

1 code implementation • 14 May 2019 • Jon Gillick, Adam Roberts, Jesse Engel, Douglas Eck, David Bamman

We explore models for translating abstract musical ideas (scores, rhythms) into expressive performances using Seq2Seq and recurrent Variational Information Bottleneck (VIB) models.

Generative Adversarial Network Quantization

Paper
Code

Telling Stories with Soundtracks: An Empirical Analysis of Music in Film

no code implementations • WS 2018 • Jon Gillick, David Bamman

Soundtracks play an important role in carrying the story of a film.

Image Captioning Question Answering

Paper
Add Code

Please Clap: Modeling Applause in Campaign Speeches

no code implementations • NAACL 2018 • Jon Gillick, David Bamman

This work examines the rhetorical techniques that speakers employ during political campaigns.

Paper
Add Code

Adversarial Training for Relation Extraction

no code implementations • EMNLP 2017 • Yi Wu, David Bamman, Stuart Russell

Adversarial training is a mean of regularizing classification algorithms by generating adversarial noise to the training data.

General Classification Image Classification +4

Paper
Add Code

The Labeled Segmentation of Printed Books

no code implementations • EMNLP 2017 • Lara McConnaughey, Jennifer Dai, David Bamman

We introduce the task of book structure labeling: segmenting and assigning a fixed category (such as Table of Contents, Preface, Index) to the document structure of printed books.

Optical Character Recognition (OCR) Segmentation

Paper
Add Code

Beyond Canonical Texts: A Computational Analysis of Fanfiction

1 code implementation • EMNLP 2016 • Smitha Milli, David Bamman

Paper
Code

Annotating Character Relationships in Literary Texts

no code implementations • 2 Dec 2015 • Philip Massey, Patrick Xia, David Bamman, Noah A. Smith

We present a dataset of manually annotated relationships between characters in literary texts, in order to support the training and evaluation of automatic methods for relation type prediction in this domain (Makazhanov et al., 2014; Kokkinakis, 2013) and the broader computational analysis of literary character (Elson et al., 2010; Bamman et al., 2014; Vala et al., 2015; Flekova and Gurevych, 2015).

Type prediction

Paper
Add Code

Open Extraction of Fine-Grained Political Statements

no code implementations • EMNLP 2015 • David Bamman, Noah A. Smith

Open Information Extraction Slot Filling +1

Paper
Add Code

CMU: Arc-Factored, Discriminative Semantic Dependency Parsing

no code implementations • SEMEVAL 2014 • Sam Thomson, Brendan O{'}Connor, Jeffrey Flanigan, David Bamman, Jesse Dodge, Swabha Swayamdipta, Nathan Schneider, Chris Dyer, Noah A. Smith

Dependency Parsing Knowledge Base Population +2

Paper
Add Code

Distributed Representations of Geographically Situated Language

no code implementations • ACL 2014 • David Bamman, Chris Dyer, Noah A. Smith

Representation Learning Semantic Textual Similarity

Paper
Add Code

A Bayesian Mixed Effects Model of Literary Character

no code implementations • ACL 2014 • David Bamman, Ted Underwood, Noah A. Smith

Coreference Resolution

Paper
Add Code

Unsupervised Discovery of Biographical Structure from Text

no code implementations • TACL 2014 • David Bamman, Noah A. Smith

We present a method for discovering abstract event classes in biographies, based on a probabilistic latent-variable model.

Paper
Add Code

Learning Latent Personas of Film Characters

1 code implementation • ACL 2013 • David Bamman, Brendan O{'}Connor, Noah A. Smith

Paper
Code

A framework for (under)specifying dependency syntax without overloading annotators

1 code implementation • WS 2013 • Nathan Schneider, Brendan O'Connor, Naomi Saphra, David Bamman, Manaal Faruqui, Noah A. Smith, Chris Dyer, Jason Baldridge

We introduce a framework for lightweight dependency syntax annotation.

Paper
Code

New Alignment Methods for Discriminative Book Summarization

no code implementations • 6 May 2013 • David Bamman, Noah A. Smith

We consider the unsupervised alignment of the full text of a book with a human-written summary.

Book summarization

Paper
Add Code

Gender identity and lexical variation in social media

1 code implementation • 16 Oct 2012 • David Bamman, Jacob Eisenstein, Tyler Schnoebelen

Examining individuals whose language does not match the classifier's model for their gender, we find that they have social networks that include significantly fewer same-gender social connections and that, in general, social network homophily is correlated with the use of same-gender language markers.

Clustering

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.