Measuring and Modeling Language Change

NAACL 2019 · Jacob Eisenstein ·

This tutorial is designed to help researchers answer the following sorts of questions: - Are people happier on the weekend? - What was 1861{'}s word of the year? - Are Democrats and Republicans more different than ever? - When did {``}gay{''} stop meaning {``}happy{''}? - Are gender stereotypes getting weaker, stronger, or just different? - Who is a linguistic leader? - How can we get internet users to be more polite and objective? Such questions are fundamental to the social sciences and humanities, and scholars in these disciplines are increasingly turning to computational techniques for answers. Meanwhile, the ACL community is increasingly engaged with data that varies across time, and with the social insights that can be offered by analyzing temporal patterns and trends. The purpose of this tutorial is to facilitate this convergence in two main ways: 1. By synthesizing recent computational techniques for handling and modeling temporal data, such as dynamic word embeddings, the tutorial will provide a starting point for future computational research. It will also identify useful tools for social scientists and digital humanities scholars. 2. The tutorial will provide an overview of techniques and datasets from the quantitative social sciences and the digital humanities, which are not well-known in the computational linguistics community. These techniques include vector autoregressive models, multiple comparisons corrections for hypothesis testing, and causal inference. Datasets include historical newspaper archives and corpora of contemporary political speech.

PDF Abstract