no code implementations • 18 Nov 2023 • Haoran Zhao, Jake Ryland Williams
While Large Language Models (LLMs) become ever more dominant, classic pre-trained word embeddings sustain their relevance through computational efficiency and nuanced linguistic interpretation.
no code implementations • 13 Nov 2023 • Jake Ryland Williams, Haoran Zhao
We will discuss a general result about feed-forward neural networks and then extend this solution to compositional (mult-layer) networks, which are applied to a simplified transformer block containing feed-forward and self-attention layers.
no code implementations • 13 Nov 2023 • Jake Ryland Williams, Haoran Zhao
Iterative differential approximation methods that rely upon backpropagation have enabled the optimization of neural networks; however, at present, they remain computationally expensive, especially when training models at scale.
no code implementations • 9 May 2022 • Hunter Scott Heidenreich, Jake Ryland Williams
In this work, we present a naive initialization scheme for word vectors based on a dense, independent co-occurrence model and provide preliminary results that suggest it is competitive and warrants further investigation.
no code implementations • 30 Apr 2022 • Jake Ryland Williams, Hunter Scott Heidenreich
However, we use the solution to demonstrate a seemingly-universal existence of a property that word vectors exhibit and which allows for the prophylactic discernment of biases in data -- prior to their absorption by DL models.
no code implementations • 6 Aug 2020 • Jake Ryland Williams, Diana Solano-Oropeza, Jacob R. Hunsberger
We provide a general analytic solution to Herbert Simon's 1955 model for time-evolving novelty functions.
no code implementations • 20 Oct 2017 • Jason Anastasopoulos, Jake Ryland Williams
We demonstrate how these methods can be used diagnostically-by researchers, government officials and the public-to understand peaceful and violent collective action at very fine-grained levels of time and geography.
1 code implementation • 20 Oct 2017 • Jake Ryland Williams, Giovanni C. Santia
We offer a resolution to these issues by exhibiting how the dark matter of word segmentation, i. e., space, punctuation, etc., connect the Zipf-Mandelbrot law to Simon's mechanistic process.
1 code implementation • 5 Aug 2016 • Jake Ryland Williams
This work presents a fine-grained, text-chunking algorithm designed for the task of multiword expressions (MWEs) segmentation.
no code implementations • 29 Jan 2016 • Jake Ryland Williams, James P. Bagrow, Andrew J. Reagan, Sharon E. Alajajian, Christopher M. Danforth, Peter Sheridan Dodds
The task of text segmentation may be undertaken at many levels in text analysis---paragraphs, sentences, words, or even letters.
2 code implementations • 2 Dec 2015 • Andrew J. Reagan, Brian Tivnan, Jake Ryland Williams, Christopher M. Danforth, Peter Sheridan Dodds
The emergence and global adoption of social media has rendered possible the real-time estimation of population-scale sentiment, bearing profound implications for our understanding of human behavior.
no code implementations • 17 May 2015 • Eric M. Clark, Jake Ryland Williams, Chris A. Jones, Richard A. Galbraith, Christopher M. Danforth, Peter Sheridan Dodds
Twitter, a popular social media outlet, has evolved into a vast source of linguistic data, rich with opinion, sentiment, and discussion.
no code implementations • 7 Mar 2015 • Jake Ryland Williams, Eric M. Clark, James P. Bagrow, Christopher M. Danforth, Peter Sheridan Dodds
With our predictions we then engage the editorial community of the Wiktionary and propose short lists of potential missing entries for definition, developing a breakthrough, lexical extraction technique, and expanding our knowledge of the defined English lexicon of phrases.
no code implementations • 12 Sep 2014 • Jake Ryland Williams, James P. Bagrow, Christopher M. Danforth, Peter Sheridan Dodds
Natural languages are full of rules and exceptions.
no code implementations • 19 Jun 2014 • Jake Ryland Williams, Paul R. Lessard, Suma Desu, Eric Clark, James P. Bagrow, Christopher M. Danforth, Peter Sheridan Dodds
With Zipf's law being originally and most famously observed for word frequency, it is surprisingly limited in its applicability to human language, holding over no more than three to four orders of magnitude before hitting a clear break in scaling.
no code implementations • 15 Jun 2014 • Peter Sheridan Dodds, Eric M. Clark, Suma Desu, Morgan R. Frank, Andrew J. Reagan, Jake Ryland Williams, Lewis Mitchell, Kameron Decker Harris, Isabel M. Kloumann, James P. Bagrow, Karine Megerdoomian, Matthew T. McMahon, Brian F. Tivnan, Christopher M. Danforth
Using human evaluation of 100, 000 words spread across 24 corpora in 10 languages diverse in origin and culture, we present evidence of a deep imprint of human sociality in language, observing that (1) the words of natural human language possess a universal positivity bias; (2) the estimated emotional content of words is consistent between languages under translation; and (3) this positivity bias is strongly independent of frequency of word usage.