While recent studies have focused on quantifying word usage to find the overall shapes of narrative emotional arcs, certain features of narratives within narratives remain to be explored.
We explore the relationship between context and happiness scores in political tweets using word co-occurrence networks, where nodes in the network are the words, and the weight of an edge is the number of tweets in the corpus for which the two connected words co-occur.
Sentiment-aware intelligent systems are essential to a wide array of applications.
Data scientists across disciplines are increasingly in need of exploratory analysis tools for data sets with a high volume of features of mixed data type (quantitative continuous and discrete categorical).
Mental health challenges are thought to afflict around 10% of the global population each year, with many going untreated due to stigma and limited access to services.
Evolving out of a gender-neutral framing of an involuntary celibate identity, the concept of `incels' has come to refer to an online community of men who bear antipathy towards themselves, women, and society-at-large for their perceived inability to find and maintain sexual relationships.
Medical systems in general, and patient treatment decisions and outcomes in particular, are affected by bias based on gender and other demographic elements.
A common task in computational text analyses is to quantify how two corpora differ according to a measurement like word frequency, sentiment, or information content.
In real-time, social media data strongly imprints world events, popular culture, and day-to-day conversations by millions of ordinary people at a scale that is scarcely conventionalized and recorded.
However, the extent to which mortality in a geographical region is a function of socioeconomic factors in both that region and its neighbors is unclear.
Physics and Society Social and Information Networks Applications
We find that for the most common languages on Twitter there is a growing tendency, though not universal, to retweet rather than share new content.
Stretched words like `heellllp' or `heyyyyy' are a regular feature of spoken language, often used to emphasize or exaggerate the underlying meaning of the root word.
We introduce a qualitative, shape-based, timescale-independent time-domain transform used to extract local dynamics from sociotechnical time series---termed the Discrete Shocklet Transform (DST)---and an associated similarity search routine, the Shocklet Transform And Ranking (STAR) algorithm, that indicates time windows during which panels of time series display qualitatively-similar anomalous behavior.
Physics and Society Data Structures and Algorithms Signal Processing Data Analysis, Statistics and Probability
Using the most comprehensive, commercially-available dataset of trading activity in U. S. equity markets, we catalog and analyze quote dislocations between the SIP National Best Bid and Offer (NBBO) and a synthetic BBO constructed from direct feeds.
no code implementations • 13 Feb 2019 • Brian F. Tivnan, David Rushing Dewhurst, Colin M. Van Oort, John H. Ring IV, Tyler J. Gray, Brendan F. Tivnan, Matthew T. K. Koehler, Matthew T. McMahon, David Slater, Jason Veneman, Christopher M. Danforth
Using the most comprehensive source of commercially available data on the US National Market System, we analyze all quotes and trades associated with Dow 30 stocks in 2016 from the vantage point of a single and fixed frame of reference.
With more people living in cities, we are witnessing a decline in exposure to nature.
Conclusions: Social media can provide a positive outlet for patients to discuss their needs and concerns regarding their healthcare coverage and treatment needs.
We find that the extent of verb regularization is greater on Twitter, taken as a whole, than in English Fiction books.
Twitter data and details of depression history were collected from 204 individuals (105 depressed, 99 healthy).
Physics and Society Social and Information Networks
Statistical features were computationally extracted from 43, 950 participant Instagram photos, using color analysis, metadata components, and algorithmic face detection.
Social and Information Networks Physics and Society
Advances in computing power, natural language processing, and digitization of text now make it possible to study a culture's evolution through its texts using a "big data" lens.
Since the shooting of Black teenager Michael Brown by White police officer Darren Wilson in Ferguson, Missouri, the protest hashtag #BlackLivesMatter has amplified critiques of extrajudicial killings of Black Americans.
Identifying and communicating relationships between causes and effects is important for understanding our world, but is affected by language structure, cognitive and emotional biases, and the properties of the communication medium.
The task of text segmentation may be undertaken at many levels in text analysis---paragraphs, sentences, words, or even letters.
The emergence and global adoption of social media has rendered possible the real-time estimation of population-scale sentiment, bearing profound implications for our understanding of human behavior.
no code implementations • 8 Sep 2015 • Nicholas Allgaier, Tobias Banaschewski, Gareth Barker, Arun L. W. Bokde, Josh C. Bongard, Uli Bromberg, Christian Büchel, Anna Cattrell, Patricia J. Conrod, Christopher M. Danforth, Sylvane Desrivières, Peter S. Dodds, Herta Flor, Vincent Frouin, Jürgen Gallinat, Penny Gowland, Andreas Heinz, Bernd Ittermann, Scott Mackey, Jean-Luc Martinot, Kevin Murphy, Frauke Nees, Dimitri Papadopoulos-Orfanos, Luise Poustka, Michael N. Smolka, Henrik Walter, Robert Whelan, Gunter Schumann, Hugh Garavan, IMAGEN Consortium
In the present study, we introduce just such a method, called nonlinear functional mapping (NFM), and demonstrate its application in the analysis of resting state fMRI from a 242-subject subset of the IMAGEN project, a European study of adolescents that includes longitudinal phenotypic, behavioral, genetic, and neuroimaging data.
Twitter, a popular social media outlet, has evolved into a vast source of linguistic data, rich with opinion, sentiment, and discussion.
Of basic interest is the quantification of the long term growth of a language's lexicon as it develops to more completely cover both a culture's communication requirements and knowledge space.
With our predictions we then engage the editorial community of the Wiktionary and propose short lists of potential missing entries for definition, developing a breakthrough, lexical extraction technique, and expanding our knowledge of the defined English lexicon of phrases.
However, the Google Books corpus suffers from a number of limitations which make it an obscure mask of cultural popularity.
Natural languages are full of rules and exceptions.
With Zipf's law being originally and most famously observed for word frequency, it is surprisingly limited in its applicability to human language, holding over no more than three to four orders of magnitude before hitting a clear break in scaling.
no code implementations • 15 Jun 2014 • Peter Sheridan Dodds, Eric M. Clark, Suma Desu, Morgan R. Frank, Andrew J. Reagan, Jake Ryland Williams, Lewis Mitchell, Kameron Decker Harris, Isabel M. Kloumann, James P. Bagrow, Karine Megerdoomian, Matthew T. McMahon, Brian F. Tivnan, Christopher M. Danforth
Using human evaluation of 100, 000 words spread across 24 corpora in 10 languages diverse in origin and culture, we present evidence of a deep imprint of human sociality in language, observing that (1) the words of natural human language possess a universal positivity bias; (2) the estimated emotional content of words is consistent between languages under translation; and (3) this positivity bias is strongly independent of frequency of word usage.