no code implementations • 22 Apr 2024 • Víctor Franco-Sánchez, Arnau Martí-Llobet, Ramon Ferrer-i-Cancho
We investigate if the frequency of the $n!$ possible orders is constrained by two principles.
no code implementations • 15 Feb 2024 • Ramon Ferrer-i-Cancho
The principle of syntactic dependency distance minimization is in conflict with the principle of surprisal minimization (or predictability maximization) in single head syntactic dependency structures: while the former predicts that the head should be placed at the center of the linear arrangement, the latter predicts that the head should be placed at one of the ends (either first or last).
no code implementations • 7 Dec 2023 • Ramon Ferrer-i-Cancho, Savithry Namboodiripad
We test the prediction in three flexible order SOV languages: Korean (Koreanic), Malayalam (Dravidian), and Sinhalese (Indo-European).
no code implementations • 11 Oct 2023 • Stuart Semple, Ramon Ferrer-i-Cancho, Morgan L. Gustison
Linguistic laws, the common statistical patterns of human language, have been investigated by quantitative linguists for nearly a century.
no code implementations • 17 Mar 2023 • Sonia Petrini, Antoni Casas-i-Muñoz, Jordi Cluet-i-Martinell, Mengxue Wang, Chris Bentz, Ramon Ferrer-i-Cancho
Zipf's law of abbreviation, the tendency of more frequent words to be shorter, is one of the most solid candidates for a linguistic universal, in the sense that it has the potential for being exceptionless or with a number of exceptions that is vanishingly small compared to the number of languages on Earth.
1 code implementation • 26 Nov 2022 • Sonia Petrini, Ramon Ferrer-i-Cancho
The syntactic structure of a sentence can be represented as a graph where vertices are words and edges indicate syntactic dependencies between them.
2 code implementations • 22 Aug 2022 • Sonia Petrini, Antoni Casas-i-Muñoz, Jordi Cluet-i-Martinell, Mengxue Wang, Christian Bentz, Ramon Ferrer-i-Cancho
Zipf's law of abbreviation, namely the tendency of more frequent words to be shorter, has been viewed as a manifestation of compression, i. e. the minimization of the length of forms -- a universal principle of natural communication.
no code implementations • 12 Jul 2022 • Lluís Alemany-Puig, Ramon Ferrer-i-Cancho
In the domain of applications, we derive a $O(n)$-time algorithm to calculate the expected value of the sum of edge lengths.
no code implementations • 14 Jun 2022 • Lluís Alemany-Puig, Juan Luis Esteban, Ramon Ferrer-i-Cancho
In the projective variant for rooted trees, arrangements have to be planar and the root of the tree cannot be covered by any edge.
1 code implementation • 5 Dec 2021 • Lluís Alemany-Puig, Juan Luis Esteban, Ramon Ferrer-i-Cancho
One of the main concerns in this field is the statistical patterns of syntactic dependency structures.
no code implementations • Quasy (SyntaxFest) 2021 • Ramon Ferrer-i-Cancho, Carlos Gómez-Rodríguez
Dependency distance minimization (DDm) is a well-established principle of word order.
no code implementations • CL (ACL) 2022 • Lluís Alemany-Puig, Ramon Ferrer-i-Cancho
Thus far, the expectation of the sum of dependency distances in random projective shufflings of a sentence has been estimated approximately with a Monte Carlo procedure whose cost is of the order of $Rn$, where $n$ is the number of words of the sentence and $R$ is the number of samples; it is well known that the larger $R$, the lower the error of the estimation but the larger the time cost.
no code implementations • 24 May 2021 • David Carrera-Casado, Ramon Ferrer-i-Cancho
However, the information theoretic model employed in that research neither explains the weakening of that vocabulary learning bias in older children or polylinguals nor reproduces Zipf's meaning-frequency law, namely the non-linear relationship between the number of meanings of a word and its frequency.
no code implementations • 5 Feb 2021 • Lluís Alemany-Puig, Juan Luis Esteban, Ramon Ferrer-i-Cancho
Gildea and Temperley (GT) sketched an algorithm for projective arrangements which they claimed runs in $O(n)$ but did not provide any justification of its cost.
2 code implementations • 30 Jul 2020 • Ramon Ferrer-i-Cancho, Carlos Gómez-Rodríguez, Juan Luis Esteban, Lluís Alemany-Puig
Here we recast the problem of the optimality of the word order of a sentence as an optimization problem on a spatial network where the vertices are words, arcs indicate syntactic dependencies and the space is defined by the linear order of the words in the sentence.
no code implementations • 24 Jun 2020 • Ramon Ferrer-i-Cancho, Carlos Gómez-Rodríguez, Juan Luis Esteban
A fundamental problem in network science is the normalization of the topological or physical distance between vertices, that requires understanding the range of variation of the unnormalized distances.
no code implementations • 6 Mar 2020 • Lluís Alemany-Puig, Ramon Ferrer-i-Cancho
The crossing number of a graph $G$, $\mathrm{cr}(G)$, is the minimum number of edge crossings arising when drawing a graph on a certain surface.
Computation Discrete Mathematics Data Structures and Algorithms Combinatorics
no code implementations • 19 Aug 2019 • Carlos Gómez-Rodríguez, Morten H. Christiansen, Ramon Ferrer-i-Cancho
The ability to produce and understand an unlimited number of different sentences is a hallmark of human language.
no code implementations • 13 Jun 2019 • Ramon Ferrer-i-Cancho, Carlos Gómez-Rodríguez
Dependency distance minimization (DDm) is a word order principle favouring the placement of syntactically related words close to each other in sentences.
no code implementations • 4 Jun 2019 • Ramon Ferrer-i-Cancho, Christian Bentz, Caio Seguin
Here we consider the problem of optimal coding -- under an arbitrary coding scheme -- and show that it predicts Zipf's law of abbreviation, namely a tendency in natural languages for more frequent words to be shorter.
no code implementations • 27 Mar 2019 • Bernardino Casas, Antoni Hernández-Fernández, Neus Català, Ramon Ferrer-i-Cancho, Jaume Baixeries
The pioneering research of G. K. Zipf on the relationship between word frequency and other word features led to the formulation of various linguistic laws.
no code implementations • 30 Dec 2017 • Ramon Ferrer-i-Cancho, Michael S. Vitevitch
In his pioneering research, G. K. Zipf observed that more frequent words tend to have more meanings, and showed that the number of meanings of a word grows as the square root of its frequency.
no code implementations • 24 Aug 2017 • Xinying Chen, Carlos Gómez-Rodríguez, Ramon Ferrer-i-Cancho
A comment on "Neurophysiological dynamics of phrase-structure building during sentence processing" by Nelson et al (2017), Proceedings of the National Academy of Sciences USA 114(18), E3669-E3678.
no code implementations • 26 Jul 2017 • Antoni Lozano, Bernardino Casas, Chris Bentz, Ramon Ferrer-i-Cancho
Entropy is a fundamental property of a repertoire.
no code implementations • 15 Jun 2017 • Ramon Ferrer-i-Cancho
Comment on "Dependency distance: a new perspective on syntactic patterns in natural language" by Haitao Liu et al
no code implementations • 28 May 2017 • Ramon Ferrer-i-Cancho
The minimization of the length of syntactic dependencies is a well-established principle of word order and the basis of a mathematical theory of word order.
no code implementations • 24 Mar 2017 • Ramon Ferrer-i-Cancho, Carlos Gomez-Rodriguez, J. L. Esteban
It has been claimed recurrently that the number of edge crossings in real sentences is small.
no code implementations • 27 Nov 2016 • Bernardino Casas, Neus Català, Ramon Ferrer-i-Cancho, Antoni Hernández-Fernández, Jaume Baixeries
However, such preference could be a side-effect of another bias: the preference of children for nouns in combination with the lower polysemy of nouns with respect to other part-of-speech categories.
no code implementations • 18 Oct 2016 • Antoni Hernández-Fernández, Ramon Ferrer-i-Cancho
Surprisingly, a double Zipf (a Zipf distribution with two regimes with a different exponent each) is the model yielding the best fit although it is the function with the largest number of parameters.
no code implementations • 4 May 2016 • Ramon Ferrer-i-Cancho
Here we sketch a new derivation of Zipf's law for word frequencies based on optimal coding.
no code implementations • 13 Jan 2016 • Carlos Gómez-Rodríguez, Ramon Ferrer-i-Cancho
The structure of a sentence can be represented as a network where vertices are words and edges indicate syntactic dependencies.
no code implementations • 17 Dec 2015 • Ramon Ferrer-i-Cancho
Word order evolution has been hypothesized to be constrained by a word order permutation ring: transitions involving orders that are closer in the permutation ring are more likely.
no code implementations • 9 Sep 2015 • Ramon Ferrer-i-Cancho, Carlos Gómez-Rodríguez
A commentary on the article "Large-scale evidence of dependency length minimization in 37 languages" by Futrell, Mahowald & Gibson (PNAS 2015 112 (33) 10336-10341).
no code implementations • 5 Sep 2015 • Ramon Ferrer-i-Cancho
In a recent article, Christiansen and Chater (2016) present a fundamental constraint on language, i. e. a now-or-never bottleneck that arises from our fleeting memory, and explore its implications, e. g., chunk-and-pass processing, outlining a framework that promises to unify different areas of research.
no code implementations • 26 Aug 2015 • Ramon Ferrer-i-Cancho, Carlos Gómez-Rodríguez
The syntactic structure of sentences exhibits a striking regularity: dependencies tend to not cross when drawn above the sentence.
no code implementations • 22 Dec 2014 • Ramon Ferrer-i-Cancho
Here we respond to some comments by Alday concerning headedness in linguistic theory and the validity of the assumptions of a mathematical model for word order.
no code implementations • 8 Dec 2014 • Ramon Ferrer-i-Cancho
A family of information theoretic models of communication was introduced more than a decade ago to explain the origins of Zipf's law for word frequencies.
no code implementations • 10 Nov 2014 • Ramon Ferrer-i-Cancho
The use of null hypotheses (in a statistical sense) is common in hard sciences but not in theoretical linguistics.
no code implementations • 20 Oct 2014 • Ramon Ferrer-i-Cancho
That null hypothesis takes into account the length of the pair of edges that may cross and predicts the relative number of crossings in random trees with a small error, suggesting that a ban of crossings or a principle of minimization of crossings are not needed in general to explain the origins of non-crossing dependencies.
no code implementations • 25 Sep 2014 • Ramon Ferrer-i-Cancho
According to Zipf's meaning-frequency law, words that are more frequent tend to have more meanings.
no code implementations • 8 Aug 2014 • Ramon Ferrer-i-Cancho
Comment on "Approaching human language with complex networks" by Cong & Liu
no code implementations • 31 Jul 2014 • Alvaro Corral, Gemma Boleda, Ramon Ferrer-i-Cancho
In all cases Zipf's law is fulfilled, in the sense that a power-law distribution of word or lemma frequencies is valid for several orders of magnitude.
no code implementations • 22 Oct 2013 • Ramon Ferrer-i-Cancho
First, it is shown that this bias is a particular case of the maximization of mutual information between words and meanings.
no code implementations • 16 Sep 2013 • Ramon Ferrer-i-Cancho
Little is known about why SOV order is initially preferred and then discarded or recovered.
no code implementations • 8 Sep 2013 • Ramon Ferrer-i-Cancho
However, how that length is translated into cognitive cost is not known.
no code implementations • 20 May 2013 • Ramon Ferrer-i-Cancho
We show that this number depends only on the number of vertices of the dependency tree (the sentence length) and the second moment about zero of vertex degrees.
no code implementations • 27 Apr 2013 • Ramon Ferrer-i-Cancho, Łukasz Dębowski, Fermín Moscoso del Prado Martín
We show that constant entropy rate (CER) and two interpretations for uniform information density (UID), full UID and strong UID, are inconsistent with these laws.
no code implementations • 15 Apr 2013 • Ramon Ferrer-i-Cancho
Hubiness (the variance of degrees) plays a central role: the mean dependency length is bounded below by hubiness while the number of crossings is bounded above by hubiness.
no code implementations • 13 Apr 2013 • Ramon Ferrer-i-Cancho, Haitao Liu
However, the empirical distribution of dependency lengths of sentences of the same length differs from that of sentences of varying length and the distribution of dependency lengths depends on sentence length for real sentences and also under the null hypothesis that dependencies connect vertices located in random positions of the sequence.