Universal Complex Structures in Written Language

Alvaro Corral (1), Ramon Ferrer-i-Cancho (2), Gemma Boleda (2), Albert Diaz-Guilera (3). ((1) Centre de Recerca Matematica, (2) U Politecnica Catalunya, (3) U Barcelona)
Arxiv ID: 901.2924Last updated: 1/21/2009
Quantitative linguistics has provided us with a number of empirical laws that characterise the evolution of languages and competition amongst them. In terms of language usage, one of the most influential results is Zipf's law of word frequencies. Zipf's law appears to be universal, and may not even be unique to human language. However, there is ongoing controversy over whether Zipf's law is a good indicator of complexity. Here we present an alternative approach that puts Zipf's law in the context of critical phenomena (the cornerstone of complexity in physics) and establishes the presence of a large scale "attraction" between successive repetitions of words. Moreover, this phenomenon is scale-invariant and universal -- the pattern is independent of word frequency and is observed in texts by different authors and written in different languages. There is evidence, however, that the shape of the scaling relation changes for words that play a key role in the text, implying the existence of different "universality classes" in the repetition of words. These behaviours exhibit striking parallels with complex catastrophic phenomena.

