On the nature of long-range letter correlations in texts
Dmitrii Y. Manin
Arxiv ID: 809.0103•Last updated: 11/27/2016
The origin of long-range letter correlations in natural texts is studied using random walk analysis and Jensen-Shannon divergence. It is concluded that they result from slow variations in letter frequency distribution, which are a consequence of slow variations in lexical composition within the text. These correlations are preserved by random letter shuffling within a moving window. As such, they do reflect structural properties of the text, but in a very indirect manner.
PaperStudio AI Chat
I'm your research assistant! Ask me anything about this paper.