On the nature of long-range letter correlations in texts

Dmitrii Y. Manin
Arxiv ID: 809.0103Last updated: 11/27/2016
The origin of long-range letter correlations in natural texts is studied using random walk analysis and Jensen-Shannon divergence. It is concluded that they result from slow variations in letter frequency distribution, which are a consequence of slow variations in lexical composition within the text. These correlations are preserved by random letter shuffling within a moving window. As such, they do reflect structural properties of the text, but in a very indirect manner.

PaperStudio AI Chat

I'm your research assistant! Ask me anything about this paper.

Related papers

Commercial Disclosure
© 2023 Paper Studio™. All Rights Reserved.