Using Data Compression to Build a Method for Statistically Verified Attribution of Literary Texts

Entropy (Basel). 2021 Oct 3;23(10):1302. doi: 10.3390/e23101302.

Abstract

We consider the problems of the authorship of literary texts in the framework of the quantitative study of literature. This article proposes a methodology for authorship attribution of literary texts based on the use of data compressors. Unlike other methods, the suggested one gives a possibility to make statistically verified results. This method is used to solve two problems of attribution in Russian literature.

Keywords: authorship attribution of literary texts; data compression; hypothesis testing; quantitative study of literature.