CRIE: An automated analyzer for Chinese texts

Behav Res Methods. 2016 Dec;48(4):1238-1251. doi: 10.3758/s13428-015-0649-1.

Abstract

Textual analysis has been applied to various fields, such as discourse analysis, corpus studies, text leveling, and automated essay evaluation. Several tools have been developed for analyzing texts written in alphabetic languages such as English and Spanish. However, currently there is no tool available for analyzing Chinese-language texts. This article introduces a tool for the automated analysis of simplified and traditional Chinese texts, called the Chinese Readability Index Explorer (CRIE). Composed of four subsystems and incorporating 82 multilevel linguistic features, CRIE is able to conduct the major tasks of segmentation, syntactic parsing, and feature extraction. Furthermore, the integration of linguistic features with machine learning models enables CRIE to provide leveling and diagnostic information for texts in language arts, texts for learning Chinese as a foreign language, and texts with domain knowledge. The usage and validation of the functions provided by CRIE are also introduced.

Keywords: CRIE; Chinese text analysis; Linguistic feature; Readability.

MeSH terms

  • Asian People*
  • Comprehension*
  • Humans
  • Linguistics*
  • Software*