StimulStat: A lexical database for Russian

Behav Res Methods. 2018 Dec;50(6):2305-2315. doi: 10.3758/s13428-017-0994-3.

Abstract

In this article, we present StimulStat - a lexical database for the Russian language in the form of a web application. The database contains more than 52,000 of the most frequent Russian lemmas and more than 1.7 million word forms derived from them. These lemmas and forms are characterized according to more than 70 properties that were demonstrated to be relevant for psycholinguistic research, including frequency, length, phonological and grammatical properties, orthographic and phonological neighborhood frequency and size, grammatical ambiguity, homonymy and polysemy. Some properties were retrieved from various dictionaries and are presented collectively in a searchable form for the first time, the others were computed specifically for the database. The database can be accessed freely at http://stimul.cognitivestudies.ru .

Keywords: Frequency; Grammatical properties; Lexical database; Neighborhood; Russian.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Factual*
  • Humans
  • Language*
  • Psycholinguistics / standards*
  • Russia
  • Vocabulary*