Semi-supervised morphosyntactic classification of Old Icelandic

PLoS One. 2014 Jul 16;9(7):e102366. doi: 10.1371/journal.pone.0102366. eCollection 2014.

Abstract

We present IceMorph, a semi-supervised morphosyntactic analyzer of Old Icelandic. In addition to machine-read corpora and dictionaries, it applies a small set of declension prototypes to map corpus words to dictionary entries. A web-based GUI allows expert users to modify and augment data through an online process. A machine learning module incorporates prototype data, edit-distance metrics, and expert feedback to continuously update part-of-speech and morphosyntactic classification. An advantage of the analyzer is its ability to achieve competitive classification accuracy with minimum training data.

Publication types

  • Historical Article
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Artificial Intelligence
  • History, Medieval
  • Humans
  • Iceland
  • Internet*
  • Language / history*
  • Linguistics / classification*
  • Linguistics / methods*
  • Literature / history*
  • Software*

Grants and funding

Funding for this project was provided through National Science Foundation (NSF) #BCS-0921123; NSF #IIS-0122491/EU IST2001-32745; with additional support from UCLA's Center for Medieval and Renaissance Studies; the UCLA Council on Research; and the UCLA Office of the Vice Chancellor for Research. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.