The use of residual analysis to improve the error rate accuracy of machine translation

Ľubomír Benko; Dasa Munkova; Michal Munk; Lucia Benkova; Petr Hajek

doi:10.1038/s41598-024-59524-3

The use of residual analysis to improve the error rate accuracy of machine translation

Sci Rep. 2024 Apr 23;14(1):9293. doi: 10.1038/s41598-024-59524-3.

Authors

Ľubomír Benko¹, Dasa Munkova², Michal Munk^{2

3}, Lucia Benkova², Petr Hajek³

Affiliations

¹ Constantine the Philosopher University in Nitra, Tr. A. Hlinku 1, 949 01, Nitra, Slovakia. lbenko@ukf.sk.
² Constantine the Philosopher University in Nitra, Tr. A. Hlinku 1, 949 01, Nitra, Slovakia.
³ Science and Research Centre, University of Pardubice, Studentská 84, 532 10, Pardubice, Czech Republic.

Abstract

The aim of the study is to compare two different approaches to machine translation-statistical and neural-using automatic MT metrics of error rate and residuals. We examined four available online MT systems (statistical Google Translate, neural Google Translate, and two European commission's MT tools-statistical mt@ec and neural eTranslation) through their products (MT outputs). We propose using residual analysis to improve the accuracy of machine translation error rate. Residuals represent a new approach to comparing the quality of statistical and neural MT outputs. The study provides new insights into evaluating machine translation quality from English and German into Slovak through automatic error rate metrics. In the category of prediction and syntactic-semantic correlativeness, statistical MT showed a significantly higher error rate than neural MT. Conversely, in the category of lexical semantics, neural MT showed a significantly higher error rate than statistical MT. The results indicate that relying solely on the reference when determining MT quality is insufficient. However, when combined with residuals, it offers a more objective view of MT quality and facilitates the comparison of statistical MT and neural MT.

Abstract

Grants and funding