Predicting Antigenic Distance from Genetic Data for PRRSV-Type 1: Applications of Machine Learning

Microbiol Spectr. 2023 Feb 14;11(1):e0408522. doi: 10.1128/spectrum.04085-22. Epub 2022 Dec 13.

Abstract

The control of porcine reproductive and respiratory syndrome (PRRS) remains a significant challenge due to the genetic and antigenic variability of the causative virus (PRRSV). Predominantly, PRRSV management includes using vaccines and live virus inoculations to confer immunity against PRRSV on farms. While understanding cross-protection among strains is crucial for the continued success of these interventions, understanding how genetic diversity translates to antigenic diversity remains elusive. We developed machine learning algorithms to estimate antigenic distance in silico, based on genetic sequence data, and identify differences in specific amino acid sites associated with antigenic differences between viruses. First, we obtained antigenic distance estimates derived from serum neutralization assays cross-reacting PRRSV monospecific antisera with virus isolates from 27 PRRSV1 viruses circulating in Europe. Antigenic distances were weakly to moderately associated with ectodomain amino acid distance for open reading frames (ORFs) 2 to 4 (ρ < 0.2) and ORF5 (ρ = 0.3), respectively. Dividing the antigenic distance values at the median, we then categorized the sera-virus pairs into two levels: low and high antigenic distance (dissimilarity). In the machine learning models, we used amino acid distances in the ectodomains of ORFs 2 to 5 and site-wise amino acid differences between the viruses as potential predictors of antigenic dissimilarity. Using mixed-effect gradient boosting models, we estimated the antigenic distance (high versus low) between serum-virus pairs with an accuracy of 81% (95% confidence interval, 76 to 85%); sensitivity and specificity were 86% and 75%, respectively. We demonstrate that using sequence data we can estimate antigenic distance and potential cross-protection between PRRSV1 strains. IMPORTANCE Understanding cross-protection between cocirculating PRRSV1 strains is crucial to reducing losses associated with PRRS outbreaks on farms. While experimental studies to determine cross-protection are instrumental, these in vivo studies are not always practical or timely for the many cocirculating and emerging PRRSV strains. In this study, we demonstrate the ability to rapidly estimate potential immunologic cross-reaction between different PRRSV1 strains in silico using sequence data routinely collected by production systems. These models can provide fast turn-around information crucial for improving PRRS management decisions such as selecting vaccines/live virus inoculation to be used on farms and assessing the risk of outbreaks by emerging strains on farms previously exposed to certain PRRSV strains and vaccine development among others.

Keywords: bioinformatics; cross-protection; immune response; immunodominant sites; immunogenicity; machine learning; seroneutralization.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Antigenic Variation
  • Cross Protection
  • Cross Reactions
  • Genetic Variation
  • Machine Learning*
  • Phylogeny
  • Porcine Reproductive and Respiratory Syndrome*
  • Porcine respiratory and reproductive syndrome virus* / genetics
  • Swine