HIV-1 proviral landscape characterization varies by pipeline analysis

J Int AIDS Soc. 2021 Jul;24(7):e25725. doi: 10.1002/jia2.25725.

Abstract

Introduction: HIV rebounds after cessation of antiretroviral therapy, representing a barrier to cure. To better understand the virus reservoir, analysis pipelines have been developed that categorize proviral sequences as intact or defective, and further determine the precise nature of the sequence defects that may be present. We investigated the effects that different analysis pipelines had on the characterization of HIV-1 proviral sequences.

Methods: We used single genome amplification to generate near full-length (NFL) HIV-1 proviral DNA sequences, defined as amplicons greater than 8000 base pairs in length, isolated from peripheral blood mononuclear cells (PBMC) of treated suppressed participants with HIV-1. Amplicons underwent direct next-generation single genome sequencing and were analysed using four HIV-1 proviral characterization pipelines. Sequences were characterized as intact or defective; defective sequences were assessed for the number and types of defects present. To confirm and extend our findings, 691 proviruses from the Proviral Sequence Database (PSD) were analysed and the ProSeq-IT tool of the PSD was used to characterize both the participant and PSD proviruses.

Results and discussion: Virus sequences derived from thirteen ART-treated virologically suppressed participants with HIV were studied. A total of 693 HIV-1 proviral sequences were generated, 282 of which were NFL. An average of 53 sequences per participant was analysed. We found that proviruses often harbour multiple sequence defect types (mean 2.7, 95% confidence interval [CI] 2.5, 3.0); the elimination order used by each pipeline affected the percentage of proviruses allotted into each defect category. These differences varied between participants, depending on the number of defect categories present in a given provirus sequence. Pipeline-specific differences in characterizing the HIV-1 5' untranslated region (5' UTR) led to an overestimation of the number of intact NFL proviral sequences, a finding corroborated in the independent PSD analysis. A comparison of the four published pipelines to ProSeq-IT found that ProSeq IT was more likely to characterize proviruses as intact.

Conclusions: The choice of pipeline used for HIV-1 provirus landscape analysis may bias the classification of defective sequences. To improve the comparison of provirus characterizations across research groups, the development of a consensus elimination pipeline should be prioritized.

Keywords: HIV-1; analysis pipeline; near full-length genome; proviral landscape; provirus characterization; reservoir.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • DNA, Viral
  • HIV Infections* / drug therapy
  • HIV-1* / genetics
  • Humans
  • Leukocytes, Mononuclear
  • Proviruses / genetics

Substances

  • DNA, Viral