Identifying Transmission Clusters with Cluster Picker and HIV-TRACE

AIDS Res Hum Retroviruses. 2017 Mar;33(3):211-218. doi: 10.1089/AID.2016.0205. Epub 2016 Dec 13.

Abstract

We compared the behavior of two approaches (Cluster Picker and HIV-TRACE) at varying genetic distances to identify transmission clusters. We used three HIV gp41 sequence datasets originating from the Rakai Community Cohort Study: (1) next-generation sequence (NGS) data from nine linked couples; (2) NGS data from longitudinal sampling of 14 individuals; and (3) Sanger consensus sequences from a cross-sectional dataset (n = 1,022) containing 91 epidemiologically linked heterosexual couples. We calculated the optimal genetic distance threshold to separate linked versus unlinked NGS datasets using a receiver operating curve analysis. We evaluated the number, size, and composition of clusters detected by Cluster Picker and HIV-TRACE at six genetic distance thresholds (1%-5.3%) on all three datasets. We further tested the effect of using all NGS, versus only a single variant for each patient/time point, for datasets (1) and (2). The optimal gp41 genetic distance threshold to distinguish linked and unlinked couples and individuals was 5.3% and 4%, respectively. HIV-TRACE tended to detect larger and fewer clusters, whereas Cluster Picker detected more clusters containing only two sequences. For NGS datasets (1) and (2), HIV-TRACE and Cluster Picker detected all linked pairs at 3% and 4% genetic distances, respectively. However, at 5.3% genetic distance, 20% of couples in dataset (3) did not cluster using either program, and for >1/3 of couples cluster assignment were discordant. We suggest caution in choosing thresholds for clustering analyses in a generalized epidemic.

Keywords: HIV; Uganda; viral clustering.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, N.I.H., Intramural

MeSH terms

  • Adolescent
  • Adult
  • Cluster Analysis*
  • Disease Transmission, Infectious
  • Female
  • HIV / classification
  • HIV / genetics
  • HIV / isolation & purification
  • HIV Infections / transmission*
  • Humans
  • Male
  • Middle Aged
  • Molecular Epidemiology / methods*
  • Sequence Analysis, DNA
  • Young Adult