Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Modelling drug resistance emergence and transmission in HIV-1 in the UK

View ORCID ProfileAnna Zhukova, View ORCID ProfileDavid Dunn, View ORCID ProfileOlivier Gascuel, Anti-HIV Drug Resistance Network the UK HIV Drug Resistance Database & the Collaborative HIV
doi: https://doi.org/10.1101/2023.04.24.23288580
Anna Zhukova
1Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, Paris, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Anna Zhukova
  • For correspondence: anna.zhukova{at}pasteur.fr olivier.gascuel{at}mnhn.fr
David Dunn
2Medical Research Council Clinical Trials Unit & Institute for Global Health, University College London, London, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for David Dunn
Olivier Gascuel
3Institut de Systématique, Evolution, Biodiversité (ISYEB) - URM 7205 CNRS, Muséum National d’Histoire Naturelle, SU, EPHE & UA, Paris, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Olivier Gascuel
  • For correspondence: anna.zhukova{at}pasteur.fr olivier.gascuel{at}mnhn.fr
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

A deeper understanding of HIV-1 transmission and drug resistance mechanisms can lead to improvement in current treatment policies. However, the rates at which HIV-1 drug resistance mutations (DRMs) are acquired and at which transmitted DRMs persist are multi-factorial and vary considerably between different mutations. We develop a method for estimation of drug resistance acquisition and transmission patterns, which refines the method we described in Mourad et al. AIDS 2015. The method uses maximum likelihood ancestral character reconstruction informed by treatment roll-out dates and allows for analysis of very large data sets. We apply our method to transmission trees reconstructed on the data obtained from the UK HIV drug resistance database to make predictions for known DRMs. Our results show important differences between DRMs, in particular between polymorphic and non-polymorphic DRMs, and between the B and C subtypes. Our estimates of reversion times, based on a very large number of sequences, are compatible but more accurate than those already available in the litterature, with narrower confidence intervals. We consistently find that large resistance clusters are associated with polymorphic DRMs and DRMs with long loss time, which require special surveillance. As in other high-income countries (e.g. Switzerland), the prevalence of sequences with DRMs is decreasing, but among these, the fraction of transmitted resistance is clearly increasing compared to the fraction of acquired resistance mutations. All this indicates that efforts to monitor these mutations and the emergence of resistance clusters in the population must be maintained in the long term.

1. Introduction

Drug resistance is an increasing health problem. Drug resistance mutations (DRMs) emerge in HIV viruses through selective pressure during antiretroviral therapy (ART) and make the current ART drug combination ineffective both for sustaining the patient’s well-being and for prevention of virus transmission [1,2]. Drug resistant viruses can therefore be transmitted to treatment-naive patients, who in turn can transmit them further [3,4], endangering the efficacy of treatment for the whole population. The rates at which DRMs are acquired and transmitted drug resistance (TDR) mutations persist are likely to be multi-factorial and have been shown to vary considerably depending on duration and type of treatment, and mutations [5]. Hence, having a deeper understanding of HIV transmission and drug resistance mechanisms is important as it can lead to improvement in current treatment policies [6].

Phylodynamics uses phylogenetic trees (i.e. genealogies of the pathogen population) inferred from the pathogen sequence data to estimate the epidemiological parameters. Several phylodynamic models of pathogen transmission were developed [7–10].

An important trade-off in phylodynamic modeling is between the complexity of the biological questions that a model can address and its computational speed. On one side of the spectrum there are computationally-light statistical approaches, such as the study by Mourad et al. [4] of persistence times of drug-resistance in the HIV-1-infected untreated population in the UK. The analysis used a a parsimony-based approach [11] to extract “phylotypes” of sequences, the most recent common ancestor of which was bearing a resistant mutation that is still shared by the majority of the sequences in the phylotype. Once dated and combined with the treatment-naive/experienced status, these phylotypes were used to zoom on the most readable parts of the phylogeny and compute simple statistics which are immediately accessible from the annotated tree. The simplicity of the method makes it computationally very efficient. It was applied to a large set of ≈ 25, 000 HIV-1 subtype B sequences from the UK, where it showed that around 70% of transmitted drug-resistance had a treatment-naive source.

However to address more refined questions such as estimation of rates of different events (transmission, drug resistance acquisition, etc.), more complex methods are needed, such as modeling of the viral dynamics with ordinary differential equations (ODEs). Kühnert et al. [9] proposed a piecewise-constant two-type (resistant and sensitive) birth–death model to estimate the fitness cost of DRMs. The fitness was measured as a ratio between transmission rates of hosts infected by drug resistant strains and transmission rates of hosts infected by sensitive strains. They applied this model to the data from the Swiss HIV cohort study. They reconstructed a maximum likelihood tree for 5 638 pol-gene sequences from the Swiss HIV cohort study and 4 284 closely related sequences from the Los Alamos HIV database. On this tree, for each of 15 major DRMs present in the Swiss cohort sequences, they identified its transmission clusters of up to 250 sequences each, containing > 80% of Swiss sequences and at least one sequence with the mutation. Kühnert et al. [9] estimated the model parameters on all the clusters for each mutation separately, in a Bayesian setting. To account for the fact that DRMs appear under aniretroviral (ARV) selective pressure, they put the rates of state change (from sensitive to resistant and vice versa) to zero before significant usage of the related drug(s) in Switzerland. The study showed that some of the mutations (RT:D67N, RT:K70R, RT:M184V, RT:K219Q) decreased the fitness, one (PR:L90M) seemed to increase the fitness, while the others did not have a significant effect.

The models above, and more generally the family of multi-type birth-death models [7] with a Bayesian birth-death skyline plot (allowing the parameters to change in a piece-wise constant manner) [12] it belongs to, define ODEs for fine-tuned parameter estimation. However, their complexity prevents resolving them analytically. Numerical solution of the ODEs, on the other hand, takes long computational time and prevents the application of these models to larger datasets (dozens of thousands of sequences), while larger data sets are desirable for more accurate parameter estimation.

A compromise between the model complexity and computational speed when applied to large datasets needs to be found. In this study we propose such a compromise that improves the approach by Mourad et al. [4] by using maximum likelihood and combining it with the skyline ideas of Stadler et al. [8], for analysis of DRM transmission patterns.

Our approach uses ancestral character reconstruction (ACR) on a partially sampled transmission tree. Using ancestral scenario reconstruction tool PastML [13], we study ancestral states for presence/absence of common surveillance DRMs. In a tree annotated with PastML, we can discriminate between two types of resistant nodes: (1) those whose parent node does not have the DRM, which correspond to acquired drug resistance (ADR), and (2) those whose parent node is also resistant, such nodes form TDR clusters. We also identify the scenarios of DRM loss (when the parent node has the mutation, while the child does not). Moreover, we account for the changes in treatment policies by allowing for separate ACR for different time intervals (e.g. before and after the first DRM-provoking ARV introduction). Once the reconstruction is performed we visualize the results with PastML and calculate various statistics for transmission patterns.

We apply our approach to analyze the patterns of DRM emergence, transmission and loss in HIV-1-infected individuals in the UK, using sequences and metadata from the UK HIV Drug Resistance Database [14].

2. Materials and Methods

The UK HIV Drug Resistance Database provides HIV protease (PR) and reverse transcriptase (RT) sequences extracted during the resistance tests and the corresponding metadata (e.g., treatment status of the patient before the test: treatment-experienced, -naive, or unknown; and date of the test).

In response to our request for data from the database we obtained 88 009 sequences for 60 846 different patients, sampled between 1996 and 2016.

2.0.1. Sequence subtyping and alignment

We subtyped (pure subtypes and recombination positions) and aligned the sequences against the Los Alamos 2010 subtype reference pol-gene alignment [15] using jpHMM [16] (for detailed options see Appendix A).

All together, we obtained a large alignment of 88 009 sequences, from which we extracted the alignments for the B and C subtypes. We filtered them to contain only the first sequence (in terms of sampling date) when several sequences were present for the same patient. We hence obtained a 40 055-sequence alignment for the B subtype, and a 19 139-sequence alignment for the C subtype. To each of them we added five randomly selected HIV-1 group M sequences of other pure subtypes to be used as an outgroup for tree rooting.

2.0.2. Transmission tree reconstruction

We reconstructed phylogenetic trees for B and C sequences separately, using RAxML-NG (v0.9.0, evolutionary model GTR+G4+FO+IO, for detailed options see Appendix A) [17] and rooted them with the outgroup sequences, which we then removed. For tree reconstruction, the positions of surveillance DRMs were removed from the alignment, as they are influenced by treatment-selection forces unlike the other positions, and could bias the reconstruction by grouping together the sequences that share the same DRMs.

We then dated each tree with LSD2 [18] (v2.3: github.com/tothuhien/lsd2/tree/v1.4.2.2, under strict molecular clock with outlier removal, for detailed options see Appendix A) using tip sampling dates.

2.0.3. Ancestral character reconstruction

For each DRM (surveillance or accessory) listed in the Stanford HIV Drug Resistance Database [19] we extracted its presence/absence in the sequences of our data sets and the ARVs that can provoke it with Sierra, the Stanford Algorithm [20] web service. We then analyzed the DRMs that were found in at least 0.5% of sequences (after filtering by patient and temporal outlier removal) of our dataset (either B or C, analyzed separately).

Each DRM name (e.g. RT:T215D) contain 2 pieces of information: the DRM position (e.g. RT:T215, which in turn contains the protein name: RT (reverse transcriptase) or PR (protease), the reference position of the amino acid, e.g. 215, and its wildtype amino acid, e.g. T), and its mutated amino acid, associated with resistance (e.g. D).

We analyzed each DRM position independently by reconstructing its states in the ancestral nodes, based on the tip states. The DRMs with prevalence > 0.5% found in this position in the dataset of interest (e.g. B) were analyzed together. For the majority of the DRM positions, only one DRM with prevalence > 0.5% was found (e.g. PR:L90M for the position PR:L90), however for the positions RT:T215 and RT:K219 in the B data set and for the position RT:V179 in the C data set, several DRMs were found (RT:T215D/F/S/Y, RT:K219E/N/Q, and RT:V179D/E).

Possible states for ancestral character reconstruction (ACR) corresponded to DRM presence (i.e. the resistant state) or absence (sensitive state) for DRM positions with only one DRM. For instance, for PR:L90M the resistant state corresponds to the amino acid M, and the sensitive state to any other amino acid; in practice, the sensitive state is almost uniquely L. In the B data set, 97.79% of sequences have L at the position PR:90; 1.93% have M; less than 0.01% have W or F, and 0.23% have an ambiguity at this position (so their initial state for ACR is unresolved between sensitive and resistant). For positions with several DRMs, the resistant state was split into all the possibilities (e.g., D, F, S or Y for RT:T215).

For polymorphic mutations (e.g. RT:S68G), ACR was performed on the corresponding (B or C) time-scaled tree with PastML (v1.9.40, MAP (maximum a posteriori) decision rule) without taking into account the year of ARV acceptance, as the these mutations could be present independently of ARVs.

To reconstruct the ancestral character states for non-polymorphic DRMs, we used the procedure visualised in Figure 1a (which we first proposed and applied to study HIV resistance patterns in Cuba in [21]). For each ARV we extracted the dates of their acceptance with Wikipedia python package (https://github.com/goldsmith/Wikipedia). We cut the time-scaled tree at the earliest of the dates of acceptance of ARVs that can provoke the DRM (e.g. for PR:L90M, saquinavir (SQV) was accepted in 1995). We hence obtained the pre-treatment-introduction tree and a forest of post-treatment-introduction subtrees. For the trees in the forest we added additional one-child root nodes (as parents of the corresponding tree roots, at distances that corresponded to the differences between the root dates and the ARV acceptance date), which we marked as sensitive in the PastML input annotation file. We performed ACR with PastML on the forest, and then combined it with the all-sensitive annotation for the pre-treatment-introduction tree nodes.

Figure 1.
  • Download figure
  • Open in new tab
Figure 1. ACR for DRMs.

a) To reconstruct the ancestral character states, resistant (violet, e.g. M) or sensitive (gray), for a non-polymorphic DRM (e.g. PR:L90M), we cut the time-scaled tree at the date of acceptance of the first ARV that can provoke this DRM (for PR:L90M, SQV accepted in 1995), as shown in the left panel. We hence obtain the pre-treatment-introduction tree (upper part of the tree) and a forest of post-treatment-introduction subtrees (bottom part). For the trees in the forest we then mark their roots as sensitive (middle left panel). We perform the ACR with PastML on the forest (middle right panel) and combine the results with the the all-sensitive annotation for the pre-treatment-introduction tree nodes (right panel).

b) To reconstruct the ancestral character states for the DRM position RT:V179, corresponding to a polymorphic DRM RT:V179D (violet), but also to a non-polymorphic DRM RT:V179E (orange), we cut the time-scaled tree at the date of acceptance of the first ARV that can provoke RT:V179E (NVP, accepted in 1996), as shown in the left panel. For the trees in the after-1996 forest we then mark their roots as either sensitive (gray) or D (violet, middle left panel) and perform the ACR with PastML (middle right panel). We then extended this reconstruction to the before-1996 tree only for RT:V179D (right panel).

For two of the multiple-DRM positions (RT:T215 and RT:K219) all the corresponding DRMs were non-polymorphic and provoked by the same ARVs (the earliest accepted being zidovudine (AZT, accepted in 1987) for all of them). We therefore cut the tree as explained above, and reconstructed the ACR for D, F, S, Y or sensitive (for RT:T215), and for E, N, Q or sensitive (for RT:K219) on the after-1987 forest.

Finally, for RT:V179, the mutation RT:V179D was polymorphic, while RT:V179E was non-polymorphic (provoked by nevirapine, NVP, accepted in 1996). To reconstruct ancestral characters for RT:V179, we followed the procedure visualised in Figure 1b: First, we cut the tree at 1996, and reconstructed the ancestral characters (E, D or sensitive) on the after-1996 forest (the input states for the forest roots were sensitive or D). We then extended this reconstruction on the before-1996 tree only for RT:V179D (i.e., possible states: D or sensitive).

Once ACR was performed for all the DRM positions, we combined the predictions into a common table mapping node names to their states. A node state was sensitive if no DRM was reconstructed for this node at any position, otherwise the state was a combination of DRMs reconstructed for this node in separate DRM analyses (e.g. RT:K103N+RT:V106I if those DRMs were reconstructed as present for the node of interest while the others were reconstructed as absent). We visualized this combined result using the COPY method of PastML.

Transmitted versus acquired drug resistance

On a tree whose nodes are annotated with their DRM status, present (resistant) or absent (sensitive), we defined three configurations: transmitted drug resistance (TDR), acquired drug resistance (ADR), and DRM loss (see Figure 2).

Figure 2.
  • Download figure
  • Open in new tab
Figure 2. ADR and TDR scenarios.

Each panel represents (left) a configuration observed in a tree whose nodes are annotated with their non-polymorphic DRM status (resistant nodes are violet, sensitive are gray), and (right) the most parsimonious transmission scenario (i.e. with the least number of events) leading to this configuration. The TDR clusters corresponding to the inferred scenarios are shown with violet background. a) The observed tree (left) contains a tip, corresponding to a sample of a resistant virus from a treatment-experienced individual, while its parent node (corresponding to a transmission) is sensitive. In the simplest scenario (right) the treatment-experienced individual’s virus acquired the DRM after the last observed transmission. b) The observed tree (left) contains a tip, corresponding to a sample of a resistant virus from a treatment-naive individual, while its parent node (corresponding to a transmission) is sensitive. The simplest scenario (right) includes a hidden transmission of a resistant virus from an unsampled treatment-experienced individual (dashed node and branch), whose virus previously acquired the DRM. c) The observed tree (left) contains one or several (here three) connected internal resistant nodes (corresponding to transmissions), leading to some treatment-naive tips (here three) and at least one treatment-experienced tip. Some of the tips might be sensitive (here one), while the others (here three) are resistant. In the simplest scenario (right) the treatment-experienced individual’s virus first acquired the DRM, then transmitted it to one (or several, here two) treatment-naive individuals, who might have further transmitted the resistant virus between them (here the transmission on the right). Some of the viruses might have eventually lost the DRM in the absence of drug-selective pressure (here the treatment-naive sensitive tip in the bottom). d) The observed tree (left) contains one or several (here three) connected internal resistant nodes (corresponding to transmissions), leading to only treatment-naive tips (here four). In the simplest scenario (right) an unsampled (and hence unobserved) treatment-experienced individual’s virus first acquired the DRM (before the oldest resistant internal node), then its host (dashed line) transmitted it (dashed node) to one (or several, here one) treatment-naive individuals of the observed cluster, who might have further transmitted the resistant virus between them (here all the three transmissions). Some of the viruses might have eventually lost the DRM in the absence of drug-selective pressure (here the treatment-naive sensitive tip in the bottom).

We defined ADR cases as parent-child node pairs, where the parent DRM status is sensitive, while the child DRM status is resistant.

We defined TDR cases inferred from the tree as either:

  1. an internal node whose state was estimated as resistant (i.e. containing the DRM of interest, see Figure 2c,d). As the internal nodes of the tree roughly correspond to transmissions, such a node indicates a transmission of a resistant virus.

  2. (for non-polymorphic mutations only) a hidden internal node between a node whose DRM status is resistant and its parent node whose DRM status is sensitive, if all the tips in the node’s subtree are treatment-naive. According to the treatment status and the fact that the mutation is non-polymorphic, the initial resistance could not be acquired through treatment pressure, and hence must have been transmitted from a patient who was not sampled (and does not appear in the tree, see Figure 2b,d).

Connected parts of the tree corresponding to TDR cases form TDR clusters (see Figure 2). We calculated their sizes as the numbers of resistant tips connected to each cluster. Note that if a TDR cluster subtree contains only treatment-naive patients, it implies that its root ADR event corresponds to an unsampled treated patient (see Figure 2b,d).

We define DRM loss cases as parent-child node pairs, where the parent DRM status is resistant, while the child DRM status is sensitive.

Using these configurations, we calculate the source of the DRM status of each tip in the tree as follows.

For non-polymorphic DRMs

  1. For treatment-naive tips, the source of their DRM status is:

    • TDR if the tip is resistant (see Figure 2a,d);

    • TDR+DRM loss if the tip is sensitive and is involved in a DRM loss configuration (see Figure 2c,d);

    • transmission of a virus without the DRM if the above two cases do not apply.

  2. For treatment-experienced tips, the source of their DRM status is:

    • ADR (+DRM loss if the tip is sensitive) for one of the treatment-experienced tips connected to a TDR cluster (see Figure 2c). The patient corresponding to this tip is assumed to be the source of the TDR cluster. The later DRM loss is possible if the treatment was changed to drugs that do not provoke the DRM in question. For other treated tips connected to this cluster, we assume that they received a resistant virus via TDR. Assuming their treatment was such that it could not provoke the DRM in question, they could later lose it (hence +DRM loss if they are sensitive);

    • ADR for a resistant tip not connected to a TDR cluster (Figure 2a);

    • transmission of a virus without the DRM if the above cases do not apply.

  3. For the tips whose treatment status is unknown, we consider both cases (naive or resistant) with equal probabilities (0.5).

For polymorphic DRMs we do not consider the treatment status (as such DRMs could appear independently of treatment) and calculate the source of each tip’s DRM status as follows:

  • ADR for a resistant tip not connected to a TDR cluster (as in Figure 2a, independently of the treatment status);

  • ADR (+DRM loss if the tip is sensitive) for one of the tips connected to a TDR cluster (as in Figure 2c, independently of the treatment status). The individual corresponding to this tip is assumed to be the source of the TDR cluster. For other tips connected to this cluster, we assume that they received a resistant virus via TDR. They could later lose it (hence +DRM loss if they are sensitive);

  • transmission of a virus without the DRM if the above cases do not apply.

We count the numbers of tip DRM status sources of each type (ADR: NADR, TDR: NTDR, or loss: Nloss (see Appendix B for details) and report the results in Tables 2 and 3. We count all the identified DRM loss events, all the identified (observed and hidden) TDR events, and only those of the ADR events that are not at the root of naive-only TDR clusters, as the latter happened in unsampled treatment-experienced patients (see Figure 2b,d).

Note that Nresistant tips = NADR + NTDR − Nloss. For example, in Figure 2c, all the events correspond to observed tips, so we count one ADR, three TDR, and one DRM loss events: Nresistant tips = 3 = 1 + 3 − 1. Figure 2d represents a more complex case: We count one hidden TDR event (as it led to the resistance status of one of the observed tips) and three observed TDR events (leading to resistance statuses of other observed tips). We do not count the ADR event (as it corresponds to an unobserved patient, whose virus is not in our data set). We also count one DRM loss event, which led to one of the tips regaining its sensitive state. Hence Nresistant tips = 3; NADR = 0; NTDR = 4; Nloss = 1; 3 = 0 + 4 − 1.

Times of DRM loss

We estimated the loss times for non-polymorphic DRMs, using survival analysis with an exponential (constant hazard) model (Weibull model with β = 1), implemented in Python3 package SurPyval (github.com/derrynknife/SurPyval, v0.10.10). This model takes as input observations about event durations and estimates the rate at which the event occurs. The input data might be left-, right- or interval-censored. Left-censored data represent times that are longer than the event occurrences, e.g. if the DRM loss occurred in exactly 2 years, but the observation was only made after 3 years, the 3-year duration represents a left-censored data point. Right-censored data represent times that are shorter than the event occurrences, e.g. if for the same DRM loss the only observation was made after 1 year (and observed no DRM loss yet), the 1-year duration represents a right-censored data point. Interval-censored data represents cases when both a left- and a right-censored data point are available, e.g. for the same DRM loss an interval-censored data might state that it occurred sometime between 1 and 3 years.

For each individual represented in our data set we extracted at most one data point for the loss survival analysis, as described below.

A right-censored data point represents the maximal observed duration during which a mutation loss did not occur. We extracted such points for the individuals who had several consecutive treatment-naive samples with the DRM of interest (and of the subtype of interest) in our metadata: we took the difference in sampling times of the last such sample and the first one.

A left-censored data point represents a duration that is longer than the mutation loss time. To estimate such a duration we needed to know not only (1) the time by which the individual’s virus lost the DRM, but also (2) the earliest time by which it could have acquired it (the difference making an upper limit on the loss duration). For (1) we used the time of the earliest sample without the DRM, provided it was preceded by samples with the DRM. For (2) we used either (2a) the time of the latest sample without the DRM preceding the aforementioned samples (where the DRM was present and then lost), if such sample existed in the metadata, or (2b) if the earliest metadata sample already had the DRM (which implies it corresponded to a resistant tip in the tree), the time of the tip’s most recent ancestral node whose status was sensitive (with marginal probability > 0.95).

For individuals for whom both a left- and a right-censored data point was present, we converted them to an interval-censored one.

We reported the resulting DRM loss time estimates (i.e. inverse of the loss rates) for non-polymorphic DRMs with at least 5 left-censored and 5 right-censored data points (interval-censored data points counted as both). We estimated confidence intervals (CIs) as the 2.5- and 97.5-percentiles of the loss times estimated on bootstrapped data points of the same size (with 1 000 repetitions).

3. Results

3.1. HIV in the UK

Antiretroviral therapy (ART) was introduced in the UK more than 30 years ago and transformed HIV from a fatal infection into a chronic, manageable condition [22–24]. It is accepted that successful ART results in an “undetectable” viral load which is protective from passing on the virus to others [25,26].

In the UK, a patient’s viral load is regularly monitored by the clinicians: Patients attend bi-annual or quarterly clinical visits, depending on how well they do on treatment. Moreover, the increase in viral load comes with symptoms (generally opportunistic infections that persist longer than they should). A suspicious increase from undetectable to detectable viral load (i.e. viral rebound) is the first sign of treatment failure.

In case of a viral rebound, the virus is sequenced to discriminate between resistance (presence of a known DRM) and poor adherence (failure without DRM, if a patient does not take the drugs regularly according to prescription). If resistance is the reason for a treatment failure, the treatment is changed.

Therefore, in the case of treatment failure, there is a window of opportunity for the virus to be transmitted: between the time the viral load increases to transmittable levels and the time when the clinician realizes it and changes treatment. The probability of transmission varies across patients, and depends on various factors [5].

The information collected from the HIV drug resistance tests carried out in the UK since 1996 is available in the UK HIV Drug Resistance Database. The database stores protease (PR) and reverse transcriptase (RT) sequences for about 50% of infected individuals in the UK.

3.2. UK HIV data set

We used the data from the UK HIV Drug Resistance Database containing samples from 1996 to 2016 to estimate transmission mechanisms for different common DRMs.

Out of 88 009 initial sequences obtained from the database, the majority were of subtypes B (58 569 sequences, 66.5%) and C (27 151 sequences, 30.1%), we also detected 8 D, 1 F, 2 G, and 3 K (< 0.0001%) sequences, and 2 276 potentially recombinant sequences (2.6%, in particular 494 A,B,G and 446 B,K-recombinants (0.5%)). We report these and other data set statistics in Table 1.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 1.

Statistics on the B and C data sets. The “with DRM(s)” statistics count samples with at least one unambiguous resistant amino acid at any DRM position. Samples that contained either non-resistant or ambiguous amino acids at all DRM positions were considered as “without DRMs”. “p DRM(s)” stands for polymorphic DRMs, while “np DRMs” stands for non-polymorphic ones. Note that the same sequence might contain both p and np DRMs (at different positions).

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 2. DRMs in B data set

with prevalence > 0.5%. Polym. stands for polymorphic DRMs, for non-polymorphic DRMs the first ARV that could provoke it and its acceptance date are shown. Nresistant cases = NTDR + NADR − Nloss.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 3. DRMs in C data set

with prevalence > 0.5%. Polym. stands for polymorphic DRMs, for non-polymorphic DRMs the first ARV that could provoke it and its acceptance date are shown.

Nresistant cases = NTDR + NADR − Nloss.

We focused our analysis on subtypes B and C, keeping the first sampled sequence for patients for whom multiple sequences were available. We hence obtained a 40 055-sequence data set for B, and a 19 139-sequence data set for C. We further filtered these data sets by removing temporal outliers (< 2% of sequences), as they could correspond to poorly sequenced samples or erroneous dates. The final data sets contained 39 159 sequences for B, and 18 809 for C.

We detected 161 DRMs found in at least one sequence of the B data set, and 146 DRMs for C. 31.4% of B and 27.4% of C sequences had at least one of these DRMs present, 18.5% of B and 16.9 % of C sequences had only one mutation, while the others had multiple DRMs present. While the subtypes B and C are different, as well as are the locations where these subtypes are most prevalent (African countries for C versus the UK and other European countries for B), we did not detect major differences in DRM distribution in the B and C data sets. Hence, while more C than B sequences correspond to imported cases, the UK health policies must play an important role on their DRM patterns, independently of the subtype. In a recent study Blassel et al. [27] compared DRMs in a UK and an African data sets. They reported that the median number of DRMs in resistant sequences differed between the two datasets (3 in the African sequences versus 1 in the UK sequences). In our case, there was no difference between B and C data sets if all DRMs were considered (median number of 1 DRM for both B and C data sets in resistant sequences); if we considered only non-polymorphic DRMs, a slight difference appeared (1 for B vs 2 for C). Detailed statistics on DRM number distributions are shown in Table A1. There was however a significant difference in the TDR distribution: more TDR could be suspected among the B samples (12% of treatment-naive sequences had non-polymorphic DRMs present, while in the C samples there were only 7% of such sequences).

3.3. Drug resistance analyses

We reconstructed time-scaled phylogenetic trees for B and C data sets and performed ancestral character reconstruction for each of the selected DRMs and positions to look at their transmission patterns. Consistently with what was previously reported in HIV-1 group M studies (of the pol gene [28] and of the full-genome [29]), we estimated a faster mutation rate (1.9 · 10−3 [mutations per site per year]) and a more recent root date (1965) for subtype B than for subtype C (1.4 · 10−3; 1944). More details on B and C datasets can be found in Table 1.

On the time-scaled trees we analyzed the transmission patterns of the DRMs found in at least 0.5% of sequences: 31 DRMs (on 26 different positions) for B and 21 (on 20 different positions) for C. The major drug resistance patterns found in B and C data sets are visualized in Figures 3 and 4. The statistics on these DRMs and their loss times are shown in Tables 2-4.

Figure 3.
  • Download figure
  • Open in new tab
Figure 3. Major resistance patterns found in B data set.

States of tree nodes are shown as labels, e.g.”PR_L90M” corresponds to the presence of DRM PR:L90M and absence of the other DRMs. The nodes are coloured by DRM found in them (if several DRMs are present, the colour of the (lexicographically) first DRM is used). The nodes with no DRM are coloured gray and labelled “sensitive”. The parts of the tree where no state change happens are clustered together into metanodes, their size corresponds to the number of samples (tips) they contain (shown in labels), e.g. “RT_K103N+RT_S68G+RT_T215S 36” (violet, on the bottom) corresponds to a transmitted resistance cluster containing 36 samples in the B data set, having three mutations. Configurations present several times are shown once and the number of occurrences is shown on the corresponding branch, e.g. a branch of size 247 leading to the metanode “RT_V179D 1-8” (salad green, on the top right) represents 247 cases of acquiring the mutation RT:V179D leading to small transmission clusters of sizes between 1 and 8. Configurations representing less than 34 samples are not shown to increase readability.

Figure 4.
  • Download figure
  • Open in new tab
Figure 4. Major resistance patterns found in C data set.

States of tree nodes are shown as labels, e.g. “RT_E138A” (salad green nodes in the middle) correspond to the presence of DRM RT:E138A and absence of the other DRMs. The nodes are coloured by DRM found in them (if several DRMs are present, the colour of the (lexicographically) first DRM is used). The nodes with no DRM found are coloured gray and labelled “sensitive”. The parts of the tree where no state change happens are clustered together into metanodes, their size corresponds to the number of samples (tips) they contain (shown in labels). Configurations present several times are shown once and the number of occurrences is shown on the corresponding branch, e.g. a branch of size 23 leading to the blue metanode “PR_L90M 1-3” (top left) represents 23 cases of acquiring the mutation PR:L90M leading to small resistance clusters of sizes 1–3. Configurations representing less than 11 samples are not shown to increase readability.

While some of the DRMs (e.g. RT:M184V) are comparably prevalent in B and C (4.8% of resistant cases vs 5.4%), others are very subtype-specific. For instance, the non-polymorphic mutation PR:L90M is present in 2.2% of B resistant cases and only in 0.6% of C. Another example is the mutations in position RT:106. In the B dataset the polymorphic DRM RT:V106I is present in 3.9% of resistant cases and is 30 times more prevalent than the non-polymorphic DRM RT:V106M, which was not selected for our analysis due to its low prevalence; while for the C data set we have the opposite distribution: RT:V106M is present in 2% of resistant cases and is 16 times more prevalent than RT:V106I, which was non selected for our analyses. More examples are given in Tables 2 and 3.

Using the metadata only, we can already see that there is a clear difference between polymorphic and non-polymorphic mutations. While the presence of most of the latter ones correlated with the treatment status (e.g. 86.5% of B sequences with the non-polymorphic mutation RT:M184V are from treatment-experienced patients), it is the opposite for the former, which are more prevalent in treatment-naive sequences (e.g. 78.1% of B sequences with RT:S68G are treatment-naive, see Table 2). Indeed, while the polymorphic DRMs can appear spontaneously, the non-polymorphic ones are selected by treatment, and carrying them often implies a fitness cost [9]. However, a few non-polymorphic DRM do not follow this pattern and are more prevalent in treatment-naive individuals: RT:T215D, RT:T215S, RT:K219N in B, and RT:V179E in C. RT:T215D/S area form of reversion and are are often developed in patients primarily infected with strains with RT:T215Y/F, and hence have a higher fitness [30]. It is further confirmed by our estimation of the loss times: the loss times of RT:T215D/S are long (9.3 and 6.8 years versus 1.1 and 1.8 years for RT:T215Y/F, see Table 4), which may explain their prevalence in treatment-naive patients. Similarly, we estimated a rather long loss time for RT:K219N (3.7 years). We did not have enough data to estimate the loss time of RT:V179E. While this mutation is generally considered as non-polymorphic [31], its natural presence in treatment-naive patients has been reported for the HIV-1 common recombinant form CRF55_01B [32].

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 4.

Loss times (with 95% CIs) for non-polymorphic DRMs found in B and C data sets with prevalence > 0.5% and at least 5 left- and 5 right-censored data points (the exact numbers of data points are shown in Table A3).

Using the information from the tree, we refined the mutation statistics further, classifying resistant mutations sources into TDR vs ADR, and detecting DRM loss events (see Tables 2 and 3). C data set featured smaller TDR clusters (apart from the polymorphic mutation RT:E138A) than B. This could be explained by multiple introductions of subtype C into different regions of the UK and different risk groups, particularly from Africa via immigration, which is consistent with the higher diversity of the C strains observed in our data (C: 0.019 vs B: 0.014 [mutations per site per branch], Table 1) and with the dates of origin of the two UK sub-epidemics (C: 1944 vs B: 1965, Table 1).

A large size (e.g. 78 individuals in the B data set for RT:K103N) of some of the TDR clusters and a rather high proportion of TDR cases among the resistant ones (see Tables 2 and 3) is clinically problematic, as it means a high level of resistant strain transmission, leading to decrease of treatment choice on the population level.

We further analyzed each mutation position over time (see Supplementary Tables S1-S46) and found a common pattern: The proportion of resistant cases with respect to all cases decreases over time, however the proportion of resistant cases in treatment-naive individuals and, consistently, the proportion of TDR with respect to ADR, increases. This pattern is well illustrated by the mutations in position RT:215 (see Figure 5 and Table A2). However, there are exceptions with respect to the decrease in the proportion of resistant cases over time, especially among the polymorphic DRMs, consistent with the fact that they have little or no fitness cost associated with them. For the polymorphic mutation RT:E138A this proportions has been increasing from 2001 to mid-March 2016 (the last sampling time in our data): from 1.9% to 2.2% in the B data set, and from 8.3% to 11.6% in the C data set (Tables S8, S31). Similarly the proportion of resistant cases with polymorphic RT:S68G has been increasing from 4.6% in 2001 to 8.1% in 2016 in B, and from 0.3% to 0.9% in C (Tables S21, S41). The proportion of resistant cases with polymorphic RT:V106I has been increasing in B: from 2% in 2001 to 2.7% in 2016, while the proportion of non-polymorphic RT:V106M (similar to RT:V106I) in C seems to have stabilized at 2% over the last five sampling years (2011-2016, Tables S23, S43). The proportion of resistant cases with polymorphic RT:V179D has been increasing in B: from 1.3% in 2001 to 2% in 2016, so did the proportion of non-polymorphic RT:V179E (similar to RT:V179D) in C: from 0.1% in 2006 to 0.6% in 2016, while the proportion of RT:V179D has stayed stable (∼ 1.5%) over the last 10 sampling years (2006-2016, Tables S25, S45). Finally, the proportion of resistant cases with polymorphic PR:Q58E has been increasing in subtype C: from 0.6% in 2006 to 0.8% in 2016 (Table S28, we did not analyze it for B due to its low prevalence). These results clearly indicate that the spread of polymorphic DRMs should become a subject of particular surveillance.

Figure 5.
  • Download figure
  • Open in new tab
Figure 5.

DRMs with prevalence > 0.5% in position RT:215 in the B data set (wildtype amino acid is T, non-polymorphic AZT-resistant mutations are D, F, S, and Y). ACR was performed for the RT position 215 with five possible states: D (lilac), F (salad green), S (light blue), Y (violet), and other (sensitive, gray). The parts of the tree where no state change happens are clustered together into metanodes, their size corresponds to the number of samples (tips) they contain (shown in labels), e.g. “RT_T215D 99” (lilac, top left) corresponds to a transmitted RT:T215D resistance cluster containing 99 samples in the B data set. Configurations present several times are shown once and the number of occurrences is shown on the corresponding branch, e.g. the branch of size 804 leading to the metanode “RT_T215Y 1-5” (violet, right) represents 804 cases of acquired RT:T215Y mutation leading to small transmission clusters of sizes between 1 and 5. Configurations representing less than 2 samples are not shown to increase readability.

3.3.1. DRM loss times

We estimated the times of DRM loss for non-polymorphic DRMs in our data sets and compared them to the estimates previously reported by Castro et al. [5]. Castro et al. [5] analyzed 313 patients from the UK Drug Resistance database, who were treatment-naive and had a DRM present in their first resistance test (performed between 1997 and 2009), mixing all the subtypes and using survival analysis. We also used survival analysis, but had a larger data set, included the information not only from the metadata, but also from the tree, and analyzed the subtypes separately. Out results and the comparison are shown in Table 4. Overall, out estimates are compatible with those by Castro et al. [5]: the CIs of the two studies intersect for all the DRMs but RT:K103N in the C dataset. The difference for RT:K103N could be explained by the fact that Castro et al. [5] analyzed different subtypes together (though the majority of samples used were from B), while we performed a subtype-specific analysis: our estimate for RT:K103N on the B data set (2.0-2.6 years) is compatible with the one by Castro et al. [5] (2.0-6.8 years). Our CIs are systematically narrower than those of Castro et al. [5].

The visualization using ACR for the DRMs at the position RT:T215 (Figure 5) is consistent with the estimated loss patterns. For two of the mutations found in this position (RT:T215D and RT:T215S) the loss times are long (9.3 and 6.8 years respectively), which allows them to form large TDR clusters (up to 99 and 45 sampled respectively, left and bottom part of Figure 5). For the other two mutations found in this position (RT:T215F and RT:T215Y) the loss times (including potential reversions to D or S) are rather short (1.8 and 1.1 years), which prevents them from forming significant TDR clusters.

4. Discussion

We proposed fast maximum-likelihood ACR methods for investigation of drug resistance patterns in large sequence data sets. Their application to ∼ 40 000 subtype B and ∼ 20 000 subtype C sequences from the UK HIV Drug resistance database allowed us to investigate the trends in drug resistance patterns between 1996 and 2016 and to estimate the loss times for 25 common non-polymorphic DRMs.

An important advantage of our methods is their applicability to very large data sets (dozens of thousands of sequences). Previous studies had to face an uncomfortable choice between using more complex models on filtered data [9] or using less accurate (e.g. parsimony) approaches on full data sets [4]. Our approach uses a robust maximum likelihood framework, and permits the extraction of global drug resistant patterns from all the available data.

While the proportion of resistant cases in the UK seems to decrease with time, the proportion of resistant cases in treatment-naive individuals (hence acquired via TDR) is increasing. In addition, our results show that polymorphic DRMs obey to a different scheme, with an increase of both the proportion of resistant cases and TDR, and large resistance clusters. The TDR cases form resistance clusters, which are clearly identifiable on phylogenetic trees. Locating these clusters within the UK regions and cities, and among risk groups would be an important step in stopping drug resistance spread. The global trend that we observe in the UK is visible in other high-income countries (e.g. Switzerland [33], Italy [34] and Portugal [35]), but differs from, for example, West Africa, where the prevalence of multiple resistance in the population is a major concern [36]. Furthermore, detailed analyses in high-income countries indicate that a high level of ADR is more frequently observed in certain risk groups (e.g. African origin, unemployment, mental illness, among others, in Switzerland [37]) that require special surveillance to prevent treatment failure and HIV-1 transmission.

Data Availability

All data produced are available online at https://github.com/evolbioinfo/HIV1-UK

https://github.com/evolbioinfo/HIV1-UK

Author Contributions

Conceptualization, A.Z. and O.G.; methodology, A.Z. and O.G.; formal analysis, A.Z.; resources, D.D. and the UK HIV Drug Resistance Database & the Collaborative HIV, Anti-HIV Drug Resistance Network; writing–original draft preparation, A.Z. and O.G.; writing– review and editing, A.Z., O.G., and D.D.; visualization, A.Z.; supervision, O.G. All authors have read and agreed to the published version of the manuscript.

Funding

O.G. was supported by PRAIRIE (ANR-19-P3IA-0001).

Data Availability Statement

The HIV-1 sequences and the metadata (anonymized patient id and gender) used in this study, were obtained from the UK HIV Drug Resistance Database [14] in 2017. The visualizations of the ACR results produced in this study are available at github.com/evolbioinfo/HIV1-UK

Conflicts of Interest

The authors declare no conflict of interest.

Disclaimer/Publisher’s Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Acknowledgments

Authors thank Dr Stéphane Hué for valuable discussions on HIV spread in the UK.

Appendix A Analysis pipelines

Snakemake [38] pipelines and ad hoc Python3 scripts used for the analyses described above are available on github.com/evolbioinfo/HIV1-UK. Along with the subtyping, tree reconstruction, dating and ACR tools mentioned above, we used goalign (v0.3.6) and gotree (v0.3.0b) [39] for basic sequence alignment and tree manipulations, as well as ETE3 framework [40] for basic tree manipulations (format conversion, pruning, etc.). Survival analysis was performed with Python3 package SurPyval (github.com/derrynknife/SurPyval). Sierra web service [20] for ARV and DRM detection was used via Python3 package sierrapy (github.com/hivdb/sierra-client).

Appendix B

Algorithm for counting NADR, NTDR, Nloss in a tree 𝕋

Figure
  • Download figure
  • Open in new tab

Appendix C Additional Tables

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table A1.

Resistant sequence counts in B and C data sets (after filtering by patient and temporal outlier removal).

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table A2.

DRMs with prevalence > 0.5% found in position RT:T215 in B data set, and the evolution of their presence over time.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table A3.

Numbers of data points available for loss time calculation for non-polymorphic DRMs found in B and C data sets (see Table 4).

Footnotes

  • d.dunn{at}ucl.ac.uk

Abbreviations

The following abbreviations are used in this manuscript:

ADR
acquired drug resistance
ART
antiretroviral therapy
ARV
antiretroviral
AZT
zidovudine
CI
confidence interval
DDI
didanosine
DRM
drug resistance mutation
ETR
etravirine
MAP
maximum a posteriori
NFV
nelfinavir
NNRTI
non-nucleoside reverse transcriptase inhibitor
NRTI
nucleoside reverse transcriptase inhibitor
NVP
nevirapine
np DRM
non-polymorphic drug resistance mutation
PI
protease inhibitor
p DRM
polymorphic drug resistance mutation
PR
protease
RT
reverse transcriptase
SQV
saquinavir
TDF
tenofovir
TDR
transmitted drug resistance

References

  1. 1.↵
    Larder, B.A.; Kemp, S.D. Multiple mutations in HIV-1 reverse transcriptase confer high-level resistance to zidovudine (AZT). Science 1989, 246, 1155–1158. https://doi.org/10.1126/science.2479983.
    OpenUrlAbstract/FREE Full Text
  2. 2.↵
    Lepri, A.C.; Sabin, C.A.; Staszewski, S.; Hertogs, K.; Müller, A.; Rabenau, H.; Phillips, A.N.; Miller, V. Resistance Profiles in Patients with Viral Rebound on Potent Antiretroviral Therapy. The Journal of Infectious Diseases 2000, 181, 1143–1147. https://doi.org/10.1086/315301.
    OpenUrlCrossRefPubMedWeb of Science
  3. 3.↵
    Hué, S.; Gifford, R.J.; Dunn, D.; Fernhill, E.; Pillay, D.; UK Collaborative Group on HIV Drug Resistance. Demonstration of sustained drug-resistant human immunodeficiency virus type 1 lineages circulating among treatment-na{\”\i}ve individuals. J. Virol. 2009, 83, 2645–2654.
    OpenUrlAbstract/FREE Full Text
  4. 4.↵
    Mourad, R.; Chevennet, F.; Dunn, D.T.; Fearnhill, E.; Delpech, V.; Asboe, D.; Gascuel, O.; Hue, S.; UK HIV Drug Resistance Database & the Collaborative HIV, Anti-HIV Drug Resistance Network. A phylotype-based analysis highlights the role of drug-naive HIV-positive individuals in the transmission of antiretroviral resistance in the UK. AIDS (London, England) 2015, 29, 1917–25.
    OpenUrlCrossRefPubMed
  5. 5.↵
    Castro, H.; Pillay, D.; Cane, P.; Asboe, D.; Cambiano, V.; Phillips, A.; Dunn, D.T.; for the UK Collaborative Group on HIV Drug Resistance.; Aitken, C.; Asboe, D.; et al. Persistence of HIV-1 transmitted drug resistance mutations. The Journal of infectious diseases 2013, 208, 1459–63.
    OpenUrlCrossRefPubMed
  6. 6.↵
    Organization, T.W.H. Consolidated guidelines on the use of antiretroviral drugs for treating and preventing HIV infection: recommendations for a public health approach, 2nd ed.
  7. 7.↵
    Stadler, T.; Bonhoeffer, S. Uncovering epidemiological dynamics in heterogeneous host populations using phylogenetic methods. Philosophical Transactions of the Royal Society B: Biological Sciences 2013, 368, 20120198–20120198. https://doi.org/10.1098/rstb.2012.0198.
    OpenUrlCrossRefPubMed
  8. 8.↵
    Stadler, T.; Kühnert, D.; Bonhoeffer, S.; Drummond, A.J. Birth-death skyline plot reveals temporal changes of epidemic spread in HIV and hepatitis C virus (HCV). Proceedings of the National Academy of Sciences of the United States of America 2013, 110, 228–33.
    OpenUrlAbstract/FREE Full Text
  9. 9.↵
    Kühnert, D.; Kouyos, R.; Shirreff, G.; Pečerska, J.; Scherrer, A.U.; Böni, J.; Yerly, S.; Klimkait, T.; Aubert, V.; Günthard, H.F.; et al. Quantifying the fitness cost of HIV-1 drug resistance mutations through phylodynamics. PLOS Pathogens 2018, 14, e1006895. Publisher: Public Library of Science, https://doi.org/10.1371/journal.ppat.1006895.
    OpenUrlCrossRef
  10. 10.↵
    Ratmann, O.; Grabowski, M.K.; Hall, M.; Golubchik, T.; Wymant, C.; Abeler-Dörner, L.; Bonsall, D.; Hoppe, A.; Brown, A.L.; de Oliveira, T.; et al. Inferring HIV-1 transmission networks and sources of epidemic spread in Africa with deep-sequence phylogenetic analysis. Nature Communications 2019, 10, 1411. https://doi.org/10.1038/s41467-019-09139-4.
    OpenUrl
  11. 11.↵
    Fitch, W.M. Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology. Syst. Biol. 1971, 20, 406–416.
    OpenUrlCrossRefPubMed
  12. 12.↵
    Kühnert, D.; Stadler, T.; Vaughan, T.G.; Drummond, A.J. Phylodynamics with Migration: A Computational Framework to Quantify Population Structure from Genomic Data. Molecular biology and evolution 2016, 33, 2102–16. https://doi.org/10.1093/molbev/msw064.
    OpenUrlCrossRefPubMed
  13. 13.↵
    Ishikawa, S.A.; Zhukova, A.; Iwasaki, W.; Gascuel, O. A Fast Likelihood Method to Reconstruct and Visualize Ancestral Scenarios. Molecular Biology and Evolution 2019. https://doi.org/10.1093/molbev/msz131.
  14. 14.↵
    Dunn, D.; Pillay, D. UK HIV drug resistance database: background and recent outputs. Journal of HIV therapy 2007, 12, 97–8.
    OpenUrlPubMed
  15. 15.↵
    Kuiken, C.; Korber, B.; Shafer, R.W. HIV sequence databases. AIDS reviews 2003, 5, 52–61. Publisher: NIH Public Access.
    OpenUrlCrossRefPubMed
  16. 16.↵
    Schultz, A.K.; Zhang, M.; Leitner, T.; Kuiken, C.; Korber, B.; Morgenstern, B.; Stanke, M. A jumping profile Hidden Markov Model and applications to recombination sites in HIV and HCV genomes. BMC Bioinformatics 2006, 7, 265. https://doi.org/10.1186/1471-2105-7-265.
    OpenUrlCrossRefPubMed
  17. 17.↵
    Kozlov, A.M.; Darriba, D.; Flouri, T.; Morel, B.; Stamatakis, A. RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics 2019. https://doi.org/10.1093/bioinformatics/btz305.
  18. 18.↵
    To, T.H.; Jung, M.; Lycett, S.; Gascuel, O. Fast Dating Using Least-Squares Criteria and Algorithms. Systematic biology 2016, 65, 82–97.
    OpenUrlCrossRefPubMed
  19. 19.↵
    Shafer, R.W.; Jung, D.R.; Betts, B.J. Human immunodeficiency virus type 1 reverse transcriptase and protease mutation search engine for queries. Nat Med 2000, 6, 1290–1292. https://doi.org/10.1038/81407.
    OpenUrlCrossRefPubMedWeb of Science
  20. 20.↵
    Liu, T.F.; Shafer, R.W. Web Resources for HIV Type 1 Genotypic-Resistance Test Interpretation. Clinical Infectious Diseases 2006, 42, 1608–1618.
    OpenUrlCrossRefPubMedWeb of Science
  21. 21.↵
    Zhukova, A.; Voznica, J.; Felipe, M.D.; To, T.H.; Pérez, L.; Martínez, Y.; Pintos, Y.; Méndez, M.; Gascuel, O.; Kouri, V. Cuban history of CRF19 recombinant subtype of HIV-1. PLOS Pathogens 2021, 17, e1009786. Publisher: Public Library of Science, https://doi.org/10.1371/JOURNAL.PPAT.1009786.
    OpenUrl
  22. 22.↵
    Palmisano, L.; Vella, S. A brief history of antiretroviral therapy of HIV infection: success and challenges. Annali dell’Istituto superiore di sanita 2011, 47, 44–8.
    OpenUrlCrossRefPubMed
  23. 23.
    Hammer, S.M.; Squires, K.E.; Hughes, M.D.; Grimes, J.M.; Demeter, L.M.; Currier, J.S.; Eron, J.J.; Feinberg, J.E.; Balfour, H.H.; Deyton, L.R.; et al. A Controlled Trial of Two Nucleoside Analogues plus Indinavir in Persons with Human Immunodeficiency Virus Infection and CD4 Cell Counts of 200 per Cubic Millimeter or Less. New England Journal of Medicine 1997, 337, 725–733.
    OpenUrlCrossRefPubMedWeb of Science
  24. 24.↵
    Gulick, R.M.; Mellors, J.W.; Havlir, D.; Eron, J.J.; Gonzalez, C.; McMahon, D.; Richman, D.D.; Valentine, F.T.; Jonas, L.; Meibohm, A.; et al. Treatment with Indinavir, Zidovudine, and Lamivudine in Adults with Human Immunodeficiency Virus Infection and Prior Antiretroviral Therapy. New England Journal of Medicine 1997, 337, 734–739.
    OpenUrlCrossRefPubMedWeb of Science
  25. 25.↵
    Cohen, M.S.; Chen, Y.Q.; McCauley, M.; Gamble, T.; Hosseinipour, M.C.; Kumarasamy, N.; Hakim, J.G.; Kumwenda, J.; Grinsztejn, B.; Pilotto, J.H.; et al. Prevention of HIV-1 Infection with Early Antiretroviral Therapy. New England Journal of Medicine 2011, 365, 493–505.
    OpenUrlCrossRefPubMedWeb of Science
  26. 26.↵
    Rodger, A.J.; Cambiano, V.; Bruun, T.; Vernazza, P.; Collins, S.; van Lunzen, J.; Corbelli, G.M.; Estrada, V.; Geretti, A.M.; Beloukas, A.; et al. Sexual Activity Without Condoms and Risk of HIV Transmission in Serodifferent Couples When the HIV-Positive Partner Is Using Suppressive Antiretroviral Therapy. Jama 2016, 316, 171–81.
    OpenUrlCrossRefPubMed
  27. 27.↵
    Blassel, L.; Tostevin, A.; Villabona-Arenas, C.J.; Peeters, M.; Hué, S.; Gascuel, O.; Database, O.b.o.t.U.H.D.R. Using machine learning and big data to explore the drug resistance landscape in HIV. PLOS Computational Biology 2021, 17, e1008873. Publisher: Public Library of Science, https://doi.org/10.1371/journal.pcbi.1008873.
    OpenUrl
  28. 28.↵
    Wertheim, J.O.; Fourment, M.; Kosakovsky Pond, S.L. Inconsistencies in Estimating the Age of HIV-1 Subtypes Due to Heterotachy. Mol Biol Evol 2012, 29, 451–456. https://doi.org/10.1093/molbev/msr266.
    OpenUrlCrossRefPubMedWeb of Science
  29. 29.↵
    Bletsa, M.; Suchard, M.A.; Ji, X.; Gryseels, S.; Vrancken, B.; Baele, G.; Worobey, M.; Lemey, P. Divergence dating using mixed effects clock modelling: An application to HIV-1. Virus Evolution 2019, 5. vez036, https://doi.org/10.1093/ve/vez036.
  30. 30.↵
    Goudsmit, J.; de Ronde, A.; de Rooij, E.; de Boer, R. Broad spectrum of in vivo fitness of human immunodeficiency virus type 1 subpopulations differing at reverse transcriptase codons 41 and 215. J Virol 1997, 71, 4479–4484. https://doi.org/10.1128/JVI.71.6.4479-4484.1997.
    OpenUrlAbstract/FREE Full Text
  31. 31.↵
    Tambuyzer, L.; Azijn, H.; Rimsky, L.T.; Vingerhoets, J.; Lecocq, P.; Kraus, G.; Picchio, G.; de Béthune, M.P. Compilation and prevalence of mutations associated with resistance to non-nucleoside reverse transcriptase inhibitors. Antivir Ther 2009, 14, 103–109.
    OpenUrlPubMedWeb of Science
  32. 32.↵
    Liu, Y.; Li, H.; Wang, X.; Han, J.; Jia, L.; Li, T.; Li, J.; Li, L. Natural presence of V179E and rising prevalence of E138G in HIV-1 reverse transcriptase in CRF55_01B viruses. Infect Genet Evol 2020, 77, 104098. https://doi.org/10.1016/j.meegid.2019.104098.
    OpenUrl
  33. 33.↵
    Scherrer, A.U.; von Wyl, V.; Yang, W.L.; Kouyos, R.D.; Böni, J.; Yerly, S.; Klimkait, T.; Aubert, V.; Cavassini, M.; Battegay, M.; et al. Emergence of Acquired HIV-1 Drug Resistance Almost Stopped in Switzerland: A 15-Year Prospective Cohort Analysis. Clinical Infectious Diseases 2016, 62, 1310–1317. https://doi.org/10.1093/cid/ciw128.
    OpenUrlCrossRefPubMed
  34. 34.↵
    Rossetti, B.; Di Giambenedetto, S.; Torti, C.; Postorino, M.C.; Punzi, G.; Saladini, F.; Gennari, W.; Borghi, V.; Monno, L.; Pignataro, A.R.; et al. Evolution of transmitted HIV-1 drug resistance and viral subtypes circulation in Italy from 2006 to 2016. HIV Med 2018, 19, 619–628. Place: England, https://doi.org/10.1111/hiv.12640.
    OpenUrlPubMed
  35. 35.↵
    Pingarilho, M.; Pimentel, V.; Diogo, I.; Fernandes, S.; Miranda, M.; Pineda-Pena, A.; Libin, P.; Theys, K.; Martins, M.R.O.; Vandamme, A.M.; et al. Increasing Prevalence of HIV-1 Transmitted Drug Resistance in Portugal: Implications for First Line Treatment Recommendations. Viruses 2020, 12. Place: Switzerland, https://doi.org/10.3390/v12111238.
  36. 36.↵
    Villabona-Arenas, C.J.; Vidal, N.; Guichet, E.; Serrano, L.; Delaporte, E.; Gascuel, O.; Peeters, M. In-depth analysis of HIV-1 drug resistance mutations in HIV-infected individuals failing first-line regimens in West and Central Africa. AIDS 2016, 30, 2577–2589. Place: England, https://doi.org/10.1097/QAD.0000000000001233.
    OpenUrlCrossRef
  37. 37.↵
    Abela, I.A.; Scherrer, A.U.; Böni, J.; Yerly, S.; Klimkait, T.; Perreau, M.; Hirsch, H.H.; Furrer, H.; Calmy, A.; Schmid, P.; et al. Emergence of Drug Resistance in the Swiss HIV Cohort Study Under Potent Antiretroviral Therapy Is Observed in Socially Disadvantaged Patients. Clin Infect Dis 2020, 70, 297–303. Place: United States, https://doi.org/10.1093/cid/ciz178.
    OpenUrl
  38. 38.↵
    Köster, J.; Rahmann, S. Snakemake-a scalable bioinformatics workflow engine. Bioinformatics 2012, 28, 2520–2522. https://doi.org/10.1093/bioinformatics/bts480.
    OpenUrlCrossRefPubMedWeb of Science
  39. 39.↵
    Lemoine, F.; Gascuel, O. Gotree/Goalign: toolkit and Go API to facilitate the development of phylogenetic workflows. NAR Genomics and Bioinformatics 2021, 3. https://doi.org/10.1093/nargab/lqab075.
  40. 40.↵
    Huerta-Cepas, J.; Serra, F.; Bork, P. ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data. Molecular Biology and Evolution 2016, 33, 1635–1638. https://doi.org/10.1093/molbev/msw046.
    OpenUrlCrossRefPubMed
Back to top
PreviousNext
Posted April 25, 2023.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Modelling drug resistance emergence and transmission in HIV-1 in the UK
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Modelling drug resistance emergence and transmission in HIV-1 in the UK
Anna Zhukova, David Dunn, Olivier Gascuel, Anti-HIV Drug Resistance Network
medRxiv 2023.04.24.23288580; doi: https://doi.org/10.1101/2023.04.24.23288580
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Modelling drug resistance emergence and transmission in HIV-1 in the UK
Anna Zhukova, David Dunn, Olivier Gascuel, Anti-HIV Drug Resistance Network
medRxiv 2023.04.24.23288580; doi: https://doi.org/10.1101/2023.04.24.23288580

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • HIV/AIDS
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)