How many relevant SARS-CoV-2 variants might we expect in the future? ==================================================================== * Roberto Littera * Maurizio Melis ## Abstract **Objectives** The emergence of new SARS-CoV-2 variants is a major challenge in the management of Covid-19 pandemic. A crucial issue is to quantify the number of variants which may represent a potential risk for public health in the future. **Methods** We fitted the data on the most relevant SARS-CoV-2 variants recorded by the World Health Organization (WHO). The function exploited for the fit is related to the total number of infected subjects in the world since the start of the epidemic. **Results** We found that the number of relevant SARS-CoV-2 variants up to November 2021 was about 44. Moreover, the number of new relevant variants per ten million cases turned out to be 1.64 in November 2021, slightly decreased in comparison to the value of 2.29 in March 2020. **Conclusions** Our simple mathematical model can evaluate the number of relevant SARS-CoV-2 variants as the cumulative number of cases increase worldwide and may represent a useful tool in planning strategies to effectively contrast the pandemic. Keywords * Covid-19 * SARS-CoV-2 variants * infectious diseases * epidemiology ## Introduction Most mutations in the genome of the severe acute respiratory syndrome coronavirus (SARS-CoV-2) are neutral or only mildly deleterious. However, a small proportion of mutations can increase infectivity and promote virus-host interactions that are critical to the establishment of persistent and more severe infection [1, 2]. For example, mutations in the spike protein, which mediates attachment of the virus to host cell-surface receptors [3], can have significant effects on virus behaviour. In order to effectively control the pandemic, it is imperative to investigate the emergence and spread of variants with an impact on disease transmission and human health [4]. SARS-CoV-2 sequences are shared daily on public databases such as the Global Initiative on Sharing All Influenza Data (GISAID) [5] or the European Centre for Disease Prevention and Control (ECDC) [6], which significantly contribute to surveillance of the pandemic. The World Health Organization has established that SARS-CoV-2 variants representing a possible risk to public health can be divided into three distinct categories [1]: variants under monitoring (VUMs), variants of interest (VOIs) and variants of concern (VOCs). *Variants Under Monitoring* (VUMs) are associated with genetic mutations which alter virus characteristics, although evidence of phenotypic or epidemiological impact is still unclear. *Variants of Interest* (VOIs) are associated with: *i)* genetic mutations which affect transmissibility, disease course, diagnostic or therapeutic escape; *ii)* relevant community transmission with an emerging risk to global public health. *Variants of Concern* (VOCs) are associated with one or more of the following characteristics: *i)* increase in transmissibility; *ii)* increase in virulence or change in disease severity; *iii)* decrease in effectiveness of social measures, diagnostics, vaccines and therapeutics. Given the continuous evolution of the SARS-CoV-2 virus, variants may be reclassified over time. In the present study, we fitted the WHO data [1, 2] by exploiting a function which exclusively depends on the number of infected cases worldwide. Our fit allows for a fairly good estimate of the number of relevant variants that can be expected to appear for a given number of infected subjects throughout the world. Furthermore, our approach can predict the number of new relevant variants per ten million cases in any epidemiological situation. The number of new relevant variants per ten million cases decreases very slowly as the cumulative number of Covid-19 cases increases. Therefore, it becomes crucial to carefully monitor and reduce virus circulation in order to avoid the emergence of new variants which may not be suitably covered by the vaccines and drugs currently available [7]. Despite the huge efforts put forth by healthcare services in many countries, vaccination campaigns have not achieved a population coverage that is high enough to prevent the spread of SARS-CoV-2. The WHO estimate [8] is that in Europe and Central Asia alone the current fourth wave of the Covid-19 pandemic is likely to cause more than half a million deaths. Although it is obvious that the number of relevant variants increases with the cases throughout the world, it is extremely difficult to find out the precise relationship between these two variables. Our model is a simple attempt to make a fairly reliable estimate of the risk of new variants that can impact public health as the virus continues to spread. ### Methods By means of Wolfram Mathematica [9] we fitted the data on SARS-CoV-2 variants in order to evaluate the cumulative number *υ* of relevant SARS-CoV-2 variants versus the cumulative number *N* of infected subjects worldwide. The function *υ* fitting the WHO data must satisfy the following conditions: 1. The function *υ* varies from zero to infinity. If there is no infection, the number of variants is zero; vice versa, if the virus replicates infinite times, the cumulative number of variants is also infinite. 2. *υ* increases monotonically, therefore the first derivative *υ*′(*N*) is positive. The cumulative number of variants increases with the number of infections, i.e. as the virus replications increase. 3. The first derivative of *υ* decreases monotonically, therefore the second derivative *υ*′′(*N*) is negative. As the cumulative number of variants increases with the total cases in the world, the emergence of new virus mutations turns out to be slightly less frequent. The fit of WHO data was obtained by means of the following function: ![Formula][1] where *k* is the constant of the numerical fit and “log *N*” represents the natural logarithm of *N*. This function satisfies all the previous conditions, as shown below: 1. ![Graphic][2] and ![Graphic][3]. 2. ![Graphic][4], where *e* ≅ 2.72 is the Euler’s number. 3. ![Graphic][5], with *e*2 ≅ 7.39. The number *n* of new relevant variants per ten million cases (Δ*N* = 107) turns out to be: ![Formula][6] The relative variation |*n*′(*N*)| of new relevant variants per ten million cases decreases as the number *N* of cases increases: ![Graphic][7] Instead of imposing the three conditions listed previously, the function *υ* exploited in the fit can also be justified through a more heuristic approach discussed in Appendix A. ## Results We fitted the data recorded by WHO [1, 2] up to November 2021 by means of a specific code written with Wolfram Mathematica 12.1.3 [9]. Table 1 lists the characteristics of SARS-CoV-2 variants reported by WHO [1, 2]: date and country of the earliest detection, PANGO (Phylogenetic Assignment of Named Global Outbreak) and WHO classification, current relevance (variants of concern, variants of interest or under monitoring), total number of cases in the world at the end of the month of detection and cumulative number of variants. PANGO is a rule-based nomenclature system for naming and tracking SARS-CoV-2 genetic lineages [10]. The numerical fit of the WHO data was obtained by means of the function *υ*(*N*) = *k* · *N*/log *N*, where the constant of the numerical fit is *k* = 3.35 · 10−6. The 95% confidence interval (CI) of *k* is given by 95% CI = (2.91 – 3.79) · 10−6. View this table: [Table 1.](http://medrxiv.org/content/early/2021/11/21/2021.11.17.21266463/T1) Table 1. Characteristics of SARS-CoV-2 variants recorded by WHO [1, 2]: date and country of the earliest detection, PANGO and WHO nomenclature, current relevance (concern, interest or under monitoring) and cumulative number of cases in the world at the end of the month of detection. The last column summarises the cumulative number of observed relevant variants. The adjusted *R*-squared, measuring the goodness of the fit, turned out to be *R*2 = 0.97. Further technical details on the fit of WHO data are presented in Appendix B. Figure 1 represents the cumulative number *υ* of relevant SARS-CoV-2 variants versus the cumulative number *N* of cases in the world. The dots from 1 to 10 correspond to the data reported by WHO [1, 2] from March 2020 to May 2021; the solid line represents the function used in the fit: *υ* = *k* · *N*/log *N*. ![Figure 1.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2021/11/21/2021.11.17.21266463/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2021/11/21/2021.11.17.21266463/F1) Figure 1. Cumulative number of relevant SARS-CoV-2 variants versus the cumulative number of cases in the world. The dots from 1 to 10 indicate the data reported by WHO [1, 2] from March 2020 to May 2021; the solid line represents the numerical fit *υ* = *k* · *N*/log *N* obtained with Wolfram Mathematica. The analytical formula *υ* exploited in the fit allows to predict both the cumulative number of relevant variants and the new relevant variants for a given number of cases in the world. The total cases in the world up to the 14th of November 2021 were 252826597 [2]. The corresponding cumulative number of relevant SARS-CoV-2 variants was 43.7, i.e. almost 19 variants more than the last WHO report dated back to May 2021, when the relevant variants were 25 [1]. As discussed in the Methods section, the number *n* of new relevant variants per ten million (107) cases is ![Graphic][8], which becomes ![Graphic][9] by substituting the numerical value of *k*. From March 2020 to November 2021 the number of new variants per ten million cases decreased only by 28.4%, from 2.29 to 1.64. Figure 2 reports the number *n* of new relevant SARS-CoV-2 variants per ten million cases versus the cumulative number *N* of cases in the world. The analytical behaviour represented by the plot is given by the formula ![Graphic][10]. ![Figure 2.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2021/11/21/2021.11.17.21266463/F2.medium.gif) [Figure 2.](http://medrxiv.org/content/early/2021/11/21/2021.11.17.21266463/F2) Figure 2. Number *n* of new relevant SARS-CoV-2 variants per ten million cases versus the cumulative number of cases in the world. From March 2020 to November 2021 *n* decreased from 2.29 to 1.64. The relative variation |*n*′(*N*)| of the number *n* of new relevant variants per ten million cases is given by ![Graphic][11], with |*n*′(*N*)| ≪ 1 for *N* ≫ 1. This analytical result implies that the number *n* of new relevant variants per ten million cases does not decrease significantly as the virus continues to circulate. For instance, *n* will remain above 1.40 as long as the cumulative cases in the world increase to about 8.5 billion. Figure 3 represents the predicted increase of the cumulative number of relevant variants per each step of ten million infections, from 170 to 300 million cumulative cases in the world. In this range of cases, the number of relevant variants is predicted to increase from 30 to about 51. ![Figure 3.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2021/11/21/2021.11.17.21266463/F3.medium.gif) [Figure 3.](http://medrxiv.org/content/early/2021/11/21/2021.11.17.21266463/F3) Figure 3. Prediction of the cumulative number of SARS-CoV-2 variants from 170 to 300 million cases in the world. The dotted line represents the function *υ* = *k* · *N*/log *N*, while each step shown on the plot corresponds to the predicted number of new relevant variants per ten million cases. As shown in Figure 3, the total infected cases were about 170 and 250 million in May 2021 and November 2021, respectively. Our model only focuses on the relationship between the number of virus replications and the emergence of relevant variants. All other factors involved in the diffusion of new variants were not taken into account; for this reason, we could suppose that the parameter *k* in the fit *υ* = *k* · *N*/log *N* was constant, although it actually varies with the factors affecting the emergence of relevant variants. ## Discussion Since the start of the Covid-19 pandemic there has been an impressive global effort in investigating every aspect of the coronavirus epidemic [11], including immunogenetic [12, 13] and epidemiological [14] issues. In this study, we built a simple mathematical model to calculate the number of relevant SARS-CoV-2 variants from the number of infected cases in the world. By fitting the WHO data listed in Table 1, we obtained an analytical formula which allows to predict both the cumulative number of variants and the number of new relevant variants in a given epidemiological situation. For example, up to the 14th of November 2021, the cumulative number of cases worldwide were 252826597 [2], corresponding to 43.7 relevant variants, i.e. almost 19 variants more than the last WHO report [1] dated back to May 2021. Analogously, we found that the number of new relevant variants per ten million cases was 1.64 on the 14th of November 2021, decreased only by 28.4% in comparison to March 2020 when the number was 2.29. Our method depends critically on the WHO efficiency in tracking the most relevant SARS-CoV-2 variants. A different approach would be to consider the whole number of variants detected by genomic sequencing of SARS-CoV-2 and recorded in public databases such as GISAID [5]. This choice would be independent from the WHO targeting of the most relevant variants but would be less interesting from a clinical viewpoint since only the variants affecting virus transmission or disease severity are important to the control and management of the pandemic. As shown in Figure 2, the number of new relevant variants per ten million cases decreases very slowly as the cumulative number of cases increases. Therefore, the persistence of virus circulation will always cause the emergence of new relevant SARS-CoV-2 variants. Our model does not take into account the fact that the number of virus replications is different in each infected subject. However, the average number of replications can be assumed to be constant over a large number of cases, such as those recorded worldwide. The diffusion of new relevant variants depends on a large variety of factors. For instance, the ability to monitor all virus mutations, the effectiveness of containment measures, vaccination campaigns, the evolution of the viral genome, the evolution patterns of the virus [15], and so on. All these factors were ignored in our model, which only focuses on the number of virus replications. The parameter *k* appearing in the fit *υ* = *k* · *N*/log *N* is not actually constant, as supposed in our model, but varies with all the other factors which can affect the emergence of relevant variants. None the less, our model provides a fairly good estimate of what we can expect in the future. SARS-CoV-2 vaccination has led to a decrease in hospitalisations and disease severity. Nevertheless, the current number of infections is still too high to prevent the appearance of new variants potentially dangerous to public health. The number of variants increases with the number of cases *in the world*: this result underlines the urgency of making every effort to reduce the impact of virus replication in all geographical areas. The risk that new relevant variants may emerge anywhere in the world indicates that the winning strategy is not to leave any country behind in the battle against the virus. The possibility to predict the number of new relevant SARS-CoV-2 variants will become increasingly important in future to ensure optimal planning of vaccination campaigns by healthcare services, united in the awareness that new variants can change the characteristics of the virus and greatly influence the global management of the pandemic. ## Supporting information Appendices [[supplements/266463_file02.pdf]](pending:yes) ## Data Availability All data referred to in the manuscript are available online at [https://www.who.int/en/activities/tracking-SARS-CoV-2-variants](https://www.who.int/en/activities/tracking-SARS-CoV-2-variants) ## Authors’ contributions The authors contributed equally to the article. ## Conflicts of Interest The authors declare that no competing interests exist. ## Funding The authors received no specific funding for this work. ## Ethical approval Not applicable. ## Acknowledgments The authors are grateful to Anna Maria Koopmans for translations, professional writing assistance and preparation of the manuscript. ## Footnotes * *E-mail addresses:* roby.litter{at}gmail.com (R. Littera), maurizio.melis{at}gmail.com (M. Melis). * Received November 17, 2021. * Revision received November 17, 2021. * Accepted November 21, 2021. * © 2021, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), CC BY-NC 4.0, as described at [http://creativecommons.org/licenses/by-nc/4.0/](http://creativecommons.org/licenses/by-nc/4.0/) ## References 1. [1].WHO (World Health Organization). Tracking SARS-CoV-2 variants. Working Definitions and Actions Taken. [https://www.who.int/en/activities/tracking-SARS-CoV-2-variants/](https://www.who.int/en/activities/tracking-SARS-CoV-2-variants/). Accessed 14 Nov 2021. 2. [2].WHO (World Health Organization). Covid-19 Weekly Epidemiological and Operational Update. [https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports/](https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports/). Accessed 14 Nov 2021. 3. [3].Harvey WT, Carabelli AM, Jackson B, Gupta RK, Thomson EC, Harrison EM, et al. SARS-CoV-2 variants, spike mutations and immune escape. Nat Rev Microbiol 2021; 19: 409–24. [https://doi.org/10.1038/s41579-021-00573-0](https://doi.org/10.1038/s41579-021-00573-0). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/S41579-021-00573-0&link_type=DOI) 4. [4].Otto SP, Day T, Arino J, et al. The origins and potential future of SARS-CoV-2 variants of concern in the evolving COVID-19 pandemic. Current Biology 2021; 31: R918–R929. [https://doi.org/10.1016/j.cub.2021.06.049](https://doi.org/10.1016/j.cub.2021.06.049). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cub.2021.06.049&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=34314723&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F11%2F21%2F2021.11.17.21266463.atom) 5. [5].GISAID (Global Initiative on Sharing All Influenza Data). hCoV-19 tracking of variants. [https://www.gisaid.org/](https://www.gisaid.org/). 6. [6].ECDC (European Centre for Disease Prevention and Control). SARS-CoV-2 variants of concern. [https://www.ecdc.europa.eu/en/covid-19/variants-concern](https://www.ecdc.europa.eu/en/covid-19/variants-concern). 7. [7].Krause PR, Fleming TR, Longini IM, Peto R, Briand S, Heymann DL, et al. SARS-CoV-2 Variants and Vaccines. N Engl J Med 2021; 385: 179–86. [https://doi.org/10.1056/NEJMsr2105280](https://doi.org/10.1056/NEJMsr2105280). [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F11%2F21%2F2021.11.17.21266463.atom) 8. [8].WHO (World Health Organization) Regional Office for Europe. Update on COVID-19: Europe and Central Asia again at the epicentre of the pandemic. [https://www.euro.who.int/en/media-centre/sections/statements/2021/statement-update-on-covid-19-europe-and-central-asia-again-at-the-epicentre-of-the-pandemic](https://www.euro.who.int/en/media-centre/sections/statements/2021/statement-update-on-covid-19-europe-and-central-asia-again-at-the-epicentre-of-the-pandemic). Accessed 4 Nov 2021. 9. [9].Wolfram Research, Inc. Mathematica 12.1.3 (Trial Version). Champaign, Illinois, US. Released in July 2021. 10. [10].Rambaut A, Holmes EC, O’Toole Á, Hill V, McCrone JT, Ruis C, et al. A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology. Nat Microbiol 2020; 5: 1403–7. [https://doi.org/10.1038/s41564-020-0770-5](https://doi.org/10.1038/s41564-020-0770-5). [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F11%2F21%2F2021.11.17.21266463.atom) 11. [11].Lavezzo E, Franchin E, Ciavarella C, Cuomo-Dannenburg G, Luisa Barzon L, Del Vecchio C, et al. Suppression of a SARS-CoV-2 outbreak in the Italian municipality of Vo’. Nature 2020; 584: 425–9. [https://doi.org/10.1038/s41586-020-2488-1](https://doi.org/10.1038/s41586-020-2488-1). [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F11%2F21%2F2021.11.17.21266463.atom) 12. [12].Littera R, Campagna M, Deidda S, Angioni G, Cipri S, Melis M, et al. Human Leukocyte Antigen Complex and Other Immunogenetic and Clinical Factors Influence Susceptibility or Protection to SARS-CoV-2 Infection and Severity of the Disease Course. The Sardinian Experience. Front Immunol 2020; 11:605688. [https://doi.org/10.3389/fimmu.2020.605688](https://doi.org/10.3389/fimmu.2020.605688). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.3389/fimmu.2020.605688&link_type=DOI) 13. [13].Littera R, Chessa L, Deidda S, Angioni G, Campagna M, Lai S, et al. Natural killer-cell immunoglobulin-like receptors trigger differences in immune response to SARS-CoV-2 infection. PLoS ONE 2021; 16(8): e0255608. [https://doi.org/10.1371/journal.pone.0255608](https://doi.org/10.1371/journal.pone.0255608). 14. [14].Melis M, Littera R. Undetected infectives in the Covid-19 pandemic. Int J Inf Dis 2021; 104: 262–8. [https://doi.org/10.1016/j.ijid.2021.01.010](https://doi.org/10.1016/j.ijid.2021.01.010). 15. [15].Giovanetti M, Benedetti F, Campisi G, et al. Evolution patterns of SARS-CoV-2: Snapshot on its genome variants. Biochem Biophys Res Commun 2021; 538: 88–91. [https://doi.org/10.1016/j.bbrc.2020.10.102](https://doi.org/10.1016/j.bbrc.2020.10.102). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.bbrc.2020.10.102&link_type=DOI) [1]: /embed/graphic-1.gif [2]: /embed/inline-graphic-1.gif [3]: /embed/inline-graphic-2.gif [4]: /embed/inline-graphic-3.gif [5]: /embed/inline-graphic-4.gif [6]: /embed/graphic-2.gif [7]: /embed/inline-graphic-5.gif [8]: /embed/inline-graphic-6.gif [9]: /embed/inline-graphic-7.gif [10]: /embed/inline-graphic-8.gif [11]: /embed/inline-graphic-9.gif