Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

A Monte Carlo Estimation of the Narrow-Sense Heritability of COVID-19 Infection and Severity from AncestryDNA Survey Data

View ORCID ProfileAmelia J. Averitt, Deepika Sharma, View ORCID ProfileMichael Cantor
doi: https://doi.org/10.1101/2022.05.23.22275364
Amelia J. Averitt
1Regeneron Genetics Center, Tarrytown, NY
2Department of Biomedical Informatics, Columbia University, New York, NY
MPH MA PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Amelia J. Averitt
  • For correspondence: amelia.averitt{at}regeneron.com
Deepika Sharma
1Regeneron Genetics Center, Tarrytown, NY
MHI
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael Cantor
1Regeneron Genetics Center, Tarrytown, NY
MD MA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Michael Cantor
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Respiratory infectious diseases, such as COVID-19, demonstrate a host genetic component that contributes to interindividual differences of susceptibility and infection. At present, the relative effect of environmental and genetic factors of COVID-19 is unknown. This research presents a Monte Carlo (MC) estimation of the genetic narrow-sense heritability of COVID-19 infection and severity from AncestryDNA survey data. The results suggest a moderate genetic contribution to COVID-19 infection and a low genetic contribution for COVID-19 severity.

Introduction

Many regard the emergence of COVID-19, caused by severe acute respiratory syndrome 2 (SARS-Cov-2), to be the greatest public health crisis of the past century. Effective management of infectious disease outbreaks, such as COVID-19, requires data-driven public health management. Often, public health management of infectious diseases is pathogen centered. This approach makes the assumption that the pathogen, itself, dictates the course of the disease. However, research has shown that genetics of the host may also contribute to infectious disease burden (Klebanov 2018). Respiratory infectious diseases, like COVID-19, also demonstrate a host genetic component that contributes to inter-individual differences (Patarčić et al. 2015). This contribution of the host’s genetics, in tandem with environmental factors, determines disease susceptibility and infection. Understanding the relative effect of environmental and genetic factors of respiratory infectious diseases may be helpful to guide public health intervention.

Genetic heritability summarizes the variation in observable characteristics – or phenotype – that is attributable to a variation in genetics rather than the environment or random effects. High genetic heritability indicates a strong phenotypic similarity between parents and offspring that is due to genetics, while low heritability indicates a low phenotypic similarity. A subset of genetic heritability is narrow-sense heritability (h2). h2 is the average proportion of total phenotypic variance that is due to additive genetic factors that are passed from parents to offspring. When discussing respiratory infectious diseases, knowledge of h2 can help inform public health interventions. In situations of uncertainty, as is the case with COVID-19, h2 estimates of disease susceptibility and severity may provide evidence of how the disease will, on average, affect families. This may enable clinicians to appropriately intervene on or monitor family groups and aid in allocation of medical resources.

Typically, heritability estimates, such as h2, are made via twin studies. In such studies, phenotypes of monozygotic (identical) twins are compared with phenotypes of dizygotic (fraternal) twins (Visscher, Hill, and Wray 2008). Though twin studies are robust to genetic and environmental confounders, they suffer from many limitations. They are inefficient, could result in an overestimation of heritability; and the study population may not be representative of the larger population, which could limit generalizability. Heritability estimates have also been made from the electronic health records, but often require lengthy patient recruitment, familial ascertainment, and phenotype identification (Polubriaginof et al. 2018).

Methods

The Method

This research presents a Monte Carlo (MC) estimation of the narrow-sense heritability of COVID-19 infection and severity from AncestryDNA survey data. This data is from a private collaboration with AncestryDNA and Regeneron Pharmaceuticals and was collected online from volunteer respondents. The survey was given to AncestryDNA customers (N210k) and aimed to assess each respondent’s COVID-19 infection status, exposure, risk factors, and symptoms. This data includes, if and which biological family members were infected with and hospitalized from COVID-19, which can be used for h2 estimation. Unlike traditional data for this task, the AncestryDNA survey data is borne from a large and heterogenous population, familial relationships are easily identified, and COVID-19 phenotypes are defined. Clinical data is preferred when making h2 estimates, but research demonstrates that self-reported data – such as survey data – yields concordance of heritability estimates from a sufficiently large sample (Macgregor et al. 2006).

In the circumstance when a parent is infected with or hospitalized from COVID-19, this survey’s responses fail to specify whether one or two parents were affected. This research uses an MC method to stochastically estimate the parent’s phenotypic variance using parameters that are grounded in real-world epidemiologic evidence. MC methods are a class of algorithms that rely on repeated sampling to make estimates that cannot be directly made. Over many sampling iterations, estimates derived from an MC procedure will theoretically converge with the true value. Our MC method is summarized in Algorithm 1. For this algorithm, if a respondent (proband) has one or more parents were infected or hospitalized, the count of afflicted parents is initialized to 1. Stochastic parameters for the simulation include the rate of two-parent households (η) (United States Census Bureau 2019); rate of household co-infection (ρ) (Li et al. 2020)(Grijalva et al. 2020)(Fung et al. 2021); and the rate of hospitalization among COVID infected individuals (ψ) (Garg et al. 2020). h2 is then estimated from the slope of a linear regression that models the simulated, mid-phenotypic value of the parents by the phenotypic value of the proband.

Algorithm 1

Pseudocode for Monte Carlo (MC) Estimation. Where, N is the number of iterations; J is number of observations in data; and xj is number of parents in household for proband j.

Figure
  • Download figure
  • Open in new tab

Experimentation

This research presents (i) an application of the proposed methods to simulated data, and (ii) an application to AncestryDNA survey data to estimate h2 of COVID-19 infection Embedded Image and severity Embedded Image

Simulation

To evaluate the proposed method, we applied our estimation procedure to data in which the ground-truth h2 is known. Proband and parent phenotypes were simulated such that the ‘true’ h2 was hard-coded into the data. If 1 or 2 parents were affected, the data was then masked to only indicate if 1 or more was affected, which aligns with the uncertainty in the AncestryDNA data. The MC algorithm was applied to this masked, simulated data and an h2 estimate was made. This experimental set-up was repeated 1000 times using varying ‘true’ h2 values of 0.1, 0.3 and 0.5.

Results

When applied to simulated data, in which groundtruth h2 is known, the method is able to recover h2 with high accuracy (Table 1). Presuming that the parameters are true, this indicates that the proposed method can recover h2 with minimal bias.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 1:

Mean MC estimates and 95‥ confidence intervals of h2 when applied to masked, simulated data with target h2s of 0.1, 0.3, and 0.5.

Application to AncestryDNA Survey Data

To estimate h2 from the AncestryDNA data, proband phenotypes for COVID-19 infection and severity were binarized. Cases of proband COVID-19 infection (2.24%) were identified as respondents who reported any of the following (i) a swab-test for COVID-19 that was positive; (ii) no swab-test for COVID-19, but respondent had flu-like symptoms after February 2020; or (iii) an antibody test for COVID-19 that was positive. From the population of infection cases, severe COVID-19 (0.25%) probands were defined as respondents who reported any of the following (i) hospitalization due to COVID-19; (ii) hospitalization in the ICU with oxygen; or (iii) hospitalization in the ICU with a ventilator. Cases were assigned a phenotype of 1. In the absence of case requirements for either phenotype, the proband phenotype was 0. Parents of a proband that did not report that their “Parent(s)” were infected or hospitalized due to COVID-19 were assigned a phenotype of 0 for both outcomes. In the event that a proband reported that “Parent(s)” were infected or hospitalized due to COVID-19, this data was passed into the MC simulation to stochastically model the parent phenotype; if 1 or 2 parents were infected (pi) and hospitalized (ph). Models for both phenotypes of interest, were adjusted for the age, gender, and ever-smoking status of the proband. Given the high variability in reported household co-infection, ρ, the MC simulation was repeated using three reported statistics to inform this parameter (Li et al. 2020)(Grijalva et al. 2020)(Fung et al. 2021).

Results

The results of the application to AncestryDNA survey data indicate that h2 of COVID-19 infection susceptibility ranges from 0.1554 to 0.1833. Embedded Image when Embedded Image; and when Embedded Image. h2 of COVID-19 severity ranges from 0.0734 to 0.0751. When Embedded Image when Embedded Imageand whenEmbedded ImageThese results sug-gest a moderate genetic contribution to COVID-19 infection and a low genetic contribution for COVID-19 severity (Figure 1). For context, h2 of height is 0.8 (Macgregor et al. 2006).

Figure 1:
  • Download figure
  • Open in new tab
Figure 1: MC estimates of h2 from AncestryDNA data, with varying parameters of household co-infection – 0.163 (Li et al. 2020), 0.170 (Grijalva et al. 2020), and 0.530 (Fung et al. 2021).

Discussion

This research provides evidence that variability in infection by SARS-Cov-2 is, in part, explained by inherited genetic factors. Clinicians and public health professionals should be mindful of inherited disease susceptibility and monitor biological family members of the infected. These estimates may also provide context for genetic contribution to infection severity and hospitalization for disease variants and other pathogens. However, this approach does not account for recessive effects or gene-gene interaction and is reliant upon the estimated parameters. furthermore, this research and abstract was completed in March 2020. COVID research, including our understanding of genetic mechanisms of disease and epidemiologic parameters, have matured since this time.

Ethical Statement

All data for this research project was from individuals who provided prior informed consent to AncestryDNA, as reviewed and approved by our external institutional review board, Advarra (formerly Quorum). All data were de-identified before use.

Data Availability

Due to privacy and ethical concerns, supporting data cannot be made openly available.

References

  1. [Fung et al. 2021].↵
    Fung, H. F.; Martinez, L.; Alarid-Escudero, F.; Salomon, J. A.; Studdert, D. M.; Andrews, J. R.; Goldhaber-Fiebert, J. D.; Chin, E. T.; Claypool, A. L.; Fernandez, M.; Gracia, V.; Luviano, A.; Rosales, R. I. M.; Reitsma, M.; Ryckman, T.; and Ryckman, T. 2021. The Household Secondary Attack Rate of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2): A Rapid Review. Clin-ical Infectious Diseases 73(Supplement<sub>2</sub>) : S138––S145.
  2. [Garg et al. 2020].↵
    Garg, S.; Kim, L.; Whitaker, M.; O’Halloran, A.; Cummings, C.; Holstein, R.; Prill, M.; Chai, S. J.; Kirley, P. D.; Alden, N. B.; Kawasaki, B.; Yousey-Hindes, K.; Niccolai, L.; Anderson, E. J.; Openo, K. P.; Weigel, A.; Monroe, M. L.; Ryan, P.; Henderson, J.; Kim, S.; Como-Sabetti, K.; Lynfield, R.; Sosin, D.; Torres, S.; Muse, A.; Bennett, N. M.; Billing, L.; Sutton, M.; West, N.; Schaffner, W.; Talbot, H. K.; Aquino, C.; George, A.; Budd, A.; Brammer, L.; Langley, G.; Hall, A. J.; and Fry, A. 2020. Hospitalization Rates and Characteristics of Patients Hospitalized with Laboratory-Confirmed Coronavirus Disease 2019 — COVID-NET, 14 States, March 1–30, 2020. MMWR. Morbidity and Mortality Weekly Report 69(15):458–464.
    OpenUrlCrossRefPubMed
  3. [Grijalva et al. 2020].↵
    Grijalva, C. G.; Rolfes, M. A.; Zhu, Y.; McLean, H. Q.; Hanson, K. E.; Belongia, E. A.; Halasa, N. B.; Kim, A.; Reed, C.; Fry, A. M.; and Talbot, H. K. 2020. Transmission of SARS-COV-2 Infections in Households — Tennessee and Wisconsin, April–September 2020. MMWR. Morbidity and Mortality Weekly Report 69(44).
  4. [Klebanov 2018].↵
    Klebanov, N. 2018. Genetic Predisposition to Infectious Disease. Cureus 10(8).
  5. [Li et al. 2020].↵
    Li, W.; Zhang, B.; Lu, J.; Liu, S.; Chang, Z.; Peng, C.; Liu, X.; Zhang, P.; Ling, Y.; Tao, K.; and Chen, J. 2020. Characteristics of Household Transmission of COVID-19. Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 71(8):1943–1946.
    OpenUrlPubMed
  6. [Macgregor et al. 2006].↵
    Macgregor, S.; Cornes, B. K.; Martin, N. G.; and Visscher, P. M. 2006. Bias, precision and heritability of self-reported and clinically measured height in Australian twins. Human genetics 120(4):571–80.
    OpenUrlCrossRefPubMedWeb of Science
  7. [Patarčić et al. 2015].↵
    Patarčić, I.; Gelemanović, A.; Kirin, M.; Kolčić, I.; Theodoratou, E.; Baillie, K. J.; de Jong, M. D.; Rudan, I.; Campbell, H.; and Polašek, O. 2015. The role of host genetic factors in respiratory tract infectious diseases: systematic review, meta-analyses and field synopsis. Scientific Reports 5(1):16119.
    OpenUrl
  8. [Polubriaginof et al. 2018].↵
    Polubriaginof, F. C. G.; Vanguri, R.; Quinnies, K.; Belbin, G. M.; Yahi, A.; Salmasian, H.; Lorberbaum, T.; Nwankwo, V.; Li, L.; Shervey, M. M.; Glowe, P.; Ionita-Laza, I.; Simmerling, M.; Hripcsak, G.; Bakken, S.; Goldstein, D.; Kiryluk, K.; Kenny, E. E.; Dudley, J.; Vawdrey, D. K.; and Tatonetti, N. P. 2018. Disease Heritability Inferred from Familial Relationships Reported in Medical Records. Cell 173(7):1692–1704.e11.
    OpenUrlCrossRef
  9. [United States Census Bureau 2019].↵
    United States Census Bureau. 2019. Families Living Arrangements.
  10. [Visscher, Hill, and Wray 2008].↵
    Visscher, P. M.; Hill, W. G.; and Wray, N. R. 2008. Heritability in the genomics era — concepts and misconceptions. Nature Reviews Genetics 9(4):255–266.
    OpenUrlCrossRefPubMedWeb of Science
Back to top
PreviousNext
Posted May 25, 2022.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
A Monte Carlo Estimation of the Narrow-Sense Heritability of COVID-19 Infection and Severity from AncestryDNA Survey Data
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
A Monte Carlo Estimation of the Narrow-Sense Heritability of COVID-19 Infection and Severity from AncestryDNA Survey Data
Amelia J. Averitt, Deepika Sharma, Michael Cantor
medRxiv 2022.05.23.22275364; doi: https://doi.org/10.1101/2022.05.23.22275364
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
A Monte Carlo Estimation of the Narrow-Sense Heritability of COVID-19 Infection and Severity from AncestryDNA Survey Data
Amelia J. Averitt, Deepika Sharma, Michael Cantor
medRxiv 2022.05.23.22275364; doi: https://doi.org/10.1101/2022.05.23.22275364

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetic and Genomic Medicine
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)