Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Profiling SARS-CoV-2 mutation fingerprints that range from the viral pangenome to individual infection quasispecies

Billy T. Lau, Dmitri Pavlichin, View ORCID ProfileAnna C. Hooker, View ORCID ProfileAlison Almeda, Giwon Shin, View ORCID ProfileJiamin Chen, View ORCID ProfileMalaya K. Sahoo, ChunHong Huang, View ORCID ProfileBenjamin A. Pinsky, View ORCID ProfileHoJoon Lee, Hanlee P. Ji
doi: https://doi.org/10.1101/2020.11.02.20224816
Billy T. Lau
1Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, United States
2Stanford Genome Technology Center West, Stanford University, Palo Alto, CA, 94304, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Dmitri Pavlichin
1Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Anna C. Hooker
1Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Anna C. Hooker
Alison Almeda
1Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Alison Almeda
Giwon Shin
1Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jiamin Chen
1Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jiamin Chen
Malaya K. Sahoo
3Department of Pathology, Stanford University School of Medicine, Stanford, CA, 94305, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Malaya K. Sahoo
ChunHong Huang
3Department of Pathology, Stanford University School of Medicine, Stanford, CA, 94305, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Benjamin A. Pinsky
3Department of Pathology, Stanford University School of Medicine, Stanford, CA, 94305, United States
4Department of Medicine, Division of Infectious Diseases and Geographic Medicine, Stanford University School of Medicine, Stanford, CA, 94305, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Benjamin A. Pinsky
HoJoon Lee
1Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for HoJoon Lee
  • For correspondence: genomics_ji{at}stanford.edu hojoon{at}stanford.edu
Hanlee P. Ji
1Division of Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, United States
2Stanford Genome Technology Center West, Stanford University, Palo Alto, CA, 94304, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: genomics_ji{at}stanford.edu hojoon{at}stanford.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

ABSTRACT

Background The genome of SARS-CoV-2 is susceptible to mutations during viral replication due to the errors generated by RNA-dependent RNA polymerases. These mutations enable the SARS-CoV-2 to evolve into new strains. Viral quasispecies emerge from de novo mutations that occur in individual patients. In combination, these sets of viral mutations provide distinct genetic fingerprints that reveal the patterns of transmission and have utility in contract tracing.

Methods Leveraging thousands of sequenced SARS-CoV-2 genomes, we performed a viral pangenome analysis to identify conserved genomic sequences. We used a rapid and highly efficient computational approach that relies on k-mers, short tracts of sequence, instead of conventional sequence alignment. Using this method, we annotated viral mutation signatures that were associated with specific strains. Based on these highly conserved viral sequences, we developed a rapid and highly scalable targeted sequencing assay to identify mutations, detect quasispecies and identify mutation signatures from patients. These results were compared to the pangenome genetic fingerprints.

Results We built a k-mer index for thousands of SARS-CoV-2 genomes and identified conserved genomics regions and landscape of mutations across thousands of virus genomes. We delineated mutation profiles spanning common genetic fingerprints (the combination of mutations in a viral assembly) and rare ones that occur in only small fraction of patients. We developed a targeted sequencing assay by selecting primers from the conserved viral genome regions to flank frequent mutations. Using a cohort of SARS-CoV-2 clinical samples, we identified genetic fingerprints consisting of strain-specific mutations seen across populations and de novo quasispecies mutations localized to individual infections. We compared the mutation profiles of viral samples undergoing analysis with the features of the pangenome.

Conclusions We conducted an analysis for viral mutation profiles that provide the basis of genetic fingerprints. Our study linked pangenome analysis with targeted deep sequenced SARS-CoV-2 clinical samples. We identified quasispecies mutations occurring within individual patients, mutations demarcating dominant species and the prevalence of mutation signatures, of which a significant number were relatively unique. Analysis of these genetic fingerprints may provide a way of conducting molecular contact tracing.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

The work is supported by the National Institutes of Health [2R01HG006137-04 to H.P.J., P01HG00205ESH to B.T.L. and H.P.J., U01HG010963 to HJ.L., D.P. and H.P.J., 1R35HG011292-01 to B.T.L.]. Additional support came from the Clayville Foundation.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

The Institutional Review Board (IRB) at Stanford University School of Medicine approved the study protocol (IRB-56088).

All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

Sequence data is available at the National Institutes of Health's Sequence Read Archive under BioProject Accession ID of PRJNA663917.

  • ABBREVIATIONS

    Bp
    base pair
    CDC
    Centers for Disease Control and Prevention
    cDNA
    complementary DNA
    EUA
    Emergency Use Authorization
    FDA
    Food and Drug Administration
    GISAID
    Global Initiative on Sharing All Influenza Data
    gnomAD
    Genome Aggregation Database
    kb
    kilobase
    NCBI
    National Center for Biotechnology Information
    NGS
    next-generation sequencing
    nsp
    non-structural protein
    nt
    nucleotide
    ORF
    open reading frame
    RdRp
    RNA-dependent RNA polymerase
    RT
    reverse transcription
    ViPR
    Virus Pathogen Resource
  • Copyright 
    The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
    Back to top
    PreviousNext
    Posted November 04, 2020.
    Download PDF

    Supplementary Material

    Data/Code
    Email

    Thank you for your interest in spreading the word about medRxiv.

    NOTE: Your email address is requested solely to identify you as the sender of this article.

    Enter multiple addresses on separate lines or separate them with commas.
    Profiling SARS-CoV-2 mutation fingerprints that range from the viral pangenome to individual infection quasispecies
    (Your Name) has forwarded a page to you from medRxiv
    (Your Name) thought you would like to see this page from the medRxiv website.
    CAPTCHA
    This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
    Share
    Profiling SARS-CoV-2 mutation fingerprints that range from the viral pangenome to individual infection quasispecies
    Billy T. Lau, Dmitri Pavlichin, Anna C. Hooker, Alison Almeda, Giwon Shin, Jiamin Chen, Malaya K. Sahoo, ChunHong Huang, Benjamin A. Pinsky, HoJoon Lee, Hanlee P. Ji
    medRxiv 2020.11.02.20224816; doi: https://doi.org/10.1101/2020.11.02.20224816
    Twitter logo Facebook logo LinkedIn logo Mendeley logo
    Citation Tools
    Profiling SARS-CoV-2 mutation fingerprints that range from the viral pangenome to individual infection quasispecies
    Billy T. Lau, Dmitri Pavlichin, Anna C. Hooker, Alison Almeda, Giwon Shin, Jiamin Chen, Malaya K. Sahoo, ChunHong Huang, Benjamin A. Pinsky, HoJoon Lee, Hanlee P. Ji
    medRxiv 2020.11.02.20224816; doi: https://doi.org/10.1101/2020.11.02.20224816

    Citation Manager Formats

    • BibTeX
    • Bookends
    • EasyBib
    • EndNote (tagged)
    • EndNote 8 (xml)
    • Medlars
    • Mendeley
    • Papers
    • RefWorks Tagged
    • Ref Manager
    • RIS
    • Zotero
    • Tweet Widget
    • Facebook Like
    • Google Plus One

    Subject Area

    • Infectious Diseases (except HIV/AIDS)
    Subject Areas
    All Articles
    • Addiction Medicine (349)
    • Allergy and Immunology (668)
    • Allergy and Immunology (668)
    • Anesthesia (181)
    • Cardiovascular Medicine (2648)
    • Dentistry and Oral Medicine (316)
    • Dermatology (223)
    • Emergency Medicine (399)
    • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
    • Epidemiology (12228)
    • Forensic Medicine (10)
    • Gastroenterology (759)
    • Genetic and Genomic Medicine (4103)
    • Geriatric Medicine (387)
    • Health Economics (680)
    • Health Informatics (2657)
    • Health Policy (1005)
    • Health Systems and Quality Improvement (985)
    • Hematology (363)
    • HIV/AIDS (851)
    • Infectious Diseases (except HIV/AIDS) (13695)
    • Intensive Care and Critical Care Medicine (797)
    • Medical Education (399)
    • Medical Ethics (109)
    • Nephrology (436)
    • Neurology (3882)
    • Nursing (209)
    • Nutrition (577)
    • Obstetrics and Gynecology (739)
    • Occupational and Environmental Health (695)
    • Oncology (2030)
    • Ophthalmology (585)
    • Orthopedics (240)
    • Otolaryngology (306)
    • Pain Medicine (250)
    • Palliative Medicine (75)
    • Pathology (473)
    • Pediatrics (1115)
    • Pharmacology and Therapeutics (466)
    • Primary Care Research (452)
    • Psychiatry and Clinical Psychology (3432)
    • Public and Global Health (6527)
    • Radiology and Imaging (1403)
    • Rehabilitation Medicine and Physical Therapy (814)
    • Respiratory Medicine (871)
    • Rheumatology (409)
    • Sexual and Reproductive Health (410)
    • Sports Medicine (342)
    • Surgery (448)
    • Toxicology (53)
    • Transplantation (185)
    • Urology (165)