Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Mapping structural variants to rare disease genes using long-read whole genome sequencing and trait-relevant polygenic scores

View ORCID ProfileCas LeMaster, Carl Schwendinger-Schreck, Bing Ge, Warren A. Cheung, Rebecca McLennan, Jeffrey J. Johnston, Tomi Pastinen, View ORCID ProfileCraig Smail
doi: https://doi.org/10.1101/2024.03.15.24304216
Cas LeMaster
1Genomic Medicine Center, Children’s Mercy Research Institute and Children’s Mercy Kansas City, Kansas City, MO, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Cas LeMaster
  • For correspondence: clemaster{at}cmh.edu
Carl Schwendinger-Schreck
1Genomic Medicine Center, Children’s Mercy Research Institute and Children’s Mercy Kansas City, Kansas City, MO, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Bing Ge
2McGill University, Montreal, Quebec, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Warren A. Cheung
1Genomic Medicine Center, Children’s Mercy Research Institute and Children’s Mercy Kansas City, Kansas City, MO, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Rebecca McLennan
1Genomic Medicine Center, Children’s Mercy Research Institute and Children’s Mercy Kansas City, Kansas City, MO, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jeffrey J. Johnston
1Genomic Medicine Center, Children’s Mercy Research Institute and Children’s Mercy Kansas City, Kansas City, MO, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Tomi Pastinen
1Genomic Medicine Center, Children’s Mercy Research Institute and Children’s Mercy Kansas City, Kansas City, MO, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Craig Smail
1Genomic Medicine Center, Children’s Mercy Research Institute and Children’s Mercy Kansas City, Kansas City, MO, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Craig Smail
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

ABSTRACT

Recent studies have revealed the pervasive landscape of rare structural variants (rSVs) present in human genomes. rSVs can have extreme effects on the expression of proximal genes and, in a rare disease context, have been implicated in patient cases where no diagnostic single nucleotide variant (SNV) was found. Approaches for integrating rSVs to date have focused on targeted approaches in known Mendelian rare disease genes. This approach is intractable for rare diseases with many causal loci or patients with complex, multi-phenotype syndromes. We hypothesized that integrating trait-relevant polygenic scores (PGS) would provide a substantial reduction in the number of candidate disease genes in which to assess rSV effects. We further implemented a method for ranking PGS genes to define a set of core/key genes where a rSV has the potential to exert relatively larger effects on disease risk. Among a subset of patients enrolled in the Genomic Answers for Kids (GA4K) rare disease program (N=497), we used PacBio HiFi long-read whole genome sequencing (lrWGS) to identify rSVs intersecting genes in trait-relevant PGSs. Illustrating our approach in Autism (N=54 cases), we identified 22,019 deletions, 2,041 duplications, 87,826 insertions, and 214 inversions overlapping putative core/key PGS genes. Additionally, by integrating genomic constraint annotations from gnomAD, we observed that rare duplications overlapping putative core/key PGS genes were frequently in higher constraint regions compared to controls (P = 1×10−03). This difference was not observed in the lowest-ranked gene set (P = 0.15). Overall, our study provides a framework for the annotation of long-read rSVs from lrWGS data and prioritization of disease-linked genomic regions for downstream functional validation of rSV impacts. To enable reuse by other researchers, we have made SV allele frequencies and gene associations freely available.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

C.S. is supported by NIH grant R35GM146966. We thank all the individuals that participated in making the GA4K study possible. This work was funded through internal institutional funds from Childrens Mercy Research Institute and Childrens Mercy Kansas City.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

GA4K study data can be found at the ANVIL host at https://anvilproject.org/data/studies/phs002206/workspaces. The study complies with all relevant ethical regulations as approved by the Children's Mercy Institutional Review Board (IRB) (Study #11120514)

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Footnotes

  • rSVs now exclude homozygous ref variants that were inflating allelic frequencies. This change modifies group N values, frequency stats, and significance tests. Enrichment now shows additional rSV enriched genes. Table 1 has been modified to reflect new overlaps and indicate rSV type and locus data. Constraint valuations have remained consistent. A new co-author has been added. A new supplemental figure for rSV count at 20 MAF bins (0-1 MAF) has been added.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted May 16, 2024.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Mapping structural variants to rare disease genes using long-read whole genome sequencing and trait-relevant polygenic scores
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Mapping structural variants to rare disease genes using long-read whole genome sequencing and trait-relevant polygenic scores
Cas LeMaster, Carl Schwendinger-Schreck, Bing Ge, Warren A. Cheung, Rebecca McLennan, Jeffrey J. Johnston, Tomi Pastinen, Craig Smail
medRxiv 2024.03.15.24304216; doi: https://doi.org/10.1101/2024.03.15.24304216
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Mapping structural variants to rare disease genes using long-read whole genome sequencing and trait-relevant polygenic scores
Cas LeMaster, Carl Schwendinger-Schreck, Bing Ge, Warren A. Cheung, Rebecca McLennan, Jeffrey J. Johnston, Tomi Pastinen, Craig Smail
medRxiv 2024.03.15.24304216; doi: https://doi.org/10.1101/2024.03.15.24304216

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetic and Genomic Medicine
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)