Abstract
Background This study is aimed to explore the risk for a lifetime or early onset of cardiovascular diseases and diseases within or beyond the circulatory system observed in rare variant(RV) carriers of seven well-characterized monogenic cerebral small vessel disease (CSVD) risk genes (COL4A1, COL4A2, NOTCH3, HTRA1, TREX1, CTC1, and GLA) using a genotype-first approach within a hospital system population-based biobank cohort.
Method This was a retrospective longitudinal study. MyCode participants with sequenced exomes were temporally split into discovery (n=92,445) and replication (n=81,130) cohorts. A workflow was created to prioritize potentially pathogenic variants by integrating three variant annotation pipelines. After propensity score matching, the number of participants in both discovery and replication cohorts was 2738/5490 and 1695/3410 for carriers/noncarriers, respectively.
Result Most of the RVs identified were in a heterozygous form and disproportionately present in participants with African ancestry. Carriers showed an increased risk for early signs and symptoms of cerebrovascular disease. Cox regression model showed NOTCH3 (European), TREX1 (European), and COL4A1/2 (African) were associated with ischemic stroke. Circulatory diseases were overrepresented in discovery and replication cohorts with an early-onset of cardiovascular phenotypes irrespective of race. Sex- and race-dependent effect for cerebrovascular disease risk was clearly detectable, particularly for NOTCH3(HRearlyonset = 2.175[1.391-3.403], p = 0.001), and TREX1(HRearlyonset = 4.006[1.797-8.931], p = 0.001). COL4A1/2(HRearlyonset = 2.163[0.87-5.38], p = 0.097). HTRA1 (European) also showed a similar trend of association.
Conclusion Carriers for monogenic CSVD risk genes demonstrated the increased risk for the lifetime or early onset of cerebrovascular disease and diseases within or beyond the circulatory system, some of which in a race- and sex-dependent manner. Our findings support the concept of developing a gene panel of CSVD for population screening of patients with early-onset circulatory diseases.
Question Is there an increased risk for a lifetime or early onset of cardiovascular diseases and others within or beyond the circulatory system observed in rare variant carriers of seven well-characterized monogenic cerebral small vessel disease (CSVD) risk genes (COL4A1, COL4A2, NOTCH3, HTRA1, TREX1, CTC1, and GLA)?
Findings In this retrospective longitudinal study of 175K individuals, which were temporally split into discovery (2738/5490) and replication cohorts (1695/3410) for carriers/matched noncarriers, NOTCH3 (HRearlyonset = 2.175 for White), TREX1 (HRearlyonset = 4.006 for White), and COL4A1/2 (HRearlyonset = 2.163 for Black) were associated with ischemic stroke, particularly in female participants. HTRA1 also showed a similar trend. Circulatory diseases were overrepresented in discovery and replication cohorts with an early onset of cardiovascular phenotypes irrespective of race.
Meaning Carriers for monogenic CSVD risk genes demonstrated the increased risk for the lifetime or early onset of cardiovascular disease and diseases within or beyond the circulatory system, some of which are race- and sex- dependent.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This study did not receive any funding
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The Geisinger Institutional Review Board approved this study to meet Non-human subject research using de-identified information. All research was performed in accordance with relevant guidelines/regulations. Informed consent was obtained from all subjects and/or their legal guardian(s) for the MyCode patients.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
Conflict of Interest Disclosures: J Li, D Chaudhary, DJ Carey, O Goren, KE Wain, R Zand, and V Abedi report no disclosures relevant to the manuscript.
Minor changes have been made in the main text. The link to additional codes for the data analysis is appended. Fixed a vertical bar shifted due to a previous uploading error.
Data Availability
All data produced in the present study are available upon reasonable request to the authors. The codes are available at GitHub - TheDecodeLab/Trajectory-of-Monogenic-Diseases: Development of a platform to observe and analyses the trajectory of monogenic diseases.
Glossary
- ACMG
- American College of Medical Genetics and Genomics
- AFR
- African ancestry
- CI
- confidence interval
- CoxPH
- Cox proportional-hazards model
- CSVD
- cerebral small vessel disease
- EHR
- Electronic Health Record
- eQTL
- expression quantitative trait loci
- EUR
- European ancestry
- GO
- Gene Ontology
- GWAS
- genome-wide association study
- Hazards ratio
- HR
- HGMD
- Human Gene Mutation Database
- HWE
- Hardy-Weinberg Equilibrium
- ICD
- International Classification of Disease
- IS
- IS
- LAS
- large-artery strokes
- LD
- linkage disequilibrium
- MAF
- minor allele frequency
- ML
- machine learning
- OR
- odds ratio
- PSEA
- PheCode Set Enrichment Analysis
- PTV
- protein truncation variants
- RD
- risk difference
- RR
- relative risk
- RV
- RV
- PCA
- Principal Component Analysis
- PRE
- percent relative effect
- PRS
- polygenic risk score
- SNP
- single nucleotide polymorphism
- TIA
- transient ischemic attack
- VEP
- Ensembl Variant Effect Predictor
- VUS
- a variant of uncertain significance.