ABSTRACT
Background Atrial Fibrillation (AF) is a common and clinically heterogeneous arrythmia. Machine learning (ML) algorithms can define data-driven disease subtypes in an unbiased fashion, but whether the AF subgroups defined in this way align with underlying mechanisms, such as high polygenic liability to AF or inflammation, and associate with clinical outcomes is unclear.
Methods We identified individuals with AF in a large biobank linked to electronic health records (EHR) and genome-wide genotyping. The phenotypic architecture in the AF cohort was defined using principal component analysis of 35 expertly curated and uncorrelated clinical features. We applied an unsupervised co-clustering machine learning algorithm to the 35 features to identify distinct phenotypic AF clusters. The clinical inflammatory status of the clusters was defined using measured biomarkers (CRP, ESR, WBC, Neutrophil %, Platelet count, RDW) within 6 months of first AF mention in the EHR. Polygenic risk scores (PRS) for AF and cytokine levels were used to assess genetic liability of clusters to AF and inflammation, respectively. Clinical outcomes were collected from EHR up to the last medical contact.
Results The analysis included 23,271 subjects with AF, of which 6,023 had available genome-wide genotyping. The machine learning algorithm identified 3 phenotypic clusters that were distinguished by increasing prevalence of comorbidities, particularly renal dysfunction, and coronary artery disease. Polygenic liability to AF across clusters was highest in the low comorbidity cluster. Clinically measured inflammatory biomarkers were highest in the high comorbid cluster, while there was no difference between groups in genetically predicted levels of inflammatory biomarkers. Subgroup assignment was associated with multiple clinical outcomes including mortality, stroke, bleeding, and use of cardiac implantable electronic devices after AF diagnosis.
Conclusion Patient subgroups identified by unsupervised clustering were distinguished by comorbidity burden and associated with risk of clinically important outcomes. Polygenic liability to AF across clusters was greatest in the low comorbidity subgroup. Clinical inflammation, as reflected by measured biomarkers, was lowest in the subgroup with lowest comorbidities. However, there were no differences in genetically predicted levels of inflammatory biomarkers, suggesting associations between AF and inflammation is driven by acquired comorbidities rather than genetic predisposition.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This project was supported by Vanderbilt participation in the American Heart Associations Atrial Fibrillation Strategically Focused Research Network (18SFRN34110369 and 18SFRN34230089). This project also received support from NIH grants R01 HL149826 (DMR) and R01 HL142856 (JDM).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Vanderbilt University Medical Center (VUMC) Institutional Review Board (IRB#181403)
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
Data used in this study are available upon request to the corresponding author and after approval from institutional review.