Abstract
Coronary artery disease (CAD) remains the leading cause of death despite scientific advances. Elucidating shared CAD/pneumonia pathways may reveal novel insights regarding CAD pathways. We performed genome-wide pleiotropy analyses of CAD and pneumonia, examined the causal effects of the expression of genes near independently replicated SNPs and interacting genes with CAD and pneumonia, and tested interactions between disruptive coding mutations of each pleiotropic gene and smoking status on CAD and pneumonia risks. Identified pleiotropic SNPs were annotated to ADAMTS7 and IL6R. Increased ADAMTS7 expression across tissues consistently showed decreased risk for CAD and increased risk for pneumonia; increased IL6R expression showed increased risk for CAD and decreased risk for pneumonia. We similarly observed opposing CAD/pneumonia effects for NLRP3. Reduced ADAMTS7 expression conferred a reduced CAD risk without increased pneumonia risk only among never-smokers. Genetic immune-inflammatory axes of CAD linked to respiratory infections implicate ADAMTS7 and IL6R, and related genes.
Introduction
Despite continued advances, coronary artery disease (CAD) remains the leading cause of mortality and disability-adjusted life-years (DALYs) worldwide1. Acute respiratory infections such as pneumonia, from influenza, SARS-CoV-2, and other pathogens, are common among individuals with chronic CAD2, 3 and can precipitate CAD events or exacerbate CAD symptoms4, and influenza vaccination reduces CAD risk5, 6,7. Such observations implicate the operation of immune-related CAD pathways. Elucidating shared pathways may reveal novel insights toward addressing the sizable persistent CAD risk in the population.
Prior human genetic studies identified interleukin 6 receptor (IL6R) as a putative causal factor for CAD8, 9, bolstered by the CANTOS (Canakinumab Anti-Inflammatory Thrombosis Outcomes Study) trial showing that inhibition of interleukin 1B (IL1B), a cytokine immediately upstream of IL-6, reduces CAD risk10. However, key questions persist: since IL1B inhibition was associated with increased infection risk, does this extend to other targets in the IL1B signaling pathway? Are there other inflammatory pathways implicating CAD and pneumonia? Can environmental factors influence the CAD and pneumonia risk associated with these pathways?
Here, we leveraged the pleiotropy between CAD and pneumonia to dissect jointly the unaddressed axis of CAD linked to pneumonia. We started with genome-wide pleiotropy analyses of CAD and pneumonia using summary statistics from CARDIoGRAMplusC4D and FinnGen R4. Then we examined the causal effects of genes near the pleiotropic loci that were independently replicated, as well as other closely interacting genes according to the STRING database of protein-protein interactions, on CAD and pneumonia through Mendelian randomization (MR). Lastly, we examined for interactions between those genes and smoking status on incident CAD and pneumonia risks in the UK Biobank. Our observations support genetic immune-inflammatory axes of CAD linked to respiratory infections and implicate ADAMTS7 and IL6R, and related genes.
Results
Our primary study population included 450,899 participants in the UK Biobank, who were mean (standard deviation) age 56.6 (8.0) years, 53.9% female, 45.0% ever smokers, and 83.6% European ancestry. At baseline, the prevalences of CAD, pneumonia, hypertension, hypercholesterolemia, and type 2 diabetes were 3.1%, 2.1%, 29.3%, 15.1%, and 2.5% respectively. Over 11.1-year median follow up, the number of incident cases of CAD, pneumonia, and type 2 diabetes were 13,374 (3.1%), 18,574 (4.2%), and 23,007 (5.2%), respectively (Table 1). The workflow for our study is depicted in Supplemental Figure 1.
Genome-wide pleiotropy search
With applied the PLeiotropic Analysis under COmposite null hypothesis (PLACO) to summary statistics for CAD from CARDIoGRAMplusC4D 201511 and for pneumonia from FinnGen v4. A total of 115 SNPs with significant genetic pleiotropy between CAD and pneumonia defined as pPLACO<5 × 10−8 were identified. Among them, 88 also demonstrated evidence of association (p<0.05) with CAD and pneumonia risk in an independent dataset, which was a meta-analysis of the UK Biobank and Mass General Brigham Biobank. The CAD and pneumonia effects of all replicated SNPs were of opposite directions. These SNPs mapped to 2 distinct loci, and the annotated closest genes were ADAMTS7 and IL6R (Figure 1 and Supplemental Table 1).
Genetic pleiotropy is defined as PLACO p values below genome-wide significance (5.0 × 10−8) with input GWAS of CAD from CARDIoGRAMplusC4D and that of pneumonia from FinnGen. GWAS results of CAD and pneumonia are from meta-analysis of UK Biobank and Mass General Brigham Biobank. 115 SNPs with significant evidence of genetic pleiotropy between CAD and pneumonia are highlighted: 113 SNPs with opposite direction of effects on CAD and pneumonia (nearest genes: IL6R, CHRNB4, ADAMTS7, MORF4L1) are highlighted in blue; 2 SNPs with same direction of effects (nearest genes: SH2B3, ATXN2) are highlighted in purple. The black horizontal dashed line in (A) corresponds to genome-wide significance (5E-8) and that in (B) and (C) corresponds to nominal significance (0.05). CAD: coronary artery disease, GWAS: genome-wide association study.
Causal relations examination
We used eQTL data from the GTEx v8 to perform Mendelian randomization (MR) to assess the tissue specific effects of increased ADAMTS7 and IL6R expression on CAD and pneumonia risk12. Based on known biology, the genetic effects of tissue-specific expression of ADAMTS7 were assessed in aorta, tibial artery, and left ventricle and those of IL6R were assessed in tibial artery and whole blood. Across all those tissues, we observed that genetically increased ADAMTS7 expression consistently correlated with significantly decreased risk for CAD and significantly increased risk for pneumonia, and genetically increased IL6R expression correlated with significantly increased risk for CAD and significantly decreased risk for pneumonia. Using inverse variance–weighted fixed-effects (IVW-FE) method, one standard deviation increase in the normalized gene expression of ADAMTS7 was genetically associated with 0.85 odds (95% CI: 0.83-0.88; p=9.1 × 10−28) for CAD and 1.07 odds (95% CI: 1.03-1.11; p: 0.001) for pneumonia in tibial artery, and one standard deviation increase in genetic IL6R expression was associated with 1.18 odds (95% CI: 1.13-1.24; p=1.1 × 10−11) for CAD and 0.86 odds (95% CI: 0.80-0.93; p: 0.0001) for pneumonia in the same tissue (Figure 2). Sensitivity analyses using other MR methods, including summary-data-based MR (SMR), MR-Egger, weighted median, weighted mode, and MR RAPS, showed consistent results (Supplemental Figure 2 and Supplemental Table 2).
Mendelian randomization with inverse-variance weighted methods were used for this analysis. eQTL data were from GTEx v8. Relevant tissues included whole blood, spleen, small intestine terminal ileum, lung, liver, heart left ventricle, heart atrial appendage, artery tibial, artery coronary, artery aorta, adrenal gland, adipose visceral omentum, and adipose subcutaneous. Red bars indicate positive causal relations. Green bars indicate negative causal relations. The methods use was Black dashed lines indicate Z scores of 1.96, corresponding to p=0.05. CAD: coronary artery disease
Among the top ten genes whose corresponding proteins closely interact with ADAMTS7 protein according to STRING (Supplemental Figure 3A), three (ADAMTS13, ADAMTSL1, and ADAMTSL3) had valid eQTL gene expression for tibial artery in GTEx. None of the eQTL instruments for the three genes showed significant associations with either CAD or pneumonia (Figure 3A). As the expression of IL6R is well-recognized to be influenced by the concentrations of IL1B and NLRP3, we seeded STRING with IL6R, IL1B, and NLRP3 for interacting genes in this pathway. Combining the top ten genes whose corresponded proteins closely interact with each of IL6R, IL1B, and NLRP3 proteins resulted in a total of 28 distinct genes, among which 24 had valid eQTL instruments for their expression concentrations in whole blood of data from the eQTLGen consortium, including IL6, IL6ST, JAK1, JAK2, JAK3, STAT1, SAT3, STAT6, TYK2, IL1R1, IL1R2, IL1RAP, IL4, IL10, IL18, TNF, AIM2, DHX33, CASP1, CARD5, CARD8, NEK7, NLRC4, and TXNIP. We tested IL6R, IL1B, and NLRP3 together with the 24 genes, and found three of them had significant effects on both CAD and pneumonia: IL6R, NLRP3, and DHX33. Consistent with the results using tissue specific eQTL data from the GTEx, increased IL6R expression showed increased risk for CAD and decreased risk for pneumonia (pCAD=3.8 × 10−4; ppneumonia=0.03); the expression of NLRP3 also demonstrated opposing directions of effects but with increased expression leading to decreased CAD risk and increased pneumonia risk (pCAD=0.009; ppneumonia=0.004). ; decreased DHX33 gene expression was associated with decreased risk for both CAD and pneumonia (pCAD=0.009; ppneumonia=0.008) (Figure 3B-D). Our sensitivity analysis with using only cis-eQTL data for selecting instruments or using multiple MR methods yielded consistent results (Supplemental Figure 4-5 and Supplemental Table 3). When limiting to cis-eQTLs only, similar to NLRP3, decreased expression of the related AIM2 inflammasome was associated with decreased CAD risk and increased pneumonia risk (Supplementary Figure 4C).
eQTL data derived from tibial artery tissue for ADAMTS7 related genes were from the GTEx v8. eQTL data derived from whole blood for IL6R, IL1B, and NLRP3 related genes were from the eQTLGen consortium. Mendelian randomization with inverse-variance weighted methods were used for this analysis. Red dots indicate significant causal relations with both CAD and pneumonia. Green dots indicate significant causal relations with either CAD or pneumonia. Grey dots indicate significant causal relations with neither CAD nor pneumonia. Black dashed lines indicate Z scores of 1.96, corresponding to p=0.05. CAD: coronary artery disease, eQTL: expression quantitative trait loci.
Gene-environment interaction examination
Based on existing literature, we identified ADAMTS7 p.Ser214Pro (rs3825807, allele frequency=43.8%) as the genetic proxy for the function inhibition for ADAMTS713, 14 and IL6R p.Asp358Ala (rs2228145, allele frequency=40.1%) for IL6R8, 15, 16, both of which were also identified as pleiotropic SNPs by PLACO. In the UK Biobank, among participants who never smoked, the presence of ADAMTS7 p.Ser214Pro was associated with a significantly reduced risk of incident CAD (HR: 0.89; 95% CI: 0.84–0.93; p=1.1 × 10−5) for never smokers but not those who ever smoked (HR: 0.96; 95% CI: 0.92–1.01; p=0.14), revealing a statistical interaction between smoking status and ADAMTS7 p.Ser214Pro (pinteraction=0.01). In contrast, the presence of ADAMTS7 p.Ser214Pro was associated with a significantly increased risk of pneumonia only among those who ever smoked (HR: 1.08; 95% CI: 1.03–1.12; p=4.4 × 10−4) but not those who never smoked (HR: 0.96; 95% CI: 0.92-1.01; p=0.15; pinteraction=2.8 × 10−4) (Table 2). We did not observed statistically significant interactions between IL6R p.Asp358Ala and smoking status on either CAD or pneumonia.
Discussion
Using a genome-wide pleiotropy scan, we demonstrated a genetic immune-inflammatory axis of CAD linked to pneumonia implicating ADAMTS7 and IL6R, and related genes. For both genes as well as NLRP3, genetic analyses were consistent with opposing causal effects on CAD and pneumonia. We also found significant interactions between the expression of ADAMTS7 and smoking status on incident CAD as well as pneumonia risk, indicating a cardioprotective effect without heighted pneumonia risk among nonsmokers. Our study provides a framework leveraging genetic pleiotropy to yield new biological and therapeutic insights.
Our approach uses a hypothesis-driven approach to examine immunity and CAD through concomitant analysis of pneumonia, a common co-morbid infectious disease. Observational epidemiology and translational research efforts have led to significant progress in improving the understanding of the pathophysiology underlying CAD during the past decades17, 18. However, despite effective treatment for traditional risk factors such as lipids and hypertension, CAD remains the leading causes of mortality and DALYs among population above age 501. Recent pandemics of influenza, COVID-19, and other respiratory infections brought to the forefront their complex interplay with atherosclerosis, highlighting the potential separate risk of immune-inflammation on CAD2, 3. Probing the genetic correlation between CAD and pneumonia revealed the involvement of the immune-inflammatory genes ADAMTS7 and IL6R.
Our results may have important implications for the prevention of CAD. First, smoking status may provide important information towards precision medicine immunomodulatory approaches for CAD. ADAMTS7 belongs to the ADAMTS family of secreted metalloproteinases characterized with at least one thrombospondin type I repeat domains19, 20. Genome wide association studies (GWAS) previously revealed and independently replicated associations between multiple genetic variants at the ADAMTS7 locus and susceptibility to CAD21, 22, 23, among which rs3825807 represents an A to G coding SNP, resulting in replacement of Ser214 by Pro thus affecting protein function23. ADAMTS7 has also been validated as the causal gene by both in vitro and in vivo experiments24, 25. Expression of ADAMTS7 in vascular smooth muscle cells (VSMCs) rises with inflammation but not with hyperlipidemia25. ADAMTS7 concentrations in human atherosclerotic plaque link to histological characteristics of plaque instability26. Our study demonstrated that ADAMTS7 participates in CAD and pneumonia but in opposing directions. Prior analyses indicated that ADAMTS7 is an essential gene for influenza A viral replication27. Mice deficient for ADAMTS7 may also have greater influenza viral replication. Moreover, not only did we independently observe that ADAMTS7 p.Ser214Pro confers CAD protection in never-smokers and not those who ever smoked28, we also found that ADAMTS7 p.Ser214Pro significantly increased risk of pneumonia only among ever-smokers but not never-smokers. These findings indicate that therapeutic targeting of ADAMTS7 may be most effective in never-smokers – i.e., greater decrease in CAD risk without increased pneumonia risk.
Second, our approach aims to genetically prioritize therapeutic targets for CAD in the NLRP3/IL1B/IL6 pathway. Previously, the common IL6R p.Asp358Ala was shown to be associated with reduced cardiovascular disease risk16, 29. The Canakinumab Anti-inflammatory Thrombosis Outcomes Study (CANTOS) showed that, compared to placebo, patients with CAD and elevated high sensitivity C-reactive protein concentrations had reduced cardiovascular disease risk with canakinumab, a monoclonal antibody targeting IL1B10. Furthermore, the therapeutic effect was closely tied to the extent of IL6 concentration reduction30. Consistent with our human genetic observations, canakinumab versus placebo yielded a higher incidence of fatal infection in CANTOS10. Currently, there are multiple inhibitors of IL6 and NLRP3 that are in clinical development31, 32. We now provide human genetic validation for NLRP3 inhibition and CAD risk. However, our analyses indicate the possibility of increased infection risk as observed in CANTOS. While our germline genetic analyses model organism-wide perturbations, pharmacologic specificity may overcome such anticipated on-target adverse effects. In secondary analyses of interacting proteins, we observed that decreased DHX33 gene expression was associated with decreased risk for both CAD and pneumonia. Therefore, alternate related targets may more optimally modulate immunity for both CAD and infectious disease.
The current study is strengthened by our robust design for genome-wide pleiotropy scan and causal relationship examinations: (1) in addition to having statistically significant evidence on pleiotropy, defined as pPLACO<5 × 10−8, only SNPs whose associations with CAD and pneumonia replicated in independent cohorts were carried forward for further steps; (2) leveraging eQTL data from both GTEx and eQTLGen, our causal inference results of IL6R and ADAMTS7 on CAD and pneumonia are consistent across tissues and data resources; (3) sensitivity analysis with applying multiple MR methods on both cis- and trans-, and cis-only eQTL data also demonstrate the robustness of our findings. Our study also has limitations. (1) Due to the unavailability of valid instruments for measurement of ADAMTS7 in whole blood, the causal inferences for ADAMTS7 and its interacting genes were conducted using eQTL data of tibial artery from GTEx, which has smaller sample size and only cis-eQTL data versus the IL6R related analyses which used data from the eQTLGen consortium. (2) Our gene-smoking interaction exploration dichotomized smoking status to ever-smoker and never-smoker without examining for the potential dose-response of smoking on disease risk due to the large number of missingness of corresponding variables in the UK Biobank. Heterogeneity among ever-smokers may exist. (3) While our human genetic study provides strong causal inference, perturbation in controlled model systems and prospective human randomized controlled trials are necessary to definitively establish causality.
In conclusion, leveraging novel statistical methods, eQTL data from multiple tissues and resources, together with large-scale population cohort, we demonstrated genetic immune-inflammatory axes of CAD linked to respiratory infections implicate ADAMTS7 and IL6R, and related genes.
Methods
Study population
The UK Biobank is a prospective cohort consists of approximately 500,000 adult participants recruited between 2006 and 2010 from 22 assessment centers across the United Kingdom33. Our analysis included 450,899 unrelated individuals in the UK Biobank. Samples were excluded for the following reasons: genotypic sex did not match reported sex, kinship was not inferred, putative sex chromosome aneuploidy, consent withdrawn, or excessive heterozygosity or missingness, based on centralized sample quality control performed by UK Biobank. Excluded related individuals were defined as one individual in each pair within the third degree of relatedness determined based on kinship coefficients centrally calculated by UK Biobank 33. Outcomes of this analysis were incident CAD and pneumonia: CAD was defined as self-reported, hospitalization with, or death due to myocardial infarction; or hospitalization with OPCS4 (Office of Population Censuses and Surveys) codes for coronary artery bypass grafting (K40, K41, and K45) or coronary angioplasty with or without stenting (K49, K50.2, and K75). Pneumonia was defined as self-reported, hospitalization with, or death due to pneumonia. Prevalent cases were removed when analyzing the corresponding incident outcomes.
Mass General Brigham Biobank is an ongoing hospital-based cohort study of patients across the Mass General Brigham (formerly Partners Healthcare) health system. It is enriched with longitudinal electronic medical records (EMR) data, genomic data, and electronic health and lifestyle survey data34. The present secondary analyses were approved by the Massachusetts General Hospital Institutional Review Board.
Genome-wide pleiotropy search
We applied PLeiotropic Analysis under COmposite null hypothesis (PLACO) to conduct a genome wide search for SNPs influencing risks of both CAD and pneumonia. Briefly, PLACO is a novel statistical method for identifying pleiotropic loci between two traits through testing the composite null hypothesis that a locus is associated with zero or one of the traits. It only needs summary statistics as input, and the test statistic is formed as the product of the Z statistics of the SNP in each of the two studies that assumed to follow a mixture distribution35. Input genome-wide summary statistics for CAD were from CARDIoGRAMplusC4D (Coronary ARtery DIsease Genome wide Replication and Meta-analysis (CARDIoGRAM) plus The Coronary Artery Disease (C4D) Genetics) 201511 and those for pneumonia were from FinnGen R436. CARDIoGRAMplusC4D 2015 is a meta-analysis of GWAS studies including 60,801 CAD case and 123,504 control participants with mainly European, South Asian, and East Asian ancestries11. The FinnGen Study is a Finnish, nationwide GWAS meta-analysis that linked with EMR data from national health registries. In R4, there were 20389 pneumonia cases defined as phenotype ICD10-J10 pneumonia and 156510 controls36. There were no overlap of study populations underlying the CAD and pneumonia GWAS. We removed SNPs with Z2 > 80 as extremely large significance may disproportionately influence the analysis35, 37. For pleiotropic SNPs with statistically significant evidence of genome-wide pleiotropy, defined as PPLACO < 5 × 10−8, we independently assessed for nominal association with both CAD and pneumonia in a meta-analysis of UK Biobank38 and Mass General Brigham Biobank. We clumped variants in ±500 Kb radius and linkage disequilibrium (LD) threshold of r2>0.2 into a single genetic locus using FUMA (SNP2GENE function, v1.3.6a). The gene annotations for all loci are based on proximity to the most significant/lead SNPs as mapped by FUMA39.
Genetic relations examination
We evaluated the genetic relations of tissue-specific expression of genes annotated to the pleiotropic SNPs on CAD and pneumonia using MR toward causal inference. eQTL data of 13 tissues relevant to CAD and pneumonia, i.e., whole blood, spleen, small intestine terminal ileum, lung, liver, heart left ventricle, heart atrial appendage, artery tibial, artery coronary, artery aorta, adrenal gland, adipose visceral omentum, and adipose subcutaneous, was obtained from the GTEx project (v8; https://gtexportal.org/home/). These analyses were limited to cis-eQTL, which were defined as variants within 1 Mb of the transcription start site (TSS) of each gene, given data availability12. Summary statistics of CAD were from meta-analysis of CARDIoGRAMplusC4D and UK Biobank40 and that of pneumonia were from meta-analysis of FinnGen R436 and UK Biobank38.
We extended the causal relation evaluations for both identified pleiotropic genes to their interacting genes identified through STRING, a protein-protein interaction database, (v11; https://string-db.org/) using same MR approaches. For index pleiotropic gene(s) that are expressed in whole blood, we used eQTL data derived from whole blood in >31,000 individuals from the eQTLGen consortium (http://www.eqtlgen.org) for evaluating their and their interacting genes’ causal effects on CAD and pneumonia. As the eQTLGen consortium provided summary statistics of both cis- and trans-eQTL results, we included both cis- and trans-eQTL for primary analysis to maximize power and because recent studies showed that only a modest fraction of trait-heritability can be explained by cis-mediated bulk gene-expressions41. We also conducted sensitivity analysis with using cis-eQTL results only to address potential pleiotropy. As the eQTLGen consortium only included eQTL data derived from whole blood, for other index pleiotropic gene(s) and their interacting genes not highly expressed in whole blood, we used eQTL data of tibial artery, due to similarity to coronary artery, from the GTEx project. As the GTEx project only included cis-eQTL data, we used cis-eQTLs for primary analysis.
For both sets of causal relation analyses, we selected instruments from each dataset using independent (r2 < 0.05) SNPs that met genome-wide significance (P < 5 × 10−8). We applied two-sample MR to estimate the causal relationship between eQTLs and CVD and pneumonia using the MR-Base R package (version 0.5.5)42. For genes that were instrumented with a single SNP, Wald ratio MR method was used; for those with more than one valid instrument, we conducted MR analyses using inverse variance–weighted fixed-effects (IVW-FE) as the primary method43, as well as summary-data-based MR (SMR) method which is less prone to false-positive findings and other alternative MR methods (i.e., MR-Egger, weighted median, weighted mode, or MR RAPS) that may be more powerful under various model assumptions for sensitivity analyses44, 45, 46, 47, 48. As the use of SMR requires eQTL data in BESD format and provided GTEx v7 eQTL data in that format, we used GTEx v7 eQTL data when using SMR. A reference panel of European individuals from the phase 3 1000 Genomes project was used to compute linkage disequilibrium (LD) estimation for all analyses49.
Gene-environment interaction examination in human
Based on existing literature, we identified one common (allele frequency > 30%) disruptive coding mutations of each pleiotropic gene as a genetic proxy for the function inhibition of that gene – ADAMTS7 p.Ser214Pro and IL6R p.Asp358Ala. We performed a prospective, time-to-event analysis for the incident CAD and pneumonia outcomes in the UK Biobank stratified by the status of the instrument for each pleiotropic gene. Cox proportional hazard models were adjusted for age, age2, sex, race, first 10 genetic principal components (PCs), diagnosis of type 2 diabetes mellitus, and genotyping array. To test for proportional hazards assumption for the genetic instruments, we examined whether their corresponding scaled Schoenfeld residuals were independent of follow-up time and found no statistically significant evidence suggesting violation of the assumption. Race was determined by self-reported ancestry and followed by outlier detection based on genetic PCs centrally calculated by UK Biobank. Since smoking represents a key shared risk factor for both CAD and pneumonia, we evaluated the interaction between smoking status and each instrument on the primary outcomes. Analyses used R version 3.6.2 software (The R Foundation), two-tailed P-values, as well as statistical significance level of P < 0.05 except for the identification of genome-wide signals, which was P < 5 × 10−8.
Data Availability
UKB individual-level data are available for request by application (https://www.ukbiobank.ac.uk). Individual-level MGBB data are available from https://personalizedmedicine.partners.org/Biobank/Default.aspx, but restrictions apply to the availability of these data. All the summary-level data used are available for download at the public repositories.