Abstract
Genetic risk factors are occasionally shared between different neurodegenerative diseases. Previous studies have linked ANG, a gene encoding angiogenin, to both Parkinson’s disease (PD) and amyotrophic lateral sclerosis (ALS). Functional studies suggest ANG plays a neuroprotective role in both PD and ALS by reducing cell death. We further explored the genetic association between ANG and PD by analyzing genotype data from the International Parkinson’s Disease Genomics Consortium (IPDGC) (14,671 cases and 17,667 controls) and whole genome sequencing (WGS) data from the Accelerating Medicines Partnership - Parkinson’s disease initiative (AMP-PD, https://amp-pd.org/) (1,647 cases and 1,050 controls). Our analysis did not replicate the findings of previous studies and found no significant association between ANG variants and PD risk.
Introduction
Parkinson’s disease (PD) is a neurodegenerative disease characterized by loss of dopaminergic neurons in the substantia nigra leading to symptoms of tremor, rigidity and slowed movement, and is the second most common neurodegenerative disease in the world. Both sporadic and familial forms of PD exist, and much work has been done to identify the environmental and genetic risk factors behind this disease. Over 20 genes have been associated with PD or parkinsonism in recent years, and the largest genome wide association studies for PD risk have identified 92 PD risk variants across 80 loci, explaining 16-36% of the heritable risk of PD (Blauwendraat et al., 2020; Foo et al., 2020; Nalls et al., 2019).
It is not uncommon to find genetic variations associated with multiple neurodegenerative disorders, suggesting shared pathways between diseases (Tan et al., 2019). For example, common variations in MAPT have been associated with PD (Zabetian et al., 2007), amyotrophic lateral sclerosis (ALS) (Karch et al., 2018) and Alzheimer’s disease (AD) (Ferrari et al., 2017), and variations in GBA have been associated with PD (Sidransky et al., 2009) and Gaucher disease (Riboldi and Di Fonzo, 2019). Therefore, the interrogation of genes common to multiple neurodegenerative disorders is a logical next step in the identification of novel PD risk variants.
One such candidate exists in ANG, a gene thought to confer a large risk for both ALS and PD (Rayaprolu et al., 2012; van Es et al., 2011). However, studies in Asian populations have suggested there is no link between ANG variants and PD (Chen et al., 2014; Liu et al., 2013). ANG encodes angiogenin, a small protein that plays a role in the angiogenesis pathway, which forms new blood vessels. Angiogenin and its related pathway are thought to play a role in cancer and placental development (Amankwah et al., 2012; Pavlov et al., 2014). An in vitro study has shown that angiogenin has a neuroprotective effect on motor neurons (Subramanian et al., 2008). ALS associated ANG variants are suggested to potentiate neuronal death through inhibition of the PI3K-Akt pathway (Kieran et al., 2008). A PD mouse model has also shown this gene has a neuroprotective effect on dopaminergic neurons (Steidinger et al., 2011). This neuroprotective effect is suggested to be lost when ANG is mutated, decreasing the viability of motor neurons (Wu et al., 2007). These findings are of relevance because PD is characterized by the loss of dopaminergic neurons and ALS is characterized by the loss of motor neurons. Interestingly, angiogenin levels have been found to be elevated in the blood serum of ALS patients, but not in PD patients (van Es et al., 2014). This suggests angiogenin may play a larger role elsewhere, such as in the basal ganglia, a brain structure often associated with PD. Structural work has shown ten ANG coding variants are associated with a decrease in angiogenin activity, and one coding variant, p.Arg145Cys, is associated with an increase in activity (Bradshaw et al., 2017).
To date, ANG variants have not been associated with either ALS or PD through genome wide association studies (GWAS) (Nalls et al., 2019; Nicolas et al., 2018), despite previous studies suggesting ANG is associated with risk for these diseases (Rayaprolu et al., 2012; van Es et al., 2011). Here we scrutinize ANG variants in two large PD datasets to assess whether ANG variants contribute to PD risk in individuals of European ancestry.
Methods
We mined whole-genome sequencing (WGS) data from the Accelerating Medicines Partnership - Parkinson’s disease initiative (AMP-PD, https://amp-pd.org/) which included 1,647 cases and 1,050 healthy controls from cohorts including the Fox Investigation for New Discovery of Biomarkers (BioFIND), the Parkinson’s Progression Markers Initiative (PPMI), the Harvard Biomarker Study (HBS), and the Parkinson’s Disease Biomarkers Program (PDBP). We also looked at ANG variants in genotype data from the International Parkinson’s Disease Genomics Consortium (IPDGC) which included 14,671 cases and 17,667 healthy controls. Variants were annotated from both datasets using ANNOVAR (Wang et al., 2010). Variant frequencies in non-Finnish European populations were obtained from the hg38 gnomAD v3.0 dataset (Karczewski et al., 2020). PLINK 1.9 was used to perform Fisher’s exact test to identify significant variants (Purcell et al., 2007). Rare variant burden tests were performed using RVTESTS (Zhan et al., 2016). We further analysed existing summary statistics including the latest GWAS meta-analyses for PD risk and age of onset (Blauwendraat et al., 2019; Nalls et al., 2019) and additionally assessed public summary statistics from the most recent ALS GWAS (Nicolas et al., 2018).
Variants identified by amino acid change from previous studies including Van Es et al. and Rayaprolu et al. did not initially match any variants identified in AMP-PD data due to differences in nomenclature. To resolve this we mapped each variant amino acid change to the angiogenin protein sequence. This sequence was obtained from Ensembl using the ANG-201 ENST00000336811.10 transcript (Yates et al., 2020). We found that the reported amino acid changes from Van Es et al. and Rayaprolu et al. were offset by 25 or 24 amino acids due to numbering differences used for the signal peptide sequence and we accounted for this in our analysis. The code we used for analysis is available on the IPDGC github (https://github.com/ipdgc/IPDGC-Trainees/blob/master/ANG.md).
Results
We identified a total of 168 ANG variants in the AMP-PD WGS data. Nine of these variants were found to be coding. Two of these were synonymous and the other seven were nonsynonymous. The top variant (p=0.017) after performing Fisher’s exact test was not significant after Bonferroni correction for multiple tests (p=0.05/168=2.97E-4) (Supplementary Table 1).
We compared the nine identified ANG coding variants to variants from two other studies (Rayaprolu et al., 2012; van Es et al., 2011) (Supplementary Table 2). All nonsynonymous variants were rare (MAF<0.01). Allele frequencies did not differ significantly from gnomAD non-Finnish European allele frequencies although most variants were too rare to reliably test individually.
After excluding two synonymous variants, rs11701 (p.G110=) and rs2228653 (p.T121=), from these nine, we observed a frequency of 0.39% in PD cases and 0.48% in controls. Van Es et al. also removed two common variants, rs121909536 (K41I) and rs121909541 (p.I70V), from their analysis. After removing these variants from our data the frequencies were 0.15% in PD cases and 0.19% in controls. This is in contrast with the previously found 0.45% in PD cases and 0.04% in controls (van Es et al., 2011).
Burden tests using variants with minor allele frequency less than 0.03 gave no significant results when using all variants (N variants=72; CMC p=0.493, Fp p=0.509, MB p=0.880, Skat p=0.454, SkatO p=0.523, Zeggini p=0.395). Likewise, there were no significant results when doing the same test on only coding variants (N variants=9; CMC p=0.866, Fp p=0.510, MB p=0.820, Skat p=0.436, SkatO p=0.556, Zeggini p=0.868).
Twenty-six ANG variants were found using the IPDGC imputed genotype data, all of which were non-coding (Supplementary Table 1). No significant association between ANG variants and PD risk (Figure 1A) or onset (Supplementary Figure 1) was found in data from the latest PD risk GWAS or in the PD age of onset GWAS (Blauwendraat et al., 2019; Nalls et al., 2019). No variants had a minor allele frequency less than 0.03, so the threshold was increased to 0.05 for burden tests. Only two variants were included at this threshold, which also gave no significant results (N=2; CMC p=0.893, Fp p=0.960, MB p=0.948, Skat p=0.980, SkatO p=1, Zeggini p=0.842). Additionally, no GWAS signal of interest is identified in the most recent ALS GWAS (Figure 1B) (Nicolas et al., 2018).
The -log10(p-value) of variants on or near ANG are shown on the y-axis, and base-pair position of each variant is on the x-axis. P-values are taken from the PD risk GWAS (Figure 1A) and the ALS risk GWAS (Figure 1B). Variants are colored by their R2 linkage disequilibrium color with respect to the variant with the lowest p-value on this plot. Recombination rates are included in blue (Pruim et al., 2010).
Discussion
Rare coding variants in ANG have been reported to be associated with PD (van Es et al., 2011). Here, our goal was to further explore the role of ANG in PD by analyzing large datasets from IPDGC and AMP-PD. Our study shows no significant enrichment of ANG single variants in PD cases or controls in either of these datasets. Rare variant burden tests also gave no significant results for ANG. Our analysis provides no evidence to support the hypothesis that genetic variation of ANG plays a role in PD risk or age at onset.
The nine coding ANG variants we identified were from AMP-PD WGS data. This dataset included fewer samples (1,647 PD; 1,050 controls) than the Van Es et al. study (6,471 ALS;3,146 PD;7,668 controls) which identified a total of 29 unique ANG coding variants. However, the frequency of ANG coding variants detected in the AMP-PD data is 0.15% in PD cases and 0.19% in controls which is different from the previously found 0.45% in PD cases and 0.04% in controls (van Es et al., 2011). Using the Genetic Association Study Power Calculator, we calculated a genotype relative risk of 3.7 at a statistical power of 0.8 (Supplementary Figure 2) (Johnson and Abecasis, n.d.). The genotype relative risk increased to 6.7 when using a statistical power of 0.95. This is comparable to the PD odds ratio of 6.7 from previous studies, suggesting we have the statistical power required to replicate these findings (van Es et al., 2011). However, the cumulative frequency of ANG variants identified in AMP-PD data was not significantly different as previously reported. A larger sample size may be needed to identify the missing coding variants so the role of ANG in PD can be assessed on an even larger scale.
Overall, despite some potentially interesting functional experiments supporting the neuroprotective effect of angiogenin, we cannot replicate the genetic association between ANG coding variants and PD. Therefore, we cannot conclude that ANG variants play a role in PD, which is in line with previous studies done in Asian populations (Chen et al., 2014; Liu et al., 2013).
Data Availability
data is provided in supplementary tables and code used to generate the provided data is on our github (https://github.com/ipdgc/IPDGC-Trainees/blob/master/ANG.md).
Conflicts of Interest
The authors declare that they have no conflict of interest.
Acknowledgements
We would like to thank all of the subjects who donated their time and biological samples to be a part of this study. We also would like to thank all members of the International Parkinson’s Disease Genomics Consortium (IPDGC). For a complete overview of members, acknowledgements and funding, please see http://pdgenetics.org/partners. This work was supported in part by the Intramural Research Programs of the National Institute of Neurological Disorders and Stroke (NINDS), the National Institute on Aging (NIA), and the National Institute of Environmental Health Sciences (NIEHS), all part of the National Institutes of Health, Department of Health and Human Services; project numbers 1ZIA-NS003154, Z01-AG000949-02 and Z01-ES101986. In addition, this work was supported by the Department of Defense (award W81XWH-09-2-0128), and The Michael J Fox Foundation for Parkinson’s Research. Data used in the preparation of this article were obtained from the AMP PD Knowledge Platform. For up-to-date information on the study, visit https://www.amp-pd.org. AMP PD – a public-private partnership – is managed by the FNIH and funded by Celgene, GSK, the Michael J. Fox Foundation for Parkinson’s Research, the National Institute of Neurological Disorders and Stroke, Pfizer, and Verily. We would like to thank AMP-PD for the publicly available whole-genome sequencing data, including cohorts from the Fox Investigation for New Discovery of Biomarkers (BioFIND), the Parkinson’s Progression Markers Initiative (PPMI), and the Parkinson’s Disease Biomarkers Program (PDBP). The Parkinson’s Disease Biomarker Program (PDBP) consortium is supported by the National Institute of Neurological Disorders and Stroke (NINDS) at the National Institutes of Health. A full list of PDBP investigators can be found at https://pdbp.ninds.nih.gov/policy. Harvard Biomarker Study (HBS) is a collaboration of HBS investigators (full list of HBS investigators found at https://www.bwhparkinsoncenter.org/biobank) and funded through philanthropy and NIH and Non-NIH funding sources. The HBS Investigators have not participated in reviewing the data analysis or content of the manuscript. This work utilized the computational resources of the NIH HPC Biowulf cluster (http://hpc.nih.gov).