Scalable Bayesian functional GWAS method accounting for multivariate quantitative functional annotations with applications to studying Alzheimer’s disease

Junyu Chen; Lei Wang; Philip L. De Jager; David A. Bennett; Aron S. Buchman; Jingjing Yang

doi:10.1101/2022.08.12.22278704

Abstract

Existing methods for integrating functional annotations in GWAS to fine-map and prioritize potential causal variants are either limited to using non-overlapped categorical annotations, or limited by the computation burden of modeling genome-wide variants. To overcome these limitations, we propose a scalable Bayesian functional GWAS method to account for multivariate quantitative functional annotations (BFGWAS_QUANT), accompanied by a scalable computation algorithm enabling joint modeling of genome-wide variants. Simulation studies validated the performance of BFGWAS_QUANT for accurately quantifying annotation enrichment and improving GWAS power. Applying BFGWAS_QUANT to study five Alzheimer’s disease (AD) related phenotypes using individual-level GWAS data (n=∼1K), we found that histone modification annotations have higher enrichment than eQTL annotations for all considered phenotypes, with the highest enrichment in H3K27me3 (polycomb regression). We also found that cis-eQTL in microglia had higher enrichment than eQTL of bulk brain frontal cortex tissue for all considered phenotypes. A similar enrichment pattern was also identified using the IGAP summary-level GWAS data of AD (n=∼54K). The strongest known APOE E4 risk allele was identified for all five phenotypes and the APOE locus was validated using the IGAP data. BFGWAS_QUANT fine-mapped 32 significant variants from 1073 genome-wide significant variants in the IGAP data. We further demonstrated that the polygenic risk scores (PRS) using effect size estimates by BFGWAS_QUANT had similar prediction accuracy as other methods assuming a sparse causal model. Overall, BFGWAS_QUANT provides a useful GWAS tool for quantifying annotation enrichment and prioritizing potential causal variants.

Introduction

Although thousands of significant associations have been identified by single variant genome-wide association studies (GWAS) for complex traits and diseases, majority GWAS signals reside in the noncoding genome regions and have unknown biological meaning^1-3. Existing GWAS results based on single variant tests are still difficult to interpret with respect to the underlying biological mechanisms^{4; 5}. Promisingly, advancements in sequencing technology have made plenteous multi-omics data available which provide functional annotations of genetic variants available to the scientific community –– Combined Annotation-Dependent Depletion (CADD) score⁶, Roadmap Epigenomics Mapping Consortium⁷ providing DNA methylation, histone modification, and chromatin accessibility information of various human tissues, the Encyclopedia of DNA Elements (ENCODE)⁸ providing functional information of human and mouse genomes, and the Genotype-Tissue Expression (GTEx) project⁹ providing expression quantitative trait loci (eQTL) information of 54 human tissues. In particular, molecular quantitative trait loci (QTL) mapped from profiles of molecular phenotypes (e.g., gene expression from GTEx⁹, chromatin marks from Roadmap^{7; 10; 11}, and protein abundances¹²) and corresponding genomes (genotype data) have been shown to be enriched with GWAS signals and help interpretate the underlying biology for studying complex traits and diseases^13-15. Especially, these molecular QTL have been leveraged to prioritize GWAS associations of complex traits and diseases^16-20.

An intuitive but widely used ad-hoc approach is to fine-map and prioritize potential causal GWAS signals that are also molecular QTL²¹, or located in a region with histone modifications. Recently, advanced statistical methods have been proposed to integrate non-overlapped categorical functional annotation (assigning one function label per variant) with GWAS data to fine-map GWAS results^22-26. Additionally, PAINTOR was proposed to integrate multivariate quantitative functional annotations with GWAS summary statistics to fine-map GWAS loci with thousands of variants^{27; 28}; FunSPU²⁹ and STAAR³⁰ were proposed to incorporate multiple biological annotations for rare variant association tests. These existing methods have shown the feasibility and promising results of integrating multivariate quantitative functional annotations with GWAS data to fine-map GWAS results and prioritize potential causal variants. However, these methods were not developed for joint modeling millions of genome-wide variants with multivariate quantitative annotations, which would lead to less accurate quantification of annotation enrichment and reduced power of fine-mapping.

Here, we propose a scalable Bayesian Functional GWAS method for integrating multivariate QUANTitative functional annotations with GWAS data by a Bayesian hierarchical variable selection regression model, referred to as BFGWAS_QUANT. BFGWAS_QUANT assumes a hierarchical logistic prior for the causal probabilities of genetic variants in the standard Bayesian Variable Selection Regression (BVSR) based GWAS method³¹ to jointly model millions of genome-wide genetic variants. BFGWAS_QUANT adapts the scalable Expectation Maximization Markov Chain Monte Carlo (EM-MCMC) algorithm developed by the previous Bayesian functional GWAS (BFGWAS) method that only models non-overlapped categorical annotations (referred to as BFGWAS_CAT in this paper)²². BFGWAS_QUANT further improves the computation efficiency by pre-calculating linkage disequilibrium (LD) correlation matrix and single variant test Zscore statistics that are used in the MCMC algorithm, or using reference LD correlation matrix and summary-level GWAS data. Bayesian Causal Posterior Probability (CPP) and genetic effect size estimates will be generated by BFGWAS_QUANT, along with enrichment quantification of considered multivariate quantitative functional annotations. Furthermore, Bayesian estimates of genetic effect sizes can be used to derive polygenic risk scores (PRS) that account for functional annotations.

By simulation studies, we showed that our Bayesian estimates of functional enrichment converged and GWAS power was improved over the standard BVSR method without accounting for functional annotations³¹. We then applied BFGWAS_QUANT to real GWAS data for studying Alzheimer’s disease (AD) related phenotypes^{32; 33}, accounting for multivariate quantitative annotations with respect to Roadmap histone modifications⁷ of brain mid-frontal gyrus and eQTL of brain frontal cortex tissue^{32; 34} and microglia cell type^{35; 36}. We showed that BFGWAS_QUANT identified interesting enrichment pattern and generated fine-mapped GWAS results using both individual-level and summary-level GWAS data. We also showed that, PRS derived from BFGWAS_QUANT effect size estimates led to similar accurate AD risk prediction as other PRS methods assuming a sparse causal model.

In the following Material and Methods section, we describe an overview of the BFGWAS_QUANT method, Roadmap histone modification and eQTL based quantitative functional annotations, simulation study design, and application studies of AD. In the Results section, we describe the results of simulation and application studies. We then end with a Discussion about the advantages and limitations of BFGWAS_QUANT.

Material and Methods

Bayesian hierarchical variable selection regression model

BFGWAS_QUANT assumes a Bayesian hierarchical variable selection regression (BVSR)³¹ model for genome-wide variants, where y_n×1 is a vector of standardized phenotype with n subjects; X_n×p is the standardized genotype matrix with p genome-wide genetic variants and β_p×1 is the vector of the genetic effect sizes. Spike-and-slab variable selection prior is assumed per effect size β_i. That is, β_i has probability π_i to be non-zero and follows a normal distribution centered at zero and probability (1 − π_i) to be zero with a point-mass density function at 0, where π_i denotes the “casual” probability of the ith variant.

We assume a hierarchical logistic model for the causal probabilities of genetic variants to account for multivariate quantitative functional annotations, where π_i denotes the casual probability of the ith variant as in Equation 1, A_i = (1, A_i1, …, A_iJ)′ denotes the augmented annotation vector for the ith variant with an intercept term as the first element, and coefficient vector α = (α₀, α₁, …, α_J)′ denotes the intercept term α₀ and enrichment quantification with respect to functional annotation j = 1, …, J.

Further, we assume a fixed value in the domain of (0, 1] for τ_β, a fixed value for α₀ in the domain of (−13.8, -9), and a standard normal prior for enrichment parameters (α_j∼N(0, 1); j = 0, 1, … J). In particular, τ_β = 1 would assume the prior variance of effect sizes is the same as the marginal effect size estimates in a single variant regression model, and smaller τ_β values would inflate the magnitude of Bayesian effect size estimates. The lower bound value of α₀ = −13.8 would assume the prior causal probability is 10⁻⁶ when α_j = 0, j = 1, …, J (see Supplementary Text for model details).

Adapted EM-MCMC algorithm

To overcome heavy computational burden and poor mixing rate of posterior samplings by the standard MCMC algorithm³¹, we adapt the scalable EM-MCMC algorithm developed for the original BFGWAS method²². Specifically, we first segment genome-wide variants into approximately independent genome blocks with about 5,000∼10,000 variants based on LD structure^{22; 37}. Second, conditioning on given hyper parameters (α_j, j = 1, …, J), we conduct standard MCMC algorithm within each genome block to obtain Bayesian posterior estimates for genetic effect size (β_i) and causal probability (π_i) per variant (Expectation (E) step). Third, conditioning on the Bayesian estimates of (β_i, π_i, i = 1, …, p), we will update the values of hyper parameters (α_j, i = 1, …, J) by maximizing their conditional posterior likelihood (Maximization (M) step). Computational optimization algorithm Broyden-Fletcher-Goldfarb-Shanno (BFGS)³⁸ is used to obtain the maximum a posteriori (MAP) estimates for α_j, j = 1, …, J. The EM steps will be iterated (∼5 iterations) until the estimates of hyper parameters converge.

The Bayesian estimate for the causal probability (π_i) per variant is referred as CPP, and the Bayesian estimates for annotation coefficients (α_j, j = 1, …, J) are referred as quantified enrichment of multivariate functional annotations. SNPs with CPP > 0.1068 will be considered as significantly associated with the phenotype of interest, where the significance threshold was shown equivalent to p-value < 5 × 10⁻⁸ by previously BFGWAS_CAT paper²². Because a multivariate regression model is fitted per genome-block, the LD among all variants per genome-block is accounted for during the Bayesian inference of CPP and effect size β. The GWAS results obtained by BFGWAS_QUANT will be fine-mapped and variants with enriched annotations will be prioritized.

In particular, implementing the MCMC algorithm per genome-block can greatly reduce the search space (from genome-wide to a genome-block) and facilitate parallel computing (one core per genome-block), thus leading to efficient convergence rate and improved mixing rate. Computation efficiency is further improved by implementing the MCMC algorithm using pre-calculated LD correlation matrix per genome-block and single variant test Zscore statistics, or reference LD correlation matrix and GWAS Z-score statistics, which will save up to 90% computation time compared to using individual-level GWAS data³⁷. With 32 computation cores in one node, BFGWAS_QUANT can complete analyzing ∼10M SNPs in ∼4hours for 5 EM iterations.

eQTL based functional annotations

In this paper, we considered 5 real eQTL based quantitative functional annotations. Three of these annotations (Allcis-eQTL, 95%CredibleSet, MaxCPP) were constructed based on standard cis-eQTL data of brain frontal cortex tissue from GTEx data^{9; 34}: i) Binary annotation Allcis-eQTL was constructed by taking all SNPs that were identified as a significant cis-eQTL (false discovery rate (FDR) <5%; 1MB from transcription start site (TSS)) for at least one expression quantitative trait (one gene) as 1 and otherwise as 0; ii) For each gene expression trait that has at least one significant (FDR<5%) cis-eQTL, CAVIAR¹⁶ was used to calculate the causal posterior probability (cis-CPP) of each cis-SNP and identify a 95% credible set. SNPs that do not belong to any 95% Credible Set were taken as 0 or otherwise 1 for the annotation of 95%CredibleSet. iii) Maximum cis-CPP per SNP across all genes were taken as quantitative values of MaxCPP. In addition, we took the maximum Bayesian Genome-Wide CPP of being either cis- or trans-eQTL across all genes in brain frontal cortex tissue from the Religious Orders Study and Rush Memory and Aging Project (ROS/MAP)^{21; 33} as the fourth annotation BGW_MaxCPP, where cis- and trans-CPPs were estimated by Bayesian genome-wide TWAS (BGW-TWAS) method^{32; 37}. Lastly, we derived a fifth Microglia-eQTL annotation from two datasets of recent Microglia cell-type specific eQTL summary statistics^{35; 39}, where 1 indicates being identified as a cis-eQTL to any gene in either Microglia dataset or otherwise 0.

Histone modifications based functional annotation

We constructed 5 histone modification based functional annotations using the epigenomics data of core histone modifications in the brain mid frontal gyrus region from the ROADMAP Epigenomics database⁷ –– H3K4me1 (primed enhancers), H3K4me3 (promoters), H3K36me3 (gene bodies), H3K27me3 (polycomb regression) and H3K9me3 (heterochromatin). For each histone modification, peak regions from replicates of the same sample were first merged together and then overlapped with peak regions of other samples by bedtools (v 2.27.0)⁴⁰. If a genetic variant resides in the overlapped peak regions of a histone modification, 1 would be assigned to the function annotation of such modification for this variant or 0 would be assigned otherwise.

Simulation study design

We conducted simulation studies to validate the performance of BFGWAS_QUANT. Continuous phenotypes were simulated using the real whole genome sequence (WGS) data of Chromosomes 19 (122,745 SNPs with minor allele frequency (MAF) >0.01) for n=1,893 samples from the ROS/MAP cohort^{33; 41} and Mount Sinai Brain Bank (MSBB) study⁴². Phenotypes were simulated based on the multivariate linear additive model (Equation 1) with true genetic effect sizes β_p×1 generated based on the hierarchical logistic model with multivariate quantitative functional annotations (Equation 2). Scenarios with various numbers of true causals and heritability were considered.

Besides real cis-eQTL based functional annotations of Allcis-QTL, 95%CredibleSet, MaxCPP, we also considered a fourth artificial annotation randomly generated from N(0, 1) as a negative control. With chosen annotation enrichment parameters (α₀ = (−10.5, −9.5), α₁ = 4, α₂ = 1.5, α₃ = 0.5, α₄ = 0), we first calculated casual probabilities (π_i) for all considered p=122,745 SNPs by Equation 2, where α₀ was chosen to ensure the total number of true causal SNPs fall in (5, 10) with α₀ = −10.5 or (15, 30) with α₀ = −9.5. Second, a vector of binary indicator (γ_i) of true causal SNPs was generated from the corresponding Bernoulli distribution with probability (π_i) for i = 1, …, p. Third, genetic effect sizes were taken as 0 for SNPs with γ_i = 0 and generated from a normal distribution for SNPs with γ_i = 1. Finally, phenotypes were generated from Equation 1 with simulated genetic effect sizes and random errors ϵ∼N(0, (1 − h²)I), to ensure a target total heritability h² = (0.25, 0.5) equally explained by all true causal SNPs. A total of four scenarios were considered, including one with relatively sparse true causals in the range of (5, 10) and one with the number of true causals in the range of (15, 30), with respect to two different heritability values (0.25, 0.5).

In addition, we considered a null enrichment scenario where none of the annotations were enriched. In this scenario, we randomly selected 10 true causal SNPs and assigned them genetic effect sizes generated from a normal distribution. We simulated the phenotype as described above with a targeted h²=0.5.

We repeated 100 simulations per scenario to evaluate our Bayesian estimates for annotation enrichment, total heritability, and true causal SNPs with respect to sensitivity (power) and positive predictive values (PPV)⁴³. Sensitivity (power) is defined as the proportion of true positive findings among all true casual variants, and PPV is defined as the proportion of true positive findings among all identified significant associations. We took simulated true casual SNPs and those having R² > 0.3 with true casual SNPs with as true positive findings, following the significance rule used by the BFGWAS_CAT method²². The sensitivity and PPV are given by

We compared with the standard BVSR method⁴⁴ that does not account for functional annotations. We estimated the total heritability by the squared correlation between the simulated phenotypes and the PRSs based on Bayesian estimates of genetic effect sizes of SNPs with ,

Applications to ROS/MAP individual-level GWAS data

ROS/MAP are two prospective longitudinal community-based cohort studies, which recruit older adults without known dementia at baseline and follow participants up annually until the time of death^{33; 41; 45}. Participants all agreed to annual clinical evaluation and brain autopsy at the time of death, signing an informed consent and Anatomic Gift Act. All participants in this study also signed a repository consent to allow their data to be re-purposed. WGS data were profiled for 1200 samples by using the KAPA Hyper Library Preparation Kit and Illumina HiSeq X sequencer.

We applied BFGWAS_QUANT to account for five eQTL based and five histone modifications based quantitative annotations as described above to study five AD related phenotypes^{9; 32-34}, including the binary clinical diagnosis of late-onsite Alzheimer’s dementia (n=1087), three quantified postmortem pathology indices of AD (i.e., PHFtau tangle density with n=1105, β-amyloid load with n=1113, and a global measurement of AD pathology burden with n=1123), and a quantitative measurement of cognition decline rate with n=1049^{33; 41}. The cognition decline rate was constructed as the random slope per sample from a linear mixed model of annual longitudinal measurements of cognition function. Details about ROS/MAP and phenotypic traits were measured can be found in previously published paper⁴⁶. We also adjusted for the covariates of age, sex, smoking status, study index (ROS or MAP), and first 3 genotype principal components, by regressing these covariates out from the phenotypes and taking the corresponding regression residuals as the outcome in the BFGWAS_QUANT method.

Applications to IGAP summary-level GWAS data

We applied BFGWAS_QUANT to the stage 1 summary-level GWAS data of the International Genomics of Alzheimer’s Project (IGAP)⁴⁷, along with above 10 eQTL and histone modifications based functional annotations and reference LD generated from ROS/MAP. The stage 1 IGAP summary-level GWAS data were generated by meta-analyses consisting of 17,008 AD cases and 37,154 controls of European ancestry (n=∼54K).

AD risk prediction by PRS

To show the usefulness of PRS for risk prediction, we evaluated two sets of PRS that were respectively derived from BFGWAS_QUANT and BVSR summary statistics using ROS/MAP GWAS data. We used the independent test samples from Mayo Clinic Alzheimer’s Disease Genetics Studies (MCADGS)^{48; 49}. MCADGS contain 2,099 European-decent samples (844 AD cases and 1,255 controls) with microarray genotype data profiled that were further imputed to the 1000 Genome Project Phase⁵⁰.

We compared PRS using Bayesian effect size estimates with three commonly used PRS methods, the standard method using informed LD-Pruning and P-value thresholding (P+T)^{51; 52}, LDpred2⁵³, and PRS-CS⁵⁴. P-value thresholds of (10⁻², 10⁻³, 10⁻⁵, 5 × 10⁻⁸, 10⁻⁸) and LD thresholds of (0.1, 0.3, 0.5, 0.7,0.9) were considered by the P+T method. Reference LD derived from 1000 Genome data were used by LDpred and PRS-CS methods. AD risk prediction accuracy was evaluated using the area under the Receiver Operating Characteristic (ROC)⁵⁵ curve (AUC) for MCADGC test samples.

Ethics statement

ROS/MAP and MCADGS data analyzed in this study were generated under the improvement by the Institutional Review Board (IRB) of Rush University Medical Center, Chicago, IL and Mayo Clinic, respectively. All samples analyzed in this study were de-identified and all analyses were approved by the IRB of Emory University School of Medicine.

Results

Simulation results

For all considered simulation scenarios, our Bayesian estimates of functional annotation enrichment achieved convergence with 4 EM iterations, as shown in boxplots of 100 simulation replicates (Figure 1 (A, D); and Supplemental Figure 1). Although BFGWAS_QUANT overestimated α₁ and α₂, the Bayesian estimates still reflect the correct enrichment pattern among all considered annotations. By taking the 2.5^th and 97.5^th quantiles of these 100 Bayesian estimates to estimate the corresponding 2.5^th and 97.5^th quantiles of the estimator distributions, the true enrichment values (α₁ = 4, α₂ = 1.5, α₃ = 0.5, α₄ = 0) indeed fell within this range. For example, in the scenario with α₀ = −10.5 and h² = 0.25 (Figure 1A), the estimated 2.5^th and 97.5^th quantiles are (1.⁶2, 5.50) for α₁, (0.98, 4.57) for α₂, (0.02, 1.20) for α₃, and (0.00, 0.38) for α₄. We observed precise estimates of 0 enrichment for the artificial annotation (Figure 1; Supplemental Figure 1) and all annotations in the scenario with null enrichment (Supplement Figure 3A), which demonstrated the ability of BFGWAS_QUANT to identify null enrichment.

Figure 1. Bayesian enrichment estimates, heritability estimates, and sensitivities of simulation studies.

Panels (A, B, C) are for simulations with α₀ = −10.5 and panels (D, E, F) are for simulations with α₀ = −9.5. Bayesian estimates of annotation enrichment (α₁, α₂, α₃, α₄) out of 100 simulations with true heritability h² = 0.25 are shown in the respective box plot (panels A, D), where red dots denote true enrichment values. Comparable heritability estimates (panels B, E) and higher sensitivities (panels C, F) were obtained by BFGWAS_QUANT (red) vs. BVSR (blue).

By taking PRS as estimated phenotypes and taking the squared correlation between RPS and simulated phenotypes as the estimate of phenotype heritability, we obtained similar heritability estimates by BFGWAS_QUANT and BVSR, which are close to the true heritability (Figure 1 (B, E); Supplemental Figure 3B). For scenarios with true enrichment, BFGWAS_QUANT obtained substantially higher sensitivity (power) while similar positive predictive values (PPV) compared to BVSR for all scenarios (Figure 1 (C, F); Supplemental Figure 2; Supplementary Figures 4-5). For the scenario with null enrichment, both BFGWAS_QUANT and BVSR performed comparably (Supplementary Figure 3 (B, C, D)).

These simulation studies validated BFGWAS_QUANT for quantifying multivariate functional annotations, estimating phenotype heritability, and identifying true causal SNPs. By accounting for multivariate quantitative annotations, BFGWAS_QUANT showed improved performance than the standard BVSR method, especially with higher power for identifying true causal SNPs and accurate enrichment estimation.

Application GWAS results for studying AD

Applying BFGWAS_QUANT to the individual-level ROS/MAP and summary-level IGAP GWAS data, we obtained consistent patterns for Bayesian enrichment estimates of 5 eQTL based and 5 histone modifications based functional annotations (Figure 2; Supplement Figures 6-7). In particular, the histone modifications based functional annotations had higher enrichment than eQTL based annotations when studying the individual-level ROS/MAP GWAS data, with the highest enrichment for H3K27me3 and second highest for H3K4me1. Interestingly, BFGWAS_QUANT estimated the second highest enrichment for the Microglia_eQTL annotation when the IGAP GWAS data with a large sample size (n=∼54K). Even with a small sample size in the individual-level ROS/MAP GWAS data, the Microglia_eQTL annotation was still identified with higher enrichment than other annotations based on eQTL of the bulk brain frontal cortex tissue. There is mounting evidence showing that Microglia (composing <10% cells in the bulk brain frontal cortex tissues) plays important roles in the development and progression of AD pathology⁵⁶, and cell-type specific differential expression of GWAS risk genes of AD are only present in Microglia⁵⁷.

Figure 2. Bayesian estimates of functional annotation enrichment for Alzheimer’s dementia by using ROS/MAP individual-level (A) and IGAP summary-level (B) GWAS data.

Histone modification H3K27me3 (polycomb regression) and Microglia cis-eQTL annotations were found to be most enriched for association signals of AD.

Using the ROSMAP individual-level GWAS data, four significant SNPs with Bayesian CPP > 0.1068 were identified for AD (rs429358, rs10414043, rs769449 and rs7256200) by BFGWAS_QUANT (Table 1; Figure 3A; Supplemental Figures 8-9). In particular, SNP rs429358 (CPP = 0.144, p-value = 7.72 × 10⁻¹³, missense variant) is the famous known APOE E4 risk allele of AD⁵⁸ and have significant Bayesian CPP > 0.1068 for all 5 AD related phenotypes. SNPs rs10414043 (CPP = 0.111, p-value = 2.71 × 10⁻¹²) and rs7256200 (CPP = 0.315, p-value = 2.71 × 10⁻¹², regulatory variant) are upstream of the known risk gene APOC1 of AD and blood protein traits^{37; 59; 60}. Besides the missense APOE E4 risk alleles rs429358 and rs7412, one additional significant SNP rs1065853 is identified upstream of APOC1 for global AD pathology, which is a known GWAS signal for blood protein traits such as low-density lipoprotein^{60; 61}. Out of 11 significant SNPs identified for at least one AD related phenotype (Table 1), 2 of which are intergenic and the other 9 SNPs are intron, regulatory, missense, and stop gained variants.

View this table:

Table 1. Significant SNPs with Bayesian CPP > 0. 1068 by BFGWAS_QUANT for studying AD related phenotypes using the ROS/MAP individual-level GWAS data.

SNPs with single variant test P-value <5 × 10⁻⁸ were shaded in gray.

Figure 3. Manhattan plots of BFGWAS_QUANT results for studying Alzheimer’s dementia using ROS/MAP individual-level (A) and IGAP summary-level (B) GWAS data.

Single variant test p-values were plotted in -log10 scale in the y-axis. The dashed horizontal line denotes the genome-wide significant threshold 5 × 10⁻⁸. SNPs with Bayesian CPP > 0.1068 were colored according to the color scale of their Bayesian CPP values by BFGWAS_QUANT. SNPs with Bayesian CPP >0.5 were plotted in solid triangles.

Using the IGAP summary-level GWAS data, BFGWAS_QUANT fine-mapped 32 significant SNPs with Bayesian CPP > 0.1068 associated with AD (Table 2; Figure 3B). Multiple SNPs located in genes PVRL2, APOE/TOMM40, APOC1 in chromosome 19 were found to be associated with AD with CPP = 1. Interestingly, 10 out of 32 significant SNPs are cis-eQTL of Microglia, and all significant SNPs except one intergenic SNP (rs7110631) are intron, regulatory, downstream, upstream, 3’ UTR, and missense variants (Table 2). All significant SNPs except rs78959900 are located in the peak regions of histone modification H3K27me3 (polycomb regression) that has the highest enrichment. Further, several SNPs that did not pass the genome-wide significance threshold (P-value < 5 × 10⁻⁸) were identified by integrating 10 functional annotations in BFGWAS_QUANT. These SNPs are located in genes that were previously found to be genetically linked to AD, such as HLA-DRB1, NUP160, SLC24A4 and CD33^{47; 58; 62-65}.

View this table:

Table 2. Significant SNPs with Bayesian CPP > 0. 1068 by BFGWAS_QUANT for studying AD using the IGAP summary-level GWAS data.

SNPs with single variant test P-value <5 × 10⁻⁸ were shaded in gray.

Additionally, the summation of genome-wide Bayesian CPP of SNPs with CPP >0.01 can be used to estimate the number of total causal SNPs for the phenotype of interest. The threshold of CPP>0.01 is used to exclude adding CPP from random MCMC selections. Although the power is limited for analyzing the ROS/MAP individual-level GWAS data with a small sample size, BFGWAS_QUANT estimated a total of 54 potential causal SNPs for AD using the IGAP summary-level GWAS data (Table 3).

View this table:

Table 3. Estimates of total causal SNPs.

The summations of the Bayesian CPP estimates of SNPs with CPP>0.01 estimate the total number of causal SNPs.

AD risk prediction by PRS in MCADGS

To show the usefulness of BFGWAS_QUANT for studying complex traits and diseases, we derived PRS using the Bayesian effect size estimates by BFGWAS_QUANT for an independent GWAS cohort MCADGS (n = 2099), and compared the risk prediction accuracy with the PRS using effect size estimates by BVSR, P+T, LDpred2, and PRS-CS. When the ROS/MAP individual-level GWAS data were used as training data, comparable AUC was obtained by using Bayesian effect size estimates byBFGWAS_QUANT (0.69) and BVSR (0.68), which was similar to the one obtained by LDpred2-auto (0.68) but significantly higher than the ones obtained by P+T (0.55), LDpred2-inf (0.53), and PRS-CS (0.54) (Figure 4A). This showed the advantage of deriving PRS using Bayesian effect size estimates by BFGWAS_QUANT when individual-level GWAS data are available, especially when the sample size is small. When the IGAP summary-level GWAS data were used as training data, the AUC by BFGWAS_QUANT (0.75) was the same as by LDpred2-auto (0.75), but much lower than P + T (0.88 with p-value threshold 10⁻³), LDpred2-inf (0.94), and PRS-CS (0.93) (Figure 4B). These results show that an infinitesimal model⁶⁶ as assumed by PRS-CS and LDpred2-inf is more suitable for PRS development than the sparse model assumed by BFGWAS_QUANT and LDpred2-auto.

Figure 4. ROC plots comparing prediction accuracy of Alzheimer’s dementia in the test data of MCDGC by PRSs derived using the ROS/MAP individual-level (A) and IGAP summary-level (B) GWAS data as training data.

PRS derived using Bayesian effect size estimates by BFGWAS_QUANT has comparable prediction accuracy as the PRSs derived by BVSR and LDpred2 auto, for all assuming a sparse causal model. PRSs derived by PRS-CS and LDpred2-inf using IGAP summary-level GWAS data as training data have the highest prediction accuracy for assuming an infinitesimal causal model.

Discussion

We developed a scalable Bayesian functional GWAS method to account for multivariate quantitative functional annotations (BFGWAS_QUANT) for studying complex traits and diseases, which is based on a Bayesian hierarchical variable selection regression model and accompanied with a scalable EM-MCMC computation algorithm. BFGWAS_QUANT has the advantages of quantifying enrichment of functional annotations, as well as modeling LD to generate fine-mapped GWAS results that are also prioritized based on their functional annotations. BFGWAS_QUANT can be applied to both individual-level and summary-level GWAS data. In particular, the Bayesian effect size estimates can be used to derive PRS that accounts for functional annotations.

Our simulation studies validated the performance of BFGWAS_QUANT with respect to annotation enrichment quantification, GWAS association identification, and heritability estimation. Compared to BVSR, BFGWAS_QUANT method had higher sensitivities (i.e., power), while comparable PPV and accuracy of phenotype heritability estimation.

In the real studies of AD related phenotypes using both ROS/MAP individual-level and IGAP summary-level GWAS data, we showed that interesting enrichment patterns were identified, fine-mapped GWAS signals were identified, and predictive PRSs were derived. In particular, we found the histone modification H3K27me3 (polycomb regression) and Microglia cis-eQTL annotations were most enriched for association signals of AD. We also showed that SNPs with single variant test p-values < 5 × 10⁻⁸ could be identified for being prioritized due to their functional annotations.

Despite the above advantages, BFGWAS_QUANT does have its limitations. First, BFGWAS_QUANT model is developed for quantitative traits. However, following previous studies^{22; 67}, GWAS analysis can still be done for dichotomous traits by quantifying cases as 1’s and controls as 0’s, which will have similar performance as a probit model if samples are independent and population structure can be addressed by top genotype principal components. Extending the BFGWAS_QUANT method for studying dichotomous traits is also part of our ongoing research. Second, BFGWAS_QUANT assumes the summary-level GWAS data and reference LD were derived from populations of the same ancestry. One need to apply BFGWAS_QUANT per ancestry first and then meta-analyze the results for studying GWAS cohorts with multiple ancestries. Third, BFGWAS_QUANT assumes a sparse causal genetic architecture which is suitable for generating fine-mapped GWAS results but might lack of power for deriving PRS.

In conclusion, our work demonstrated the usefulness of integrating multivariate quantitative functional annotations in GWAS for quantifying the enrichment of multiple functional annotations and generating fine-mapped GWAS results with higher power. Specifically, accurate quantification of annotation enrichment would help prioritize GWAS signals (fine-mapping) and then help illustrate the underlying genomic etiology of complex traits and diseases. Especially, as publicly available molecular QTL based and epigenomic features functional annotations continue to grow, BFGWAS_QUANT provides a convenient tool for integrative multi-omics studies.

Data Availability

All data produced in the present study are available upon reasonable request to the authors

https://www.radc.rush.edu/

Supplemental Data

Supplemental Note includes 9 Supplemental Figures and Supplemental Methods with details of the model derivations.

Declaration of Interests

The authors declare no competing interests. David A. Bennett receives consulting fee from B4X inc for advising on neurodegenerative disease.

Web Resources

BFGWAS_QUANT Github directory, https://github.com/yanglab-emory/BFGWAS_QUANT

RADC Research Resource Sharing Hub, http://www.radc.rush.edu/

ROS/MAP data, https://www.synapse.org/#!Synapse:syn3219045

MCADGC data, https://www.synapse.org/#!Synapse:syn2910256

LDpred, https://github.com/bvilhjal/ldpred

PRS-CS, https://github.com/getian107/PRScs

Data and Code Availability

ROS/MAP data can be requested through Rush Alzheimer’s Disease Center (http://www.radc.rush.edu/) and synapse (https://www.synapse.org/#!Synapse:syn3219045). MCADGS data can be requested through synapse (https://www.synapse.org/#!Synapse:syn2910256). IGAP summary statistics are available from http://web.pasteur-lille.fr/en/recherche/u744/igap/igap_download.php. Source code of BFGWAS_QUANT is available through Github, https://github.com/yanglab-emory/BFGWAS_QUANT.

Acknowledgements

J.Y. is supported by NIH/NIGMS grant R35GM138313 and NIH/NIA grant R21AG070659. ROS/MAP study data were provided by the Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, IL. Data collection was supported through funding by NIA grants P30AG10161, R01AG15819, R01AG17917, R01AG30146, R01AG36836, U01AG32984, U01AG46152, U01AG61356, the Illinois Department of Public Health, and the Translational Genomics Research Institute. The MCADGC led by Dr. Nilüfer Ertekin-Taner and Dr. Steven G. Younkin, Mayo Clinic, Jacksonville, FL uses samples from the Mayo Clinic Study of Aging, the Mayo Clinic Alzheimer’s Disease Research Center, and the Mayo Clinic Brain Bank. MCADGC data collection was supported through funding by NIA grants P50 AG016574, R01 AG032990, U01 AG046139, R01 AG018023, U01 AG006576, U01 AG006786, R01 AG025711, R01 AG017216, R01 AG003949, NINDS grant R01 NS080820, CurePSP Foundation, and support from Mayo Foundation.

Reference

1.↵
Visscher, P.M., Wray, N.R., Zhang, Q., Sklar, P., McCarthy, M.I., Brown, M.A., and Yang, J. (2017). 10 Years of GWAS Discovery: Biology, Function, and Translation. Am J Hum Genet 101, 5–22.
OpenUrl CrossRef PubMed
2.↵
Welter, D., MacArthur, J., Morales, J., Burdett, T., Hall, P., Junkins, H., Klemm, A., Flicek, P., Manolio, T., Hindorff, L., et al. (2014). The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res 42, D1001–1006.
OpenUrl CrossRef PubMed Web of Science
3.↵
McCarthy, M.I., Abecasis, G.R., Cardon, L.R., Goldstein, D.B., Little, J., Ioannidis, J.P.A., and Hirschhorn, J.N. (2008). Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature Reviews Genetics 9, 356–369.
OpenUrl CrossRef PubMed Web of Science
4.↵
Hindorff, L.A., Sethupathy, P., Junkins, H.A., Ramos, E.M., Mehta, J.P., Collins, F.S., and Manolio, T.A. (2009). Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proceedings of the National Academy of Sciences 106, 9362.
OpenUrl Abstract/FREE Full Text
5.↵
Gallagher, M.D., and Chen-Plotkin, A.S. (2018). The Post-GWAS Era: From Association to Function. Am J Hum Genet 102, 717–730.
OpenUrl CrossRef PubMed
6.↵
Rentzsch, P., Witten, D., Cooper, G.M., Shendure, J., and Kircher, M. (2019). CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res 47, D886–D894.
OpenUrl CrossRef PubMed
7.↵
Roadmap Epigenomics, C., Kundaje, A., Meuleman, W., Ernst, J., Bilenky, M., Yen, A., Heravi-Moussavi, A., Kheradpour, P., Zhang, Z., Wang, J., et al. (2015). Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330.
OpenUrl CrossRef PubMed
8.↵
Consortium, E.P. (2011). A user’s guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol 9, e1001046.
OpenUrl CrossRef PubMed
9.↵
Consortium, G. (2020). The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330.
OpenUrl Abstract/FREE Full Text
10.↵
McVicker, G., van de Geijn, B., Degner, J.F., Cain, C.E., Banovich, N.E., Raj, A., Lewellen, N., Myrthil, M., Gilad, Y., and Pritchard, J.K. (2013). Identification of genetic variants that affect histone modifications in human cells. Science 342, 747–749.
OpenUrl Abstract/FREE Full Text
11.↵
Ernst, J., and Kellis, M. (2010). Discovery and characterization of chromatin states for systematic annotation of the human genome. Nature biotechnology 28, 817–825.
OpenUrl CrossRef PubMed Web of Science
12.↵
Robins, C., Liu, Y., Fan, W., Duong, D.M., Meigs, J., Harerimana, N.V., Gerasimov, E.S., Dammer, E.B., Cutler, D.J., Beach, T.G., et al. (2021). Genetic control of the human brain proteome. American journal of human genetics 108, 400–410.
OpenUrl
13.↵
Banovich, N.E., Lan, X., McVicker, G., van de Geijn, B., Degner, J.F., Blischak, J.D., Roux, J., Pritchard, J.K., and Gilad, Y. (2014). Methylation QTLs are associated with coordinated changes in transcription factor binding, histone modifications, and gene expression levels. PLoS Genet 10, e1004663.
OpenUrl CrossRef PubMed
14.
Waszak, S.M., Delaneau, O., Gschwind, A.R., Kilpinen, H., Raghav, S.K., Witwicki, R.M., Orioli, A., Wiederkehr, M., Panousis, N.I., Yurovsky, A., et al. (2015). Population Variation and Genetic Control of Modular Chromatin Architecture in Humans. Cell 162, 1039–1050.
OpenUrl CrossRef PubMed
15.↵
Heyn, H. (2016). Quantitative Trait Loci Identify Functional Noncoding Variation in Cancer. PLoS Genet 12, e1005826.
OpenUrl
16.↵
Hormozdiari, F., Kostem, E., Kang, E.Y., Pasaniuc, B., and Eskin, E. (2014). Identifying causal variants at loci with multiple signals of association. Genetics 198, 497–508.
OpenUrl Abstract/FREE Full Text
17.
Le, K.T.T., Matzaraki, V., Netea, M.G., Wijmenga, C., Moser, J., and Kumar, V. (2019). Functional Annotation of Genetic Loci Associated With Sepsis Prioritizes Immune and Endothelial Cell Pathways. Front Immunol 10, 1949.
OpenUrl CrossRef
18.
Matzaraki, V., Gresnigt, M.S., Jaeger, M., Ricaño-Ponce, I., Johnson, M.D., Oosting, M., Franke, L., Withoff, S., Perfect, J.R., Joosten, L.A.B., et al. (2017). An integrative genomics approach identifies novel pathways that influence candidaemia susceptibility. PLoS One 12, e0180824.
OpenUrl
19.
Ruffieux, H., Fairfax, B.P., Nassiri, I., Vigorito, E., Wallace, C., Richardson, S., and Bottolo, L. (2021). EPISPOT: An epigenome-driven approach for detecting and interpreting hotspots in molecular QTL studies. Am J Hum Genet 108, 983–1000.
OpenUrl
20.↵
Nicolae, D.L., Gamazon, E., Zhang, W., Duan, S., Dolan, M.E., and Cox, N.J. (2010). Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS genetics 6, e1000888–e1000888.
OpenUrl
21.↵
Ng, B., White, C.C., Klein, H.U., Sieberts, S.K., McCabe, C., Patrick, E., Xu, J., Yu, L., Gaiteri, C., Bennett, D.A., et al. (2017). An xQTL map integrates the genetic architecture of the human brain’s transcriptome and epigenome. Nat Neurosci 20, 1418–1426.
OpenUrl CrossRef PubMed
22.↵
Yang, J., Fritsche, L.G., Zhou, X., Abecasis, G., and International Age-Related Macular Degeneration Genomics, C. (2017). A Scalable Bayesian Method for Integrating Functional Information in Genome-wide Association Studies. American journal of human genetics 101, 404–416.
OpenUrl CrossRef
23.
Pickrell, J.K. (2014). Joint analysis of functional genomic data and genome-wide association studies of 18 human traits. American journal of human genetics 94, 559–573.
OpenUrl CrossRef PubMed
24.
Trynka, G., Westra, H.-J., Slowikowski, K., Hu, X., Xu, H., Stranger, B.E., Klein, R.J., Han, B., and Raychaudhuri, S. (2015). Disentangling the Effects of Colocalizing Genomic Annotations to Functionally Prioritize Non-coding Variants within Complex-Trait Loci. Am J Hum Genet 97, 139–152.
OpenUrl CrossRef PubMed
25.
Iversen, E.S., Lipton, G., Clyde, M.A., and Monteiro, A.N.A. (2014). Functional annotation signatures of disease susceptibility loci improve SNP association analysis. BMC Genomics 15, 398–398.
OpenUrl
26.↵
Chen, G.K., and Witte, J.S. (2007). Enriching the analysis of genomewide association studies with hierarchical modeling. American journal of human genetics 81, 397–404.
OpenUrl CrossRef PubMed Web of Science
27.↵
Kichaev, G., Yang, W.-Y., Lindstrom, S., Hormozdiari, F., Eskin, E., Price, A.L., Kraft, P., and Pasaniuc, B. (2014). Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS genetics 10, e1004722–e1004722.
OpenUrl
28.↵
Kichaev, G., Roytman, M., Johnson, R., Eskin, E., Lindström, S., Kraft, P., and Pasaniuc, B. (2016). Improved methods for multi-trait fine mapping of pleiotropic risk loci. Bioinformatics 33, 248–255.
OpenUrl
29.↵
Ma, Y., and Wei, P. (2019). FunSPU: A versatile and adaptive multiple functional annotation-based association test of whole-genome sequencing data. PLoS genetics 15, e1008081.
OpenUrl
30.↵
Li, X., Li, Z., Zhou, H., Gaynor, S.M., Liu, Y., Chen, H., Sun, R., Dey, R., Arnett, D.K., Aslibekyan, S., et al. (2020). Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. Nature genetics 52, 969–983.
OpenUrl CrossRef PubMed
31.↵
Guan, Y.T., and Stephens, M. (2011). Bayesian Variable Selection Regression for Genome-Wide Association Studies and Other Large-Scale Problems. Annals of Applied Statistics 5, 1780–1815.
OpenUrl
32.↵
De Jager, P.L., Ma, Y., McCabe, C., Xu, J., Vardarajan, B.N., Felsky, D., Klein, H.U., White, C.C., Peters, M.A., Lodgson, B., et al. (2018). A multi-omic atlas of the human frontal cortex for aging and Alzheimer’s disease research. Sci Data 5, 180142.
OpenUrl
33.↵
Bennett, D.A., Buchman, A.S., Boyle, P.A., Barnes, L.L., Wilson, R.S., and Schneider, J.A. (2018). Religious Orders Study and Rush Memory and Aging Project. J Alzheimers Dis 64, S161–S189.
OpenUrl PubMed
34.↵
Hormozdiari, F., Gazal, S., van de Geijn, B., Finucane, H.K., Ju, C.J., Loh, P.R., Schoech, A., Reshef, Y., Liu, X., O’Connor, L., et al. (2018). Leveraging molecular quantitative trait loci to understand the genetic architecture of diseases and complex traits. Nature genetics 50, 1041–1047.
OpenUrl CrossRef PubMed
35.↵
Bryois, J., Calini, D., Macnair, W., Foo, L., Urich, E., Ortmann, W., Iglesias, V.A., Selvaraj, S., Nutma, E., Marzin, M., et al. (2021). Cell-type specific cis-eQTLs in eight brain cell-types identifies novel risk genes for human brain disorders. medRxiv, 2021.2010.2009.21264604.
36.↵
Lopes, K.P., Snijders, G.J.L., Humphrey, J., Allan, A., Sneeboer, M.A.M., Navarro, E., Schilder, B.M., Vialle, R.A., Parks, M., Missall, R., et al. (2022). Genetic analysis of the human microglial transcriptome across brain regions, aging and disease pathologies. Nature genetics 54, 4–17.
OpenUrl
37.↵
Luningham, J.M., Chen, J., Tang, S., De Jager, P.L., Bennett, D.A., Buchman, A.S., and Yang, J. (2020). Bayesian Genome-wide TWAS Method to Leverage both cis- and trans-eQTL Information through Summary Statistics. American journal of human genetics 107, 714–726.
OpenUrl CrossRef
38.↵
Broyden, C.G. (1970). The Convergence of a Class of Double-rank Minimization Algorithms 1. General Considerations. IMA Journal of Applied Mathematics 6, 76–90.
OpenUrl CrossRef
39.↵
Lopes, K.d.P., Snijders, G.J.L., Humphrey, J., Allan, A., Sneeboer, M.A.M., Navarro, E., Schilder, B.M., Vialle, R.A., Parks, M., Missall, R., et al. (2022). Genetic analysis of the human microglial transcriptome across brain regions, aging and disease pathologies. Nature Genetics 54, 4–17.
OpenUrl
40.↵
Quinlan, A.R., and Hall, I.M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842.
OpenUrl CrossRef PubMed Web of Science
41.↵
Bennett, D.A., Schneider, J.A., Arvanitakis, Z., and Wilson, R.S. (2012). Overview and findings from the religious orders study. Curr Alzheimer Res 9, 628–645.
OpenUrl CrossRef PubMed Web of Science
42.↵
Wang, M., Beckmann, N.D., Roussos, P., Wang, E., Zhou, X., Wang, Q., Ming, C., Neff, R., Ma, W., Fullard, J.F., et al. (2018). The Mount Sinai cohort of large-scale genomic, transcriptomic and proteomic data in Alzheimer’s disease. Sci Data 5, 180185.
OpenUrl
43.↵
Trevethan, R. (2017). Sensitivity, Specificity, and Predictive Values: Foundations, Pliabilities, and Pitfalls in Research and Practice. Front Public Health 5, 307.
OpenUrl
44.↵
Guan, Y., and Stephens, M. (2011). Bayesian variable selection regression for genome-wide association studies and other large-scale problems. Ann Appl Stat 5, 1780–1815.
OpenUrl CrossRef
45.↵
Bennett, D.A., Schneider, J.A., Buchman, A.S., Barnes, L.L., Boyle, P.A., and Wilson, R.S. (2012). Overview and findings from the rush Memory and Aging Project. Curr Alzheimer Res 9, 646–663.
OpenUrl CrossRef PubMed Web of Science
46.↵
Bennett, D.A., Wilson, R.S., Boyle, P.A., Buchman, A.S., and Schneider, J.A. (2012). Relation of neuropathology to cognition in persons without cognitive impairment. Ann Neurol 72, 599–609.
OpenUrl CrossRef PubMed
47.↵
Lambert, J.-C., Ibrahim-Verbaas, C.A., Harold, D., Naj, A.C., Sims, R., Bellenguez, C., Jun, G., DeStefano, A.L., Bis, J.C., Beecham, G.W., et al. (2013). Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer’s disease. Nature Genetics 45, 1452–1458.
OpenUrl CrossRef PubMed
48.↵
Carrasquillo, M.M., Zou, F., Pankratz, V.S., Wilcox, S.L., Ma, L., Walker, L.P., Younkin, S.G., Younkin, C.S., Younkin, L.H., Bisceglio, G.D., et al. (2009). Genetic variation in PCDH11X is associated with susceptibility to late-onset Alzheimer’s disease. Nature genetics 41, 192–198.
OpenUrl CrossRef PubMed Web of Science
49.↵
Allen, M., Carrasquillo, M.M., Funk, C., Heavner, B.D., Zou, F., Younkin, C.S., Burgess, J.D., Chai, H.S., Crook, J., Eddy, J.A., et al. (2016). Human whole genome genotype and transcriptome data for Alzheimer’s and other neurodegenerative diseases. Sci Data 3, 160089.
OpenUrl
50.↵
1000 Genomes Project Consortium. (2012). An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65.
OpenUrl CrossRef PubMed Web of Science
51.↵
Choi, S.W., Mak, T.S., and O’Reilly, P.F. (2020). Tutorial: a guide to performing polygenic risk score analyses. Nat Protoc 15, 2759–2772.
OpenUrl PubMed
52.↵
Euesden, J., Lewis, C.M., and O’Reilly, P.F. (2015). PRSice: Polygenic Risk Score software. Bioinformatics (Oxford, England) 31, 1466–1468.
OpenUrl CrossRef PubMed
53.↵
Prive, F., Arbel, J., and Vilhjalmsson, B.J. (2020). LDpred2: better, faster, stronger. Bioinformatics.
54.↵
Ge, T., Chen, C.Y., Ni, Y., Feng, Y.A., and Smoller, J.W. (2019). Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat Commun 10, 1776.
OpenUrl
55.↵
Linden, A. (2006). Measuring diagnostic and predictive accuracy in disease management: an introduction to receiver operating characteristic (ROC) analysis. J Eval Clin Pract 12, 132–139.
OpenUrl CrossRef PubMed Web of Science
56.↵
Hansen, D.V., Hanson, J.E., and Sheng, M. (2018). Microglia in Alzheimer’s disease. J Cell Biol 217, 459–472.
OpenUrl Abstract/FREE Full Text
57.↵
Mathys, H., Davila-Velderrain, J., Peng, Z., Gao, F., Mohammadi, S., Young, J.Z., Menon, M., He, L., Abdurrob, F., Jiang, X., et al. (2019). Single-cell transcriptomic analysis of Alzheimer’s disease. Nature 570, 332–337.
OpenUrl
58.↵
Kunkle, B.W., Grenier-Boley, B., Sims, R., Bis, J.C., Damotte, V., Naj, A.C., Boland, A., Vronskaya, M., van der Lee, S.J., Amlie-Wolf, A., et al. (2019). Genetic meta-analysis of diagnosed Alzheimer’s disease identifies new risk loci and implicates Aβ, tau, immunity and lipid processing. Nature genetics 51, 414–430.
OpenUrl CrossRef PubMed
59.↵
Takei, N., Miyashita, A., Tsukie, T., Arai, H., Asada, T., Imagawa, M., Shoji, M., Higuchi, S., Urakami, K., Kimura, H., et al. (2009). Genetic association study on in and around the APOE in late-onset Alzheimer disease in Japanese. Genomics 93, 441–448.
OpenUrl CrossRef PubMed Web of Science
60.↵
Wojcik, G.L., Graff, M., Nishimura, K.K., Tao, R., Haessler, J., Gignoux, C.R., Highland, H.M., Patel, Y.M., Sorokin, E.P., Avery, C.L., et al. (2019). Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514–518.
OpenUrl CrossRef PubMed
61.↵
Andaleon, A., Mogil, L.S., and Wheeler, H.E. (2018). Gene-based association study for lipid traits in diverse cohorts implicates BACE1 and SIDT2 regulation in triglyceride levels. PeerJ 6, e4314.
OpenUrl CrossRef
62.↵
Naj, A.C., Jun, G., Beecham, G.W., Wang, L.-S., Vardarajan, B.N., Buros, J., Gallins, P.J., Buxbaum, J.D., Jarvik, G.P., Crane, P.K., et al. (2011). Common variants at MS4A4/MS4A6E, CD2AP, CD33 and EPHA1 are associated with late-onset Alzheimer’s disease. Nature Genetics 43, 436–441.
OpenUrl CrossRef PubMed
63.
Karch, C.M., Jeng, A.T., Nowotny, P., Cady, J., Cruchaga, C., and Goate, A.M. (2012). Expression of Novel Alzheimer’s Disease Risk Genes in Control and Alzheimer’s Disease Brains. PLOS ONE 7, e50976.
OpenUrl CrossRef PubMed
64.
Lu, R.C., Yang, W., Tan, L., Sun, F.R., Tan, M.S., Zhang, W., Wang, H.F., and Tan, L. (2017). Association of HLA-DRB1 polymorphism with Alzheimer’s disease: a replication and meta-analysis. Oncotarget 8, 93219–93226.
OpenUrl CrossRef
65.↵
Novikova, G., Kapoor, M., Tcw, J., Abud, E.M., Efthymiou, A.G., Chen, S.X., Cheng, H., Fullard, J.F., Bendl, J., Liu, Y., et al. (2021). Integration of Alzheimer’s disease genetics and myeloid genomics identifies disease risk regulatory elements and genes. Nature Communications 12, 1610.
OpenUrl
66.↵
Nelson, R.M., Pettersson, M.E., and Carlborg, O. (2013). A century after Fisher: time for a new paradigm in quantitative genetics. Trends Genet 29, 669–676.
OpenUrl CrossRef PubMed
67.↵
Zhou, X., Carbonetto, P., and Stephens, M. (2013). Polygenic modeling with bayesian sparse linear mixed models. PLoS genetics 9, e1003264.
OpenUrl

View the discussion thread.

Posted August 17, 2022.

Download PDF

Supplementary Material

Data/Code

Citation Tools

Subject Area

Genetic and Genomic Medicine

Subject Areas

All Articles

Addiction Medicine (349)
Allergy and Immunology (668)
Allergy and Immunology (668)
Anesthesia (181)
Cardiovascular Medicine (2648)
Dentistry and Oral Medicine (316)
Dermatology (223)
Emergency Medicine (399)
Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
Epidemiology (12228)
Forensic Medicine (10)
Gastroenterology (759)
Genetic and Genomic Medicine (4103)
Geriatric Medicine (387)
Health Economics (680)
Health Informatics (2657)
Health Policy (1005)
Health Systems and Quality Improvement (985)
Hematology (363)
HIV/AIDS (851)
Infectious Diseases (except HIV/AIDS) (13695)
Intensive Care and Critical Care Medicine (797)
Medical Education (399)
Medical Ethics (109)
Nephrology (436)
Neurology (3882)
Nursing (209)
Nutrition (577)
Obstetrics and Gynecology (739)
Occupational and Environmental Health (695)
Oncology (2030)
Ophthalmology (585)
Orthopedics (240)
Otolaryngology (306)
Pain Medicine (250)
Palliative Medicine (75)
Pathology (473)
Pediatrics (1115)
Pharmacology and Therapeutics (466)
Primary Care Research (452)
Psychiatry and Clinical Psychology (3432)
Public and Global Health (6527)
Radiology and Imaging (1403)
Rehabilitation Medicine and Physical Therapy (814)
Respiratory Medicine (871)
Rheumatology (409)
Sexual and Reproductive Health (410)
Sports Medicine (342)
Surgery (448)
Toxicology (53)
Transplantation (185)
Urology (165)

[1] 1.↵
Visscher, P.M., Wray, N.R., Zhang, Q., Sklar, P., McCarthy, M.I., Brown, M.A., and Yang, J. (2017). 10 Years of GWAS Discovery: Biology, Function, and Translation. Am J Hum Genet 101, 5–22.
OpenUrl CrossRef PubMed

[2] 2.↵
Welter, D., MacArthur, J., Morales, J., Burdett, T., Hall, P., Junkins, H., Klemm, A., Flicek, P., Manolio, T., Hindorff, L., et al. (2014). The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res 42, D1001–1006.
OpenUrl CrossRef PubMed Web of Science

[3] 3.↵
McCarthy, M.I., Abecasis, G.R., Cardon, L.R., Goldstein, D.B., Little, J., Ioannidis, J.P.A., and Hirschhorn, J.N. (2008). Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature Reviews Genetics 9, 356–369.
OpenUrl CrossRef PubMed Web of Science

[4] 4.↵
Hindorff, L.A., Sethupathy, P., Junkins, H.A., Ramos, E.M., Mehta, J.P., Collins, F.S., and Manolio, T.A. (2009). Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proceedings of the National Academy of Sciences 106, 9362.
OpenUrl Abstract/FREE Full Text

[5] 5.↵
Gallagher, M.D., and Chen-Plotkin, A.S. (2018). The Post-GWAS Era: From Association to Function. Am J Hum Genet 102, 717–730.
OpenUrl CrossRef PubMed

[6] 6.↵
Rentzsch, P., Witten, D., Cooper, G.M., Shendure, J., and Kircher, M. (2019). CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res 47, D886–D894.
OpenUrl CrossRef PubMed

[7] 7.↵
Roadmap Epigenomics, C., Kundaje, A., Meuleman, W., Ernst, J., Bilenky, M., Yen, A., Heravi-Moussavi, A., Kheradpour, P., Zhang, Z., Wang, J., et al. (2015). Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330.
OpenUrl CrossRef PubMed

[8] 8.↵
Consortium, E.P. (2011). A user’s guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol 9, e1001046.
OpenUrl CrossRef PubMed

[9] 9.↵
Consortium, G. (2020). The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330.
OpenUrl Abstract/FREE Full Text

[10] 10.↵
McVicker, G., van de Geijn, B., Degner, J.F., Cain, C.E., Banovich, N.E., Raj, A., Lewellen, N., Myrthil, M., Gilad, Y., and Pritchard, J.K. (2013). Identification of genetic variants that affect histone modifications in human cells. Science 342, 747–749.
OpenUrl Abstract/FREE Full Text

[11] 11.↵
Ernst, J., and Kellis, M. (2010). Discovery and characterization of chromatin states for systematic annotation of the human genome. Nature biotechnology 28, 817–825.
OpenUrl CrossRef PubMed Web of Science

[12] 12.↵
Robins, C., Liu, Y., Fan, W., Duong, D.M., Meigs, J., Harerimana, N.V., Gerasimov, E.S., Dammer, E.B., Cutler, D.J., Beach, T.G., et al. (2021). Genetic control of the human brain proteome. American journal of human genetics 108, 400–410.
OpenUrl

[13] 13.↵
Banovich, N.E., Lan, X., McVicker, G., van de Geijn, B., Degner, J.F., Blischak, J.D., Roux, J., Pritchard, J.K., and Gilad, Y. (2014). Methylation QTLs are associated with coordinated changes in transcription factor binding, histone modifications, and gene expression levels. PLoS Genet 10, e1004663.
OpenUrl CrossRef PubMed

[14] 14.
Waszak, S.M., Delaneau, O., Gschwind, A.R., Kilpinen, H., Raghav, S.K., Witwicki, R.M., Orioli, A., Wiederkehr, M., Panousis, N.I., Yurovsky, A., et al. (2015). Population Variation and Genetic Control of Modular Chromatin Architecture in Humans. Cell 162, 1039–1050.
OpenUrl CrossRef PubMed

[15] 15.↵
Heyn, H. (2016). Quantitative Trait Loci Identify Functional Noncoding Variation in Cancer. PLoS Genet 12, e1005826.
OpenUrl

[16] 16.↵
Hormozdiari, F., Kostem, E., Kang, E.Y., Pasaniuc, B., and Eskin, E. (2014). Identifying causal variants at loci with multiple signals of association. Genetics 198, 497–508.
OpenUrl Abstract/FREE Full Text

[17] 17.
Le, K.T.T., Matzaraki, V., Netea, M.G., Wijmenga, C., Moser, J., and Kumar, V. (2019). Functional Annotation of Genetic Loci Associated With Sepsis Prioritizes Immune and Endothelial Cell Pathways. Front Immunol 10, 1949.
OpenUrl CrossRef

[18] 18.
Matzaraki, V., Gresnigt, M.S., Jaeger, M., Ricaño-Ponce, I., Johnson, M.D., Oosting, M., Franke, L., Withoff, S., Perfect, J.R., Joosten, L.A.B., et al. (2017). An integrative genomics approach identifies novel pathways that influence candidaemia susceptibility. PLoS One 12, e0180824.
OpenUrl

[19] 19.
Ruffieux, H., Fairfax, B.P., Nassiri, I., Vigorito, E., Wallace, C., Richardson, S., and Bottolo, L. (2021). EPISPOT: An epigenome-driven approach for detecting and interpreting hotspots in molecular QTL studies. Am J Hum Genet 108, 983–1000.
OpenUrl

[20] 20.↵
Nicolae, D.L., Gamazon, E., Zhang, W., Duan, S., Dolan, M.E., and Cox, N.J. (2010). Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS genetics 6, e1000888–e1000888.
OpenUrl

[21] 21.↵
Ng, B., White, C.C., Klein, H.U., Sieberts, S.K., McCabe, C., Patrick, E., Xu, J., Yu, L., Gaiteri, C., Bennett, D.A., et al. (2017). An xQTL map integrates the genetic architecture of the human brain’s transcriptome and epigenome. Nat Neurosci 20, 1418–1426.
OpenUrl CrossRef PubMed

[22] 22.↵
Yang, J., Fritsche, L.G., Zhou, X., Abecasis, G., and International Age-Related Macular Degeneration Genomics, C. (2017). A Scalable Bayesian Method for Integrating Functional Information in Genome-wide Association Studies. American journal of human genetics 101, 404–416.
OpenUrl CrossRef

[23] 23.
Pickrell, J.K. (2014). Joint analysis of functional genomic data and genome-wide association studies of 18 human traits. American journal of human genetics 94, 559–573.
OpenUrl CrossRef PubMed

[24] 24.
Trynka, G., Westra, H.-J., Slowikowski, K., Hu, X., Xu, H., Stranger, B.E., Klein, R.J., Han, B., and Raychaudhuri, S. (2015). Disentangling the Effects of Colocalizing Genomic Annotations to Functionally Prioritize Non-coding Variants within Complex-Trait Loci. Am J Hum Genet 97, 139–152.
OpenUrl CrossRef PubMed

[25] 25.
Iversen, E.S., Lipton, G., Clyde, M.A., and Monteiro, A.N.A. (2014). Functional annotation signatures of disease susceptibility loci improve SNP association analysis. BMC Genomics 15, 398–398.
OpenUrl

[26] 26.↵
Chen, G.K., and Witte, J.S. (2007). Enriching the analysis of genomewide association studies with hierarchical modeling. American journal of human genetics 81, 397–404.
OpenUrl CrossRef PubMed Web of Science

[27] 27.↵
Kichaev, G., Yang, W.-Y., Lindstrom, S., Hormozdiari, F., Eskin, E., Price, A.L., Kraft, P., and Pasaniuc, B. (2014). Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS genetics 10, e1004722–e1004722.
OpenUrl

[28] 28.↵
Kichaev, G., Roytman, M., Johnson, R., Eskin, E., Lindström, S., Kraft, P., and Pasaniuc, B. (2016). Improved methods for multi-trait fine mapping of pleiotropic risk loci. Bioinformatics 33, 248–255.
OpenUrl

[29] 29.↵
Ma, Y., and Wei, P. (2019). FunSPU: A versatile and adaptive multiple functional annotation-based association test of whole-genome sequencing data. PLoS genetics 15, e1008081.
OpenUrl

[30] 30.↵
Li, X., Li, Z., Zhou, H., Gaynor, S.M., Liu, Y., Chen, H., Sun, R., Dey, R., Arnett, D.K., Aslibekyan, S., et al. (2020). Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. Nature genetics 52, 969–983.
OpenUrl CrossRef PubMed

[31] 31.↵
Guan, Y.T., and Stephens, M. (2011). Bayesian Variable Selection Regression for Genome-Wide Association Studies and Other Large-Scale Problems. Annals of Applied Statistics 5, 1780–1815.
OpenUrl

[32] 32.↵
De Jager, P.L., Ma, Y., McCabe, C., Xu, J., Vardarajan, B.N., Felsky, D., Klein, H.U., White, C.C., Peters, M.A., Lodgson, B., et al. (2018). A multi-omic atlas of the human frontal cortex for aging and Alzheimer’s disease research. Sci Data 5, 180142.
OpenUrl

[33] 33.↵
Bennett, D.A., Buchman, A.S., Boyle, P.A., Barnes, L.L., Wilson, R.S., and Schneider, J.A. (2018). Religious Orders Study and Rush Memory and Aging Project. J Alzheimers Dis 64, S161–S189.
OpenUrl PubMed

[34] 34.↵
Hormozdiari, F., Gazal, S., van de Geijn, B., Finucane, H.K., Ju, C.J., Loh, P.R., Schoech, A., Reshef, Y., Liu, X., O’Connor, L., et al. (2018). Leveraging molecular quantitative trait loci to understand the genetic architecture of diseases and complex traits. Nature genetics 50, 1041–1047.
OpenUrl CrossRef PubMed

[35] 35.↵
Bryois, J., Calini, D., Macnair, W., Foo, L., Urich, E., Ortmann, W., Iglesias, V.A., Selvaraj, S., Nutma, E., Marzin, M., et al. (2021). Cell-type specific cis-eQTLs in eight brain cell-types identifies novel risk genes for human brain disorders. medRxiv, 2021.2010.2009.21264604.

[36] 36.↵
Lopes, K.P., Snijders, G.J.L., Humphrey, J., Allan, A., Sneeboer, M.A.M., Navarro, E., Schilder, B.M., Vialle, R.A., Parks, M., Missall, R., et al. (2022). Genetic analysis of the human microglial transcriptome across brain regions, aging and disease pathologies. Nature genetics 54, 4–17.
OpenUrl

[37] 37.↵
Luningham, J.M., Chen, J., Tang, S., De Jager, P.L., Bennett, D.A., Buchman, A.S., and Yang, J. (2020). Bayesian Genome-wide TWAS Method to Leverage both cis- and trans-eQTL Information through Summary Statistics. American journal of human genetics 107, 714–726.
OpenUrl CrossRef

[38] 38.↵
Broyden, C.G. (1970). The Convergence of a Class of Double-rank Minimization Algorithms 1. General Considerations. IMA Journal of Applied Mathematics 6, 76–90.
OpenUrl CrossRef

[39] 39.↵
Lopes, K.d.P., Snijders, G.J.L., Humphrey, J., Allan, A., Sneeboer, M.A.M., Navarro, E., Schilder, B.M., Vialle, R.A., Parks, M., Missall, R., et al. (2022). Genetic analysis of the human microglial transcriptome across brain regions, aging and disease pathologies. Nature Genetics 54, 4–17.
OpenUrl

[40] 40.↵
Quinlan, A.R., and Hall, I.M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842.
OpenUrl CrossRef PubMed Web of Science

[41] 41.↵
Bennett, D.A., Schneider, J.A., Arvanitakis, Z., and Wilson, R.S. (2012). Overview and findings from the religious orders study. Curr Alzheimer Res 9, 628–645.
OpenUrl CrossRef PubMed Web of Science

[42] 42.↵
Wang, M., Beckmann, N.D., Roussos, P., Wang, E., Zhou, X., Wang, Q., Ming, C., Neff, R., Ma, W., Fullard, J.F., et al. (2018). The Mount Sinai cohort of large-scale genomic, transcriptomic and proteomic data in Alzheimer’s disease. Sci Data 5, 180185.
OpenUrl

[43] 43.↵
Trevethan, R. (2017). Sensitivity, Specificity, and Predictive Values: Foundations, Pliabilities, and Pitfalls in Research and Practice. Front Public Health 5, 307.
OpenUrl

[44] 44.↵
Guan, Y., and Stephens, M. (2011). Bayesian variable selection regression for genome-wide association studies and other large-scale problems. Ann Appl Stat 5, 1780–1815.
OpenUrl CrossRef

[45] 45.↵
Bennett, D.A., Schneider, J.A., Buchman, A.S., Barnes, L.L., Boyle, P.A., and Wilson, R.S. (2012). Overview and findings from the rush Memory and Aging Project. Curr Alzheimer Res 9, 646–663.
OpenUrl CrossRef PubMed Web of Science

[46] 46.↵
Bennett, D.A., Wilson, R.S., Boyle, P.A., Buchman, A.S., and Schneider, J.A. (2012). Relation of neuropathology to cognition in persons without cognitive impairment. Ann Neurol 72, 599–609.
OpenUrl CrossRef PubMed

[47] 47.↵
Lambert, J.-C., Ibrahim-Verbaas, C.A., Harold, D., Naj, A.C., Sims, R., Bellenguez, C., Jun, G., DeStefano, A.L., Bis, J.C., Beecham, G.W., et al. (2013). Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer’s disease. Nature Genetics 45, 1452–1458.
OpenUrl CrossRef PubMed

[48] 48.↵
Carrasquillo, M.M., Zou, F., Pankratz, V.S., Wilcox, S.L., Ma, L., Walker, L.P., Younkin, S.G., Younkin, C.S., Younkin, L.H., Bisceglio, G.D., et al. (2009). Genetic variation in PCDH11X is associated with susceptibility to late-onset Alzheimer’s disease. Nature genetics 41, 192–198.
OpenUrl CrossRef PubMed Web of Science

[49] 49.↵
Allen, M., Carrasquillo, M.M., Funk, C., Heavner, B.D., Zou, F., Younkin, C.S., Burgess, J.D., Chai, H.S., Crook, J., Eddy, J.A., et al. (2016). Human whole genome genotype and transcriptome data for Alzheimer’s and other neurodegenerative diseases. Sci Data 3, 160089.
OpenUrl

[50] 50.↵
1000 Genomes Project Consortium. (2012). An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65.
OpenUrl CrossRef PubMed Web of Science

[51] 51.↵
Choi, S.W., Mak, T.S., and O’Reilly, P.F. (2020). Tutorial: a guide to performing polygenic risk score analyses. Nat Protoc 15, 2759–2772.
OpenUrl PubMed

[52] 52.↵
Euesden, J., Lewis, C.M., and O’Reilly, P.F. (2015). PRSice: Polygenic Risk Score software. Bioinformatics (Oxford, England) 31, 1466–1468.
OpenUrl CrossRef PubMed

[53] 53.↵
Prive, F., Arbel, J., and Vilhjalmsson, B.J. (2020). LDpred2: better, faster, stronger. Bioinformatics.

[54] 54.↵
Ge, T., Chen, C.Y., Ni, Y., Feng, Y.A., and Smoller, J.W. (2019). Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat Commun 10, 1776.
OpenUrl

[55] 55.↵
Linden, A. (2006). Measuring diagnostic and predictive accuracy in disease management: an introduction to receiver operating characteristic (ROC) analysis. J Eval Clin Pract 12, 132–139.
OpenUrl CrossRef PubMed Web of Science

[56] 56.↵
Hansen, D.V., Hanson, J.E., and Sheng, M. (2018). Microglia in Alzheimer’s disease. J Cell Biol 217, 459–472.
OpenUrl Abstract/FREE Full Text

[57] 57.↵
Mathys, H., Davila-Velderrain, J., Peng, Z., Gao, F., Mohammadi, S., Young, J.Z., Menon, M., He, L., Abdurrob, F., Jiang, X., et al. (2019). Single-cell transcriptomic analysis of Alzheimer’s disease. Nature 570, 332–337.
OpenUrl

[58] 58.↵
Kunkle, B.W., Grenier-Boley, B., Sims, R., Bis, J.C., Damotte, V., Naj, A.C., Boland, A., Vronskaya, M., van der Lee, S.J., Amlie-Wolf, A., et al. (2019). Genetic meta-analysis of diagnosed Alzheimer’s disease identifies new risk loci and implicates Aβ, tau, immunity and lipid processing. Nature genetics 51, 414–430.
OpenUrl CrossRef PubMed

[59] 59.↵
Takei, N., Miyashita, A., Tsukie, T., Arai, H., Asada, T., Imagawa, M., Shoji, M., Higuchi, S., Urakami, K., Kimura, H., et al. (2009). Genetic association study on in and around the APOE in late-onset Alzheimer disease in Japanese. Genomics 93, 441–448.
OpenUrl CrossRef PubMed Web of Science

[60] 60.↵
Wojcik, G.L., Graff, M., Nishimura, K.K., Tao, R., Haessler, J., Gignoux, C.R., Highland, H.M., Patel, Y.M., Sorokin, E.P., Avery, C.L., et al. (2019). Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514–518.
OpenUrl CrossRef PubMed

[61] 61.↵
Andaleon, A., Mogil, L.S., and Wheeler, H.E. (2018). Gene-based association study for lipid traits in diverse cohorts implicates BACE1 and SIDT2 regulation in triglyceride levels. PeerJ 6, e4314.
OpenUrl CrossRef

[62] 62.↵
Naj, A.C., Jun, G., Beecham, G.W., Wang, L.-S., Vardarajan, B.N., Buros, J., Gallins, P.J., Buxbaum, J.D., Jarvik, G.P., Crane, P.K., et al. (2011). Common variants at MS4A4/MS4A6E, CD2AP, CD33 and EPHA1 are associated with late-onset Alzheimer’s disease. Nature Genetics 43, 436–441.
OpenUrl CrossRef PubMed

[63] 63.
Karch, C.M., Jeng, A.T., Nowotny, P., Cady, J., Cruchaga, C., and Goate, A.M. (2012). Expression of Novel Alzheimer’s Disease Risk Genes in Control and Alzheimer’s Disease Brains. PLOS ONE 7, e50976.
OpenUrl CrossRef PubMed

[64] 64.
Lu, R.C., Yang, W., Tan, L., Sun, F.R., Tan, M.S., Zhang, W., Wang, H.F., and Tan, L. (2017). Association of HLA-DRB1 polymorphism with Alzheimer’s disease: a replication and meta-analysis. Oncotarget 8, 93219–93226.
OpenUrl CrossRef

[65] 65.↵
Novikova, G., Kapoor, M., Tcw, J., Abud, E.M., Efthymiou, A.G., Chen, S.X., Cheng, H., Fullard, J.F., Bendl, J., Liu, Y., et al. (2021). Integration of Alzheimer’s disease genetics and myeloid genomics identifies disease risk regulatory elements and genes. Nature Communications 12, 1610.
OpenUrl

[66] 66.↵
Nelson, R.M., Pettersson, M.E., and Carlborg, O. (2013). A century after Fisher: time for a new paradigm in quantitative genetics. Trends Genet 29, 669–676.
OpenUrl CrossRef PubMed

[67] 67.↵
Zhou, X., Carbonetto, P., and Stephens, M. (2013). Polygenic modeling with bayesian sparse linear mixed models. PLoS genetics 9, e1003264.
OpenUrl

Scalable Bayesian functional GWAS method accounting for multivariate quantitative functional annotations with applications to studying Alzheimer’s disease

Abstract

Introduction

Material and Methods

Bayesian hierarchical variable selection regression model

Adapted EM-MCMC algorithm

eQTL based functional annotations

Histone modifications based functional annotation

Simulation study design

Applications to ROS/MAP individual-level GWAS data

Applications to IGAP summary-level GWAS data

AD risk prediction by PRS

Ethics statement

Results

Simulation results

Application GWAS results for studying AD

AD risk prediction by PRS in MCADGS

Discussion

Data Availability

Supplemental Data

Declaration of Interests

Web Resources

Data and Code Availability

Acknowledgements

Reference

Citation Manager Formats

Subject Area