Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Novel KITLG/SCF regulatory variants are associated with lung function in African American children with asthma

View ORCID ProfileAngel CY Mak, Satria Sajuthi, Jaehyun Joo, Shujie Xiao, Patrick M Sleiman, Marquitta J White, Eunice Y Lee, Benjamin Saef, Donglei Hu, Hongsheng Gui, Kevin L Keys, Fred Lurmann, Deepti Jain, Gonçalo Abecasis, Hyun Min Kang, Deborah A. Nickerson, Soren Germer, Michael C Zody, Lara Winterkorn, Catherine Reeves, Scott Huntsman, Celeste Eng, Sandra Salazar, Sam S Oh, Frank D Gilliland, Zhanghua Chen, Rajesh Kumar, Fernando D Martínez, Ann Chen Wu, Elad Ziv, Hakon Hakonarson, Blanca E Himes, L Keoki Williams, Max A Seibold, Esteban G. Burchard
doi: https://doi.org/10.1101/2020.02.20.20019588
Angel CY Mak
1Department of Medicine, University of California San Francisco, San Francisco, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Angel CY Mak
  • For correspondence: angelcymak{at}gmail.com
Satria Sajuthi
2Center for Genes, Environment, and Health, National Jewish Health, Denver, CO, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jaehyun Joo
3Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Shujie Xiao
4Center for Individualized and Genomic Medicine Research, Department of Internal Medicine, Henry Ford Health System, Detroit, MI, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Patrick M Sleiman
5Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
6Department of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Marquitta J White
1Department of Medicine, University of California San Francisco, San Francisco, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Eunice Y Lee
1Department of Medicine, University of California San Francisco, San Francisco, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Benjamin Saef
3Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Donglei Hu
1Department of Medicine, University of California San Francisco, San Francisco, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hongsheng Gui
4Center for Individualized and Genomic Medicine Research, Department of Internal Medicine, Henry Ford Health System, Detroit, MI, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kevin L Keys
1Department of Medicine, University of California San Francisco, San Francisco, CA, USA
7Berkeley Institute for Data Science, University of California, Berkeley, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Fred Lurmann
8Sonoma Technology Inc, Petaluma, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Deepti Jain
9Department of Biostatistics, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Gonçalo Abecasis
10Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hyun Min Kang
10Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Deborah A. Nickerson
11Department of Genome Sciences, University of Washington, Seattle, WA, USA
12Northwest Genomics Center, Seattle, WA, USA
13Brotman Baty Institute, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Soren Germer
14New York Genome Center, New York, NY, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael C Zody
14New York Genome Center, New York, NY, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Lara Winterkorn
14New York Genome Center, New York, NY, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Catherine Reeves
14New York Genome Center, New York, NY, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Scott Huntsman
1Department of Medicine, University of California San Francisco, San Francisco, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Celeste Eng
1Department of Medicine, University of California San Francisco, San Francisco, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sandra Salazar
1Department of Medicine, University of California San Francisco, San Francisco, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sam S Oh
1Department of Medicine, University of California San Francisco, San Francisco, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Frank D Gilliland
15Department of Preventive Medicine, Division of Environmental Health, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Zhanghua Chen
15Department of Preventive Medicine, Division of Environmental Health, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Rajesh Kumar
16Ann and Robert H. Lurie Children’s Hospital of Chicago, Chicago, IL, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Fernando D Martínez
17Asthma and Airway Disease Research Center, University of Arizona, Tucson, AZ, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ann Chen Wu
18Precision Medicine Translational Research (PRoMoTeR) Center, Department of Population Medicine, Harvard Medical School and Pilgrim Health Care Institute, Boston, MA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Elad Ziv
1Department of Medicine, University of California San Francisco, San Francisco, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hakon Hakonarson
5Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
6Department of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Blanca E Himes
3Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
L Keoki Williams
4Center for Individualized and Genomic Medicine Research, Department of Internal Medicine, Henry Ford Health System, Detroit, MI, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Max A Seibold
3Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Esteban G. Burchard
1Department of Medicine, University of California San Francisco, San Francisco, CA, USA
19Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

ABSTRACT

Baseline lung function, quantified as forced expiratory volume in the first second of exhalation (FEV1), is a standard diagnostic criterion used by clinicians to identify and classify lung diseases. Using whole genome sequencing data from the National Heart, Lung, and Blood Institute TOPMed project, we identified a novel genetic association with FEV1 on chromosome 12 in 867 African American children with asthma (p = 1.26 × 10−8, β = 0.302). Conditional analysis within 1 Mb of the tag signal (rs73429450) yielded one major and two other weaker independent signals within this peak. We explored statistical and functional evidence for all variants in linkage disequilibrium with the three independent signals and yielded 9 variants as the most likely candidates responsible for the association with FEV1. Hi-C data and eQTL analysis demonstrated that these variants physically interacted with KITLG (aka SCF) and their minor alleles were associated with increased expression of KITLG gene in nasal epithelial cells. Gene-by-air-pollution interaction analysis found that the candidate variant rs58475486 interacted with past-year SO2 exposure (p = 0.003, β = 0.32). This study identified a novel protective genetic association with FEV1, possibly mediated through KITLG, in African American children with asthma.

INTRODUCTION

Asthma, a chronic pulmonary condition characterized by reversible airway obstruction, is one of the hallmark diseases of childhood in the United States (World Health Organization 2017). Asthma is also the most disparate common disease in the pediatric clinic, with significant variation in prevalence, morbidity, and mortality among U.S. racial/ethnic groups (Oh et al. 2016). Specifically, African American children carry a higher asthma disease burden compared to their European American counterparts (Akinbami et al. 2014; Akinbami 2015). Forced expiratory volume in the first second (FEV1), a measurement of lung function, is a vital clinical trait used by physicians to assess overall lung health and diagnose pulmonary diseases such as asthma (Johnson and Theurer. 2014). We have previously shown that genetic ancestry plays an important role in FEV1 variation and that African Americans have lower FEV1 compared to European Americans regardless of asthma status (Kumar et al. 2010; Pino-Yanes et al. 2015). The disparity in lung function between populations may explain disparities in asthma disease burden. Understanding the factors that influence FEV1 variation among individuals with asthma could lead to improved patient care and therapeutic interventions.

Twin and family-based studies estimate that the heritability of FEV1 ranged from 26% to 81%, supporting the combined contribution by genetic and environmental factors in FEV1 variation (Chatterjee and Das. 1995; Chen et al. 1996; Hukkinen et al. 2011; Palmer et al. 2001; Sillanpaa et al. 2017; Tian et al. 2017; Yamada et al. 2015). Genome-wide association studies (GWAS) of FEV1, including among individuals with asthma, have identified many variants that contribute to lung function (Li et al. 2013; Liao et al. 2014; Repapi et al. 2010; Soler Artigas et al. 2011; Soler Artigas et al. 2015; Wain et al. 2017). A search in NHGRI-EBI GWAS Catalog (version e98_r2020-03-08) on baseline lung function (FEV1) alone revealed 349 associations (Buniello et al. 2019). Most of these previous GWAS, however, were performed in adult populations of European descent, and their results may not generalize across populations or across the life span of an individual (Carlson et al. 2013; Martin, A. R. et al. 2017; Wojcik et al. 2019). Previous GWAS results are also limited due to their reliance on genotyping arrays. In particular, variation in non-coding regions of the genome is not adequately covered by many genotyping arrays because they were not designed to account for the population-specific genetic variability of all populations (Kim, M. S. et al. 2018; Zhang and Lupski. 2015). Whole genome sequencing (WGS) is a newer technology that captures nearly all common variation from coding and non-coding regions of the genome and is unencumbered by genotype array design constraints and differences in linkage disequilibrium patterns among populations. To date, no large-scale WGS studies of lung function have been performed in African American children with asthma (Martin et al. 2017).

In addition to genetics, FEV1 is a complex trait that is significantly influenced by both genetic variation and environmental factors, such as air pollution (Chatterjee and Das. 1995; Hukkinen et al. 2011; Palmer et al. 2001; Sillanpaa et al. 2017; Tian et al. 2017; Yamada et al. 2015). Exposure to ambient air pollution has been consistently associated with poor respiratory outcomes, including reduced FEV1 (Barraza-Villarreal et al. 2008; Brunekreef and Holgate. 2002; Ierodiakonou et al. 2016; Wise 2019). We previously showed that exposure to sulfur dioxide (SO2), an air pollutant emitted by the burning of fossil fuels, is significantly associated with reduced FEV1 in African American children with asthma in the SAGE II study (Neophytou et al. 2016). Because the genetic variants associated with FEV1 thus far do not account for the majority of its estimated heritability, considering gene-environment (GxE) interactions, specifically gene-by-air-pollution, may improve our understanding of lung function genetics (Moore 2005; Moore and Williams. 2009). Here, we performed a genome-wide association analysis using WGS data to identify common genetic variants associated with FEV1 in African American children with asthma in SAGE II and investigated the effect of GxE (SO2) interactions on FEV1 associations.

METHODS

Study population

This study examined African American children between 8-21 years of age with physician-diagnosed asthma from the Study of African Americans, Asthma, Genes & Environments (SAGE II). All SAGE II participants were recruited from the San Francisco Bay Area. The inclusion and exclusion are previously described in detailed (Oh et al. 2012; White et al. 2016). Briefly, participants were eligible if they were 8-21 years of age and self-identified as African American and had four African American grandparents. Study exclusion criteria included the following: 1) any smoking within one year of the recruitment date; 2) 10 or more pack-years of smoking; 3) pregnancy in the third trimester; 4) history of lung diseases other than asthma (for cases) or chronic illness (for cases and controls). Baseline lung function defined as forced expiratory volume in the first second (FEV1) was measured by spirometry prior to administering albuterol as previously described (Oh et al. 2012).

TOPMed whole genome sequencing data

SAGE II DNA samples were sequenced as part of the Trans-Omics for Precision Medicine (TOPMed) whole genome sequencing (WGS) program (Taliun et al. 2019). WGS was performed at the New York Genome Center and Northwest Genomics Center on a HiSeq X system (Illumina, San Diego, CA) using a paired-end read length of 150 base pairs (bp), with a minimum of 30x mean genome coverage. DNA sample handling, quality control, library construction, clustering and sequencing, read processing and sequence data quality control are described in detail in the TOPMed website (TOPMed 2019). Variant calls were obtained from TOPMed data freeze 8 VCF files corresponding to the GRCh38 assembly. Variants with a minimal read depth of 10 (DP10) were used for analysis unless otherwise stated.

Genetic principal components, global ancestry, and kinship estimation

Genetic principal components (PCs), global ancestry, and kinship estimation on genetic relatedness were computed using biallelic single nucleotide polymorphisms (SNPs) with a PASS flag from TOPMed freeze 8 DP10 data. PCs and kinship estimates were computed using the PC-Relate function from the GENESIS R package (Conomos et al. 2015; Conomos et al. 2016) using a workflow available from the Summer Institute in Statistical Genetics Module 17 course website (Summer Institute in Statistical Genetics 2019). African global ancestry was computed using the ADMIXTURE package (Alexander et al. 2009) in supervised mode using European (CEU), African (YRI) and Native American (NAM) reference panels as previously described (Mak, A. C. Y. et al. 2018).

FEV1 GWAS

Non-normality of the distribution of FEV1 values was tested with the Shapiro-Wilk test in R using the shapiro.test function. Since FEV1 was not normally distributed (p = 1.41 × 10−8 for FEV1 and p = 1.05 × 10−8 for log10 FEV1), FEV1 was regressed on all covariates (age, sex, height, controller medications, sequencing centers, and the first 5 genetic PCs) and the residuals were inverse-normalized. These inverse-normalized residuals (FEV1.res.rnorm) were the main outcome of the discovery GWAS. The controller medication covariate included the use of inhaled corticosteroids (ICS), long-acting beta-agonists (LABA), leukotriene inhibitors and/or an ICS/LABA combo in the 2 weeks prior to the recruitment date.

Genome-wide single variant analysis was performed on the ENCORE server (https://github.com/statgen/encore) using the linear Wald test (q.linear) originally implemented in EPACTS (https://genome.sph.umich.edu/wiki/EPACTS) and TOPMed freeze 8 data (DP0 PASS) with a MAF filter of 0.1%. All pairwise relationships with degree 3 or more relatedness (kinship values > 0.044) were identified, and one participant of the related pair was subsequently chosen at random and removed prior to analysis. All covariates used to obtain FEV1.res.rnorm were also included as covariates in the GWAS as recommended in a recent publication (Sofer et al. 2019). The association analysis was repeated using untransformed FEV1 and FEV1 percent predicted (FEV1.perc.predicted). FEV1 percent predicted was defined as the percentage of measured FEV1 relative to predicted FEV1 estimated by the Hankinson lung function prediction equation for African Americans (Hankinson et al. 1999). A secondary analysis that included smoking-related covariates (smoking status and number of smokers in the family) was performed in PLINK 1.9 (version 1p9_2019_0304_dev) (Chang et al. 2015; Purcell and Chang. 2013). To study whether association with FEV1 is specific to SAGE II participants with asthma, we repeated the association analysis adjusting for age, sex, height and the first 5 genetic PCs in SAGE II participants without asthma on the ENCORE server. All of these participants were sequenced in the same center. Regional association results were plotted using LocusZoom 1.4 (Pruim et al. 2010) with a 500 kilobase (Kb) flanking region. Linkage disequilibrium (R2) was estimated in PLINK 1.9. LD plot was generated using recoded genotype files (plink --recode 12) in Haploview (Barrett et al. 2005).

The function effectiveSize in the R package CODA was used to estimate the actual effective number of independent tests and CODA-adjusted statistical and suggestive significance p-value thresholds were defined as 0.05 and 1 divided by the effective number of tests, respectively (Duggal et al. 2008). We compared the CODA-adjusted statistical significance threshold and the widely used 5 × 10−8 GWAS genome-wide significance threshold (Pe’er et al. 2008) and selected the more stringent threshold for genome-wide significance.

The following WGS quality control steps were applied to all reported variants from ENCORE to ensure WGS variant quality: (1) The variant had VCF FILTER = PASS; (2) Variant quality was confirmed via manual inspection on the BRAVO server based on TOPMed freeze 5 data (University of Michigan and NHLBI TOPMed. 2018); (3) Variants were reanalyzed with linear regression using PLINK 1.9 by applying the arguments --mac 5 --geno 0.1 --hwe 0.0001 using TOPMed freeze 8 DP10 PASS data.

To determine if the rs73429450 association with FEV1 was only identifiable using whole genome sequencing data, we repeated the linear regression association analysis on signals that passed the genome-wide significance threshold using PLINK 1.9 and genotype data generated with Axiom Genome-Wide LAT 1 array (Affymetrix, Santa Clara, CA, dbGaP phs000921.v1.p1). These array genotype data were imputed into the following reference panels: 1000 Genomes phase 3 version 5, Haplotype Reference Consortium (HRC) r1.1, the Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA) and the TOPMed phase 5 panels on the Michigan Imputation Server (Das et al. 2016). It should be noted that 500 SAGE II subjects were part of the TOPMed freeze 5 reference panel.

A total of 349 GWAS FEV1-associated entries were retrieved from the NHGRI-EBI GWAS Catalog version 1.0.2-associations_e98_r2020-03-08 (Buniello et al. 2019) using the trait names “Lung function (FEV1)”, “FEV1”, “Lung function (forced expiratory volume in 1 second)” or “Prebronchodilator FEV1”. After adding 100 Kb flanking regions to each of the 349 entries, a total of 230 non-overlapping region were obtained. To look up whether we replicated previously GWAS loci while control for multiple testing penalties, we only used 279,495 common variants (MAF >= 0.01) that overlapped with the 230 regions. The 279,495 common variants is equivalent to 17,755 effective test based on CODA and 5.63 × 10−5 (1/17,755) was used as suggestive p-value threshold for replication.

Conditional analysis

Conditional analysis was performed to identify all independent signals in a GWAS peak using PLINK 1.9. All TOPMed freeze 8 DP10 variants within 1 megabase (Mb) of the tag association signal and with association p-value of 1 × 10−4 or smaller in the discovery GWAS were included in the analysis. Variants were first ordered by ascending p-value. A variant was considered to be an independent signal if the association p-value after conditioning (conditional p-value) on the tag signal was smaller than 0.05. Newly identified independent signals were included with the tag signal for conditioning on the next variant.

Region-based association analysis

Region-based association analyses were performed in 1 Kb sliding windows with 500 bp increments in a 1 Mb flanking region of the tag GWAS signal using the SKAT_CommonRare function from the SKAT R package v1.3.2.1 (Ionita-Laza et al. 2013). Default settings were used with method = “C” and test.type = “Joint”. A minor allele frequency (MAF) threshold of 0.01 was used as the cutoff to distinguish rare and common variants. Variants were annotated in TOPMed using the WGSA pipeline (Liu et al. 2016). Since SKAT imputes missing genotypes by default by assigning mean genotype values (impute.method=“fixed”), we chose to use low coverage genotypes instead of SKAT imputation, and hence, TOPMed freeze 8 DP0 variants with a VCF FILTER of PASS were included in the analysis. The function effectiveSize in the R package CODA (Plummer et al. 2006) was used to estimate the effective number of independent hypothesis tests for accurate Bonferroni multiple testing corrections. P-value thresholds for statistical significance and suggestive significance were defined as 0.05 and 1 divided by the effective number of tests, respectively (Duggal et al. 2008). If a region was suggestively significant, region-based analyses were repeated with functional variants and/or rare variants (MAF <= 0.01) to assess contribution of common, rare and/or functional variants. Region-based analyses using rare variants only were performed using SKAT-O (Lee et al. 2012). The WGSA annotation filters used to define functional variants are provided in File S1 (Supplementary Text 1). To study the contribution of individual variants to a region-based association p-value, drop-one variant analysis was performed by repeating the region-based analysis multiple times and dropping one variant only at a time.

Functional annotations and prioritization of genetic variants

The Hi-C Unifying Genomic Interrogator (HUGIN) (Ay et al. 2014; Martin, J. S. et al. 2017; Schmitt et al. 2016) was used to assign potential gene targets to each variant. HUGIN uses the Hi-C data generated from the primary human tissues from four donors used in the Roadmap Epigenomics Project (Schmitt et al. 2016). ENCODE annotations (ENCODE Project Consortium 2011; ENCODE Project Consortium 2012) were based on overlap of the variants with functional data downloaded from the UCSC Table Browser (Karolchik et al. 2004). These data included DNAase I hypersensitivity peak clusters (hg38 wgEncodeRegDnaseClustered table), transcription factor ChIP-Seq clusters (hg38 encRegTfbsClustered table) and histone modification ChIP-Seq peaks (hg19 wgEncodeBroadHistone<cell type><histone>StdPk tables). For DNase I hypersensitivity and transcription factor binding sites, we focused on blood, bone marrow, lung and embryonic cells. For histone modification ChIP-Seq, we focused on H3K27ac and H3K4me3 modifications in human blood (GM12878), bone marrow (K562), lung fibroblast (NHLF), and embryonic stem cells (H1-hESC). LiftOver tool (Hinrichs et al. 2006) was used to convert genomic coordinates from hg19 to hg38. Candidate cis-regulatory elements (ccREs) were a subset of representative DNase hypersensitivity sites with epigenetic activity further supported by histone modification (H3K4me3 and H3K27ac) or CTCF-binding data from the ENCODE project. Overlap of variants with ccREs were detected using the Search Candidate cis-Regulatory Elements by ENCODE (SCREEN) web interface (ENCODE Project Consortium 2011; ENCODE Project Consortium 2012).

Prioritization of genetic variants was based on the presence of statistical, functional and/or bioinformatic evidence as described in the Diverse Convergent Evidence (DiCE) prioritization framework (Ciesielski et al. 2014). The priority score of each variant was obtained by counting the number of statistical, functional, and/or bioinformatic evidences that support potential biological function for that variant.

Replication of GWAS associations

All replication analyses were performed in subjects with asthma. Replication of GWAS FEV1 associations was attempted on TOPMed whole genome sequencing data generated from four cohorts. These cohorts included Puerto Rican (n=1,109) and Mexican American (n=649) children in the Genes-Environments and Admixture in Latino Americans (GALA II) study (Oh et al. 2012), African American adults in the Study of Asthma Phenotypes and Pharmacogenomic Interactions by Race-Ethnicity (SAPPHIRE, n=3,428) (Levin et al. 2014) and African American children in Genetics of Complex Pediatric Disorders (GCPD-A, n=1,464) study (Ong et al. 2013). Age, sex, height, controller medications and the first 5 PCs were used as covariates.

Additionally, replication of GWAS FEV1 associations was attempted using data of black UK Biobank subjects who had asthma (n=627) while adjusting for age, sex, height and the first 5 principal components. Asthma status was defined by ICD code or self-reported asthma. UK Biobank genotype data was generated on Affymetrix UK BiLEVE axiom or UK Biobank Axiom array and imputed into the Haplotype Reference Consortium, 1000G and UK 10K projects (Bycroft et al. 2018; Canela-Xandri et al. 2018). Additional details on the UK Biobank study and the replication procedures are available in File S1 (Supplementary Text 2).

RNA sequencing and expression quantitative trait loci (eQTL) analysis

Whole-transcriptome libraries of 370 nasal brushings from GALA II Puerto Rican children with asthma were constructed by using the Beckman Coulter FX automation system (Beckman Coulter, Fullerton, CA). Libraries were sequenced with the Illumina HiSeq 2500 system. Raw RNA-Seq reads were trimmed using Skewer (Jiang et al. 2014) and mapped to human reference genome hg38 using Hisat2 (Kim, D. et al. 2015). Reads mapped to genes were counted with htseq-count and using the UCSC hg38 GTF file as reference (Anders et al. 2015). Cis-expression quantitative trait locus (eQTL) analysis of KITLG was performed as described in the Genotype-Tissue Expression (GTEx) project version 7 protocol (GTEx Consortium et al. 2017) using age, sex, BMI, global African and European ancestries and 60 PEER factors as covariates.

Gene-by-air-pollution interaction analysis

We hypothesized that the effect of genetic variation on lung function in our study population may differ by the levels of exposure to SO2 (Neophytou et al. 2016). To test for an interaction between a genetic variant and SO2, an additional multiplicative interaction term (variant × S02 exposure) was included in the original GWAS model (see Method Section “FEV1 GWAS”). The SO2 estimates used in the interaction analysis were first-year, past-year, and lifetime exposure to ambient of SO2, which were estimated as described previously (Neophytou et al. 2016). Briefly, we obtained regional ambient daily air pollution data from the U.S. Environmental Protection Agency Air Quality System. SO2 estimates for the participant’s residential geographic coordinate were calculated as the inverse distance-squared weighted average from the four closest air pollution monitoring stations within 50 km of the participant’s residence. We estimated yearly exposure at the reported residential address by averaging all available daily measures (daily average of 1-hour SO2) in a given year. If the participant had a change of residential address in a given year, we estimate yearly exposure as a time-weighted estimate based on the number of months spent at each different address in that year. Average lifetime exposures were estimated using all available yearly average estimates over the lifetime of the participant until the day of spirometry testing. Since not all pollutants were measured daily, there are location-and pollutant-dependent missing values. Residuals of FEV1 were plotted against exposure to SO2 and stratified by the number of copies of the minor allele. Residuals of FEV1 were obtained as described in the Methods Section “FEV1 GWAS”.

Data availability

Local institutional review boards approved the studies (IRB# 10-02877). All subjects and legal guardians provided written informed consent. TOPMed whole genome sequencing and phenotype data from SAGE II are available on dbGaP under accession number phs000921.v4.p1. Normalized gene count data for KITLG and supplemental materials are available at figshare.

RESULTS

Novel lung function associations

Subject characteristics of the 867 African American children with asthma included in this study are shown in Table 1, and the distribution of their FEV1 measurements (mean = 2.56 L, standard deviation = 0.79 L) is in Figure S1. The CODA-adjusted statistical significance thresholds 2.10 × 10−8 and 4.19 × 10−7 were used as the genome-wide and suggestive significance thresholds, respectively. According to this threshold, one SNP in chromosome 12 (chr12:88846435, rs73429450, G>A) was associated with FEV1.res.rnorm (Figure 1, p = 9.01 × 10−9, β = 0.801) at genome-wide significance. The association between rs73429450 and lung function remained statistically significant when the association was repeated using untransformed FEV1 (p = 1.26 × 10−8, β = 0.302) as the outcome variable. The association between rs73429450 and lung function was suggestive using FEV1.perc.predicted (p = 1.69 × 10−7, β = 0.100). Twenty suggestive associations corresponding to 4 tag signals are reported in Supplementary File S2. None of the suggestive associations overlapped with any of the previously reported FEV1-associated loci. When considering only common variants and applying a p-value threshold of 5.63 × 10−5, we found replicated in 6 out of 230 previously reported FEV1 associations (Table S1). Our top FEV1 association, rs73429450, did not overlap with any previously reported loci and it is a novel association with FEV1 in this study population.

View this table:
  • View inline
  • View popup
Table 1.

Descriptive characteristics of 867 African American children with asthma included in this study.

Figure 1.
  • Download figure
  • Open in new tab
Figure 1.

Manhattan and LocusZoom plots from genome-wide association study of lung function*. (A) Manhattan plot from genome-wide association study of lung function* using linear regression in ENCORE. Red horizontal line: CODA-adjusted genome-wide significance p-value of 2.10 × 10−8. Blue horizontal line: CODA-adjusted suggestive significance p-value of 4.19 × 10−7. (B) LocusZoom plot of rs73429450 (chr12 : 88846435) and 500 Kb flanking region. Colors show linkage disequilibrium in the study population. * FEV1.res.rnorm was used as the phenotype for the association testing.

Secondary analysis that included covariates correcting for smoking status and number of smokers in the family showed that smoking-related factors were not significantly associated with FEV1 in our pediatric SAGE cohort: using 657 out of 867 individuals with available smoking-related covariates, the FEV1.res.rnorm association p-values before and after including the smoking-related covariates were 2.01 × 10−6 and 1.89 × 10−6. Both p-values of the covariates smoking status (p = 0.27) and number of smokers in the family (p = 0.54) were not significant.

Conditional analysis was performed on 45 variants with association p < 1 × 10−4located within 1 Mb of the strongest association signal (rs73429450). Two weaker independent signals (rs17016065, rs58475486) were identified (Table S2). None of the 45 variants showed association with FEV1.res.rnom in 251 SAGE II children without asthma (Table S3).

The minor allele frequency of rs73429450 in continental populations from the 1000 Genomes Project (1000G) is 3% in Africans (AFR) and < 1% in Admixed Americans (AMR), Europeans (EUR) and Asians (EAS and SAS) (1000 Genomes Project Consortium et al. 2015). Rs73429450 was not included on the Affymetrix LAT1 genotyping array where SAGE participants were previously genotyped. To determine if the rs73429450 association with FEV1 was only identifiable using whole genome sequencing data, we attempted to reproduce our results by imputing the genotype of rs73429450 in 851 SAGE participants with available array data using 1000G phase 3 (n = 2,504), HRC r1.1 (n = 32,470), CAAPA (n = 883) and TOPMed freeze 5 (n = 62,784) reference panels. Our results remained statistically significant when using the 1000G phase 3 (p = 4.97 × 10−8, β = 0.79, imputation R2 = 0.95) and TOPMed freeze 5 (p = 1.22 × 10−8, β = 0.80, imputation R2 = 0.98) reference panels, but lost statistical significance when rs73429450 genotypes were imputed using the HRC (p = 4.35 × 10−7, β = 0.68, imputation R2 = 0.94) and CAAPA (p = 1.95 × 10−7, β = 0.80, imputation R2 = 0.71) reference panels.

Region-based association analysis including all variants conditioned on the association signal from rs73429450 was performed in its 1 Mb flanking region (chr12:87846435-89846435). No windows were significantly associated after Bonferroni multiple testing correction (p < 2.80 × 10−4, Figure S2), but 20 windows were suggestively associated with FEV1.res.rnorm (p < 5.60 × 10−3, Table S5). Two of 20 windows re-tested using only functional variants were suggestively significant (region 4 and 16). Both of these windows were no longer suggestively significant after removing the common variants, indicating that association signal from these regions was mostly driven by common variants. Further investigation on region 16 using drop-one analysis on the 2 rare and 1 common function variants confirmed the major contribution by the common variant, rs1895710, as shown by the major increase in p-value (Table S6). The signal was also slightly driven by the singleton, rs990979778. Drop-one analysis was not performed on region 4 because there were only 1 common and 1 rare variants.

A Hi-C assay couples a chromosome conformation capture (3C) assay with next-generation sequencing to capture long-range interactions in the genome. We identified a statistically significant long-range chromatin interaction between the GWAS peak and the KIT ligand (KITLG, also known as stem cell factor, SCF) gene in human fetal lung fibroblast cell line IMR90 (Table S7). The long-range interaction detected in human primary lung tissue was not significant, implying that the potential long-range interactions are specific to tissue type or developmental stage.

Potential regulatory role of FEV1-associated variants on KITLG expression

To further elucidate potential regulatory relationships between the GWAS association peak and KITLG, we analyzed whether variants in the peak were eQTL of KITLG in previously published whole blood RNA-Seq data available from the same study participants (Mak, Angel CY et al. 2016). The whole blood RNA-Seq data, however, did not yield evidence of expressed KITLG, consistent with results in GTEx. We subsequently used RNA-Seq data from nasal epithelial cells of 370 Puerto Rican children with asthma from the GALA II study, and found that five out of 45 variants were eQTL of KITLG (Table S8). While Puerto Ricans are a different population than African Americans, they are both admixed populations with substantial African genetic ancestry, and therefore could share eQTLs. All five eQTLs corresponded to one signal in a region with strong linkage disequilibrium (r2 > 0.8, Figure S3).

Replication of genetic association with FEV1

Subject characteristics of our four replication cohorts (SAPPHIRE, GCPD-A, UK Biobank and GALA II) are shown in Table S9. We attempted to replicate the association of the 45 SNPs in our primary FEV1 GWAS in each cohort. We used 0.05 as the suggestive p-value threshold and 0.0167 as the Bonferroni-corrected p-value threshold after correcting for 3 independent signals (see conditional analysis in Results Section). A total of 20 variants were replicated at p < 0.05 with consistent direction of effect in black UK Biobank participants; 14 variants in SAPPHIRE and 2 variants in GCPD-A were significant but had an opposite direction of effect (Table S10).

We attempted to replicate the FEV1.res.rnorm association in Mexican American (n = 649) and Puerto Rican (n = 1,109) children with asthma from the GALA II study. In Mexican Americans, we excluded 19 variants with MAF < 0.1% and associations for the remaining 26 variants did not replicate (Table S11). In Puerto Ricans, the associations were not replicated (Table S11).

Incorporating statistical and functional evidence for candidate variant prioritization

We combined and summarized all functional evidence for the top 45 variants, along with eQTL findings from nasal epithelial RNA-Seq and replication results (Figure 2, Table 2 and S12). To facilitate interpretation of the variant association with FEV1, the effect sizes and p-values of both FEV1 (β and p) and FEV1.res.rnorm (βnorm and pnorm) associations are also reported. CADD functional prediction score and ENCODE histone modification ChIP-Seq peaks in embryonic, blood, bone marrow, and lung-related tissues were also examined but not reported because none of the variants had a CADD score greater than 10 and none overlapped with histone modification sites. Rs73440122 received the highest priority score of 3 based on replication in the UK Biobank, overlap with a DNase I hypersensitivity site in B-lymphoblastoid cells (GM12865) and overlap with an SPI1 binding site in acute promyelocytic leukemia cells. Eight other variants were prioritized with score > 2 or evidence of being an eQTL for KITLG in nasal epithelial cells (Table 2, score marked with ^ or # respectively). These nine candidate variants were selected for gene-by-air-pollution interaction analyses.

View this table:
  • View inline
  • View popup
Table 2.

Genome-wide lung function association in SAGE II children with asthma.

Figure 2.
  • Download figure
  • Open in new tab
Figure 2.

Integration of statistical and functional evidence for variant prioritization. Numbers and different shades of black in the LD plot represent LD in R2. The three independent signals identified in the conditional analysis are marked with *. Indels are marked with &. Nasal eQTL, variants eQTL of KITLG in nasal epithelial cells. ccREs, candidate cis-regulatory elements in SCREEN registry. ENCORE, DNase I hypersensitivity site and/or transcription factor ChIP-Seq overlapping with the variants. UK Biobank, SAPPHIRE, GCPD-A, replication results using Blacks in UK Biobank and African Americans in the SAPPHIRE and GCPD-A cohorts (R = replicated at p<0.05; F = flip-flop association at p < 0.05). Candidate, candidate variants prioritized because of presence of two or more evidence or is nasal eQTL. + indicates presence of evidence. Boxes in the top panel were shaded grey if results were not available.

Gene-by-air-pollution interaction of rs58475486

We previously found that first year of life and lifetime exposure to SO2 were associated with FEV1 in African American children (Neophytou et al. 2016). We investigated whether the effect of the nine prioritized genetic variants associated with lung function varied by SO2 exposure (first year of life, past year, and lifetime exposure). Since the nine variants represent three independent signals (see conditional analysis in the Results Section), the Bonferroni-corrected p-value threshold was set to p = 0.0056 (correction for nine tests; three signals and three exposure periods to SO2). We observed a single statistically significant interaction between the T allele of rs58475486 and past year exposure to SO2 that was positively associated with FEV1 (p = 0.003, β = 0.32, Table 3, Figure 3A). This interaction remains significant (p = 0.003, β = 0.32) in secondary analyses adjusted for smoking status or a multiplicative interaction term of rs58475486 and smoking status as additional covariates. Interestingly, six of the remaining eight variants also displayed interaction effects with past year exposure to SO2 that were suggestively associated (p< 0.05) with FEV1 (Table 3). We also found a suggestive interaction of the C allele of rs73440122 with first year exposure to SO2 that was associated with decreased FEV1 (p = 0.045, β = −0.32, Figure 3B). The same allele also showed interaction with past year of exposure to SO2 that was suggestively associated with FEV1 in the opposite direction (p = 0.051, β = 0.39).

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 3.

Gene-and-environment analysis on FEV1

Figure 3.
  • Download figure
  • Open in new tab
Figure 3.

Gene-by-environment interaction analysis on FEV1. FEV1 residuals, residuals after FEV1 was regressed on the covariates age, sex, height, controller medications, sequencing centers and the first 5 genetic PCs. FEV1 residuals was plotted against (A) past year exposure to SO2 stratified by the number of copies of T allele of rs58475486, (B) first year of life exposure to SO2 stratified by the number of copies of C allele of rs73440122.

DISCUSSION

Variant rs73429450 (MAF = 0.030) was identified as the strongest association signal with FEV1. Each additional copy of the protective A allele of rs73429450 was associated with a 0.3 L increase of FEV1. We did not find any statistically significant contribution of rare variants to the association signal from a 1 Kb sliding window analyses in the 1 MB flanking region centered on rs73429450. We were surprised to identify a novel common variant (MAF = 0.030) associated with lung function using whole genome sequence data in a population that was previously analyzed for associations with lung function using genotype array data. Further investigation revealed that our discovered variant, rs73429450, was not captured by the LAT 1 genotyping array, and the association with lung function depended on the reference panel used to impute the variant into our population. More surprisingly, our statistically significant finding was only found to be suggestively significant using data imputed from the CAAPA reference panel (p = 1.95 × 10−7, β = 0.80). Of the imputation reference panels that we assessed, CAAPA is one of the more relevant reference panels for our study population because it is based on African populations in the Americas. However, we note that the effect size estimated from CAAPA-imputed data was comparable to that generated from WGS data. While whole genome sequencing data is usually praised for enabling analysis of rare-variant contributions to phenotype variability, our results show the utility of whole genome sequencing data for the reliable analysis of common variants as well in the absence of relevant imputation panels.

Although rs73429450 had the lowest p-value from our whole genome sequencing association analysis, we did not find the required amount of functional evidence to prioritize this marker for inclusion in downstream gene-by-air-pollution analyses. Another variant, rs73440122, was in moderate to strong linkage disequilibrium (r2 = 0.76) with rs7349450 and had a similar MAF (0.027) in our study population, but was only suggestively associated with FEV1 in our association analysis (p = 2.08 × 10−7, Table2). In contrast to rs73429450, there were multiple lines of evidence suggesting the functional relevance of rs73440122: rs73440122 received the highest priority score based on its replicated FEV1 association in black UK Biobank participants and overlap with ENCODE gene regulatory regions, making it one of the most likely drivers of FEV1 variability among individuals, possibly mediated through KITLG.

Bioinformatic interrogation of rs73440122 revealed that the variant overlapped with a ccRE (SCREEN accession EH37E0279310), DNase I hypersensitivity site, and SPI1 ChIP-Seq clusters that were indicative of a candidate open chromatin gene regulatory region (Table S12). The binding evidence of SPI1 is highly relevant to the role of KITLG in type 2 inflammation (see below). Variant rs73440122 is located in a region that physically interacted with KITLG based on Hi-C data in fetal lung fibroblast cells. Additionally, five neighboring FEV1 associated variants were identified as eQTLs of KITLG, although they appeared to be an independent signal (r2 < 0.2). Overall, these results support regulatory interactions between our novel locus and KITLG.

Atopic or type 2 high asthma is the most common form of asthma in children (Comberiati et al. 2017). KITLG, more commonly known as stem cell factor (SCF), is a ligand of the KIT tyrosine kinase receptor. It plays an important role in type 2 inflammation in atopic asthma, especially in inflammatory processes mediated through mast cells, IgE and group 2 innate lymphoid cells (Da Silva and Frossard. 2005; Da Silva et al. 2006; Fonseca et al. 2019; Oliveira and Lukacs. 2003). In the airways, KITLG is expressed in bronchial epithelial cells, lung fibroblasts, bronchial smooth muscle cells, endothelial cells, peripheral blood eosinophils, dendritic cells and mast cells (Hsieh et al. 2005; Kassel et al. 1999; Oriss et al. 2014; Valent et al. 1992; Wen et al. 1996). KITLG is a major growth factor of mast cells (Reviewed in Broudy 1997; Da Silva et al. 2006; Galli et al. 1994; Galli et al. 1995). It promotes recruitment of mast cell progenitors into tissues (Reviewed in Oliveira and Lukacs. 2003), prevents mast cell apoptosis (Iemura et al. 1994; Mekori et al. 1993) and promotes release of inflammatory mediators such as proteases, histamine, chemotactic factors, cytokines (Reviewed in Amin 2012; Borish and Joseph. 1992). While KITLG promotes the production of cytokines like IL-13 upon IgE-receptor crosslinking on the surface of mast cells (Kobayashi et al. 1998), IL-13 was also reported to up-regulate KITLG (Rochman et al. 2015). Consistent with the critical role of KITLG for mast cells and type 2 inflammation, we found our prioritized variant, rs73440122, overlapped with a SPI1 (aka PU.1) ChIP-Seq cluster. The transcription factor SPI1 was demonstrated in SPI1 knockout mice to be necessary for the development of B cells, T cells, neutrophils, macrophages, dendritic cells, and mast cells (Anderson et al. 2000; Guerriero et al. 2000; McKercher et al. 1996; Scott et al. 1994; Scott et al. 1997; Walsh et al. 2002). It plays an essential role in macrophage differentiation in asthmatic and other allergic inflammation (Qian et al. 2015; Yashiro et al. 2019). It was also shown to regulate the cell fate between mast cells and monocytes (Ito et al. 2005; Ito et al. 2009; Nishiyama, Nishiyama, Ito, Masaki, Maeda et al. 2004; Nishiyama, Nishiyama, Ito, Masaki, Masuoka et al. 2004). The presence of a SPI1 binding site in a candidate regulatory region of KITLG is therefore highly relevant given the critical role of KITLG in mast cell survival and activation.

Higher levels of KITLG (Al-Muhsen et al. 2004; Da Silva et al. 2006; Tayel et al. 2017) and an increased number of mast cells in the lung (Cruse and Bradding. 2016; Fajt and Wenzel. 2013; Mendez-Enriquez and Hallgren. 2019) were detected in individuals with asthma. The percentage of a subpopulation of circulating blood mast cell progenitors (Lin+ CD34hi CD117int/hi FcεRI+) was higher in individuals with a reduced lung function (Dahlin et al. 2016). These findings suggested that higher KITLG expression and/or number of mast cells may be a contributing factor to lower lung function. This notion was inconsistent with the association of our novel locus with higher KITLG expression and increased lung function in SAGE II children with asthma. Interestingly, a study of 20 subjects with severe asthma found that increased in the number of chymase-positive mast cells in the small airway was associated with increased in lung function (Balzar et al. 2005). Overall, while there is still controversy on the direction of effect, previous findings support the association of our novel KITLG locus with lung function, especially in patients with allergic asthma. Our novel locus likely represents part of a complex regulatory mechanism that modulates immune cell differentiation, survival, and activation in highly cell-specific and context-dependent manners. Further studies are required to study how this locus is regulated in different airway and immune cells to affect lung function outcome in the context of asthma.

GxE interactions likely account for a portion of the “missing” heritability of many complex phenotypes (Moore and Williams. 2009). We previously found that lung function in SAGE II participants was associated with first year of life and lifetime exposures to SO2 (1.66% decrease [95% CI = −2.92 to −0.37] for first year of life and 5.30% decrease [95% CI = −8.43 to −2.06] for lifetime exposures in FEV1 per 1 ppb increases in SO2) (Neophytou et al. 2016). We hypothesized that a significant portion of the heritability of lung function was due in part to gene-by-air-pollution (SO2) interaction effects. The interaction between rs58475486 and past year exposure to SO2 that was significantly associated with lung function supports our hypothesis. The T allele of rs58475486 is common (8-14%) in African populations and showed a protective effect on lung function in the presence of past year SO2 exposure. SNP rs58475486 is located in a ccRE (SCREEN accession EH37E0279296) and a FOXA1 binding site in the A549 lung adenocarcinoma cell line. FOXA1 has a known compensatory role with FOXA2 during lung morphogenesis in mice (Wan et al. 2005). Deletion of both FOXA1 and FOXA2 inhibited cell proliferation, epithelial cell differentiation, and branching morphogenesis in fetal lung tissue. Further functional validation on the effect of rs58475486 on binding affinity of FOXA1 is necessary to confirm whether the role of FOXA1 in this ccRE is important for KITLG regulatory and lung function.

The higher frequency of the protective alleles of both rs73440122 and rs58465486 in African populations appears to contradict previous findings that African ancestry was associated with lower lung function (Kumar et al. 2010). One possible explanation for this seeming inconsistency is that FEV1 is a complex trait whose variation is influenced by many genetic variants of small to moderate effect sizes whose influence on lung function may vary by exposure to environmental factors. We found suggestive evidence that the interaction between rs73440122 and first year exposure to SO2 reverses the positive association of rs73440122 with lung function to a negative one (Table 3). When assessed independently, our genetic association analysis showed that the protective A allele of rs73440122 was associated with higher lung function. However, with increasing levels of SO2 exposure in the first year of life, increasing copies of the A allele of rs73440122 were associated with decreased lung function. Air pollution is known to negatively impact lung function, and we have previously shown that the deleterious effects of air pollution on lung phenotypes may be significantly increased in African American children compared to other populations experiencing the same amount of exposure (Nishimura et al. 2013). It has also been reported that Latino and African American populations often live in neighborhoods with high levels of air pollution (Mott 1995). The increased susceptibility to negative pulmonary effects from air pollution exposure coupled with the disproportionate exposure to air pollution experienced by the African American population may also contribute to the lower lung function seen in this population despite the presence of protective alleles. The overlap of the SPI1 binding site with rs73440122 further supports gene-by-SO2 interaction at this locus, since SPI1 played a critical role in the development of type 2 inflammation in the airways through macrophage polarization (Qian et al. 2015). We noted that the rs73440122 A allele also showed an interaction approaching suggestive threshold with past year exposure to SO2 that was positively associated with FEV1. The difference is not surprising because age of exposure may significantly impact the effect of air pollution on lung function (Reviewed in Usemann et al. 2019). Further studies are required to better understand the effect of this suggestive interaction on lung function.

One strength of this study is the interrogation of independent lung function associated signals at our novel locus. We identified evidence of three independent signals: the replicated signal that showed evidence of regulatory functions (an open chromatin region with a SPI1/PU.1 binding site), one signal that showed a statistically significant gene-by-SO2 interaction on lung function, and one signal that represents to KITLG eQTLs in the nasal epithelial cells together with suggestive gene-by-SO2 interaction. Our results demonstrated a glimpse of the complicated genetic architecture behind complex traits.

One limitation of this study is that the FEV1 genetic association and the eQTL analyses with KITLG were performed in different populations due to data availability constraints. Although we did not have RNA-Seq data from lung tissues from our study subjects, we previously demonstrated that there is a high degree of overlap in gene expression profiles between nasal and bronchial epithelial cells (Poole et al. 2014). The direction of effect of the association was the same in GALA II Puerto Rican children with asthma but not statistically significant. This may in part due to the significantly lower African Ancestry in Puerto Ricans compared to African Americans.

We replicated 20 of 45 variants in black UK Biobank subjects and observed conflicting “flip-flop” associations in African Americans from the SAPPHIRE and GCPD-A studies. In the past, flip-flop associations were deemed as spurious results. Traditional association testing approach studies the effect of each variant on phenotype independently and increases the chance of flip-flop associations detected between studies. Differences in study design, sampling variation that leads to variation in LD patterns, and lack of consideration of other disease influencing genetic and/or environmental factors are all potential causes of flip-flop associations (Kraft et al. 2009; Lin et al.2007). Hence, it is not surprising to observe flip-flop associations when gene and environment interactions were detected at our FEV1 GWAS locus. It was previously shown that flip-flop associations can occur between and within populations even in the presence of a genuine genetic effect (Kraft et al. 2009; Lin et al. 2007). Further functional analysis is thus required to validate the relationship between the candidate variants, KITLG and FEV1. This may include reporter assays to validate potential enhancer or repressor activity and CRISPR-based editing assays to validate the regulatory role of the candidate variants on KITLG. Although literature exists describing KIT signaling for lung function in mice (Lindsey et al. 2011), additional knockout experiments in a model animal system may be necessary to study how KITLG contributed to variation in lung function.

The average concentration of ambient SO2 exposure in our participants (Table 1) was lower than the National Ambient Air Quality Standards. It is possible that SO2 acted as a surrogate for other unmeasured toxic pollutants emitted from local point sources. Major sources of SO2 in San Francisco Bay Area during the recruitment years of 2006 to 2011 include airports, petroleum refineries, gas and oil plants, calcined petroleum coke plants, electric power plants, cement manufacturing factories, chemical plants, and landfills (United States Environmental Protection Agency 2008; United States Environmental Protection Agency 2011). The Environmental Protection Agency’s national emissions inventory data also showed that these facilities emit Volatile Organic Compounds, heavy metals (lead, mercury, chromium, arsenic), formaldehyde, ethyl benzene, acrolein, 1,3-butadiene, 1,4-dichlorobenzene, and tetrachloroethylene into the air along with SO2. These chemicals are highly toxic and inhaling even a small amount may contribute to poor lung function. Another possibility is that exposure to SO2 captured unmeasured confounding socioeconomic factors.

This study identified a novel protective allele for lung function in African American children with asthma. The protective association with lung function intensified with increased past year exposure to SO2. Our findings showcase the complexity of the relationship between genetic and environmental factors impacting variation in FEV1, highlights the utility of WGS data for genetic research of complex phenotypes, and underscores the importance of including diverse study populations in our exploration of the genetic architecture underlying lung function.

Data Availability

Local institutional review boards approved the studies (IRB# 10-02877). All subjects and legal guardians provided written informed consent. TOPMed whole genome sequencing and phenotype data from SAGE II are available on dbGaP under accession number phs000921.v4.p1. Normalized gene count data for KITLG and supplemental materials are available at figshare.

ACKNOWLEDGEMENTS

The Genes-Environments and Admixture in Latino Americans (GALA II) Study, the Study of African Americans, Asthma, Genes and Environments (SAGE) Study and E.G.B. were supported by the Sandler Family Foundation, the American Asthma Foundation, the RWJF Amos Medical Faculty Development Program, the Harry Wm. and Diana V. Hind Distinguished Professor in Pharmaceutical Sciences II, the National Heart, Lung, and Blood Institute (NHLBI) [R01HL117004, R01HL128439, R01HL135156, X01HL134589]; the National Institute of Environmental Health Sciences [R01ES015794]; the National Institute on Minority Health and Health Disparities (NIMHD) [P60MD006902, R01MD010443], the National Human Genome Research Institute [U01HG009080] and the Tobacco-Related Disease Research Program [24RT-0025]. MJW was supported by the NHLBI [K01HL140218]. JJ and BEH were supported by the NHLBI [R01HL133433, R01HL141992]. KLK was supported by the NHLBI [R01HL135156-S1], the UCSF Bakar Institute, the Gordon and Betty Moore Foundation [GBMF3834], and the Alfred P. Sloan Foundation [2013-10-27] grant to UC Berkeley through the Moore-Sloan Data Science Environment Initiative. ACW was supported by the Eunice Kennedy Shriver National Institute of Child Health and Human Development [1R01HD085993-01].

The SAPPHIRE study was supported by the Fund for Henry Ford Hospital, the American Asthma Foundation, the NHLBI [R01HL118267, R01HL141485, X01HL134589], the National Institute of Allergy and Infectious Diseases [R01AI079139], and the National Institute of Diabetes and Digestive and Kidney Diseases [R01DK113003].

The GCPD-A study was supported by an Institutional award from the Children’s Hospital of Philadelphia and by the NHLBI [X01HL134589].

Part of this research was conducted using the UK Biobank Resource under Application Number 40375. We would like to thank UK Biobank participants and researchers who contributed or collected data.

Whole genome sequencing (WGS) for the Trans-Omics in Precision Medicine (TOPMed) program was supported by the National Heart, Lung and Blood Institute (NHLBI). WGS for “NHLBI TOPMed: Gene-Environment, Admixture and Latino Asthmatics Study” (phs000920) and “NHLBI TOPMed: Study of African Americans, Asthma, Genes and Environments” (phs000921) was performed at the New York Genome Center (3R01HL117004-02S3) and the University of Washington Northwest Genomics Center (HHSN268201600032I). WGS for “NHLBI TOPMed: Study of Asthma Phenotypes & Pharmacogenomic Interactions by Race-Ethnicity” (phs001467) and “Genetics of Complex Pediatric Disorders-Asthma” (phs001661) was performed at the University of Washington Northwest Genomics Center (HHSN268201600032I). Centralized read mapping and genotype calling, along with variant quality metrics and filtering were provided by the TOPMed Informatics Research Center (3R01HL-117626-02S1; contract HHSN268201800002I). Phenotype harmonization, data management, sample-identity QC, and general study coordination were provided by the TOPMed Data Coordinating Center (3R01HL-120393-02S1; contract HHSN268201800001I). We gratefully acknowledge the studies and participants who provided biological samples and data for TOPMed.

WGS of part of GALA II was performed by New York Genome Center under The Centers for Common Disease Genomics of the Genome Sequencing Program (GSP) Grant (UM1 HG008901). The GSP Coordinating Center (U24 HG008956) contributed to cross-program scientific initiatives and provided logistical and general study coordination. GSP is funded by the National Human Genome Research Institute, the National Heart, Lung, and Blood Institute, and the National Eye Institute.

The TOPMed imputation panel was supported by the NHLBI and TOPMed study investigators who contributed data to the reference panel. The panel was constructed and implemented by the TOPMed Informatics Research Center at the University of Michigan (3R01HL-117626-02S1; contract HHSN268201800002I). The TOPMed Data Coordinating Center (3R01HL-120393-02S1; contract HHSN268201800001I) provided additional data management, sample identity checks, and overall program coordination and support. We gratefully acknowledge the studies and participants who provided biological samples and data for TOPMed.

The authors wish to acknowledge the following GALA II and SAGE study collaborators: Shannon Thyne, UCSF; Harold J. Farber, Texas Children’s Hospital; Denise Serebrisky, Jacobi Medical Center; Rajesh Kumar, Lurie Children’s Hospital of Chicago; Emerita Brigino-Buenaventura, Kaiser Permanente; Michael A. LeNoir, Bay Area Pediatrics; Kelley Meade, UCSF Benioff Children’s Hospital, Oakland; William Rodríguez-Cintrón, VA Hospital, Puerto Rico; Pedro C. Ávila, Northwestern University; Jose R. Rodríguez-Santana, Centro de Neumología Pediátrica; Luisa N. Borrell, City University of New York; Adam Davis, UCSF Benioff Children’s Hospital, Oakland; Saunak Sen, University of Tennessee.

The authors acknowledge the families and patients for their participation and thank the numerous health care providers and community clinics for their support and participation in GALA II and SAGE. In particular, the authors thank the recruiters who obtained the data: Duanny Alva, MD; Gaby Ayala-Rodríguez; Lisa Caine, RT; Elizabeth Castellanos; Jaime Colón; Denise DeJesus; Blanca López; Brenda López, MD; Louis Martos; Vivian Medina; Juana Olivo; Mario Peralta; Esther Pomares, MD; Jihan Quraishi; Johanna Rodríguez; Shahdad Saeedi; Dean Soto; and Ana Taveras.

The authors thank María Pino-Yanes for providing feedback on this study and Thomas W Blackwell for providing critical review on this manuscript.

The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

LITERATURE CITED

  1. ↵
    1000 Genomes Project Consortium, A. Auton, L. D. Brooks, R. M. Durbin, E. P. Garrison et al., 2015 A global reference for human genetic variation. Nature 526: 68–74.
    OpenUrlCrossRefPubMed
  2. ↵
    Akinbami, L. J., 2015 Asthma Prevalence, Health Care use and Mortality: United States, 2003-05. [Online] Available at: http://www.cdc.gov/nchs/data/hestat/asthma03-05/asthma03-05.htm. [Accessed 2020 Jan 8].
  3. ↵
    Akinbami, L. J., J. E. Moorman, A. E. Simon and K. C. Schoendorf, 2014 Trends in racial disparities for asthma outcomes among children 0 to 17 years, 2001-2010. J. Allergy Clin. Immunol. 134: 547-553.e5.
    OpenUrlCrossRef
  4. ↵
    Alexander, D. H., J. Novembre and K. Lange, 2009 Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19: 1655–1664.
    OpenUrlAbstract/FREE Full Text
  5. ↵
    Al-Muhsen, S. Z., G. Shablovsky, R. Olivenstein, B. Mazer and Q. Hamid, 2004 The expression of stem cell factor and c-kit receptor in human asthmatic airways. Clin. Exp. Allergy 34: 911–916.
    OpenUrlCrossRefPubMedWeb of Science
  6. ↵
    Amin, K., 2012 The role of mast cells in allergic inflammation. Respir. Med. 106: 9–14.
    OpenUrlCrossRefPubMed
  7. ↵
    Anders, S., P. T. Pyl and W. Huber, 2015 HTSeq--a python framework to work with high-throughput sequencing data. Bioinformatics 31: 166–169.
    OpenUrlCrossRefPubMedWeb of Science
  8. ↵
    Anderson, K. L., H. Perkin, C. D. Surh, S. Venturini, R. A. Maki et al., 2000 Transcription factor PU.1 is necessary for development of thymic and myeloid progenitor-derived dendritic cells. J. Immunol. 164: 1855–1861.
    OpenUrlAbstract/FREE Full Text
  9. ↵
    Ay, F., T. L. Bailey and W. S. Noble, 2014 Statistical confidence estimation for hi-C data reveals regulatory chromatin contacts. Genome Res. 24: 999–1011.
    OpenUrlAbstract/FREE Full Text
  10. ↵
    Balzar, S., H. W. Chu, M. Strand and S. Wenzel, 2005 Relationship of small airway chymase-positive mast cells and lung function in severe asthma. Am. J. Respir. Crit. Care Med. 171: 431– 439.
    OpenUrlCrossRefPubMedWeb of Science
  11. ↵
    Barraza-Villarreal, A., J. Sunyer, L. Hernandez-Cadena, M. C. Escamilla-Nunez, J. J. Sienra-Monge et al., 2008 Air pollution, airway inflammation, and lung function in a cohort study of mexico city schoolchildren. Environ. Health Perspect. 116: 832–838.
    OpenUrl
  12. ↵
    Barrett, J. C., B. Fry, J. Maller and M. J. Daly, 2005 Haploview: Analysis and visualization of LD and haplotype maps. Bioinformatics 21: 263–265.
    OpenUrlCrossRefPubMedWeb of Science
  13. ↵
    Borish, L., and B. Z. Joseph, 1992 Inflammation and the allergic response. Med. Clin. North Am.76: 765–787.
    OpenUrlCrossRefPubMed
  14. ↵
    Broudy, V. C., 1997 Stem cell factor and hematopoiesis. Blood 90: 1345–1364.
    OpenUrlFREE Full Text
  15. ↵
    Brunekreef, B., and S. T. Holgate, 2002 Air pollution and health. Lancet 360: 1233–1242.
    OpenUrlCrossRefPubMedWeb of Science
  16. ↵
    Buniello, A., J. A. L. MacArthur, M. Cerezo, L. W. Harris, J. Hayhurst et al., 2019 The NHGRI-EBI GWAS catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47: D1005–D1012.
    OpenUrlCrossRefPubMed
  17. ↵
    Bycroft, C., C. Freeman, D. Petkova, G. Band, L. T. Elliott et al., 2018 The UK biobank resource with deep phenotyping and genomic data. Nature 562: 203–209.
    OpenUrlCrossRefPubMed
  18. ↵
    Canela-Xandri, O., K. Rawlik and A. Tenesa, 2018 An atlas of genetic associations in UK biobank. Nat. Genet. 50: 1593–1599.
    OpenUrlCrossRefPubMed
  19. ↵
    Carlson, C. S., T. C. Matise, K. E. North, C. A. Haiman, M. D. Fesinmeyer et al., 2013 Generalization and dilution of association results from european GWAS in populations of non-european ancestry: The PAGE study. PLoS Biol. 11: e1001661.
    OpenUrlCrossRefPubMed
  20. ↵
    Chang, C. C., C. C. Chow, L. C. Tellier, S. Vattikuti, S. M. Purcell et al., 2015 Second-generation PLINK: Rising to the challenge of larger and richer datasets. Gigascience 4: 7–8. eCollection 2015.
    OpenUrlCrossRefPubMed
  21. ↵
    Chatterjee, S., and N. Das, 1995 Lung function in indian twin children: Comparison of genetic versus environmental influence. Ann. Hum. Biol. 22: 289–303.
    OpenUrlCrossRefPubMed
  22. ↵
    Chen, Y., S. L. Horne, D. C. Rennie and J. A. Dosman, 1996 Segregation analysis of two lung function indices in a random sample of young families: The humboldt family study. Genet. Epidemiol. 13: 35–47.
    OpenUrl
  23. ↵
    Ciesielski, T. H., S. A. Pendergrass, M. J. White, N. Kodaman, R. S. Sobota et al., 2014 Diverse convergent evidence in the genetic analysis of complex disease: Coordinating omic, informatic, and experimental evidence to better identify and validate risk factors. BioData Min. 7: 10–10. eCollection 2014.
    OpenUrlCrossRef
  24. ↵
    Comberiati, P., M. E. Di Cicco, S. D’Elios and D. G. Peroni, 2017 How much asthma is atopic in children? Front. Pediatr. 5: 122.
    OpenUrl
  25. ↵
    Conomos, M. P., M. B. Miller and T. A. Thornton, 2015 Robust inference of population structure for ancestry prediction and correction of stratification in the presence of relatedness. Genet. Epidemiol. 39: 276–293.
    OpenUrlCrossRefPubMed
  26. ↵
    Conomos, M. P., A. P. Reiner, B. S. Weir and T. A. Thornton, 2016 Model-free estimation of recent genetic relatedness. Am. J. Hum. Genet. 98: 127–148.
    OpenUrlCrossRefPubMed
  27. ↵
    Cruse, G., and P. Bradding, 2016 Mast cells in airway diseases and interstitial lung disease. Eur. J. Pharmacol. 778: 125–138.
    OpenUrlCrossRefPubMed
  28. ↵
    Da Silva, C. A., and N. Frossard, 2005 Regulation of stem cell factor expression in inflammation and asthma. Mem. Inst. Oswaldo Cruz 100 Suppl 1: 145–151.
    OpenUrlCrossRefPubMed
  29. ↵
    Da Silva, C. A., L. Reber and N. Frossard, 2006 Stem cell factor expression, mast cells and inflammation in asthma. Fundam. Clin. Pharmacol. 20: 21–39.
    OpenUrlCrossRefPubMedWeb of Science
  30. ↵
    Dahlin, J. S., A. Malinovschi, H. Ohrvik, M. Sandelin, C. Janson et al., 2016 Lin-CD34hi CD117int/hi FcepsilonRI+ cells in human blood constitute a rare population of mast cell progenitors. Blood 127: 383–391.
    OpenUrlAbstract/FREE Full Text
  31. ↵
    Das, S., L. Forer, S. Schonherr, C. Sidore, A. E. Locke et al., 2016 Next-generation genotype imputation service and methods. Nat. Genet. 48: 1284–1287.
    OpenUrlCrossRefPubMed
  32. ↵
    Duggal, P., E. M. Gillanders, T. N. Holmes and J. E. Bailey-Wilson, 2008 Establishing an adjusted p-value threshold to control the family-wide type 1 error in genome wide association studies. BMC Genomics 9: 516–516.
    OpenUrlCrossRefPubMed
  33. ↵
    ENCODE Project Consortium, 2012 An integrated encyclopedia of DNA elements in the human genome. Nature 489: 57–74.
    OpenUrlCrossRefPubMedWeb of Science
  34. ↵
    ENCODE Project Consortium, 2011 A user’s guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol. 9: e1001046.
    OpenUrlCrossRefPubMed
  35. ↵
    Fajt, M. L., and S. E. Wenzel, 2013 Mast cells, their subtypes, and relation to asthma phenotypes. Ann. Am. Thorac. Soc. 10 Suppl: 158.
  36. ↵
    Fonseca, W., A. J. Rasky, C. Ptaschinski, S. H. Morris, S. K. K. Best et al., 2019 Group 2 innate lymphoid cells (ILC2) are regulated by stem cell factor during chronic asthmatic disease.Mucosal Immunol. 12: 445–456.
    OpenUrlCrossRef
  37. ↵
    Galli, S. J., K. M. Zsebo and E. N. Geissler, 1994 The kit ligand, stem cell factor. Adv. Immunol.55: 1–96.
    OpenUrlCrossRefPubMedWeb of Science
  38. ↵
    Galli, S. J., M. Tsai, B. K. Wershil, S. Y. Tam and J. J. Costa, 1995 Regulation of mouse and human mast cell development, survival and function by stem cell factor, the ligand for the c-kit receptor. Int. Arch. Allergy Immunol. 107: 51–53.
    OpenUrlCrossRefPubMedWeb of Science
  39. ↵
    GTEx Consortium, Laboratory, Data Analysis &Coordinating Center (LDACC)-Analysis Working Group, Statistical Methods groups-Analysis Working Group, Enhancing GTEx (eGTEx) groups, NIH Common Fund et al., 2017 Genetic effects on gene expression across human tissues. Nature 550: 204–213.
    OpenUrlCrossRefPubMedWeb of Science
  40. ↵
    Guerriero, A., P. B. Langmuir, L. M. Spain and E. W. Scott, 2000 PU.1 is required for myeloid-derived but not lymphoid-derived dendritic cells. Blood 95: 879–885.
    OpenUrlAbstract/FREE Full Text
  41. ↵
    Hankinson, J. L., J. R. Odencrantz and K. B. Fedan, 1999 Spirometric reference values from a sample of the general U.S. population. Am. J. Respir. Crit. Care Med. 159: 179–187.
    OpenUrlCrossRefPubMedWeb of Science
  42. ↵
    Hinrichs, A. S., D. Karolchik, R. Baertsch, G. P. Barber, G. Bejerano et al., 2006 The UCSC genome browser database: Update 2006. Nucleic Acids Res. 34: 590.
    OpenUrlCrossRef
  43. ↵
    Hsieh, F. H., P. Sharma, A. Gibbons, T. Goggans, S. C. Erzurum et al., 2005 Human airway epithelial cell determinants of survival and functional phenotype for primary human mast cells. Proc. Natl. Acad. Sci. U. S. A. 102: 14380–14385.
    OpenUrlAbstract/FREE Full Text
  44. ↵
    Hukkinen, M., J. Kaprio, U. Broms, A. Viljanen, D. Kotz et al., 2011 Heritability of lung function: A twin study among never-smoking elderly women. Twin Res. Hum. Genet. 14: 401–407.
    OpenUrlCrossRefPubMed
  45. ↵
    Iemura, A., M. Tsai, A. Ando, B. K. Wershil and S. J. Galli, 1994 The c-kit ligand, stem cell factor, promotes mast cell survival by suppressing apoptosis. Am. J. Pathol. 144: 321–328.
    OpenUrlPubMedWeb of Science
  46. ↵
    Ierodiakonou, D., A. Zanobetti, B. A. Coull, S. Melly, D. S. Postma et al., 2016 Ambient air pollution, lung function, and airway responsiveness in asthmatic children. J. Allergy Clin. Immunol. 137: 390–399.
    OpenUrlCrossRefPubMed
  47. ↵
    Ionita-Laza, I., S. Lee, V. Makarov, J. D. Buxbaum and X. Lin, 2013 Sequence kernel association tests for the combined effect of rare and common variants. Am. J. Hum. Genet. 92: 841–853.
    OpenUrlCrossRefPubMed
  48. ↵
    Ito, T., C. Nishiyama, M. Nishiyama, H. Matsuda, K. Maeda et al., 2005 Mast cells acquire monocyte-specific gene expression and monocyte-like morphology by overproduction of PU.1. J. Immunol. 174: 376–383.
    OpenUrlAbstract/FREE Full Text
  49. ↵
    Ito, T., C. Nishiyama, N. Nakano, M. Nishiyama, Y. Usui et al., 2009 Roles of PU.1 in monocyte-and mast cell-specific gene regulation: PU.1 transactivates CIITA pIV in cooperation with IFN-gamma. Int. Immunol. 21: 803–816.
    OpenUrlCrossRefPubMed
  50. ↵
    Jiang, H., R. Lei, S. W. Ding and S. Zhu, 2014 Skewer: A fast and accurate adapter trimmer for next-generation sequencing paired-end reads. BMC Bioinformatics 15: 182–182.
    OpenUrlCrossRefPubMed
  51. ↵
    Johnson, J. D., and W. M. Theurer, 2014 A stepwise approach to the interpretation of pulmonary function tests. Am. Fam. Physician 89: 359–366.
    OpenUrl
  52. ↵
    Karolchik, D., A. S. Hinrichs, T. S. Furey, K. M. Roskin, C. W. Sugnet et al., 2004 The UCSC table browser data retrieval tool. Nucleic Acids Res. 32: 493.
    OpenUrlCrossRef
  53. ↵
    Kassel, O., F. Schmidlin, C. Duvernelle, B. Gasser, G. Massard et al., 1999 Human bronchial smooth muscle cells in culture produce stem cell factor. Eur. Respir. J. 13: 951–954.
    OpenUrlAbstract/FREE Full Text
  54. ↵
    Kim, D., B. Langmead and S. L. Salzberg, 2015 HISAT: A fast spliced aligner with low memory requirements. Nat. Methods 12: 357–360.
    OpenUrlCrossRefPubMed
  55. ↵
    Kim, M. S., K. P. Patel, A. K. Teng, A. J. Berens and J. Lachance, 2018 Genetic disease risks can be misestimated across global populations. Genome Biol. 19: 179–7.
    OpenUrlCrossRefPubMed
  56. ↵
    Kobayashi, H., Y. Okayama, T. Ishizuka, R. Pawankar, C. Ra et al., 1998 Production of IL-13 by human lung mast cells in response to fcepsilon receptor cross-linkage. Clin. Exp. Allergy 28: 1219–1227.
    OpenUrlCrossRefPubMedWeb of Science
  57. ↵
    Kraft, P., E. Zeggini and J. P. Ioannidis, 2009 Replication in genome-wide association studies. Stat. Sci. 24: 561–573.
    OpenUrlCrossRefPubMedWeb of Science
  58. ↵
    Kumar, R., M. A. Seibold, M. C. Aldrich, L. K. Williams, A. P. Reiner et al., 2010 Genetic ancestry in lung-function predictions. N. Engl. J. Med. 363: 321–330.
    OpenUrlCrossRefPubMedWeb of Science
  59. ↵
    Lee, S., M. J. Emond, M. J. Bamshad, K. C. Barnes, M. J. Rieder et al., 2012 Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am. J. Hum. Genet. 91: 224–237.
    OpenUrlCrossRefPubMed
  60. ↵
    Levin, A. M., Y. Wang, K. E. Wells, B. Padhukasahasram, J. J. Yang et al., 2014 Nocturnal asthma and the importance of race/ethnicity and genetic ancestry. Am. J. Respir. Crit. Care Med. 190: 266–273.
    OpenUrlPubMed
  61. ↵
    Li, X., G. A. Hawkins, E. J. Ampleford, W. C. Moore, H. Li et al., 2013 Genome-wide association study identifies TH1 pathway genes associated with lung function in asthmatic patients. J. Allergy Clin. Immunol. 132: 313-20.e15.
    OpenUrlCrossRefWeb of Science
  62. ↵
    Liao, S. Y., X. Lin and D. C. Christiani, 2014 Genome-wide association and network analysis of lung function in the framingham heart study. Genet. Epidemiol. 38: 572–578.
    OpenUrlCrossRefPubMed
  63. ↵
    Lin, P. I., J. M. Vance, M. A. Pericak-Vance and E. R. Martin, 2007 No gene is an island: The flip-flop phenomenon. Am. J. Hum. Genet. 80: 531–538.
    OpenUrlCrossRefPubMedWeb of Science
  64. ↵
    Lindsey, J. Y., K. Ganguly, D. M. Brass, Z. Li, E. N. Potts et al., 2011 C-kit is essential for alveolar maintenance and protection from emphysema-like disease in mice. Am. J. Respir. Crit. Care Med. 183: 1644–1652.
    OpenUrlCrossRefPubMedWeb of Science
  65. ↵
    Liu, X., S. White, B. Peng, A. D. Johnson, J. A. Brody et al., 2016 WGSA: An annotation pipeline for human genome sequencing studies. J. Med. Genet. 53: 111–112.
    OpenUrlFREE Full Text
  66. ↵
    Mak, A. C. Y., M. J. White, W. L. Eckalbar, Z. A. Szpiech, S. S. Oh et al., 2018 Whole-genome sequencing of pharmacogenetic drug response in racially diverse children with asthma. Am. J. Respir. Crit. Care Med. 197: 1552–1564.
    OpenUrlCrossRefPubMed
  67. ↵
    Mak, A. C., M. J. White, C. Eng, D. Hu, S. Huntsman et al., 2016 Whole Genome Sequencing to Identify Genetic Variation Associated with Bronchodilator Response in Minority Children with Asthma.
  68. ↵
    Martin, A. R., C. R. Gignoux, R. K. Walters, G. L. Wojcik, B. M. Neale et al., 2017 Human demographic history impacts genetic risk prediction across diverse populations. Am. J. Hum. Genet. 100: 635–649.
    OpenUrlCrossRefPubMed
  69. ↵
    Martin, J. S., Z. Xu, A. P. Reiner, K. L. Mohlke, P. Sullivan et al., 2017 HUGIn: Hi-C unifying genomic interrogator. Bioinformatics 33: 3793–3795.
    OpenUrl
  70. ↵
    McKercher, S. R., B. E. Torbett, K. L. Anderson, G. W. Henkel, D. J. Vestal et al., 1996 Targeted disruption of the PU.1 gene results in multiple hematopoietic abnormalities. EMBO J. 15: 5647– 5658.
    OpenUrlPubMed
  71. ↵
    Mekori, Y. A., C. K. Oh and D. D. Metcalfe, 1993 IL-3-dependent murine mast cells undergo apoptosis on removal of IL-3. prevention of apoptosis by c-kit ligand. J. Immunol. 151: 3775– 3784.
    OpenUrl
  72. ↵
    Mendez-Enriquez, E., and J. Hallgren, 2019 Mast cells and their progenitors in allergic asthma. Front. Immunol. 10: 821.
    OpenUrlCrossRef
  73. ↵
    Moore, J. H., 2005 A global view of epistasis. Nat. Genet. 37: 13–14.
    OpenUrlCrossRefPubMedWeb of Science
  74. ↵
    Moore, J. H., and S. M. Williams, 2009 Epistasis and its implications for personal genetics. Am. J. Hum. Genet. 85: 309–320.
    OpenUrlCrossRefPubMedWeb of Science
  75. ↵
    Mott, L., 1995 The disproportionate impact of environmental health threats on children of color. Environ. Health Perspect. 103 Suppl 6: 33–35.
    OpenUrlCrossRef
  76. ↵
    Neophytou, A. M., M. J. White, S. S. Oh, N. Thakur, J. M. Galanter et al., 2016 Air pollution and lung function in minority youth with asthma in the GALA II (genes-environments and admixture in latino americans) and SAGE II (study of african americans, asthma, genes, and environments) studies. Am. J. Respir. Crit. Care Med. 193: 1271–1280.
    OpenUrlCrossRefPubMed
  77. ↵
    Nishimura, K. K., J. M. Galanter, L. A. Roth, S. S. Oh, N. Thakur et al., 2013 Early-life air pollution and asthma risk in minority children. the GALA II and SAGE II studies. Am. J. Respir. Crit. Care Med. 188: 309–318.
    OpenUrlCrossRefPubMedWeb of Science
  78. ↵
    Nishiyama, C., M. Nishiyama, T. Ito, S. Masaki, N. Masuoka et al., 2004 Functional analysis of PU.1 domains in monocyte-specific gene regulation. FEBS Lett. 561: 63–68.
    OpenUrlCrossRefPubMedWeb of Science
  79. ↵
    Nishiyama, C., M. Nishiyama, T. Ito, S. Masaki, K. Maeda et al., 2004 Overproduction of PU.1 in mast cell progenitors: Its effect on monocyte-and mast cell-specific gene expression. Biochem. Biophys. Res. Commun. 313: 516–521.
    OpenUrlCrossRefPubMedWeb of Science
  80. ↵
    Oh, S. S., M. J. White, C. R. Gignoux and E. G. Burchard, 2016 Making precision medicine socially precise. take a deep breath. Am. J. Respir. Crit. Care Med. 193: 348–350.
    OpenUrlCrossRef
  81. ↵
    Oh, S. S., H. Tcheurekdjian, L. A. Roth, E. A. Nguyen, S. Sen et al., 2012 Effect of secondhand smoke on asthma control among black and latino children. J. Allergy Clin. Immunol. 129: 1478-83.e7.
    OpenUrlCrossRefPubMed
  82. ↵
    Oliveira, S. H., and N. W. Lukacs, 2003 Stem cell factor: A hemopoietic cytokine with important targets in asthma. Curr. Drug Targets Inflamm. Allergy 2: 313–318.
    OpenUrl
  83. ↵
    Ong, B. A., J. Li, J. M. McDonough, Z. Wei, C. Kim et al., 2013 Gene network analysis in a pediatric cohort identifies novel lung function genes. PLoS One 8: e72899.
    OpenUrlCrossRefPubMed
  84. ↵
    Oriss, T. B., N. Krishnamoorthy, P. Ray and A. Ray, 2014 Dendritic cell c-kit signaling and adaptive immunity: Implications for the upper airways. Curr. Opin. Allergy Clin. Immunol. 14: 7– 12.
    OpenUrlCrossRef
  85. ↵
    Palmer, L. J., M. W. Knuiman, M. L. Divitini, P. R. Burton, A. L. James et al., 2001 Familial aggregation and heritability of adult lung function: Results from the busselton health study. Eur. Respir. J. 17: 696–702.
    OpenUrlAbstract/FREE Full Text
  86. ↵
    Pe’er, I., R. Yelensky, D. Altshuler and M. J. Daly, 2008 Estimation of the multiple testing burden for genomewide association studies of nearly all common variants. Genet. Epidemiol. 32: 381– 385.
    OpenUrlCrossRefPubMedWeb of Science
  87. ↵
    Pino-Yanes, M., N. Thakur, C. R. Gignoux, J. M. Galanter, L. A. Roth et al., 2015 Genetic ancestry influences asthma susceptibility and lung function among latinos. J. Allergy Clin. Immunol. 135: 228–235.
    OpenUrlCrossRef
  88. ↵
    Plummer, M., N. Best, K. Cowles and K. Vines, 2006 CODA: Convergence diagnosis and output analysis for MCMC. R News 6: 7–11.
    OpenUrlCrossRefPubMed
  89. ↵
    Poole, A., C. Urbanek, C. Eng, J. Schageman, S. Jacobson et al., 2014 Dissecting childhood asthma with nasal transcriptomics distinguishes subphenotypes of disease. J. Allergy Clin. Immunol.133: 670-8.e12.
    OpenUrlCrossRef
  90. ↵
    Pruim, R. J., R. P. Welch, S. Sanna, T. M. Teslovich, P. S. Chines et al., 2010 LocusZoom: Regional visualization of genome-wide association scan results. Bioinformatics 26: 2336–2337.
    OpenUrlCrossRefPubMedWeb of Science
  91. ↵
    Purcell, S., and C. Chang, 2013 Plink 1.9. [Online] Available at: www.cog-genomics.org/plink/1.9/. [Accessed 2019 Mar].
  92. ↵
    Qian, F., J. Deng, Y. G. Lee, J. Zhu, M. Karpurapu et al., 2015 The transcription factor PU.1 promotes alternative macrophage polarization and asthmatic airway inflammation. J. Mol. Cell. Biol. 7: 557–567.
    OpenUrlCrossRefPubMed
  93. ↵
    Repapi, E., I. Sayers, L. V. Wain, P. R. Burton, T. Johnson et al., 2010 Genome-wide association study identifies five loci associated with lung function. Nat. Genet. 42: 36–44.
    OpenUrlCrossRefPubMedWeb of Science
  94. ↵
    Rochman, M., A. V. Kartashov, J. M. Caldwell, M. H. Collins, E. M. Stucke et al., 2015 Neurotrophic tyrosine kinase receptor 1 is a direct transcriptional and epigenetic target of IL-13 involved in allergic inflammation. Mucosal Immunol. 8: 785–798.
    OpenUrl
  95. ↵
    Schmitt, A. D., M. Hu, I. Jung, Z. Xu, Y. Qiu et al., 2016 A compendium of chromatin contact maps reveals spatially active regions in the human genome. Cell. Rep. 17: 2042–2059.
    OpenUrlCrossRefPubMed
  96. ↵
    Scott, E. W., M. C. Simon, J. Anastasi and H. Singh, 1994 Requirement of transcription factor PU.1 in the development of multiple hematopoietic lineages. Science 265: 1573–1577.
    OpenUrlAbstract/FREE Full Text
  97. ↵
    Scott, E. W., R. C. Fisher, M. C. Olson, E. W. Kehrli, M. C. Simon et al., 1997 PU.1 functions in a cell-autonomous manner to control the differentiation of multipotential lymphoid-myeloid progenitors. Immunity 6: 437–447.
    OpenUrlCrossRefPubMedWeb of Science
  98. ↵
    Sillanpaa, E., S. Sipila, T. Tormakangas, J. Kaprio and T. Rantanen, 2017 Genetic and environmental effects on telomere length and lung function: A twin study. J. Gerontol. A Biol. Sci. Med. Sci. 72: 1561–1568.
    OpenUrl
  99. ↵
    Sofer, T., X. Zheng, S. M. Gogarten, C. A. Laurie, K. Grinde et al., 2019 A fully adjusted two-stage procedure for rank-normalization in genetic association studies. Genet. Epidemiol. 43: 263–275.
    OpenUrlCrossRef
  100. ↵
    Soler Artigas, M., L. V. Wain, S. Miller, A. K. Kheirallah, J. E. Huffman et al., 2015 Sixteen new lung function signals identified through 1000 genomes project reference panel imputation. Nat. Commun. 6: 8658.
    OpenUrlCrossRefPubMed
  101. ↵
    Soler Artigas, M., D. W. Loth, L. V. Wain, S. A. Gharib, M. Obeidat et al., 2011 Genome-wide association and large-scale follow up identifies 16 new loci influencing lung function. Nat. Genet. 43: 1082–1090.
    OpenUrlCrossRefPubMed
  102. ↵
    Summer Institute in Statistical Genetics, 2019 PC-Relate. [Online] Available at: https://uw-gac.github.io/SISG_2019/pc-relate.html. [Accessed 2019 Jul 25].
  103. ↵
    Taliun, D., D. N. Harris, M. D. Kessler, J. Carlson, Z. A. Szpiech et al., 2019 Sequencing of 53,831 diverse genomes from the NHLBI TOPMed program. bioRxiv 563866.
  104. ↵
    Tayel, S. I., S. M. El-Hefnway, Abd El Gayed, E. M. and G. A. Abdelaal, 2017 Association of stem cell factor gene expression with severity and atopic state in patients with bronchial asthma. Respir. Res. 18: 21–2.
    OpenUrlCrossRef
  105. ↵
    Tian, X., C. Xu, Y. Wu, J. Sun, H. Duan et al., 2017 Genetic and environmental influences on pulmonary function and muscle strength: The chinese twin study of aging. Twin Res. Hum. Genet. 20: 53–59.
    OpenUrlCrossRef
  106. ↵
    TOPMed, 2019 TOPMed Whole Gneome Sequencing Methods: Freeze 8. [Online] Available at: https://www.nhlbiwgs.org/topmed-whole-genome-sequencing-methods-freeze-8. [Accessed 2019 Dec 13].
  107. ↵
    United States Environmental Protection Agency, 2011 National Emissions Inventory (NEI) 2011 Data. [Online] Available at: https://www.epa.gov/air-emissions-inventories/2011-national-emissions-inventory-nei-data. [Accessed 2020 Jan 8].
  108. ↵
    United States Environmental Protection Agency, 2008 National Emissions Inventory (NEI) 2008 Data. [Online] Available at: https://www.epa.gov/air-emissions-inventories/2008-national-emissions-inventory-nei-data. [Accessed 2020 Jan 8].
  109. ↵
    University of Michigan, and NHLBI TOPMed, 2018 BRAVO Variant Browser. [Online] Available at: https://bravo.sph.umich.edu/freeze5/hg38/. [Accessed 2019 Aug].
  110. ↵
    Usemann, J., F. Decrue, I. Korten, E. Proietti, O. Gorlanova et al., 2019 Exposure to moderate air pollution and associations with lung function at school-age: A birth cohort study. Environ. Int. 126: 682–689.
    OpenUrlCrossRef
  111. ↵
    Valent, P., E. Spanblochl, W. R. Sperr, C. Sillaber, K. M. Zsebo et al., 1992 Induction of differentiation of human mast cells from bone marrow and peripheral blood mononuclear cells by recombinant human stem cell factor/kit-ligand in long-term culture. Blood 80: 2237–2245.
    OpenUrlAbstract/FREE Full Text
  112. ↵
    Wain, L. V., N. Shrine, M. S. Artigas, A. M. Erzurumluoglu, B. Noyvert et al., 2017 Genome-wide association analyses for lung function and chronic obstructive pulmonary disease identify new loci and potential druggable targets. Nat. Genet. 49: 416–425.
    OpenUrlCrossRefPubMed
  113. ↵
    Walsh, J. C., R. P. DeKoter, H. J. Lee, E. D. Smith, D. W. Lancki et al., 2002 Cooperative and antagonistic interplay between PU.1 and GATA-2 in the specification of myeloid cell fates. Immunity 17: 665–676.
    OpenUrlCrossRefPubMedWeb of Science
  114. ↵
    Wan, H., S. Dingle, Y. Xu, V. Besnard, K. H. Kaestner et al., 2005 Compensatory roles of Foxa1 and Foxa2 during lung morphogenesis. J. Biol. Chem. 280: 13809–13816.
    OpenUrlAbstract/FREE Full Text
  115. ↵
    Wen, L. P., J. A. Fahrni, S. Matsui and G. D. Rosen, 1996 Airway epithelial cells produce stem cell factor. Biochim. Biophys. Acta 1314: 183–186.
    OpenUrlCrossRefPubMed
  116. ↵
    White, M. J., O. Risse-Adams, P. Goddard, M. G. Contreras, J. Adams et al., 2016 Novel genetic risk factors for asthma in african american children: Precision medicine and the SAGE II study. Immunogenetics 68: 391–400.
    OpenUrlCrossRef
  117. ↵
    Wise, J., 2019 Air pollution is linked to infant deaths and reduced lung function in children. BMJ 366: 5772.
    OpenUrl
  118. ↵
    Wojcik, G. L., M. Graff, K. K. Nishimura, R. Tao, J. Haessler et al., 2019 Genetic analyses of diverse populations improves discovery for complex traits. Nature 570: 514–518.
    OpenUrlCrossRef
  119. ↵
    World Health Organization, 2017 Asthma. [Online] Available at: http://www.who.int/mediacentre/factsheets/fs307/en/. [Accessed 2020 Jan 8].
  120. ↵
    Yamada, H., Y. Yatagai, H. Masuko, T. Sakamoto, H. Iijima et al., 2015 Heritability of pulmonary function estimated from genome-wide SNPs in healthy japanese adults. Respir. Investig. 53: 60– 67.
    OpenUrlCrossRef
  121. ↵
    Yashiro, T., S. Nakano, K. Nomura, Y. Uchida, K. Kasakura et al., 2019 A transcription factor PU.1 is critical for Ccl22 gene expression in dendritic cells and macrophages. Sci. Rep. 9: 1161–9.
    OpenUrlCrossRef
  122. ↵
    Zhang, F., and J. R. Lupski, 2015 Non-coding genetic variants in human disease. Hum. Mol. Genet.24: 102.
    OpenUrl
Back to top
PreviousNext
Posted April 21, 2020.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Novel KITLG/SCF regulatory variants are associated with lung function in African American children with asthma
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Novel KITLG/SCF regulatory variants are associated with lung function in African American children with asthma
Angel CY Mak, Satria Sajuthi, Jaehyun Joo, Shujie Xiao, Patrick M Sleiman, Marquitta J White, Eunice Y Lee, Benjamin Saef, Donglei Hu, Hongsheng Gui, Kevin L Keys, Fred Lurmann, Deepti Jain, Gonçalo Abecasis, Hyun Min Kang, Deborah A. Nickerson, Soren Germer, Michael C Zody, Lara Winterkorn, Catherine Reeves, Scott Huntsman, Celeste Eng, Sandra Salazar, Sam S Oh, Frank D Gilliland, Zhanghua Chen, Rajesh Kumar, Fernando D Martínez, Ann Chen Wu, Elad Ziv, Hakon Hakonarson, Blanca E Himes, L Keoki Williams, Max A Seibold, Esteban G. Burchard
medRxiv 2020.02.20.20019588; doi: https://doi.org/10.1101/2020.02.20.20019588
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Novel KITLG/SCF regulatory variants are associated with lung function in African American children with asthma
Angel CY Mak, Satria Sajuthi, Jaehyun Joo, Shujie Xiao, Patrick M Sleiman, Marquitta J White, Eunice Y Lee, Benjamin Saef, Donglei Hu, Hongsheng Gui, Kevin L Keys, Fred Lurmann, Deepti Jain, Gonçalo Abecasis, Hyun Min Kang, Deborah A. Nickerson, Soren Germer, Michael C Zody, Lara Winterkorn, Catherine Reeves, Scott Huntsman, Celeste Eng, Sandra Salazar, Sam S Oh, Frank D Gilliland, Zhanghua Chen, Rajesh Kumar, Fernando D Martínez, Ann Chen Wu, Elad Ziv, Hakon Hakonarson, Blanca E Himes, L Keoki Williams, Max A Seibold, Esteban G. Burchard
medRxiv 2020.02.20.20019588; doi: https://doi.org/10.1101/2020.02.20.20019588

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Allergy and Immunology
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)