Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

A genome-wide association study of polycystic ovary syndrome identified from electronic health records

Yanfei Zhang, Kevin Ho, Jacob M. Keaton, Dustin N. Hartzel, Felix Day, Anne E. Justice, Navya S. Josyula, Sarah A. Pendergrass, Ky’Era Actkins, Lea K. Davis, Digna R. Velez Edwards, Brody Holohan, Andrea Ramirez, Ian B. Stanaway, David R. Crosslin, Gail P. Jarvik, Patrick Sleiman, Hakon Hakonarson, Marc S. Williams, Ming Ta Michael Lee
doi: https://doi.org/10.1101/2019.12.12.19014761
Yanfei Zhang
1Genomic Medicine Institute, Geisinger, Danville, PA, USA
MD, PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: yzhang1{at}geisinger.edu mlee2{at}geisinger.edu
Kevin Ho
2Kidney Research Institute, Geisinger, Danville, PA, USA
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jacob M. Keaton
3Division of Epidemiology, Department of Medicine; Institute for Medicine and Public Health; Vanderbilt University Medical Center, Nashville, TN, USA
4Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Dustin N. Hartzel
5Phenomic Analytics and Clinical Data Core, Geisinger, PA, USA
BS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Felix Day
6The International PCOS Consortium
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Anne E. Justice
7Department of Population Health Sciences, Geisinger, Danville, PA, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Navya S. Josyula
7Department of Population Health Sciences, Geisinger, Danville, PA, USA
MS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sarah A. Pendergrass
7Department of Population Health Sciences, Geisinger, Danville, PA, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ky’Era Actkins
4Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA
8Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
9Department of Microbiology, Immunology, and Physiology, Meharry Medical College, Nashville, TN, USA
BS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Lea K. Davis
4Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA
8Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
10Department of Psychiatry and Behavioral Sciences; Vanderbilt University Medical Center, Nashville, TN, USA
11Department of Biomedical Informatics, Data Sciences Institute, Vanderbilt University Medical Center, Nashville, TN, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Digna R. Velez Edwards
4Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA
11Department of Biomedical Informatics, Data Sciences Institute, Vanderbilt University Medical Center, Nashville, TN, USA
12Division of Quantitative Science, Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, TN, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Brody Holohan
13Marshfield Clinic Research Institute, Marshfield, WI, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Andrea Ramirez
14Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
MD, MS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ian B. Stanaway
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
David R. Crosslin
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Gail P. Jarvik
16Departments of Medicine (Medical Genetics) and Genome Sciences, School of Medicine, University of Washington, Seattle, WA, USA
MD, PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Patrick Sleiman
17Children’s Hospital of Philadelphia, Philadelphia, PA, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hakon Hakonarson
17Children’s Hospital of Philadelphia, Philadelphia, PA, USA
MD, PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Marc S. Williams
1Genomic Medicine Institute, Geisinger, Danville, PA, USA
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ming Ta Michael Lee
1Genomic Medicine Institute, Geisinger, Danville, PA, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: yzhang1{at}geisinger.edu mlee2{at}geisinger.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Background Polycystic ovary syndrome (PCOS) is the most common endocrine disorder affecting women of reproductive age. Previous studies have identified genetic variants associated with PCOS identified by different diagnostic criteria. The Rotterdam Criteria is the broadest and able to identify the most PCOS cases.

Objectives To identify novel associated genetic variants, we extracted PCOS cases and controls from the electronic health records (EHR) based on the Rotterdam Criteria and performed a genome-wide association study (GWAS).

Study Design We developed a PCOS phenotyping algorithm based on the Rotterdam criteria and applied it to three EHR-linked biobanks to identify cases and controls for genetic study. In discovery phase, we performed individual GWAS using the Geisinger’s MyCode and the eMERGE cohorts, which were then meta-analyzed. We attempted validation of the significantly association loci (P<1×10−6) in the BioVU cohort. All association analyses used logistic regression, assuming an additive genetic model, and adjusted for principal components to control for population stratification. An inverse-variance fixed effect model was adopted for meta-analyses. Additionally, we examined the top variants to evaluate their associations with each criterion in the phenotyping algorithm. We used STRING to identify protein-protein interaction network.

Results We identified 2,995 PCOS cases and 53,599 controls in total (2,742cases and 51,438 controls from the discovery phase; 253 cases and 2,161 controls in the validation phase). GWAS identified one novel genome-wide significant variant rs17186366 (OR=1.37 [1.23,1.54], P=2.8×10−8) located near SOD2. Additionally, two loci with suggestive association were also identified: rs113168128 (OR=1.72 [1.42,2.10], P=5.2 x10−8), an intronic variant of ERBB4 that is independent from the previously published variants, and rs144248326 (OR=2.13 [1.52,2.86], P=8.45×10−7), a novel intronic variant in WWTR1. In the further association tests of the top 3 SNPs with each criterion in the PCOS algorithm, we found that rs17186366 was associated with polycystic and hyperandrogenism, while rs11316812 and rs144248326 were mainly associated with oligomenorrhea or infertility. Besides ERBB4, we also validated the association with DENND1A1.

Conclusion Through a discovery-validation GWAS on PCOS cases and controls identified from EHR using an algorithm based on Rotterdam criteria, we identified and validated a novel association with variants within ERBB4. We also identified novel associations nearby SOD2 and WWTR1. These results suggest the eGFR and Hippo pathways in the disease etiology. With previously identified PCOS-associated loci YAP1, the ERBB4-YAP1-WWTR1 network implicates the epidermal growth factor receptor and the Hippo pathway in the multifactorial etiology of PCOS.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

MyCode® was funded by Geisinger and Regeneron Genomics Center; the eMERGE III was funded by NIH U01HG8679 (Geisinger Clinic). The funding sources was not involved in the interpretation of the result or which journal to submit.

Author Declarations

All relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.

Yes

All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Footnotes

  • ↵# Both Kevin Ho (currently employed by Sanofi Genzyme) and Sarah A. Pendergrass (currently employed by Genentech) worked on this study while employed by Geisinger.

Data Availability

Summary data is available provided collaboration with Geisinger.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted December 15, 2019.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
A genome-wide association study of polycystic ovary syndrome identified from electronic health records
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
A genome-wide association study of polycystic ovary syndrome identified from electronic health records
Yanfei Zhang, Kevin Ho, Jacob M. Keaton, Dustin N. Hartzel, Felix Day, Anne E. Justice, Navya S. Josyula, Sarah A. Pendergrass, Ky’Era Actkins, Lea K. Davis, Digna R. Velez Edwards, Brody Holohan, Andrea Ramirez, Ian B. Stanaway, David R. Crosslin, Gail P. Jarvik, Patrick Sleiman, Hakon Hakonarson, Marc S. Williams, Ming Ta Michael Lee
medRxiv 2019.12.12.19014761; doi: https://doi.org/10.1101/2019.12.12.19014761
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
A genome-wide association study of polycystic ovary syndrome identified from electronic health records
Yanfei Zhang, Kevin Ho, Jacob M. Keaton, Dustin N. Hartzel, Felix Day, Anne E. Justice, Navya S. Josyula, Sarah A. Pendergrass, Ky’Era Actkins, Lea K. Davis, Digna R. Velez Edwards, Brody Holohan, Andrea Ramirez, Ian B. Stanaway, David R. Crosslin, Gail P. Jarvik, Patrick Sleiman, Hakon Hakonarson, Marc S. Williams, Ming Ta Michael Lee
medRxiv 2019.12.12.19014761; doi: https://doi.org/10.1101/2019.12.12.19014761

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetic and Genomic Medicine
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)