Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Machine learning enables new insights into clinical significance of and genetic contributions to liver fat accumulation

View ORCID ProfileMary E. Haas, View ORCID ProfileJames P. Pirruccello, Samuel N. Friedman, Connor A. Emdin, View ORCID ProfileVeeral H. Ajmera, Tracey G. Simon, Julian R. Homburger, Xiuqing Guo, View ORCID ProfileMatthew Budoff, View ORCID ProfileKathleen E. Corey, Alicia Y. Zhou, Anthony Philippakis, View ORCID ProfilePatrick T. Ellinor, View ORCID ProfileRohit Loomba, Puneet Batra, View ORCID ProfileAmit V. Khera
doi: https://doi.org/10.1101/2020.09.03.20187195
Mary E. Haas
1Program in Medical and Population Genetics, Broad Institute, Cambridge, MA, 02142
2Department of Molecular Biology, Department of Medicine, Massachusetts General Hospital, Boston, MA 02114
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mary E. Haas
James P. Pirruccello
1Program in Medical and Population Genetics, Broad Institute, Cambridge, MA, 02142
3Center for Genomic Medicine, Department of Medicine, Massachusetts General Hospital, Boston, MA 02114
4Department of Medicine, Harvard Medical School, Boston, MA 02114
5Cardiology Division, Department of Medicine, Massachusetts General Hospital, Boston, MA 02114
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for James P. Pirruccello
Samuel N. Friedman
6Data Sciences Platform, Broad Institute, Cambridge, MA, 02142
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Connor A. Emdin
1Program in Medical and Population Genetics, Broad Institute, Cambridge, MA, 02142
4Department of Medicine, Harvard Medical School, Boston, MA 02114
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Veeral H. Ajmera
7NAFLD Research Center, Department of Medicine, University of California San Diego, La Jolla, California 92103
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Veeral H. Ajmera
Tracey G. Simon
8Liver Center, Division of Gastroenterology, Department of Medicine, Massachusetts General Hospital, Boston 02114
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Julian R. Homburger
9Color Genomics, Burlingame, CA 94010
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Xiuqing Guo
10The Lundquist Institute for Biomedical Innovation and Department of Pediatrics, Harbor-UCLA Medical Center, Torrance, California 90502
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Matthew Budoff
10The Lundquist Institute for Biomedical Innovation and Department of Pediatrics, Harbor-UCLA Medical Center, Torrance, California 90502
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Matthew Budoff
Kathleen E. Corey
8Liver Center, Division of Gastroenterology, Department of Medicine, Massachusetts General Hospital, Boston 02114
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Kathleen E. Corey
Alicia Y. Zhou
9Color Genomics, Burlingame, CA 94010
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Anthony Philippakis
6Data Sciences Platform, Broad Institute, Cambridge, MA, 02142
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Patrick T. Ellinor
1Program in Medical and Population Genetics, Broad Institute, Cambridge, MA, 02142
4Department of Medicine, Harvard Medical School, Boston, MA 02114
5Cardiology Division, Department of Medicine, Massachusetts General Hospital, Boston, MA 02114
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Patrick T. Ellinor
Rohit Loomba
7NAFLD Research Center, Department of Medicine, University of California San Diego, La Jolla, California 92103
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Rohit Loomba
Puneet Batra
6Data Sciences Platform, Broad Institute, Cambridge, MA, 02142
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Amit V. Khera
1Program in Medical and Population Genetics, Broad Institute, Cambridge, MA, 02142
3Center for Genomic Medicine, Department of Medicine, Massachusetts General Hospital, Boston, MA 02114
4Department of Medicine, Harvard Medical School, Boston, MA 02114
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Amit V. Khera
  • For correspondence: avkhera{at}mgh.harvard.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Excess accumulation of liver fat – termed hepatic steatosis when fat accounts for > 5.5% of liver content – is a leading risk factor for end-stage liver disease and is strongly associated with important cardiometabolic disorders. Using a truth dataset of 4,511 UK Biobank participants with liver fat previously quantified via abdominal MRI imaging, we developed a machine learning algorithm to quantify liver fat with correlation coefficients of 0.97 and 0.99 in hold-out testing datasets and applied this algorithm to raw imaging data from an additional 32,192 participants. Among all 36,703 individuals with abdominal MRI imaging, median liver fat was 2.2%, with 6,250 (17%) meeting criteria for hepatic steatosis. Although individuals afflicted with hepatic steatosis were more likely to have been diagnosed with conditions such as obesity or diabetes, a prediction model based on clinical data alone without imaging could not reliably estimate liver fat content. To identify genetic drivers of variation in liver fat, we first conducted a common variant association study of 9.8 million variants, confirming three known associations for variants in the TM6SF2, APOE, and PNPLA3 genes and identifying five new variants associated with increased hepatic fat in or near the MARC1, ADH1B, TRIB1, GPAM and MAST3 genes. A polygenic score that integrated information from each of these eight variants was strongly associated with future clinical diagnosis of liver diseases. Next, we performed a rare variant association study in a subset of 11,021 participants with gene sequencing data available, identifying an association between inactivating variants in the APOB gene and substantially lower LDL cholesterol, but more than 10-fold increased risk of steatosis. Taken together, these results provide proof of principle for the use of machine learning algorithms on raw imaging data to enable epidemiologic studies and genetic discovery.

Competing Interest Statement

J.P.P. has served as a consultant for Maze Therapeutics. R.L. serves as a consultant or advisory board member for Arrowhead Pharmaceuticals, AstraZeneca, Boehringer-Ingelheim, Bristol-Myer Squibb, Celgene, Cirius, CohBar, Galmed, Gemphire, Gilead, Glympse bio, Intercept, Ionis, Inipharma, Merck, Metacrine, Inc., NGM Biopharmaceuticals, Novo Nordisk, Pfizer, and Viking Therapeutics. In addition, his institution has received grant support from Allergan, Boehringer-Ingelheim, Bristol-Myers Squibb, Eli Lilly and Company, Galmed Pharmaceuticals, Genfit, Gilead, Intercept, Janssen, Madrigal Pharmaceuticals, NGM Biopharmaceuticals, Novartis, Pfizer, pH Pharma, and Siemens. He is also co-founder of Liponexus, Inc. J.R.H and A.Y.Z. are employees of Color Genomics. K.E.C. serves on the advisory boards of Novo Nordisk and BMS, has consulted for Gilead and has received grant funding from BMS, Boehringer-Ingelheim and Novartis. T.G.S. has served as a consultant for Aetion. A.P. is employed as a Venture Partner at GV, a venture capital group within Alphabet; he is also supported by a grant from Bayer AG to the Broad Institute focused on machine learning for clinical trial design. S.N.F and P.B. are supported by grants from Bayer AG and IBM applying machine learning in cardiovascular disease. P.B. has served as a consultant to Novartis. P.T.E. is supported by a grant from Bayer AG to the Broad Institute focused on the genetics and therapeutics of cardiovascular diseases. P.T.E. has also served on advisory boards or consulted for Bayer AG, Quest Diagnostics, MyoKardia and Novartis. A.V.K. has served as a consultant to Sanofi, Medicines Company, Maze Pharmaceuticals, Navitor Pharmaceuticals, Verve Therapeutics, Amgen, and Color; received speaking fees from Illumina, MedGenome, and the Novartis Institute for Biomedical Research; received a sponsored research agreement from the Novartis Institute for Biomedical Research, and reports a pending patent related to a genetic risk predictor (20190017119).

Funding Statement

This research has been conducted using the UK Biobank resource, application 7089. Funding support was provided by NIH grants 1K08HG010155 (to A.V.K.) from the National Human Genome Research Institute, 1R01HL092577, R01HL128914, K24HL105780 (to P.T.E), R01HL071739 (to M.B.) from the National Heart, Lung and Blood Institute, 5P42ES010337 (to R.L.) from the National Institute of Environmental Health Sciences, 5UL1TR001442 (to R.L.) from the National Center for Advancing Translational Sciences, R01DK106419, P30DK120515 (to R.L.), K23 DK122104 to (to T.G.S.) from the National Institute of Diabetes and Digestive and Kidney Diseases, CA170674P2 (to R.L.) from the Department of Defense's Peer Reviewed Cancer Research Program, a Hassenfeld Scholar Award from Massachusetts General Hospital (to A.V.K.), a Merkin Institute Fellowship from the Broad Institute of MIT and Harvard (to A.V.K.), a John S LaDue Memorial Fellowship (to J.P.P.) a sponsored research agreement from IBM Research (to A.P., A.V.K.), American Association for the Study of Liver Diseases Foundation Clinical and Translational Research Awards (to V.A. and T.G.S.). MESA and the MESA SHARe projects are conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with MESA investigators. Support for MESA is provided by contracts 75N92020D00001, HHSN268201500003I, N01-HC-95159, 75N92020D00005, N01-HC-95160, 75N92020D00002, N01-HC-95161, 75N92020D00003, N01-HC-95162, 75N92020D00006, N01-HC-95163, 75N92020D00004, N01-HC-95164, 75N92020D00007, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95168, N01-HC-95169, UL1-TR-000040, UL1-TR-001079, UL1-TR-001420, and supported in part by the National Center for Advancing Translational Sciences, CTSI grant UL1TR001881, and the National Institute of Diabetes and Digestive and Kidney Disease Diabetes Research Center (DRC) grant DK063491 to the Southern California Diabetes Endocrinology Research Center. The authors thank the other investigators, the staff, and the participants of the MESA study for their valuable contributions. A full list of participating MESA investigators and institutions can be found at http://www.mesa-nhlbi.org.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

The UK Biobank study was approved by the Research Ethics Committee (reference 16/NW/0274) and informed consent was obtained from all participants. Analysis of UK Biobank data was conducted under application 7089 and was approved by the Mass General Brigham institutional review board (protocol 2013P001840). Framingham Heart Study and MESA genotype and phenotype data were retrieved for analysis from NCBI dbGAP under procedures approved by the Mass General Brigham institutional review board (protocol 2016P002395). Mass General Brigham Biobank participants each provided written informed consent and analysis was approved by the Mass General Brigham institutional review board.

All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

Summary statistics for the liver fat CVAS, as well as the machine learning model architectures and learned weights will be available at the Cardiovascular Disease Knowledge Portal (http://broadcvdi.org/home/portalHome) and the ML4CVD modeling framework will be available via GitHub repository at time of publication.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
Back to top
PreviousNext
Posted September 03, 2020.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Machine learning enables new insights into clinical significance of and genetic contributions to liver fat accumulation
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Machine learning enables new insights into clinical significance of and genetic contributions to liver fat accumulation
Mary E. Haas, James P. Pirruccello, Samuel N. Friedman, Connor A. Emdin, Veeral H. Ajmera, Tracey G. Simon, Julian R. Homburger, Xiuqing Guo, Matthew Budoff, Kathleen E. Corey, Alicia Y. Zhou, Anthony Philippakis, Patrick T. Ellinor, Rohit Loomba, Puneet Batra, Amit V. Khera
medRxiv 2020.09.03.20187195; doi: https://doi.org/10.1101/2020.09.03.20187195
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Machine learning enables new insights into clinical significance of and genetic contributions to liver fat accumulation
Mary E. Haas, James P. Pirruccello, Samuel N. Friedman, Connor A. Emdin, Veeral H. Ajmera, Tracey G. Simon, Julian R. Homburger, Xiuqing Guo, Matthew Budoff, Kathleen E. Corey, Alicia Y. Zhou, Anthony Philippakis, Patrick T. Ellinor, Rohit Loomba, Puneet Batra, Amit V. Khera
medRxiv 2020.09.03.20187195; doi: https://doi.org/10.1101/2020.09.03.20187195

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetic and Genomic Medicine
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)