Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Artificial intelligence analytics applied to body mass index global burden of disease worldwide cohort data derives a multiple regression formula with population attributable fraction risk factor coefficients testable by all nine Bradford Hill causality criteria

View ORCID ProfileDavid K Cundiff, View ORCID ProfileChunyi Wu
doi: https://doi.org/10.1101/2020.07.27.20162487
David K Cundiff
1Long Beach, California, USA
3Volunteer collaborators with the Institute of Health Metrics and Evaluation, Seattle, Washington, USA
MD
Roles: independent researcher
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for David K Cundiff
  • For correspondence: davidkcundiff{at}gmail.com
Chunyi Wu
2Area Specialist Lead in Epidemiology and Statistics, Michigan Medicine, Ann Arbor, Michigan, USA
3Volunteer collaborators with the Institute of Health Metrics and Evaluation, Seattle, Washington, USA
PhD
Roles: Research Epidemiologist/Statistician
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Chunyi Wu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Summary

Background Artificial intelligence (AI) analytics have not been applied to global burden of disease (GBD) risk factor data to study population health. The comparative risk assessment (CRA) systematic literature review-based methodology for population attributable fractions (PAFs in percent’s) calculations has not been utilised for quantifying dietary and other risk factors for body mass index kg/M2 (BMI).

Methods Institute of Health Metrics and Evaluation (IHME) staff and volunteer collaborators analysed over 12,000 GBD risk factor surveys of people from 195 countries and synthesized the data into representative mean cohort BMI and risk factor values. We formatted IHME GBD data relevant to BMI and associated risk factors. We empirically explored the univariate and multiple regression correlations of BMI risk factors with worldwide BMI to derive a BMI multiple regression formula (BMI formula). Main outcome measures included the performances of the BMI formula when tested with all nine Bradford Hill causality criteria each scored on a 0-5 scale: 0=negative to 5=very strong support.

Findings The BMI formula derived, with all foods in kilocalories/day (kcal/day), BMI formula risk factor coefficients were adjusted to equate with their PAFs. BMI increasing foods had “+” signs and BMI decreasing foods “-” signs. Total BMI formula PAF=80.96%. BMI formula=(0.37%*processed meat + 4.23%*red meat + 0.02%*fish + 2.24%*milk + 5.67%*poultry + 1.77%*eggs + 0.34%*alcohol + 0.99%*sugary beverages + 0.04%*corn + 0.72%*potatoes + 8.48%*saturated fatty acids + 3.89%*polyunsaturated fatty acids + 0.27%*trans fatty acids - 2.99%*fruit - 4.07%*vegetables - 0.37%*nuts and seeds - 0.45%*whole grains - 1.49%*legumes - 8.62%*rice - 0.10%*sweet potatoes - 7.45% physical activity (METs/week) - 20.38%*child underweight + 6.02%*sex (male=1, female=2))*0.05012 + 21.77. BMI formula versus BMI: r=0.907, 95% CI: 0.903 to 0.911, p<0.0001. Bradford Hill causality criteria test scores (0-5): (1) strength=5, (2) experimentation=5, (3) consistency=5, (4) dose-response=5, (5) temporality=5, (6) analogy=4, (7), plausibility=5, (8) specificity=5, and (9) coherence=5. Total score=44/45.

Interpretation Nine Bradford Hill causality criteria strongly supported a causal relationship between the BMI formula derived and mean BMIs of worldwide cohorts. The artificial intelligence methodology introduced could inform individual, clinical, and public health strategies regarding overweight/obesity prevention/treatment and other health outcomes.

Funding None

Evidence before this study Comparative risk assessment (CRA) systematic literature review-based methodology has been used in worldwide global burden of disease (GBD) analysis to determine population attributable fraction(s) (PAF(s)) for one or more risk factors for various health outcomes. So far, CRA has not been applied to derive PAFs for dietary and other risk factors for worldwide BMI. Artificial intelligence (AI) analytics has not yet been applied to worldwide GBD data as an alternative to the CRA methodology for determining risk factor PAFs for health outcomes.

Added value of this study□ A multiple regression derived BMI formula (BMI formula) including PAFs of 20 dietary risk factors, physical activity, childhood severe underweight, and sex satisfied all nine Bradford Hill causality criteria. The BMI formula also plausibly predicted the long-term BMI outcomes related to various dietary and physical activity scenarios. All the BMI formula’s 24 risk factor PAFs were consistent in sign (+ or -) with the preponderance of previously published studies on those risk factors related to BMI.

Implications of all the available evidence The AI analytics methodology of GBD data modeling of BMI and associated risk factors infers causality of the BMI formula estimates with BMI worldwide and BMIs of subsets. This methodology may enable multiple regression formulas for risk factors of health outcomes for a range of non-communicable diseases—testable by Bradford Hill causality criteria.

Competing Interest Statement

The authors have declared no competing interest.

Clinical Trial

The data came from the Institute of Health Metrics and Evaluation from over 12,000 surveys.

Funding Statement

This research received no grant from any funding agency in the public, commercial or not-for-profit sectors. The Bill and Melinda Gates Foundation funded the acquisition of the data for this analysis by the IHME. The data were provided to the authors as volunteer collaborators with IHME.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

Ethics committee and IRB approval: NA. This study is based solely on data from IHME GBD database, which we have IHME permission to use.

All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Footnotes

  • Title changed, manuscript revised and edited along with the tables.

Data Availability

The raw, unformatted data used in this analysis is now out of date. The 2019 GBD data on all the variables in this analysis may be obtained from the IHME by volunteer collaborating researchers. The formatted database, SAS codes and Excel spreadsheets on which this analysis is based are posted on the Mendeley data repository.

https://data.mendeley.com/v1/datasets/publish-confirmation/g6b39zxck4/6

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC 4.0 International license.
Back to top
PreviousNext
Posted August 26, 2021.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Artificial intelligence analytics applied to body mass index global burden of disease worldwide cohort data derives a multiple regression formula with population attributable fraction risk factor coefficients testable by all nine Bradford Hill causality …
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Artificial intelligence analytics applied to body mass index global burden of disease worldwide cohort data derives a multiple regression formula with population attributable fraction risk factor coefficients testable by all nine Bradford Hill causality criteria
David K Cundiff, Chunyi Wu
medRxiv 2020.07.27.20162487; doi: https://doi.org/10.1101/2020.07.27.20162487
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Artificial intelligence analytics applied to body mass index global burden of disease worldwide cohort data derives a multiple regression formula with population attributable fraction risk factor coefficients testable by all nine Bradford Hill causality criteria
David K Cundiff, Chunyi Wu
medRxiv 2020.07.27.20162487; doi: https://doi.org/10.1101/2020.07.27.20162487

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Public and Global Health
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)