RT Journal Article SR Electronic T1 County-level Socio-Environmental Factors and Obesity Prevalence in the United States JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2023.12.13.23299918 DO 10.1101/2023.12.13.23299918 A1 Salerno, Pedro R.V.O. A1 Qian, Alice A1 Dong, Weichuan A1 Deo, Salil A1 Nasir, Khurram A1 Rajagopalan, Sanjay A1 Al-Kindi, Sadeer YR 2023 UL http://medrxiv.org/content/early/2023/12/14/2023.12.13.23299918.abstract AB Aims There are substantial geographical variations in obesity prevalence. Sociodemographic and environmental determinants of health (SEDH), understood as upstream determinants of obesogenic behaviors, may be contributing to this disparity. Thus, we investigated high-risk SEDH potentially associated with adult obesity in American counties using machine learning (ML) techniques.Materials and methods We performed a cross-sectional analysis of county-level adult obesity prevalence (≥30 kg/m2) in the U.S. using data from the Diabetes Surveillance System 2017. We harvested 49 county-level SEDH factors that were used by a Classification and Regression Trees (CART) model to identify county-level clusters. CART was validated using a “hold-out” set of counties and variable importance was evaluated using Random Forest.Results Overall, we analyzed 2,752 counties in the U.S identifying a national median obesity prevalence of 34.1% (IQR, 30.2, 37.7). CART identified 11 clusters with a 60.8% relative increase in prevalence across the spectrum. Additionally, 7 key SEDH variables were identified by CART to guide the categorization of clusters, including Physically Inactive (%), Diabetes, Severe Housing Problems (%), Food Insecurity (%), Uninsured (%), Population over 65 years (%), and Non-Hispanic Black (%).Conclusion There is significant county-level geographical variation in obesity prevalence in the United States which can in part be explained by complex SEDH factors. The use of ML techniques to analyze these factors can provide valuable insights into the importance of these upstream determinants of obesity and, therefore, aid in the development of geo-specific strategic interventions and optimize resource allocation to help battle the obesity pandemic.Article HighlightsWhy did we undertake this study? To improve the understanding of the association between complex sociodemographic and environmental determinants of health (SEDH) and obesity prevalence in the U.S.What is the specific question(s) we wanted to answer? What are the SEDH associated with obesity prevalence?What did we find? Seven key SEDH variables were identified by CART to guide the categorization of clusters, including Physically Inactive (%), Diabetes, Severe Housing Problems (%), Food Insecurity (%), Uninsured (%), Population over 65 years (%), and Non-Hispanic Black (%).What are the implications of our findings? Our study shows the importance of SEDH for the regional variation of obesity prevalence and aids in the development of geo-specific strategies to reduce disparities.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis study did not receive any fundingAuthor DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The study used (or will use) ONLY openly available human data that were originally located at the Diabetes Surveillance System (DSS) from the Centers for Disease Control (CDC), and County Health Rankings & Roadmaps. Link: https://www.cdc.gov/diabetes/data/index.html link: https://www.countyhealthrankings.org/explore-health-rankings/rankings-data-documentationI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesAll data produced in the present study are available upon reasonable request to the authors. https://www.cdc.gov/diabetes/data/index.html https://www.countyhealthrankings.org/explore-health-rankings/rankings-data-documentation https://www.epa.gov/ejscreen/download-ejscreen-data