Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

A satellite-based spatio-temporal machine learning model to reconstruct daily PM2.5 concentrations across Great Britain

View ORCID ProfileRochelle Schneider dos Santos, View ORCID ProfileAna M. Vicedo-Cabrera, View ORCID ProfileFrancesco Sera, Pierre Masselot, View ORCID ProfileMassimo Stafoggia, View ORCID ProfileKees de Hoogh, View ORCID ProfileItai Kloog, View ORCID ProfileStefan Reis, View ORCID ProfileMassimo Vieno, View ORCID ProfileAntonio Gasparrini
doi: https://doi.org/10.1101/2020.07.19.20157396
Rochelle Schneider dos Santos
1Department of Public Health, Environments and Society, London School of Hygiene & Tropical Medicine, London WC1H 9SH, UK
2The Centre on Climate Change and Planetary Health, London School of Hygiene and Tropical Medicine, London WC1H 9SH, UK
3Forecast Department, European Centre for Medium-Range Weather Forecast, Reading, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Rochelle Schneider dos Santos
  • For correspondence: rochelle.schneider{at}lshtm.ac.uk
Ana M. Vicedo-Cabrera
4Institute of Social and Preventive Medicine, University of Bern, 3012 Bern, Switzerland
5Oeschger Center for Climate Change Research, University of Bern, 3012 Bern, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ana M. Vicedo-Cabrera
Francesco Sera
1Department of Public Health, Environments and Society, London School of Hygiene & Tropical Medicine, London WC1H 9SH, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Francesco Sera
Pierre Masselot
1Department of Public Health, Environments and Society, London School of Hygiene & Tropical Medicine, London WC1H 9SH, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Massimo Stafoggia
6Department of Epidemiology, Lazio Regional Health Service, Rome 00147, Italy
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Massimo Stafoggia
Kees de Hoogh
7Swiss Tropical and Public Health Institute, Basel, Switzerland
8University of Basel, Basel, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Kees de Hoogh
Itai Kloog
9Department of Geography and Environmental Development, Ben-Gurion University of the Negev, P.O.B. 653 Beer Sheva, Israel
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Itai Kloog
Stefan Reis
10UK Centre for Ecology & Hydrology, Bush Estate, Penicuik, Edinburgh, Midlothian, EH26 0QB, UK
11University of Exeter Medical School, Knowledge Spa, Truro, TR1 3HD, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Stefan Reis
Massimo Vieno
10UK Centre for Ecology & Hydrology, Bush Estate, Penicuik, Edinburgh, Midlothian, EH26 0QB, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Massimo Vieno
Antonio Gasparrini
1Department of Public Health, Environments and Society, London School of Hygiene & Tropical Medicine, London WC1H 9SH, UK
2The Centre on Climate Change and Planetary Health, London School of Hygiene and Tropical Medicine, London WC1H 9SH, UK
12Centre for Statistical Methodology, London School of Hygiene and Tropical Medicine, London WC1E 7HT, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Antonio Gasparrini
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Epidemiological studies on health effects of air pollution usually rely on measurements from fixed ground monitors, which provide limited spatio-temporal coverage. Data from satellites, reanalysis and chemical transport models offer additional information used to reconstruct pollution concentrations at high spatio-temporal resolution. The aim of this study is to develop a multi-stage satellite-based machine learning model to estimate daily fine particulate matter (PM2.5) levels across Great Britain during 2008-2018. This high-resolution model consists of random forest (RF) algorithms applied in four stages. Stage-1 augments monitor-PM2.5 series using co-located PM10 measures. Stage-2 imputes missing satellite aerosol optical depth observations using atmospheric reanalysis models. Stage-3 integrates the output from previous stages with spatial and spatiotemporal variables to build a prediction model for PM2.5. Stage-4 applies Stage-3 models to estimate daily PM2.5 concentrations over a 1 km grid. The RF architecture performed well in all stages, with results from Stage-3 showing an average cross-validated R2 of 0.767 and minimal bias. The model performed better over the temporal scale when compared to the spatial component, but both presented good accuracy with an R2 of 0.795 and 0.658, respectively. The high spatio-temporal resolution and relatively high precision allows this dataset (approximately 950 million points) to be used in epidemiological analyses to assess health risks associated with both short- and long-term exposures to PM2.5.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This study was supported by the Medical Research Council-UK (Grant ID: MR/M022625/1), the Natural Environment Research Council UK (Grant ID: NE/R009384/1), and the European Union's Horizon 2020 Project Exhaustion (Grant ID: 820655). EMEP4UK Model results and contributions by S.R. and M.V. were supported by award number NE/R016429/1 as part of the UK-SCAPE programme delivering National Capability.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

The study is based on publicly available data and it does not make use of sensitive data.

All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Footnotes

  • Novel definition of cross-validated predictors for monitor-based variables; results have changed accordingly, with lower cross-validated R2 especially in the spatial part; Stage-I model reverted to annual, which led to exclusion of years 2003-2007 due to low number of monitors for pm2.5; small changes in selection of monitors, with revisiono of inclusion/exclusion criteria; figures updated to better reflect annual and daily variation in space and time.

Data Availability

All data used to perform the analysis are in the public domain. References and sources are provided in the text.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted September 17, 2020.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
A satellite-based spatio-temporal machine learning model to reconstruct daily PM2.5 concentrations across Great Britain
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
A satellite-based spatio-temporal machine learning model to reconstruct daily PM2.5 concentrations across Great Britain
Rochelle Schneider dos Santos, Ana M. Vicedo-Cabrera, Francesco Sera, Pierre Masselot, Massimo Stafoggia, Kees de Hoogh, Itai Kloog, Stefan Reis, Massimo Vieno, Antonio Gasparrini
medRxiv 2020.07.19.20157396; doi: https://doi.org/10.1101/2020.07.19.20157396
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
A satellite-based spatio-temporal machine learning model to reconstruct daily PM2.5 concentrations across Great Britain
Rochelle Schneider dos Santos, Ana M. Vicedo-Cabrera, Francesco Sera, Pierre Masselot, Massimo Stafoggia, Kees de Hoogh, Itai Kloog, Stefan Reis, Massimo Vieno, Antonio Gasparrini
medRxiv 2020.07.19.20157396; doi: https://doi.org/10.1101/2020.07.19.20157396

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Occupational and Environmental Health
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)