Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Distribution of gestational age by maternal and infant characteristics in US birth certificate data: informing gestational age assumptions when clinical estimates are not available

View ORCID ProfileAndrea V Margulis, View ORCID ProfileBrian Calingaert, View ORCID ProfileAlison T Kawai, View ORCID ProfileElena Rivero-Ferrer, View ORCID ProfileMary S Anthony
doi: https://doi.org/10.1101/2022.10.19.22281268
Andrea V Margulis
1FISPE, RTI Health Solutions, Barcelona, Spain
MD ScD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Andrea V Margulis
  • For correspondence: amargulis{at}rti.org
Brian Calingaert
2RTI Health Solutions, Research Triangle Park, NC, United States
MS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Brian Calingaert
Alison T Kawai
3RTI Health Solutions, Waltham, MA, United States
ScD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Alison T Kawai
Elena Rivero-Ferrer
1FISPE, RTI Health Solutions, Barcelona, Spain
MD MPH
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Elena Rivero-Ferrer
Mary S Anthony
2RTI Health Solutions, Research Triangle Park, NC, United States
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mary S Anthony
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

ABSTRACT

We aimed to describe the distribution of gestational age at birth (GAB) to inform the estimation of GAB when clinical or obstetric estimates are not available for perinatal epidemiologic research. We estimated GAB (median, mode, mean, standard deviation) and percentage born at each gestational week in groups based on plurality and other variables for live births in CDC’s US birth data.

In 2020, 3,617,213 newborns had birth certificates with nonmissing GAB. Among singletons (3,501,693), median and mode GAB were both 39 weeks. Births with lower median GAB were from women with eclampsia (37 weeks) or receiving intensive care (37 weeks); newborns receiving intensive care (37 weeks); infants with birth weight < 2,500 grams (35 weeks), < 1,500 grams (28 weeks), or < 1,000 grams (25 weeks); and newborns not discharged alive (23 weeks). Among twins (112,633), median GAB was 36 weeks (mode, 37 weeks). Additional noteworthy groups were women with 7-8 (median, 35 weeks) or 0-6 prenatal visits (median, 34 weeks) or aged 15-19 years (median, 35 weeks).

Some maternal and infant groups had distinct GAB distributions in the US. This information can be useful in estimating GAB when individual-level clinical estimates are not available.

INTRODUCTION

In observational studies, researchers who use existing data sources to ascertain medication use or other exposures or events in pregnancy need to know when each pregnancy in the study population started. Whenever possible, researchers use obstetric or clinical estimates; otherwise, they typically use coded information available in their data. International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) Z3A codes for gestational age facilitate the process in claims data sources in the United States (US) to a certain extent.1 For pregnancies without the relevant clinical or obstetric information and without informative codes, researchers use other estimation methods. Often, such pregnancies are assigned a fixed duration based on the observed mode or median gestational age at birth (GAB) of pregnancies with some known characteristic, such as 34,2 35,1,3,4 or 365 weeks for preterm live births; 391,4 or 402,3,5 weeks for term live births, or 37 weeks for multifetal pregnancies.5 Then, the assigned GAB is subtracted from the delivery or birth date (which is usually available) to estimate the pregnancy start date and assess the timing of exposure relative to pregnancy start.

Healthcare claims and electronic health records contain information that might be used to identify groups of pregnancies with specific characteristics for which the GAB distribution differs from that of the general population of pregnancies; these distributions can in turn be used to estimate pregnancy start more accurately in those groups. The objective of this work was to describe the distribution of GAB in US birth certificates in groups defined by maternal or newborn characteristics that may also be captured in US healthcare claims or other data sources—e.g., plurality (singleton, twin, etc.), maternal age, race/ethnicity, smoking during pregnancy, body mass index (BMI) categories, birth weight—to inform the estimation of GAB when clinical or obstetric estimates are not available.

METHODS

The completed checklist for methods reporting in perinatal pharmacoepidemiology6 is presented in Appendix A, Table A-1.

Data source

We used US birth data files of the Centers for Disease Control and Prevention (CDC)7,8 for years 2019 (the most recent year before the COVID-19 pandemic) and 2020 (the most recent available data). These files are publicly available for download. Each row corresponds to 1 live birth and contains information on the mother, the pregnancy, and the offspring. We included live births to foreign residents9,10; this is why our totals are about 0.25% larger than the ones in CDC’s final reports, which did not include these live births.11,12 Variables correspond to the fields in US birth certificates; GAB is an obstetric estimate. Fields that might facilitate identification of individuals are not included in the downloadable data. No linkage with other data sources was sought in this study.

Study population

The study population included all pregnancies ending in a live birth with nonmissing GAB; pregnancies with fetuses with chromosomal abnormalities or congenital malformations (minor or major) and fetuses from multifetal pregnancies were included. Women may have contributed pregnancies in 1 or both years; this information is not directly available from the data source; intrafamily correlation was not considered in the analyses. Each analysis included pregnancies with nonmissing values for the variables used in that analysis.

Variables

Study variables are listed in Appendix A, Table A-2. Variables for this study were a subset of the variables included in the data source.9,10 GAB is provided as the number of completed weeks at the time of birth (range, 17-47).

Statistical analysis

The unit of analysis in this study was live births, but, for clarity, maternal or pregnancy characteristics were described in terms of women or pregnancies; offspring characteristics were described in terms of newborns.

We estimated summary statistics for GAB and percentage of infants born with each gestational week (e.g., 0.1% born with 17 weeks, 0.2% born with 18 weeks) in various groups of live births, separately for 2019 and 2020. Each analysis included pregnancies with nonmissing values for the variables used in that analysis. GAB distribution is presented for all live births, for singletons only, and for twins only. Because these tables are sizable, the complete tables are presented with the supplemental information (see Appendix B: Table B-1, results for all live births; Table B-2, results for singleton live births; Table B-3, results for twin live births).

To explore whether the median, mode or mean would result in a smaller estimation error, we calculated 2 metrics for each of the 3 summary statistics: the mean squared error and the mean absolute value of the error. The mean squared error using the median in group X (e.g., singletons born small for gestational age) was calculated as follows: the observed GAB for live birth i in group X minus the median GAB in group X, squared, averaged across all newborns in group X. Similar calculations were conducted for the mean and mode. The mean absolute value of the error was calculated similarly, applying the absolute value instead of the square. Smaller mean squared error or mean absolute value of the error reflect a more precise estimation. More details on the methods are presented in Appendix A and results are presented in Appendix B, Table B-4.

RESULTS

Overall

In 2019, 3,757,582 live born infants were issued birth certificates in the US; 3,755,044 (99.9%) birth certificates had information on GAB (Table 1 and Appendix B, Table B-1). Median GAB was 39 weeks; mode, the same; and mean (standard deviation [SD]), 38.4 (2.1) (Appendix B, Table B-1). In 2020, there were 3,619,826 live births; 3,617,213 (99.9%) had information on GAB (Table 1 and Appendix B, Table B-1). The GAB median, mode, mean, and SD were nearly the same as in 2019 (Appendix B, Table B-1). Results for 2019 and 2020 were very similar (Table 1 and Figure 1); for further descriptions, we use data from 2020, the latest available information at the time of study conduct.

View this table:
  • View inline
  • View popup
Table 1. Characteristics of study population, USA 2019 and 2020
Figure 1.
  • Download figure
  • Open in new tab
Figure 1. Distribution of live births by gestational age at birth, USA 2020 and 2019

In 2020, 92.0% of live births occurred in women aged 20 to 39 years (Table 1). Overall, 51% of live births occurred in non-Hispanic White women, 24.1% in Hispanic women, and 14.6% in non-Hispanic Black women. Almost 87% of women completed high school or further studies, and 52.4% were married (11.6% had unknown marital status). Over 93% did not smoke during pregnancy; 3.3% smoked 10 or more cigarettes daily during at least 1 trimester. About 56.1% of women had BMI ≥ 25 kg/m2; 8.8% of women had preexisting or gestational diabetes, and 10.9% had preexisting or gestational hypertension. In 2020, 96.8% (3,501,693) of live births were singletons, 3.1% (112,633) were twins, 0.1% (2,750) were triplets, and 137 were quadruplets or higher order (Table 1; Figure 2 shows the distribution of gestational age at birth by plurality). Of all live births, 10.1% were preterm (GAB < 37 completed weeks) (median, 35 weeks; mode, 36 weeks; mean, 33.8 weeks) (Appendix B, Table B-1).

Figure 2.
  • Download figure
  • Open in new tab
Figure 2. Distribution of live births by gestational age at birth, by plurality, USA 2020

Singletons

Among singletons (total, 3,501,693), the median and mode GAB were 39 weeks in most groups; no groups had larger median or mode GAB (Table 2; Appendix B, Table B-2); 8.4% of singletons (8.1% of live births) were preterm (median, 35 weeks; mode, 36 weeks; mean, 33.9 weeks). The following groups had lower median or mode GAB (in descending order of frequency): newborn admitted to neonatal intensive care unit (290,056 [8.3% of singletons, 8.0% of live births]; median, 37 weeks; mode, 39 weeks; mean, 35.8 weeks), low birth weight (233,500 [6.7% of singletons, 6.5% of all]; median, 35 weeks; mode, 37 weeks; mean, 34.4 weeks), very low birthweight (37,177 [1.1% of singletons, 1.0% of live births]; median and mode, 28 weeks; mean, 27.6 weeks), extremely low birthweight (17,861 [0.5% of singletons and live births]; median and mode, 25 weeks; mean, 24.9 weeks), women with eclampsia (9,263 live births [0.3% of singletons and live births], median and mode 37 weeks, mean, 36.5 weeks), newborns not discharged alive (6,730 [0.2% of singletons and live births]; median, 23 weeks; mode, 22 weeks; mean, 25.6 weeks), and women admitted to an intensive care unit as a complication of delivery or labor (5,498 [0.2% of singletons and live births]; median, 37 weeks; mode, 39 weeks; mean, 35.8 weeks).

View this table:
  • View inline
  • View popup
Table 2. Gestational age at birth in completed weeks, singletons, USA 2019 and 2020

Twins

Among twins in 2020 (total, 112,633), median GAB was 36 weeks and mode was 37 weeks in most groups (Appendix B, Table B-3); 59.9% of twins (1.9% of live births) were preterm (median, 35 weeks; mode, 36 weeks; mean, 33.5 weeks). As with singletons, no groups had a larger median or mode GAB. The characteristics that identified groups with lower median or mode GAB among singletons also did with twins. Additional groups that had lower median or mode GAB were (in descending frequency) pregnancies with 0 to 6 prenatal visits (15,530 live births [13.8% of twins, 0.4% of live births]; median, 34 weeks; mode, 36 weeks; mean, 32.5 weeks), with 7 or 8 prenatal visits (13,013 live births [11.6% of twins, 0.4% of live births]; median, 35 weeks; mode, 36 weeks; mean, 34.3 weeks), pregnancies in which the mother smoked 10 or more cigarettes per day during pregnancy (3,965 [3.5% of twins, 0.1% of live births]; median and mode, 36 weeks; mean, 34.6 weeks), with maternal age 15 to 19 years (2,499 live births [2.2% of twins, 0.1% of live births]; median, 35 weeks; mode, 37 weeks; mean, 34.1 weeks), in which the mother smoked 1 to 9 cigarettes per day during pregnancy (2,497 [2.2% of twins, 0.1% of live births]; median and mode, 36 weeks; mean 35.0 weeks), and with maternal age < 15 years (19 live births [0.02% of twins, 0.001% of all live births]; median, 35 weeks; mode, 36 weeks; mean, 32.2 weeks. As among singletons, newborns not discharged alive had the smallest median and mode GAB.

Other results

Generally, pregnancies with characteristics that can be considered healthy had a narrower GAB distribution; for example, 77.8% of singletons of women with BMI between 18.5 and less than 25 kg/m2 had GAB within 1 week around the mode (38 through 40 weeks), while only 35.9% of pregnancies in which newborns were admitted into the neonatal intensive care unit had GABs within 1 week around the mode (also 38 through 40 weeks). Newborns with birthweight < 1,500 grams had practically the same GAB distribution, regardless of whether they were singletons or twins, and these distributions were broader than that for singletons with birthweight ≥ 1,500 grams (Appendix A, Figure A-1; Appendix B, Table B-3). Among newborns not discharged alive, GAB had a mode at 21 weeks (13.2% of 8,162 newborns) and a small increase at 37 weeks (3.7%) (Appendix A, Figure A-2, and Appendix B, Table B-1).

The means displayed more variation than the medians and modes; SDs often increased as means decreased. For example, the mean (SD) GAB was 38.6 (1.7) weeks for singletons of women who did not smoke during pregnancy in 2020; 38.2 (1.9) weeks for women who smoked < 10 cigarettes daily during pregnancy, and 38.1 (2.2) weeks for women who smoked 10 or more cigarettes daily in at least 1 trimester; however, the median and mode were 39 weeks for these 3 groups (Table 2; Appendix B, Table B-2).

Mean squared errors were smaller when calculated using the mean than when using the median or mode; in contrast, mean absolute values of the errors were smaller when calculated using the median than when using the mean or the mode (when the median and mode were the same, mean absolute values of the errors were smaller when calculated using them than when calculated using the mean; Appendix B, Table B-4).

DISCUSSION

Main results

Birth certificates from the US in 2019 and 2020 indicated that newborns overall and in most groups defined by maternal and newborn characteristics had a median and mode GAB of 39 weeks; this was driven by the GAB in singletons (96.8% of pregnancies with nonmissing GAB). Among singletons, live births—including live births in women of any age, who smoked or did not smoke during pregnancy, with any BMI—had a median and mode GAB of 39 weeks; median or mode GAB lower than 39 weeks was observed in pregnancies with complications in the mother or the offspring. Multifetal pregnancies had lower GAB: for twins, overall and in several groups, the median was 36 weeks and the mode was 37 weeks; GAB was shorter for triplets. Among twins, additional groups with lower median or mode GAB were identified in groups based on number of prenatal care visits, maternal age, and smoking during pregnancy. We observed larger variability of GAB across groups in twins than in singletons.

How can these results be useful?

Data sources with valuable medication exposure information may lack some pregnancy-specific information; an example is US claims data sources, which are often used for perinatal pharmacoepidemiologic research. When individual-level clinical or obstetric estimates of duration of pregnancy are lacking, researchers often use a fixed number of weeks to estimate pregnancy duration and date of pregnancy start, to then ascertain the timing of drug or other exposures relative to the start of pregnancy. Our results can be used in several ways in this process. First, our results can be used to refine pregnancy-identifying or pregnancy-dating algorithms, allowing researchers to identify smaller groups for more individualized GAB estimation based on characteristics of each pregnancy or newborn that may be available in their data source. For example, a singleton pregnancy with known eclampsia could be assigned 37 weeks (mode and median, 2020), instead of 39 weeks (overall mode and median, 2020), thus reducing misclassification of exposure. Second, our results can be used to probabilistically impute 1 GAB value for each pregnancy as a random draw from the appropriate GAB distribution. For example, a singleton pregnancy in a 30-year old woman would have a 16.8% probability of being imputed a GAB of 38 weeks; 39.9%, 39 completed weeks (the mode); 19.5%, 40 weeks, etc. (2020 data). The multiple imputation version of this process would further reflect the uncertainty around the duration of pregnancies in the face of missing information.

Researchers could draw various GAB values per pregnancy (producing, for example, 10 completed data sets), conduct all the downstream analyses for each completed data set and finally combine the results. In addition, we provide current GAB distributions and information on which statistic can reduce errors in GAB estimates.

US birth files have 1 observation per liveborn infant: multifetal pregnancies are represented multiple times. Despite this, our results are applicable to studies whose unit of observation is pregnancies because we provide results for singletons and, separately, for twins, and because GAB distributions (percentages by gestational week, mean, median, and mode) are the same for newborns and for the corresponding pregnancies within each of those groups (assuming all twins are born alive).

Research groups have used the median1,2,4 or the mode1 GAB to estimate pregnancy duration; one may wonder whether the median, the mode, or even the mean GAB is most appropriate for estimations. We found that the median and the mode are the same for many groups. The mean squared distance, which penalizes large differences, always favored using the mean over the median or the mode. On the other hand, the mean absolute value of the distance always favored the median (and the mode, when they were the same). Researchers wanting to minimize the number of days (i.e., linear distance) between the imputed and the true value, based on our results, could use the median GAB; researchers wanting to minimize the squared distance could use the mean GAB.

Our results highlight that most singleton groups had a median and mode GAB of 39 weeks, including groups determined by BMI and smoking, data that may not be available in the claims data sources often used for perinatal pharmacoepidemiology research. While some groups with lower GAB were small (e.g., live births from women with eclampsia were 0.3% of all births), they may be the target of interventions and research, as these groups often were pregnancies with maternal or newborn complications.

Generalizability

Our results are generalizable to subpopulations within the US and to populations elsewhere with similar healthcare practices; for example, in England, the mode GAB has been reported as 39 weeks (29.1% of live births with known duration of pregnancy), with 68% of deliveries taking place from 38 to 40 completed weeks in April 2020 through March 202113 (compared with 38.8% live births at 39 weeks and 73.9% born at 38 to 40 weeks in the US in 2020). In populations where healthcare practices differ considerably from the US, the distribution of GAB might vary. For example, 32% of pregnancies were estimated to result in cesarean section in North America in 2018 (also observed in the data that we used in 2019 and 2020; Appendix B, Table B-1), but cesarean sections comprised 43% of pregnancies in Latin America and the Caribbean and 16% in Southeast Asia.14 Temporal changes in GAB in the US have been documented, with the most common duration of singleton livebirth pregnancies with spontaneous delivery shifting from 40 weeks in 1992 to 39 weeks in 2002;15 our analyses show that 39 weeks was still the mode (and also median) GAB among all live births and live births born via cesarean section in the US in 2020.

Missing data

The lack of information on pregnancy start date or GAB can be seen as a missing data problem. Using information from maternal and infant characteristics to estimate GAB (such as using group-specific GAB distributions), as we propose, makes the missing-at-random assumption more plausible.16 Our proposal to use information obtained after delivery/birth to estimate GAB (e.g., using information on whether the mother or the newborn received intensive care) is appropriate because what happens downstream from an unobserved event contains information on the unobserved event (as wet cars in the street can be an indication of unobserved earlier rain) and is consistent with established multiple imputation approaches.17

Strengths and limitations

For these analyses, we used a very large population-based data source that contains information on GAB and maternal and newborn characteristics. This allowed us to explore combinations of variables and still have a large number of observations within groups. Furthermore, US birth certificates have been found to be a valid source of information on duration of pregnancy or GAB18-20 and have been used as gold standard in validating claims-based algorithms estimating GAB.21-23 However, they have been reported as not reliable for other data elements such as maternal weight,24 smoking during pregnancy,25 and other characteristics.26 Birth weight, mode of delivery, and presence of some maternal chronic conditions have also been found to be reliable,18,26 and linkage to birth certificates has been advocated for research on drug safety in pregnancy in healthcare databases.27 Another strength of our study is that our results might be used to mitigate misclassification of GAB among shorter pregnancies, a known limitation of some previous research.21,28-30

Limitations of this study include the aforementioned birth certificate shortcomings and the fact that US birth files include only live births. Despite the large size of the data source, the number of triplets and higher order multifetal pregnancies was small, and we did not explore them separately. For similar practical reasons, we explored only 2-variable combinations. In the original data source, GAB is presented in completed weeks; finer granularity is not provided. Some characteristics that identify groups with lower GAB, such as birth weight, may not be available in some data sources.

CONCLUSIONS

Most singleton live births, including live births in women of any age, who smoked or did not smoke during pregnancy, and with any BMI had a median and mode GAB of 39 weeks. Some live birth groups had distinct GAB distributions; these groups can be identified from characteristics recorded in many existing data sources used for observational epidemiologic research. GAB distributions provided here can be useful in estimating GAB when clinical estimates are not available in those data sources.

Data Availability

We used US birth data files of the Centers for Disease Control and Prevention (CDC) for years 2019 (the most recent year before the COVID-19 pandemic) and 2020 (the most recent available data). These files are publicly available for download. References with links: CDC. US Centers for Disease Control and Prevention. Vital statistics online data portal: downloadable data files. 13 May 2022. https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm. Accessed 15 May 2022. CDC. US Centers for Disease Control and Prevention. Guide to completing the facility worksheets for the certificate of live birth and report of fetal death (2003 revision, update September 2019). September 2019. https://www.cdc.gov/nchs/data/dvs/GuidetoCompleteFacilityWks.pdf. Accessed 23 December 2021.

https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm

ACKNOWLEDGEMENTS

Editorial services were provided by Adele Monroe, ELS, and graphic art services were provided by Bethan Pickering, both employees of RTI Health Solutions. We would like to thank Abenah Harding, also an employee of RTI Health Solutions, for her help preparing this manuscript. Development of this manuscript was supported financially by RTI Health Solutions.

Appendix

Appendix A. Additional information

This is supplemental information for the paper

Checklist for reporting in perinatal pharmacoepidemiology

View this table:
  • View inline
  • View popup
Table A-1. Checklist for reporting in perinatal pharmacoepidemiology

Study variables

View this table:
  • View inline
  • View popup
Table A-2. Study variables

Methods to explore whether the median, mode, or mean would result in a smaller error in estimating gestational age at birth

Analyses in this study were aimed at informing the estimation of gestational age at birth or duration of pregnancy in other data sources. One way of doing this is, using an example, to assign all singletons born small for gestational age whose gestational age is unknown the median gestational age observed in 2019 in the present study. To support recommendations on which summary statistic should be used (i.e., median, mode, or mean), we calculated two values:

  1. The mean squared error for subgroups. The mean squared error using the median was calculated as follows:

    Embedded Image for every observation i in the group of size n

    In words, following the same example: the mean squared error using the median was the gestational age at birth for live birth i (singleton born small for gestational age) minus the median gestational age at birth among singletons born small for gestational age, squared, averaged across all singletons born small for gestational age.

    This mean squared error was calculated for the median, mode, and mean. A smaller mean squared error reflects a better estimation.

  2. The mean absolute value of the error for selected subgroups:

    Embedded Image for every observation i in the group of size n

    In words, following the same example: the mean absolute error using the median was the absolute value of gestational age at birth for live birth i (singleton born small for gestational age) minus the median gestational age at birth among singletons born small for gestational age, averaged across all singletons born small for gestational age.

    This statistic was calculated for the median, mode, and mean. A smaller value reflects a better estimation.

Additional figures

Figure A-1.
  • Download figure
  • Open in new tab
Figure A-1. Distribution of live births by gestational age at birth by plurality and birth weight, USA 2020
Figure A-2.
  • Download figure
  • Open in new tab
Figure A-2. Distribution live births by gestational age at birth in newborns not discharged alive, USA 2020

Footnotes

  • amargulis{at}rti.org;

  • bcalingaert{at}rti.org;

  • akawai{at}rti.org;

  • erivero{at}rti.org;

  • manthony{at}rti.org;

  • Previous Presentations: Part of this work has been presented as a poster at the 2022 International Conference on Pharmacoepidemiology & Therapeutic Risk Management (24-28 August 2022; Copenhagen, Denmark).

  • Funding Statement: This work was supported by RTI Health Solutions.

  • CONFLICT(S) OF INTEREST The authors have no conflict of interest for this publication

  • ETHICS REVIEW STATEMENT The RTI International institutional review board reviewed the protocol and determined that the study did not constitute research involving human subjects (RTI IRB STUDY00021950).

REFERENCES

  1. 1.↵
    Bertoia ML, Phiri K, Clifford CR, Doherty M, Zhou L, Wang LT, et al. Identification of pregnancies and infants within a US commercial healthcare administrative claims database. Pharmacoepidemiol Drug Saf. 2022 Aug;31(8):863–74. doi:http://dx.doi.org/10.1002/pds.5483.
    OpenUrl
  2. 2.↵
    Hornbrook MC, Whitlock EP, Berg CJ, Callaghan WM, Bachman DJ, Gold R, et al. Development of an algorithm to identify pregnancy episodes in an integrated health care delivery system. Health Serv Res. 2007 Apr;42(2):908–27. doi:http://dx.doi.org/10.1111/j.1475-6773.2006.00635.x.
    OpenUrlCrossRefPubMedWeb of Science
  3. 3.↵
    Matcho A, Ryan P, Fife D, Gifkins D, Knoll C, Friedman A. Inferring pregnancy episodes and outcomes within a network of observational databases. PLoS One. 2018;13(2):e0192033. doi:http://dx.doi.org/10.1371/journal.pone.0192033.
    OpenUrlCrossRef
  4. 4.↵
    Margulis AV, Setoguchi S, Mittleman MA, Glynn RJ, Dormuth CR, Hernández-Díaz S. Algorithms to estimate the beginning of pregnancy in administrative databases. Pharmacoepidemiol Drug Saf. 2013 Jan;22(1):16–24. doi:http://dx.doi.org/10.1002/pds.3284.
    OpenUrlCrossRefPubMed
  5. 5.↵
    Minassian C, Williams R, Meeraus WH, Smeeth L, Campbell OMR, Thomas SL. Methods to generate and validate a Pregnancy Register in the UK Clinical Practice Research Datalink primary care database. Pharmacoepidemiol Drug Saf. 2019 Jul;28(7):923–33. doi:http://dx.doi.org/10.1002/pds.4811.
    OpenUrlCrossRefPubMed
  6. 6.↵
    Margulis AV, Kawai AT, Anthony MS, Rivero-Ferrer E. Perinatal pharmacoepidemiology: how often are key methodological elements reported in publications? Pharmacoepidemiol Drug Saf. 2022 Jan;31(1):61–71. doi:http://dx.doi.org/10.1002/pds.5353.
    OpenUrl
  7. 7.↵
    CDC. US Centers for Disease Control and Prevention. Vital statistics online data portal: downloadable data files. 13 May 2022. https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm. accessed 15 May 2022.
  8. 8.↵
    CDC. US Centers for Disease Control and Prevention. Guide to completing the facility worksheets for the certificate of live birth and report of fetal death (2003 revision, update September 2019). September 2019. https://www.cdc.gov/nchs/data/dvs/GuidetoCompleteFacilityWks.pdf. accessed 23 December 2021.
  9. 9.↵
    CDC. US Centers for Disease Control and Prevention. User’s guide: birth data files, 2019. 2020. https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm, https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/DVS/natality/UserGuide2019-508.pdf Accessed 23 December 2021.
  10. 10.↵
    CDC. US Centers for Disease Control and Prevention. User’s guide: birth data files, 2020. 2021. https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm, https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/DVS/natality/UserGuide2020.pdf Accessed 23 December 2021.
  11. 11.↵
    Martin JA, Hamilton BE, Osterman MJK, Driscoll AK. Births: final data for 2019. Natl Vital Stat Rep. 2021 Apr;70(2):1–51.
    OpenUrlPubMed
  12. 12.↵
    Osterman M, Hamilton B, Martin JA, Driscoll AK, Valenzuela CP. Births: final data for 2020. Natl Vital Stat Rep. 2021 Feb;70(17):1–50.
    OpenUrlPubMed
  13. 13.↵
    NHS Digital. NHS maternity statistics, England - 2020-21: HES NHS maternity statistics tables. 25 Nov 2021. https://digital.nhs.uk/data-and-information/publications/statistical/nhs-maternity-statistics/2020-21 [summary report]; https://files.digital.nhs.uk/AE/46F775/hosp-epis-stat-mat-hesnational-2020-21.xlsx [tables]. Accessed 28 May 2022.
  14. 14.↵
    Betran AP, Ye J, Moller AB, Souza JP, Zhang J. Trends and projections of caesarean section rates: global and regional estimates. BMJ Glob Health. 2021 Jun;6(6). doi:http://dx.doi.org/10.1136/bmjgh-2021-005671.
  15. 15.↵
    Davidoff MJ, Dias T, Damus K, Russell R, Bettegowda VR, Dolan S, et al. Changes in the gestational age distribution among U.S. singleton births: impact on rates of late preterm birth, 1992 to 2002. Semin Perinatol. 2006 Feb;30(1):8–15. doi:http://dx.doi.org/10.1053/j.semperi.2006.01.009.
    OpenUrlCrossRefPubMedWeb of Science
  16. 16.↵
    Sterne JA, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, et al. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ. 2009 Jun 29;338:b2393. doi:http://dx.doi.org/10.1136/bmj.b2393.
    OpenUrlFREE Full Text
  17. 17.↵
    Moons KG, Donders RA, Stijnen T, Harrell FE, Jr.. Using the outcome for imputation of missing predictor values was preferred. J Clin Epidemiol. 2006 Oct;59(10):1092–101. doi:http://dx.doi.org/10.1016/j.jclinepi.2006.01.009.
    OpenUrlCrossRefPubMedWeb of Science
  18. 18.↵
    Ziogas C, Hillyer J, Saftlas AF, Spracklen CN. Validation of birth certificate and maternal recall of events in labor and delivery with medical records in the Iowa health in pregnancy study. BMC Pregnancy Childbirth. 2022 Mar 22;22(1):232. doi:http://dx.doi.org/10.1186/s12884-022-04581-7.
    OpenUrl
  19. 19.
    Dietz PM, Bombard JM, Hutchings YL, Gauthier JP, Gambatese MA, Ko JY, et al. Validation of obstetric estimate of gestational age on US birth certificates. Am J Obstet Gynecol. 2014 Apr;210(4):335.e1-.e5. doi:http://dx.doi.org/10.1016/j.ajog.2013.10.875.
    OpenUrlPubMed
  20. 20.↵
    Andrade SE, Scott PE, Davis RL, Li DK, Getahun D, Cheetham TC, et al. Validity of health plan and birth certificate data for pregnancy research. Pharmacoepidemiol Drug Saf. 2013 Jan;22(1):7–15. doi:http://dx.doi.org/10.1002/pds.3319.
    OpenUrlCrossRefPubMed
  21. 21.↵
    Zhu Y, Hampp C, Wang X, Albogami Y, Wei YJ, Brumback BA, et al. Validation of algorithms to estimate gestational age at birth in the Medicaid Analytic eXtract-Quantifying the misclassification of maternal drug exposure during pregnancy. Pharmacoepidemiol Drug Saf. 2020 Nov;29(11):1414–22. doi:http://dx.doi.org/10.1002/pds.5126.
    OpenUrl
  22. 22.
    Li Q, Jenkins DD, Kinsman SL. Birth settings and the validation of neonatal seizures recorded in birth certificates compared to Medicaid claims and hospital discharge abstracts among live births in South Carolina, 1996-2013. Matern Child Health J. 2017 May;21(5):1047–54. doi:http://dx.doi.org/10.1007/s10995-016-2200-0.
    OpenUrl
  23. 23.↵
    Eworuke E, Hampp C, Saidi A, Winterstein AG. An algorithm to identify preterm infants in administrative claims data. Pharmacoepidemiol Drug Saf. 2012 Jun;21(6):640–50. doi:http://dx.doi.org/10.1002/pds.3264.
    OpenUrlCrossRefPubMed
  24. 24.↵
    Bodnar LM, Abrams B, Bertolet M, Gernand AD, Parisi SM, Himes KP, et al. Validity of birth certificate-derived maternal weight data. Paediatr Perinat Epidemiol. 2014 May;28(3):203–12. doi:http://dx.doi.org/10.1111/ppe.12120.
    OpenUrlCrossRefPubMed
  25. 25.↵
    Land TG, Landau AS, Manning SE, Purtill JK, Pickett K, Wakschlag L, et al. Who underreports smoking on birth records: a Monte Carlo predictive model with validation. PLoS One. 2012;7(4):e34853. doi:http://dx.doi.org/10.1371/journal.pone.0034853.
    OpenUrlCrossRefPubMed
  26. 26.↵
    Josberger RE, Wu M, Nichols EL. Birth certificate validity and the impact on primary cesarean section quality measure in New York state. J Community Health. 2019 Apr;44(2):222–9. doi:http://dx.doi.org/10.1007/s10900-018-0577-y.
    OpenUrl
  27. 27.↵
    Huybrechts KF, Bateman BT, Hernandez-Diaz S. Use of real-world evidence from healthcare utilization data to evaluate drug safety during pregnancy. Pharmacoepidemiol Drug Saf. 2019 Jul;28(7):906–22. doi:http://dx.doi.org/10.1002/pds.4789.
    OpenUrlCrossRefPubMed
  28. 28.↵
    Li Q, Andrade SE, Cooper WO, Davis RL, Dublin S, Hammad TA, et al. Validation of an algorithm to estimate gestational age in electronic health plan databases. Pharmacoepidemiol Drug Saf. 2013 May;22(5):524–32. doi:http://dx.doi.org/10.1002/pds.3407.
    OpenUrlCrossRefPubMed
  29. 29.
    Toh S, Mitchell AA, Werler MM, Hernandez-Diaz S. Sensitivity and specificity of computerized algorithms to classify gestational periods in the absence of information on date of conception. Am J Epidemiol. 2008 Mar 15;167(6):633–40. doi:http://dx.doi.org/10.1093/aje/kwm367.
    OpenUrlCrossRefPubMed
  30. 30.↵
    Margulis AV, Palmsten K, Andrade SE, Charlton RA, Hardy JR, Cooper WO, et al. Beginning and duration of pregnancy in automated health care databases: review of estimation methods and validation results. Pharmacoepidemiol Drug Saf. 2015 Apr;24(4):335–42. doi:http://dx.doi.org/10.1002/pds.3743.
    OpenUrlPubMed

References

  1. 1.
    CDC. US Centers for Disease Control and Prevention. Defining adult overweight & obesity. 7 June 2021. https://www.cdc.gov/obesity/adult/defining.html. accessed 23 December 2021.
  2. 2.
    CDC. US Centers for Disease Control and Prevention. User’s guide: birth data files, 2019. 2020. https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm, https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/DVS/natality/UserGuide2019-508.pdf Accessed 23 December 2021.
  3. 3.
    CDC. US Centers for Disease Control and Prevention. User’s guide: birth data files, 2020. 2021. https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm, https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/DVS/natality/UserGuide2020.pdf Accessed 23 December 2021.
Back to top
PreviousNext
Posted October 21, 2022.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Distribution of gestational age by maternal and infant characteristics in US birth certificate data: informing gestational age assumptions when clinical estimates are not available
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Distribution of gestational age by maternal and infant characteristics in US birth certificate data: informing gestational age assumptions when clinical estimates are not available
Andrea V Margulis, Brian Calingaert, Alison T Kawai, Elena Rivero-Ferrer, Mary S Anthony
medRxiv 2022.10.19.22281268; doi: https://doi.org/10.1101/2022.10.19.22281268
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Distribution of gestational age by maternal and infant characteristics in US birth certificate data: informing gestational age assumptions when clinical estimates are not available
Andrea V Margulis, Brian Calingaert, Alison T Kawai, Elena Rivero-Ferrer, Mary S Anthony
medRxiv 2022.10.19.22281268; doi: https://doi.org/10.1101/2022.10.19.22281268

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Epidemiology
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)