ABSTRACT
To characterize Coronavirus Disease 2019 (COVID-19) transmission dynamics in each of the metropolitan statistical areas (MSAs) surrounding Dallas, Houston, New York City, and Phoenix in 2020 and 2021, we extended a previously reported compartmental model accounting for effects of multiple distinct periods of non-pharmaceutical interventions by adding consideration of vaccination and Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) variants Alpha (lineage B.1.1.7) and Delta (lineage B.1.617.2). For each MSA, we found region-specific parameterizations of the model using daily reports of new COVID-19 cases available from January 21, 2020 to October 31, 2021. In the process, we obtained estimates of the relative infectiousness of Alpha and Delta as well as their takeoff times in each MSA (the times at which sustained transmission began). The estimated infectiousness of Alpha ranged from 1.1x to 1.4x that of viral strains circulating in 2020 and early 2021. The estimated relative infectiousness of Delta was higher in all cases, ranging from 1.6x to 2.1x. The estimated Alpha takeoff times ranged from February 1 to February 28, 2021. The estimated Delta takeoff times ranged from June 2 to June 26, 2021. Estimated takeoff times are consistent with genomic surveillance data.
One-Sentence Summary Using a compartmental model parameterized to reproduce available reports of new Coronavirus Disease 2019 (COVID-19) cases, we quantified the impacts of vaccination and Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) variants Alpha (lineage B.1.1.7) and Delta (lineage B.1.617.2) on regional epidemics in the metropolitan statistical areas (MSAs) surrounding Dallas, Houston, New York City, and Phoenix.
INTRODUCTION
In 2020, Coronavirus Disease 2019 (COVID-19) transmission dynamics were significantly influenced by non-pharmaceutical interventions [1–7]. In 2021, other factors arose with significant impacts on disease transmission, namely, vaccination [8–9] and emergence of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) variants [10–11].
Mass vaccination in the United States (US) began on December 14, 2020 [12], with demonstrable reduction of disease burden within vaccinated populations [13]. As the vaccination campaign progressed into March 2021, there was widespread reduction in disease incidence [14] and relaxation of state-mandated non-pharmaceutical interventions [15].
In early 2021, SARS-CoV-2 variant Alpha (lineage B.1.1.7) spread across the US and became the dominant circulating strain [16]. By the end of July 2021, the Delta variant (lineage B.1.617.2) had supplanted Alpha [17], concomitant with increases in new COVID-19 case detection [14]. Both Alpha and Delta have been estimated to be more transmissible than strains circulating earlier [18–23], and it was determined that vaccinated persons infected with Alpha and Delta were capable of transmitting disease [24, 25].
The Alpha variant was first detected in Kent, England in September 2020 [26], and was declared a variant of concern (VOC) on December 18, 2020 [26]. The Delta variant was first detected in Maharashtra, India in October 2020 [27] and was declared a VOC on May 6, 2021 [26]. In the literature, previous modeling works have investigated the relative transmissibility and timing of SARS-CoV-2 variants Alpha, Delta, and Omicron in multiple regions/countries, including England, Greece, Iran, China, and the US [28–36]. Modeling approaches applied in these studies involved deterministic [32–34] and stochastic [28, 30] compartmental models of COVID-19 transmission. Other approaches involved statistical models and various forms of regression, e.g., multinomial logistic regression [29] and multivariable binary hyperbolastic regression [36]. Bayesian methods, including the sequential Bayesian method [32], a Bayesian evidence synthesis framework [30], a Bayesian approach to estimate the effective reproduction number Rt [31] and Markov Chain Monte Carlo (MCMC) sampling [28, 33], were predominantly used. The data considered in these studies included case data [28, 30–35] viral load data [28], seroprevalence data [30], information on deaths and hospital admissions [33], age-specific vaccine coverage data [31], sequencing data [29], and biospecimen data [35].
In earlier work, we demonstrated that new COVID-19 case detection over various periods in 2020 can be faithfully reproduced for 280 (out of the 384) metropolitan statistical areas (MSAs) in the US and all 50 states by region-specific parameterizations of a compartmental model that accounts for time-varying non-pharmaceutical interventions [5–7]. We found that the multiple surges in disease incidence seen in 2020 [14] could be explained by changes in protective disease-avoiding behaviors, which we will refer to collectively as social-distancing.
However, in 2021, the model lost its ability to capture disease transmission dynamics, presumably because of the impacts of vaccination and the emergence of more transmissible SARS-CoV-2 variants, namely, Alpha and Delta. Here, to quantify the impacts of vaccination and SARS-CoV-2 variants Alpha and Delta on COVID-19 transmission dynamics, we extended the model of Lin et al. [5] by adding consideration of vaccination and variants with increased transmissibility. We then found region-specific parameterizations of the model using vaccination and surveillance case data available for the MSAs surrounding Dallas, Houston, New York City, and Phoenix.
METHODS
Data
Daily reports of new confirmed COVID-19 cases were obtained from the GitHub repository maintained by The New York Times newspaper [37]. Daily reports of newly completed vaccinations were obtained from the Covid Act Now database for the MSAs surrounding New York City and Phoenix [38]. Because of reporting gaps in the Covid Act Now database, we used a different source of vaccination data for the MSAs surrounding Dallas and Houston, the Democrat and Chronicle newspaper [39]. County-level surveillance and vaccination data were aggregated to obtain daily case and vaccination counts for the MSAs surrounding Dallas, Houston, New York City, and Phoenix. In the case of a missing daily report, we imputed the missing information as described in the Appendix.
Compartmental Model for Disease Transmission Dynamics
We used the compartmental model illustrated in Figure 1 (and Appendix Figure 1) to analyze data available for each MSA of interest. The model consists of ordinary differential equations (ODEs) describing the dynamics of 40 populations (state variables) (Appendix Equations 1–38). The state variables are each defined in Appendix Table 1. Model parameters are defined in Tables 1–3. Key features of the model are described below, and a full description of the model is provided in the Appendix.
Illustration of compartmental model. The independent variable of the model is time t. The 40 dependent state variables of the model are populations, which are represented as boxes with rounded corners. A description of each state variable is given in Appendix Table 1. The 15 highlighted boxes (on the blue background) represent state variables introduced to capture the effects of vaccination and the Alpha and Delta variants. The other 25 boxes represent state variables considered in the model of Lin et al. [5]. Arrows connecting boxes represent transitions. Each transition represents the movement of persons from one population to another. The arrows highlighted in orange represent transitions introduced to capture the effects of vaccination and the Alpha and Delta variants. Other arrows represent transitions considered in the model of Lin et al. [5]. Each arrow is associated with one or more parameters that characterize a rate of movement; these parameters are not shown here but are shown in Appendix Figure 1. A full description of the model is given in the Appendix.
Time t = 0 corresponds to midnight on January 21, 2020. Inferences were conditioned on the compartmental model of Appendix Equations 1–38, consideration of two viral variants (Alpha and Delta, m = 2), and n + 1 distinct social-distancing periods in total (n = 4 for Dallas and Phoenix; n = 3 for New York City and Houston), the fixed parameter estimates of Table 3, and the initial condition I0 and S0 at time t = t0 given in Table 3. The choice of two variants and the setting for n were chosen through a model-selection procedure described in the Appendix. With m = 2 and n = 3, there are 18 adjustable model parameters: t0, β, θ1, y1, θ), y2, σ, p0, λ0, τ1, p1, λ1, τ), p), λ), τ*, p*, and λ*. With m = 2 and n = 4, there are 21 adjustable model parameters: t0, β, θ1, y1, θ), y), σ, p0, λ0, τ1, p1, λ1, τ), p), λ), τ*, p*, λ*, τ(, p(, and λ(. These parameters were jointly inferred together with fD, the parameter of the measurement model (i.e., the fraction of new cases detected and reported) (Appendix Equation 40), and τ, the dispersion parameter of the statistical model for noise in case detection and reporting (i.e., the adjustable parameter of the negative binomial likelihood function) (Appendix Equations 41–43). We assumed a uniform proper prior, as described in the Appendix.
We extended the model of Lin et al. [5] by including 15 new populations and 28 new transitions. Briefly, new parts of the model can be described as follows. Vaccination is modeled by moving susceptible persons (in the SM and SPpopulations) into the V1 population. Another consequence of vaccination is the movement of recovered unvaccinated persons (in the RU population) into the RV population. The rate of vaccination is changed as needed to match the empirical daily rate of vaccination. Recovered and susceptible persons have the same per capita probability of vaccination. Persons in SM are mixing (i.e., not practicing social-distancing) and persons in SP are practicing social-distancing (and thereby protected from infection to a degree). The series of transitions involving the populations V1, …, V& was introduced to model the immune response to vaccination (i.e., the amount of time required for vaccination to induce neutralizing antibodies). With this approach, the time from vaccination to appearance of neutralizing antibodies is a random variable characterized by an Erlang distribution. Persons in V1, …, V& may be infected. Persons in V& transition to one of the following four populations: SV,1, …, SV,(. These populations represent persons with varying degrees of immune protection. Persons in SV,1 are not protected against productive infection (i.e., an infection that can be transmitted to others) by any viral strain. Persons in SV,) are protected against productive infection by viral strains present before the emergence of Alpha but not Alpha or Delta. Persons in SV,* are protected against productive infection by viral strains present before the emergence of Alpha and also Alpha but not Delta. Persons in SV,( are protected against productive infection by all of the viral strains considered up to October 31, 2021. Vaccinated persons who become infected move into EV. The time spent in EV corresponds to the length of the incubation period for vaccinated persons. The mean duration of the incubation period is taken to be the same for vaccinated and unvaccinated persons; however, as a simplification, for vaccinated persons, the time spent in the incubation period is taken to consist of a single stage and consequently is exponentially distributed (vs. Erlang distributed for unvaccinated persons). Non-quarantined exposed persons in populations EV and E+,M and E+,P for i = 2, …, 5 are taken to be infectious. Persons exiting EV leave the incubation period and enter the immune clearance phase of infection, during which they may be asymptomatic (AV) or symptomatic with mild disease (IV). Non-quarantined asymptomatic persons in populations AV, AM, and AP are taken to be infectious. Persons in AV are assumed to eventually recover (i.e., to enter RV). Persons with mild symptomatic disease may recover (i.e., enter RV) or experience severe disease, at which point they move to HV (in hospital or isolated at home). Vaccinated persons have a diminished probability of severe disease in comparison to unvaccinated persons. Persons in HV either recover (move to RV) or die from COVID-19 complications (move to D). For a person with severe disease, the probability of death is independent of vaccination status. We assume that vaccinated persons do not participate in social-distancing, quarantine, or self-isolation driven by symptom awareness.
The new compartments and transitions, which are highlighted in Figure 1, capture vaccination among susceptible persons, recovered persons and infected non-quarantined persons without symptoms at a time-varying per capita rate μ(t). The value of μ(t) changes daily for consistency with MSA-specific daily reports of completed vaccinations (Appendix Equation 37) from the COVID Act Now database and the Democrat and Chronicle COVID-19 vaccine tracker. The model also captures immune responses to vaccination yielding varying degrees of protection and consequences of breakthrough infection of vaccinated persons. Vaccine protection against transmissible infection was taken to be variant-dependent.
We introduced a dimensionless step function, denoted Yθ(t) (Appendix Equation 38), which multiplies the disease-transmission rate constant β to account for m variants. In this study, m = 2 (see Appendix Equations 1–4, 18–22, and 24). Thus, in the new model, the quantity Yθ(t)β (vs. β alone) characterizes disease transmissibility at time t. The step function Yθ(t) was initially assigned a value of y0 = 1, and the value of Y1(t) was allowed to increase at times t = θ1 and t = θ) (Appendix Equation 38). The disease transmission rate constant of the Alpha variant was considered by introducing a step increase from y0β ≡ 1β to y1β (with y1 > 1) at time t = θ1 (the Alpha takeoff time). Similarly, the disease transmission rate constant of the Delta variant was considered by introducing a step increase from y1β to y)β (with y) > y1) at time t = θ) > θ1 (the Delta takeoff time). We will refer to y1 and y) as the Alpha and Delta transmissibility factors, respectively.
As in the original model of Lin et al. [5], the extended model accounts for a series of n + 1 distinct social-distancing periods (an initial period and n additional periods). Social-distancing periods are characterized by two step functions: Pτ(t) and Λτ(t). The values of these functions change coordinately at a set of switch times τ = (σ, τ1, …, τn) (Appendix Equations 35 and 36), where σ is the start time of the initial social-distancing period and τ+ is the start time of the ith social-distancing period after the initial social-distancing period. The values of Pτ(t) and Λτ(t) are zero before time t = σ. The value of Pτ(t) defines the steady-state setpoint fraction of the susceptible population practicing social-distancing at time t, and the value of Λτ(t) defines a time scale for establishment of the steady state. The value of Λτ(t) is an eigenvalue equal to a sum of social-distancing rate constants [5]. The non-zero values of Pτ(t) and Λτ(t) are denoted p0, …, pn and λ1, …, λn. We assume that vaccinated persons do not practice social-distancing. Recall that we use the term “social-distancing” to refer to behaviors adopted to protect against infection. These behaviors are assumed to reduce the risk of infection by a factor mb.
Parameters
As indicated in Tables 1 and 2, we used MSA-specific case reporting data available up to October 31, 2021 to infer MSA-specific values for parameters characterizing the start time of the local epidemic (t0), local disease transmissibility of ancestral viral strains (β), local social-distancing dynamics (σ, λ1, p1, τ+, λ+ and p+, for i = 1, …, n), local emergence of variants (θ1, y1, θ), y)), the local rate of new case detection (fD), and noise in local case detection and reporting (τ). Values for other parameters were fixed (Table 3); inferences are conditioned on these fixed parameter estimates. There are 18 fixed parameters taken to be applicable for all MSAs. The total regional population S2, which is taken to be fixed, was set on the basis of census data. The real-time per capita vaccination rate μ(t), a piecewise linear function, was set for consistency with the current empirical per capita rate of vaccination [38]. We adopted the fixed parameter estimates of Lin et al. [5]. New fixed parameter estimates made in this study for m2, f0, f1, f2, and kV are explained in the Appendix. The m2 parameter characterizes vaccine protection against severe disease, the f0, f1, and f) parameters account for differential vaccine effectiveness against the three viral strains considered in this study (ancestral, Alpha, and Delta), and the kV parameter characterizes the waiting time between vaccination and the acquisition of vaccine-induced immunity. Our model does not account for gradual loss of immunity over time. We let μG(t) = μ+ for times t throughout the ith day after January 21, 2020 (Appendix Equation 37), where μ+ is the fraction of the local population reported to complete vaccination over the 1-d surveillance period [38]. We then defined μ(t) as the piecewise linear interpolant to μG(t). In summary, for a given inference, the number of adjustable parameters was 2m + 3n + 4, where m is the number of variants under consideration (m = 2 in this study) and n is the number of distinct social-distancing periods being considered beyond an initial social-distancing period. The setting for n was determined through a model-selection procedure described in the Appendix.
Auxiliary Measurement Model
We assumed that state variables of the compartmental model (Figure 1, Appendix Table 1) are related to the expected number of new cases reported on a given calendar date through an auxiliary measurement model (Appendix Equations 39 and 40). The measurement model has one parameter: fD, the region-specific fraction of new symptomatic infections detected. Thus, fD ∈ [0, 1]. As a simplification, we considered fD to be time-invariant. This simplification means that we assumed, for example, that case detection was neither limited nor strongly influenced by testing capacity, which varied over time. This assumption is reasonable if, for example, case detection is mainly determined by presentation for testing and, moreover, the motivations and societal factors that influence presentation remained roughly constant over the period of interest. One can also interpret fD as the time-averaged case detection rate. The measurement-model parameter fDwas inferred jointly with the adjustable model parameters and the likelihood parameter τ (see below).
Statistical Model for Noise in Case Detection and Reporting
We assumed that noise in new case detection and reporting on the ith day after January 21, 2020 is captured by a negative binomial distribution NB(τ, qi) centered on I(t+, t+31), the expected number of new cases detected over the ith day after January 21, 2020 as given by the compartmental model and the auxiliary measurement model (Appendix Equations 1–40). These and other assumptions led to the likelihood function used in inference (Appendix Equations 41– 43). We determined the probability parameter q+ in NB(τ, q+) using Appendix Equation 43; the dispersion parameter τ was taken to be a time-invariant adjustable parameter applicable for all days of case reporting. The likelihood parameter τ was inferred jointly with the adjustable model parameters and the adjustable measurement-model parameter fD.
Computational Procedures
We determined the intervals of the step functions Yθ(t), Pτ(t), and Λτ(t) (i.e., θ and τ) using a model-selection procedure described in the Appendix. Simulations and Bayesian inferences were performed as previously described [5–7] and in the Appendix. Files needed to reproduce inferences using the software package PyBioNetFit [40] are available online [41]. The files include case data, vaccination data, and diagnostic plots related to Bayesian inference using Markov chain Monte Carlo (MCMC) sampling, including trace plots [41]. Summary diagnostics characterizing the sampling for each MSA are given in Table 4. Briefly, we computed the stable Gelman-Rubin statistic using the methodology of Vats and Knudson [42]. An advantage of this statistic (over the original Gelman-Rubin statistic) is that only one Markov chain is required in sampling. In addition, the stable Gelman-Rubin statistic and effective sample size (ESS) have a one-to-one relationship.
RESULTS
Explanatory power of the model
As illustrated in Figures 2A, 3A, 4A, and 5A for the Dallas, Houston, New York City, and Phoenix MSAs respectively, the new model is able to explain surveillance case data over the period starting on January 21, 2020 and ending on October 31, 2021. The surveillance case data—daily reports of newly detected COVID-19 cases—for each of the 4 MSAs of interest largely lie within the 95% credible interval of the posterior predictive distribution for new case detection, which indicates that each regional model has explanatory power for the period of interest. Parameter estimates are summarized in Tables 1–3. Table 1 provides region-specific maximum a posteriori (MAP) estimates obtained through Bayesian inference enabled by Markov chain Monte Carlo (MCMC) sampling. Table 4 provides metrics that measure the quality of MCMC sampling. The diagnostic results of Table 4 indicate that sampling was acceptable, because all multivariate potential scale reduction factors (PSRFs) are very close to 1 and all multivariate effective sample sizes (ESSs) are at least 100. Diagnostic plots are provided online [41].
Inference results obtained for the MSA surrounding Dallas using regional surveillance data—daily reports of new COVID-19 cases—available for January 21, 2020 to October 31, 2021. A) Credible intervals of the time-dependent posterior predictive distribution for detected and reported new cases are shown. The monochrome bands from innermost to outermost indicate the 20%, 50%, and 95% credible intervals. This colored region all together indicates the 95% credible interval and can be expected to cover approximately 95% of the data. Empirical case reports are indicated by black symbols. Alpha prevalence is indicated with a shaded light-yellow background. Delta prevalence is indicated with a shaded light-green background. B) The social-distancing stationary setpoint is given by Pτ(t) and the variant transmissibility factor is given by Yθ(t). Note that the values of Pτ(t) and Yθ(t) are dimensionless. Credible intervals corresponding to 1000 samples from the time-dependent posterior predictive distributions are shown for Pτ(t) and Yθ(t). The curve corresponding to Yθ(t) is monotonically increasing with an initial value of 1. The curve corresponding to Pτ(t) is decreasing from left to right after the start of social-distancing. C) Inferred changes in the distribution of persons amongst five selected subpopulations over the course of the local COVID-19 epidemic. The five populations sum to a constant, S0, the total population. Results shown here are based on the parameter values given in Tables 1 and 3. The five populations are defined as follows: the population of susceptible unvaccinated persons (blue area) is given by SM + SP, the population of susceptible vaccinated persons (green area) is given by ∑& V+ + SV,1 + Uθ (t)SV,) + Uθ (t)SV,*, the population of actively infected persons (orange area) is given by HU + HV + EV + Σ5i=1(Ei,m + Ei,p + Ei,Q + Σx∈{M,P,Q,V|(Ax + Ix), the population of removed unvaccinated persons (gray area) is given by RU + D, and the population of removed vaccinated persons (yellow area) is given by RV + (1 − U1(t))SV,) + (1 − U2(t))SV,* + SV,(. Except for U1(t) and U2(t), the terms in the above definitions refer to state variables of the compartmental model of Figure 1 and Appendix Figure 1, which are defined in Appendix Table 1. U1(t) and U2(t) are unit step functions (Appendix Equation 33 and 34), which change value from 0 to 1 at time t = θ1 and t = θ), respectively. Recall that θ1 and θ) are the Alpha and Delta takeoff times. The sum ∑+(μ+ × 1 d) (purple dots), which is the empirical cumulative number of completed vaccinations, is shown as a function of time t.
Quantification of the impacts of non-pharmaceutical interventions and emergence of variants
Each regional model (parameterized to reproduce MSA-specific case reports) provides insight into the impacts of social-distancing behaviors and the emergence of the Alpha and Delta variants; compare panel A and the corresponding panel B in Figures 2–5. For example, as can be seen in Figure 4A, the New York City MSA experienced four notable surges in disease incidence over the period of interest. Figure 4B suggests that the first surge ended because of adoption of social-distancing behaviors, the second surge occurred because of relaxation of social-distancing behaviors, the third surge was caused by Alpha, and the fourth surge was caused by Delta.
Inference results obtained for the MSA surrounding Houston using regional surveillance data—daily reports of new COVID-19 cases—available for January 21, 2020 to October 31, 2021. It should be noted that four anomalous (negative) empirical case counts are not shown in the plot. A case count of 14,300 cases on September 21, 2020 is not shown in the plot. See the caption of Figure 2 for additional information.
Inference results obtained for the MSA surrounding New York City using regional surveillance data—daily reports of new COVID-19 cases—available for January 21, 2020 to October 31, 2021. It should be noted that a single anomalous (negative) empirical case count is not shown in the plot. See the caption of Figure 2 for additional information.
Inference results obtained for the MSA surrounding Phoenix using regional surveillance data—daily reports of new COVID-19 cases—available for January 21, 2020 to October 31, 2021. It should be noted that two anomalous (negative) empirical case counts are not shown in the plot. See the caption of Figure 2 for additional information.
Interestingly, in other MSAs, Alpha had relatively little impact on disease incidence (compare Figures 4A and 4B to Figures 2A and 2B, 3A and 3B, and 5A and 5B). These differences are attributable to various factors, including the lower contagiousness of COVID-19 in Dallas, Houston, and Phoenix relative to New York City (see entries for β in Table 1), the later arrival of Alpha in Houston and Phoenix relative to New York City, and further progress of mass vaccination in Phoenix relative to New York City. Alpha takeoff, as well as Delta takeoff, is tracked locally in panel B (rightmost vertical axis) of Figures 2–5, and vaccination progress is tracked locally in panel C of Figures 2–5.
Impacts of vaccination
On the basis of region-specific parameterizations, we estimated the immune and susceptible fractions of each MSA population, as well as the fractions that achieved immunity through infection and/or vaccination (see panel C in Figures 2–5). Each of these panels shows the time evolution of five different populations in an MSA. In each MSA, only a minority of the population remained susceptible to infection (with Delta) on October 31, 2021, with a sizable fraction of the susceptible population being protected to a degree against severe disease by having completed vaccination.
Estimates of transmissibility factors and takeoff times of Alpha and Delta
Our inferences provide quantitative insights into the transmissibility factors of Alpha and Delta (y1 and y)) and their takeoff times (θ1 and θ)) in each of the 4 MSAs of interest. Figure 6 shows the marginal posteriors of y1, y), θ1, and θ), which were found on the basis of surveillance data for each MSA (daily case counts) reported between January 21, 2020 and October 31, 2021. Maximum a posteriori (MAP) estimates and 95% credible intervals for θ1 and θ) for each MSA are also shown in Figure 6. The MAP estimates for y1 range from 1.1 (for the Dallas MSA) to 1.4 (for the New York City MSA). The MAP estimates for y) range from 1.6 (for the Dallas MSA) to 2.1 (for the New York City MSA). The MAP estimates for θ1 range from February 1, 2021 (for the Dallas MSA) to February 28, 2021 (for the Phoenix MSA). The MAP estimates for θ) range from June 2, 2021 (for the Dallas MSA) to June 26, 2021 (for the Houston and Phoenix MSAs). The estimated takeoff times are consistent with regional genomic surveillance data [43], which are summarized by the shaded regions of panel A in Figures 2–5.
Marginal posterior distributions of transmissibility factors for Alpha in dark-green and Delta in light-green (left panels) and takeoff times for Alpha and Delta (right panels) in four MSAs centered on (A,B) Dallas, (C,D) Houston, (E,F) New York City, and (G,H) Phoenix. Inferences are based on daily reports of new cases from January 21, 2020 to October 31, 2021. For each of the right panels, the 95% credible intervals for Alpha and Delta takeoff times are indicated in parentheses, and the MAP estimate for a given region is indicated to the left of the credible interval.
DISCUSSION
We extended a model for COVID-19 transmission dynamics that already incorporated time-varying changes in non-pharmaceutical interventions [5] to include the effects of vaccination and new variants. This model together with its region-specific parameterizations based on case data available through October 31, 2021 provide quantitative insights into the relative infectiousness of SARS-CoV-2 variants Alpha (lineage B.1.1.7) and Delta (lineage B.1.617.2). The increased transmissibility of Alpha and Delta in comparison to ancestral strains is characterized by the marginal posteriors for the transmissibility factors y1 and y) shown in Figure 6 (panels A, C, E, and G). The maximum a posteriori (MAP) estimates of the transmissibility factors were similar across the four metropolitan statistical areas (MSAs) of interest (centered on Dallas, Houston, New York City, and Phoenix). The averages of our y1 and y) MAP estimates indicate that Alpha was 1.2x more infectious than ancestral strains, whereas Delta was 1.9x more infectious (corresponding to Delta being approximately 50% more infectious than Alpha). These estimates are consistent with estimates provided in other studies [18–23]. Using a formula for the basic reproduction number R0 derived earlier [7] and parameter estimates of Tables 1 and 3, we obtain an R1 estimate of 12 for Delta in the New York City MSA, which places this SARS-CoV-2 variant among the more infectious viruses known.
We also obtained estimates of precisely when sustained transmission of Alpha and Delta began in each of the four geographically distinct MSAs. In earlier work, takeoff times for Alpha and Delta were estimated only for regions outside the US (e.g., Denmark [44] and Mexico [45]), for regions in the US different from those considered here (e.g., New England [46] and two MSAs in Northern California [47]), and for the entire US [48]. The takeoff times for the four MSAs in the present study are characterized by the marginal posteriors for θ1 and θ) shown in Figure 6 (panels B, D, F, and H). The estimated takeoff times are consistent with the observed prevalence of Alpha and Delta sequences detected in regional genomic surveillance [43], as can be seen by comparing the two shaded regions in panel A of Figures 2–5 against the changes in transmissibility Y(t) depicted in the corresponding panel B of Figures 2–5. It should be noted that the case data shown in Figures 2–5 are from the MSAs of interest (through aggregation of county-level data), whereas the genomic surveillance data are from larger regions [43].
Our study has notable limitations, starting with the obvious uncertainties related to model assumptions and fixed parameter estimates, which are discussed in some detail in the Appendix. For example, our model assumes a constant rate of case detection and neglects gradual loss of sterilizing immunity over time. This latter simplifying assumption is supported by a study indicating that protection against re-infection from pre-Omicron variants is significant and remains high even after 40 weeks [49]. Moreover, we caution that the inference jobs performed in this study were challenging because of the relatively high-dimensional parameter spaces involved (in comparison to typical inferences involving an ODE model-constrained likelihood function). Diagnostics indicate good sampling (Table 4, [41]), but we cannot be entirely certain that the samples obtained fully characterize the parameter posteriors of interest. It can be seen in trace plots [41] that mixing across parameters was not uniform, and moreover, in some cases, there are indications of trends (manifesting as a trace plot that lacks the appearance of a horizontally extended, fuzzy caterpillar). The likely cause of these trace plots is poor local performance of the proposal kernel, which was optimized in the adaptive sampling scheme for global performance. The consequences of poor mixing may include biased parameter and uncertainty estimates. Another concern is that the model incorporates redundant disease-incidence surge mechanisms. In the model, an increase in viral infectiousness caused by the emergence of a new variant can be mimicked, to some extent, by a relaxation of social-distancing, and vice versa. For the MSAs considered here, inferred social-distancing levels were low at the time of Alpha and Delta emergence, so the inferred transmissibility factors probably reflect, at least mostly, changes in intrinsic viral infectiousness.
It is known that viral transmissibility does not depend only on viral features but also on population features [7]. This study looked at multiple regions to ascertain whether population features varied significantly. Our analysis indicates that population features influencing transmission did not vary dramatically across these similar urban regions because we obtained similar region-specific estimates for the Alpha and Delta transmissibility factors. The Alpha transmissibility factors for the Dallas, Houston, New York City, and Phoenix MSAs were 1.1x, 1.2x, 1.4x, and 1.3x, respectively. the Delta transmissibility factors for these MSAs were 1.6x, 2.0x, 2.1x, and 1.8x, respectively.
We show that it is possible to gain insights into variant dynamics through a mechanistic modeling approach. This study provides an example for modelers interesting in understanding the impacts of a mass vaccination campaign and emergence of variants with altered transmissibility during an epidemic of an aerosol-transmitted respiratory disease similar to COVID-19.
In summary, this report provides estimates of Alpha and Delta transmissibility for specific regions within the US as well as their takeoff times.
Data Availability
Files needed to reproduce the Bayesian inference results and MCMC sampling diagnostic plots are available online at https://github.com/lanl/PyBNF/tree/master/examples/Vax_and_Variants
APPENDIX
Imputation of Missing Daily Case Counts
By October 31, 2021, many regions in the US were not reporting new detected COVID-19 cases on a strictly daily basis. When one or more daily case counts were not available, we imputed daily case counts on the basis of a linear fit to the two nearest available cumulative case counts. This approach has the effect of evenly distributing case counts across the days for which daily reports are unavailable.
Equations of the Compartmental Model
The compartmental model, which is illustrated in Figure 1 and Appendix Figure 1, consists of the following 40 ordinary differential equations (ODEs):
In these equations, the independent variable is time t, and the state variables (SM, SP, E1,M, …, E=,M, E1,P, …, E=,P, E),:, …, E=,:, AM, AP, AQ, IM, IP, IQ, HU, D, RU, V1, …, V&, SV,1, …, SV,(, EV, AV, IV, HV, and RV) represent 40 (sub)populations (Appendix Table 1), which change over time. Thus, each ODE in Equations (1)–(28) defines the time-rate of change of a population, i.e., the time-rate of change of a state variable. Note that Equations (5), (6), (8) and (19) define 4, 4, 3, and 5 ODEs of the model, respectively. The model is formulated such that S0, the total population, is a constant. Thus, the model does not account for birth, death for reasons other than COVID-19, immigration, or emigration.
The initial condition associated with Equations (1)–(28) is taken to be SM(t0) = S1, IM(t0) = I1 = 1, and all other populations (SP, E1,M, …, E=,M, E1,P, …, E=,P, E),:, …, E=,:, AM, AP, AQ, IP, IQ, HU, D, RU, V1, …, V&, SV,1, …, SV,(, EV, AV, IV, HV, and RV) are equal to 0. Recall that the parameter S0 denotes the total region-specific population size. Thus, we assume that the entire population is susceptible at the start of the local epidemic at time t = t0 > 0, where time t = 0 corresponds to 00:00 hours (midnight) on January 21, 2020. The parameter I0 denotes the number of infectious symptomatic persons at the start of the regional epidemic.
In the model, the parameters β, kL, kQ, jQ, τA, τI, τH, and kV are positive-valued rate constants (all with units of d-1), and the parameters mb, m2, fA, fH, fR, f0 ≥ f1, f1 ≥ f), and f) are (dimensionless) fractions. Brief definitions of parameters are given in Tables 1 and 2.
In the model, the quantities φM(t, ρ), φP(t, ρ), and φV(t) are functions of (time-dependent) state variables (as defined below), which represent the population of infectious persons who are mixing freely (i.e., not practicing social-distancing), the population of infectious persons who are practicing social-distancing (i.e., adopting disease-avoiding behaviors), and the population of persons eligible for vaccination, respectively. The quantities φM(t, ρ) and φP(t, ρ) are also functions of ρ ≡ (ρE, ρA), where ρE (ρA) is a dimensionless ratio representing the infectiousness of persons in the incubation phase of infection (the infectiousness of asymptomatic persons in the immune clearance phase of infection) relative to the infectiousness of symptomatic persons with the same social-distancing behavior. The quantity φV(t) represents the population of persons eligible for vaccination.
The state variables that appear in these equations represent time-varying populations. Recall that state variables are defined in Appendix Table 1.
In the model, the quantities Uσ(t), U1(t), and U2(t) are unit step functions. The values of these functions change from 0 to 1 at the times indicated by the subscripts: σ, the onset time of the initial social-distancing period; θ1, the takeoff time of SARS-CoV-2 variant Alpha; and θ), the takeoff time of SARS-CoV-2 variant Delta.
As indicated in Appendix Figure 1, transitions from SM to SP, for example, become possible at time t = σ, transitions from SV,) to EV become possible at time t = θ1, and transitions from SV,* to EV become possible at time t = θ).
In the model, the quantities Pτ(t), and Λτ(t) are step functions that characterize changes in social-distancing. The value of Pτ(t) determines a setpoint steady-state fraction of susceptible persons who are practicing social-distancing. The value of Λτ(t) determines a time scale for approach to the setpoint steady state. Changes in the values of Pτ(t) and Λτ(t) occur coordinately. These changes occur at times t = σ, τ1, …, τn, where n is the number of distinct social-distancing periods beyond an initial social-distancing period. Initially, we took n = 7 (i.e., 8 total social-distancing stages). The value of n is decremented by 1 (at an inferred time) if n ← n − 1 is determined to be admissible by a model-selection procedure, which is described below. It should be noted that p0, p1, …, pn are parameters of Pτ(t) and that λ1, λ1, …, λn are parameters of Λτ(t). These parameters determine the non-zero values of the step functions over different periods. For example, p1 is the value of Pτ(t) and λ1 is the value of Λτ(t) for the period t ∈.
The values of Pτ(t) and Λτ(t) are 0 for t < σ.
In the model, the quantity μ(t) is a piecewise linear interpolant to a function μG(t) that characterizes the current rate of vaccination. The value of μG(t) is determined by the empirical daily rate of vaccination, and thus, can vary from day to day. Daily vaccination data were extracted from the Covid Act Now database and the Democrat and Chronicle newspaper [2] using the Covid Act Now Data ApI [1] and web scraping. We will use μ+ to refer to the value of μG(t) for t ∈ [t+, t+31), where time t+ corresponds to midnight on the ith day after January 21, 2020.
Settings for μ+ were made such that μ+S0 × 1 d is the number of vaccinations completed in the nearest past 1-d period according to Covid Act Now data and the Democrat and Chronicle newspaper.
In the model, the quantity Y(t) is a step function that quantifies how disease transmissibility increases upon emergence of SARS-CoV-2 variants Alpha and Delta. Initially, Y(t) = 1. The value of Y(t) is increased (by an inferred factor greater than 1 at an inferred time, θ1 or θ)) if the change is determined to be admissible by a model-selection procedure, which is described below. It should be noted that y1 and y) are parameters of Y(t). These parameters determine the values of the step function Y(t) over different periods: y1 is the value of Y(t) for the period t ∈ [θ1, θ)) and y) is the value of Y(t) for the remaining period of concern (with Delta as the dominant circulating viral strain).
where m is the number of viral variants that have emerged up to the current time, θ0 ≡ t1, and y0 ≡ 1. Here, we consider m = 2.
Equations of the Auxiliary Measurement Model
As in the study of Lin et al. [3], we assumed that only symptomatic persons are detected in testing. The accumulation of symptomatic persons is governed by
where CS(t) is the cumulative number of symptomatic persons (cases) at time t. Here, unlike in the study of Lin et al. [3], the expression for CS(t) accounts for exposed persons in quarantine. Initially, CS = 0. We numerically integrated Appendix Equation (39) together with the ODEs of the compartmental model. From the trajectory for CS, we derive a prediction for the expected number of new COVID-19 cases reported on calendar date Di, I(t+, t+31), using the following equation:
where fDis an adjustable region-specific parameter characterizing the time-averaged fraction of symptomatic cases detected and reported, t+ corresponds to midnight on the ith day after January 21, 2020, and t+31 − t+ is the reporting interval (1 d). We compare I(t+, t+31) to δC+, the number of new cases reported on calendar date Di.
Definition of the Likelihood Function
Bayesian inference relies on the definition of a likelihood, which here serves the purpose of assessing the compatibility of available surveillance data with adjustable (free) parameter values. Let us use {δC+}di=0 to denote the daily case reporting data available between 0 and d days after midnight on January 21, 2020 (the date of the first case report in the US) and let D = {δC+}d. Let us use θC(n, m) to denote the set of adjustable (free) parameter values. The number of adjustable parameters, |θC|, depends on n, the number of social-distancing periods considered beyond an initial social-distancing period, and m, the number of SARS-CoV-2 variants under consideration. As in the study of Lin et al. [3], we assume that δC+, the number of new COVID-19 cases detected over a 1-d period and reported on calendar date Di for a given region, is a random variable and its expected value follows a model-derived deterministic trajectory given by I(t+, t+31) (Equation 40). We assume that day-to-day fluctuations in the random variable are independent and characterized by a negative binomial distribution NB(τ, q+), which has two parameters, τ > 0 and q+ ∈ (0,1). Note that D[NB(τ, q+)] = τ(1 − q+)/q+. We assume that this distribution has the same dispersion parameter τ across all case reports. With these assumptions, we arrive at the following likelihood function:
where
and
In these equations, i is an integer indicating the date D+ and period (t+, t+31); nbinom(δC+; τ, q+) is the probability mass function of the negative binomial distribution NB(τ, q+), and θC(n, m) = {t0, β, σ, τ1, …, τn, p0, p1, …, pn, λ0, λ1, …, λn, θ1, …, θm, y1, …, ym, fD, τ} for n ≥ 1 and m ≥ 1; θC(0,0) = {t0, β, p0, λ1, fD, τ}.
parameters
Each model parameter is briefly described in Tables 1–3. These parameters have either fixed values or adjustable values (i.e., values inferred from surveillance data). The fixed values may be universal (i.e., applicable to all MSAs of interest) or MSA-specific. All inferred parameter values are MSA-specific. In addition, the measurement model (Appendix Equations 39 and 40) has one adjustable MSA-specific parameter, fD, and the likelihood function (Appendix Equations 41–43) has one adjustable MSA-specific parameter, τ. Values of the other likelihood parameters, q0, …, qd, are constrained and are determined using Appendix Equation 43.
Original model of Lin et al.
The model shares 19 + 3n parameters with the model of Lin et al. [3], including parameters that define the initial condition (t0, I1, and S0). (Recall that n is the number of social-distancing periods being considered beyond the initial social-distancing period.) The shared parameters are t0, I1, S2, β, σ, τ1, …, τn, p0, …, pn, λ0, …, λn, ρA, ρE, mb, fA, fH, fR, kL, kQ, jQ, τA, τH, and τI. As in the study of Lin et al. [3], we inferred MSA-specific values for the following parameters: t0, β, σ, p1, …, pn, and λ1, …, λn. We also inferred MSA-specific values for τ1, …, τn provided that n ≥ 1. As in the study of Lin et al. [3], the remaining 14 parameters shared between the old and new models (I0, S1, ρA, ρE, mb, fA, fH, fR, kL, kQ, jQ, τA, τH, and τI) were taken to have fixed values, and we adopted the settings of Lin et al. [3] for these parameters (Table 3). These settings are universal except for the setting for S1, the total population, which is MSA-specific.
Extension of the model of Lin et al.
Our extension of the model of Lin et al. [3] introduces 5 + 2(m + 1) + (d + 1) parameters, where m (= 0, 1 or 2) is the number of SARS-CoV-2 variants being considered and d is the number of days since January 21, 2020: θ1, …, θm, y0, …, ym, m2, f0, f1, f), kV, and μ0, …, μd. The θ and y parameters are variant takeover times and transmissibility factors, respectively, except that the value of θ1 is defined as t0 and the value of y0 is defined as 1. The Alpha transmissibility factor y1, the Alpha takeoff time θ1, the Delta transmissibility factor y), and the Delta takeoff time θ) were inferred for each MSA with m = 2 (cf. Table 1). The transmissibility factors were each constrained to be greater than or equal to 1. The settings for μ1, …, μd are empirical and MSA-specific. Each μ+ is set such that μ+S0 × 1 d is the number of vaccinations completed over the past 1-d period nearest to the ith day after January 21, 2020. As noted earlier, the number of completed vaccinations was obtained for each MSA from Covid Act Now and the Democrat and Chronicle newspaper [2] using the Covid Act Now Data ApI [1] and web scraping. In the spreadsheet accessed daily from Covid Act Now, the ‘metrics.vaccinationsCompletedRatio’ column gives the percentage of the total population that has received the recommended number of doses: one dose for Ad26.CoV2.S (Janssen, Johnson & Johnson) or two doses for mRNA-1273 (Moderna) and BNT162b2 (pfizer-BioNTech). As a simplification, we considered all completed vaccinations to be equivalent. The parameters m2, f0, f1, f), and kV were assigned fixed universal estimates (Table 3). Each of these estimates is explained below.
Estimation of kV
The rate constant kV characterizes the rate of transition out of compartment V+ for i = 1, …, nV. Recall that, in the model, susceptible persons enter V1 upon vaccination (Figure 1, Appendix Figure 1). The values of nV (= 6) and kV (= 0.3 d-1) were selected so that the time a person spends in V1, …, Vn), which we will denote as tV, is distributed approximately the same as, the waiting time between vaccination of a previously uninfected person and detection of vaccine-induced SARS-CoV-2-specific IgG antibodies [4] (Appendix Figure 2). According to the model, the time that a person spends in V1, …, Vn)is distributed according to the probability density function f(tV; nV, kV) = kn)tn)–1d-k)t)/(nV − 1)!, i.e., tV is Erlang distributed with shape parameter nV = 6 and rate parameter kV = 0.3 d-1. As can be seen in Appendix Figure 2, the cumulative distribution function of this Erlang distribution reasonably captures the empirical cumulative distribution of waiting times observed in the longitudinal study of Korodi et al. [4].
Thus, in the model, passage through V1, …, V& with rate constant kV = 0.3 d-1 accounts for the variable and significantly non-zero amount of time required for development of a protective antibody response after vaccination.
Estimation of f0, f1, and f2
The parameters f0> f1, f1 > f), and f) are fractions that characterize the average effectiveness of vaccines used in the US and that determine the sizes of (mutually exclusive) subpopulations of vaccinated persons having different susceptibilities to productive infection (i.e., an infection that can be transmitted to others): SV,1, SV,), SV,*, and SV,( (Figure 1, Appendix Figure 1). We assume that persons in the SV,1 subpopulation are susceptible to productive infection by any of the viral strains considered, and in contrast, we assume that persons in the SV,( subpopulation are susceptible to productive infection by none of the viral strains considered. persons in the SV,) subpopulation are taken to be susceptible to productive infection by the Alpha and Delta variants but not viral strains in circulation before the emergence of Alpha. persons in the SV,* subpopulation are taken to be susceptible to productive infection by the Delta variant but not the Alpha variant or viral strains in circulation before the emergence of Alpha. The quantity 1 − f0 defines the fraction of vaccinated persons who enter the SV,1subpopulation after exiting V&, the quantity f0 − f1 defines the fraction of vaccinated persons who enter the SV,) subpopulation after exiting V&, the quantity f1 − f) defines the fraction of vaccinated persons who enter the SV,* subpopulation after exiting V&, and f) defines the fraction of vaccinated persons who enter the SV, (subpopulation after exiting V&. We take f0 to characterize vaccine effectiveness before the emergence of Alpha. According to Thompson et al. [5], vaccine effectiveness was initially 90%. Thus, we set f1 = 0.9. We take f1 to characterize vaccine effectiveness after the emergence of Alpha but before the emergence of Delta. According to puranik et al. [6], in May 2021, vaccine effectiveness was 81%. Thus, we set f1 = 0.81. We take f) to characterize vaccine effectiveness after the emergence of Delta. According to Tang et al. [7], the effectiveness of two doses of the pfizer-BioNTech vaccine (BNT162b2) against Delta is 53.5% and the effectiveness of two doses of the Moderna vaccine (mRNA-1273) against Delta is 84.8%. Taking the average of these figures, we set f) = 0.69.
Estimation of mh
The parameter m2 characterizes the reduced risk of severe disease for a vaccinated person in the case of a breakthrough infection. We set m2 = 0.04, i.e., we assumed that there is a 25-fold reduction in the risk of severe disease for infected persons who have been vaccinated, which is consistent with the observations of Lopez Bernal et al. [8].
Notable New Modeling Assumptions
It should be noted that we treat the incubation period for newly infected (exposed) vaccinated persons differently than for newly infected (exposed) unvaccinated persons (Figure 1, Appendix Figure 1). For unvaccinated persons, as in the study of Lin et al. [3], we divide exposed persons in the incubation period of infection into five subpopulations: E1,M, …, E=,Mfor exposed persons who are mixing (i.e., persons who are not practicing social-distancing), E1,P, …, E=,P for exposed persons who are practicing social-distancing, and E1,:, …, E=,: for exposed quarantined persons. persons move through the five stages of the incubation period sequentially. In contrast, as a simplification, for vaccinated persons, we consider only a single exposed population: EV. We take persons to exit EV with rate constant kL/5 (Appendix Figure 1). With this choice, the duration of the incubation period of infection is the same, on average, for both vaccinated and unvaccinated persons. The average duration is 5/kL (about 5 d) in both cases. The difference is that the duration of the incubation period is Erlang distributed for unvaccinated persons, as discussed by Lin et al. [3], but exponentially distributed for vaccinated persons.
As indicated in Equation (29), we take vaccinated persons with productive infections to be equally as infectious as unvaccinated persons.
As noted earlier, we take all vaccinated persons to be mixing (i.e., to not be practicing social-distancing). Thus, populations of infected vaccinated persons (EV, IV, and AV) contribute to φM(t) (Appendix Equation 29) but not φP(t) (Appendix Equation 30).
As indicated in Appendix Equation 31, we consider pre-symptomatic exposed and asymptomatic unvaccinated persons to be eligible for vaccination and, thus, these persons contribute to the consumption of vaccine doses (i.e., these persons account for a portion of the number of completed vaccinations on a given day i, μ+S0 × 1 d). However, we do not move these persons to vaccinated compartments. The reason is that exposed and asymptomatic persons are expected to develop immunity faster through recovery from infection (i.e., movement to RU) than from vaccination.
As indicated in Appendix Equation 31, we do not consider symptomatic, quarantined, severely ill and hospitalized/isolated-at-home, and deceased persons to be eligible for vaccination.
Inference Approach
Recall that θC denotes the set of all adjustable parameters. As in the study of Lin et al. [3], for each MSA, we inferred MSA-specific adjustable parameter values θC using all MSA-specific surveillance data available up to a specified day of inference DR(i.e., the dth day after January 21, 2020). We took a Bayesian inference approach, meaning that, for a given dataset, we generated parameter posterior samples (a collection of θC’s) through Markov chain Monte Carlo (MCMC) sampling. The parameter posterior samples provide a probabilistic characterization of the adjustable parameter values consistent with the dataset used in inference. By drawing samples from the parametric posterior distribution, we generated a posterior predictive distribution for I(t+, t+31) for each i of interest. We considered all days from January 21, 2020 to October 31, 2021. In other words, for each i of interest, a prediction for I(t+, t+31) was generated for each of many θC’s drawn randomly (with uniform probability) from the parametric posterior distribution. The resulting distribution of I(t+, t+31) values is the posterior predictive distribution for I(t+, t+31). Recall that I(t+, t+31) is given by Appendix Equation 40 and corresponds to D[δC+], the expected number of new COVID-19 cases detected over a 1-d surveillance interval and reported for the ith day after January 21, 2020. Observation noise was injected into the posterior predictive distributions by replacing each sampled value for I(t+, t+31) with Xi∼NB(τ, q+), where τ is a member of the sampled set of parameter values θC used to generate the prediction of I(t+, t+31) and q+ is given by Equation 43.
According to Bayes’ theorem, given surveillance data D = {δC+}d, the parametric posterior is given as
where ℙ{θC} is the prior (which is formulated to capture knowledge of θC external to D or to express lack of such knowledge), ℙ{D|θC} is the likelihood defined by Appendix Equations 41– 43, and Z is a normalizing constant. We assumed a proper uniform prior, i.e., for each adjustable parameter, we assumed that all values between specified lower and upper bounds are equally likely before consideration of D. We used the same bounds as in the study of Lin et al. [3]. Then, we used an adaptive MCMC (aMCMC) algorithm [9] to generate samples from ℙ{D|θC} ℙ{θC}, which is proportional to ℙ{θC|D}. Thus, the relative probabilities of parameter sets θC according to ℙ{θC|D} are correctly represented by the samples.
Specifically, the adaptive MCMC algorithm [9] generates samples from the multivariate parametric posterior for adjustable model parameters (t0, β, and parameters for variant emergence and social-distancing), the measurement model parameter fD, and the likelihood parameter τ (Tables 1 and 2). This algorithm is available within the pyBioNetFit software package [10]. Use of the algorithm was performed as described by Lin et al. [3]. The report of Neumann et al. [10] includes helpful general usage advice, which was followed in this study.
Inference jobs were executed on a computer cluster. For each inference job, a total of 25 chains were generated, and the chain with the best mixing and convergence properties was selected for subsequent analyses.
Each inference was conditioned on the compartmental model of Figure 1 (Appendix Equations 1–38), settings for the structural parameters m (the number of SARS-CoV-2 variants under consideration) and n (the number of social-distancing periods under consideration beyond an initial social-distancing period), the measurement model (Appendix Equations 39 and 40), and fixed parameter estimates (Tables 1 and 2), including empirical daily per capita vaccination rates (i.e., the settings for μ+ in Appendix Equation 37). We assumed a proper uniform prior for each adjustable parameter [3] and a negative binomial likelihood function (Appendix Equations 41– 43). Use of proper uniform priors means that MAp estimates are maximum likelihood estimates (MLEs). In each inference, the data entering the likelihood function, D = {δC+}d (Appendix Equation 41), were MSA-specific daily reports of newly detected COVID-19 cases available up to the date of inference DR (i.e., the d-th day after January 21, 2020). Thus, all inferences are region-specific and time-dependent.
Use of Model Selection to Determine Intervals of Step Functions
Variant takeover times, θ = (θ1, θ)), and start times of social-distancing periods, τ = (σ, τ1, …, τn), were inferred from data; however, changes of the associated time-dependent step functions, Y(t), Pτ(t), and Λτ(t), were introduced only when an increase in model complexity was deemed to be justified. Each decision to introduce variant takeover or start of a new social-distancing period (beyond the initial period) was made using a model-selection procedure, which is described below. It should be noted that y1 and y) are parameters of Y(t), p0, p1, …, pn are parameters of Pτ(t), and λ1, λ1, …, λnare parameters of Λτ(t). These parameters determine the values of the step functions over different periods. For example, for n ≥ 1, p1 is the value of Pτ(t) and λ1 is the value of Λτ(t) for the period t ∈ [τ1, τ)). Similarly, y1 is the value of Y(t) for the period t ∈ [θ1, θ)), and y) is the value of Y(t) for the period t ∈ [θ), tfinal), where tfinal corresponds to the date of inference (October 31, 2021).
We started with a setting of n = 7 for each MSA of interest (i.e., 8 total social-distancing stages). To determine if n could be reduced, we conducted parsimony checks. In a parsimony check, 100 MLE curves, each constituting a fit to the data, were generated via optimization jobs. In these jobs, the total number of social-distancing stages in the model (n + 1) was set at 1 less than for the current proposed best fit. Each of the 100 fits was visually inspected to determine whether or not the following criteria held: 1) the quality of fit is acceptable (i.e., comparable to what is obtained with the current proposed best fit); 2) Alpha and Delta surges (identified by sequencing data) are explained at least partly by increased transmissibility; 3) social-distancing setpoint parameter values are feasible (i.e., each setpoint parameter takes on a value between 0.2 and 0.8); and 4) social-distancing changes proximal to an Alpha or Delta surge (if any) occur only after an increase in transmissibility. If one or more of these conditions was not satisfied, we accepted the proposed best fit as the most parsimonious fit to the data. If all conditions held, we updated the proposed best fit, and the parsimony check was repeated for a model with one less social-distancing stage. In the Appendix, we show examples of violations of parsimony-check criteria for the MSA surrounding Houston (Appendix Figure 3).
Simulations
After specification of parameter values (Tables 1–3), we used the Scipy (https://www.scipy.org) interface to LSODA [12] to numerically integrate the system of coupled ODEs consisting of the 40 ODEs of the compartmental model and the 1 ODE of the measurement model (Appendix Equations 1–39). The initial condition was defined by settings for t0, I1, and S2 (Tables 1–3). Integration combined with use of Appendix Equation 40 yielded a prediction of the expected number of new cases detected for each 1-d surveillance period of interest in the past or future: I(t+, t+31), where t+ corresponds to midnight on the ith day after January 21, 2020. To account for randomness in case detection and reporting, we replaced I(t+, t+31) with X+∼NB(τ, q+), where q+ is given by Appendix Equation 43.
Expanded illustration of the new compartmental model. In the extended model, vaccination was considered by allowing susceptible and recovered persons to transition into a vaccinated compartment, either V1 and RV. Susceptible (recovered) persons who have completed vaccination move into the V1(RV) compartment. The susceptible persons who move into V1are drawn from SM(populated by susceptible persons who are mixing and unprotected by social-distancing) and from SP (populated by susceptible persons who are protected by social-distancing). After susceptible persons enter V1, they can move through a series of additional compartments (V)through V&), which are included to capture the time needed for immunity to develop after completion of vaccination. We estimate that the time needed to acquire immunity after vaccination is approximately three weeks based on longitudinal studies of anti-spike protein IgG levels [4]. persons who exit the V& compartment without becoming infected enter one of the following compartments: SV,1, SV,), SV,*, or SV,(. persons in SV,1 are taken to remain susceptible to productive infection by all SARS-CoV-2 strains of interest (Alpha, Delta, and ancestral strains). persons in SV,) are taken to be susceptible to SARS-CoV-2 Alpha and Delta variants. persons in SV,* are taken to be susceptible to Delta. persons in SV,( are taken to be protected against all strains of interest. Infection of persons in SV,* is only allowed if Delta is present, i.e., at times t > θ). Infection of persons in SV,) is only allowed if Alpha or Delta is present, i.e., at times t > θ1. Vaccinated persons in compartments V1 through V& and compartment SV,1 are allowed to become infected at any time, at which point they transition to compartment EV, consisting of vaccinated persons who were exposed before development of vaccine-induced immunity. persons in compartment SV,) are allowed to become infected if t ≥ θ1. Similarly, persons in compartment SV,* are allowed to become infected if t ≥ θ). possible outcomes for persons in EVare taken to be the same as those for unvaccinated exposed persons; however, the incubation period is taken to be distinct. persons in EV can experience asymptomatic disease (upon entering AV) or they can become symptomatic (upon entering IV). persons in AV eventually recover, entering compartment RV. persons in IV can progress to severe disease (upon entering HV) or recover (upon entering RV). persons in HV either recover (moving into RV) or die (moving into D). persons who have recovered from infection, in the RU compartment, move directly into the RV compartment upon vaccination. persons in the RU and RV compartments are taken to have full immunity. The vaccination rate at which susceptible and recovered persons move into vaccinated compartments is updated daily for consistency with the empirical overall rate of vaccination, which we extract daily from the COVID Act Now Data ApI [1] and the Democrat and Chronicle newspaper [2]. The relative values of the vaccination rate are set such that each person eligible for vaccination has the same probability of being vaccinated. All unvaccinated persons are taken to be eligible for vaccination except symptomatic persons (in compartments IM and IP), persons who are hospitalized or severely ill at home (in compartment H), quarantined persons (in the various compartments labeled with a Q subscript), and deceased persons (in compartment D). It should be noted that asymptomatic, non-quarantined persons (in compartments AM and AP) and presymptomatic, non-quarantined persons (in the E compartments) are taken to be eligible (and to influence the vaccination rate constants) but, as a simplification, these persons are not explicitly tracked as vaccinated or unvaccinated because each of these persons will eventually enter either the D compartment or the RU compartment, at which point they will have immunity. In the model, the effects of SARS-CoV-2 variants are captured by a time-dependent dimensionless multiplier Y(t) of the rate constant β. This rate constant, which appears in Appendix Equations 1–4, 18–22, and 24, determines the rate of disease transmission within the subpopulation unprotected by social-distancing behaviors when Y(t) = 1. We take Y(t) = 1 for times t < θ1, i.e., for the initial period of the COVID-19 pandemic in the US that we take to have started on January 21, 2020. We take Y(t) to have the form of a step function with distinct values greater than 1 for periods starting at t = θ1 and θk31 > θk for k = 1, …, m. Thus, the model allows for m distinct periods of variant strain dominance delimited by a set of start times θ = {θ1, …, θm}. We considered m = 2. We assume that variants differ only in transmissibility.
Comparison of an Erlang cumulative distribution function with shape parameter nV = 6 and rate parameter kV = 0.3 d-1 and the empirical cumulative distribution of waiting times (tV values) observed in the longitudinal study of Korodi et al. [4]. The waiting time t°V is the time between vaccination of a previously uninfected person and detection of vaccine-induced SARS-CoV-2-specific IgG antibodies.
Example of a parsimony check with criteria that are violated for the MSA surrounding Houston. (A) Alpha and Delta surges (identified by sequencing data) are not explained, at least in part, by increased transmissibility, while all other criteria are satisfied. Each broken vertical black line from left to right indicates the date of onset of a social-distancing stage (i.e., with n = 3 for this illustration). Each broken vertical red line from left to right indicates the takeoff date of a variant (namely, Alpha or Delta). (B) Not all social-distancing setpoint parameter values are feasible, while all other criteria are satisfied. (C) Social-distancing changes proximal to an Alpha or Delta surge (if any) precede an increase in transmissibility, while all other criteria are satisfied. In panel A, we see that the orange segment between the two broken vertical red lines, which denotes the relative transmissibility of Alpha, is no larger than the orange segment to its left, which denotes the relative transmissibility of ancestral strains of SARS-CoV-2. In panel B, the rightmost blue segment, which denotes the social-distancing setpoint parameter value for the final social-distancing stage, has a value of roughly 0.16, which is deemed infeasible by our criterion. Finally, in panel C, the rightmost broken vertical black line is proximal to the leftmost broken vertical red line, which corresponds to the takeoff time of the Alpha surge, and the social-distancing setpoint parameter value changes prior to the Alpha surge. Note that all three panels show MLE curves, each of which constitutes an acceptable fit to the data.
ACKNOWLEDGMENTS
W.S.H., Y.T.L., and A.M. were supported by the LDRD program at Los Alamos National Laboratory (project 20220268ER). Y.C., Z.H., E.F.M., J.N., K.E.N., and R.G.P. were supported by a grant from the National Institute of General Medical Sciences of the National Institutes of Health (grant R01GM111510). A.M. was supported by the 2020 Mathematical Sciences Graduate Internship program, which is sponsored by the Division of Mathematical Sciences of the National Science Foundation, and the Center for Nonlinear Studies at Los Alamos National Laboratory. Computational resources used in this study included Northern Arizona University’s Monsoon computer cluster, which is funded by Arizona’s Technology and Research Initiative Fund, and the FARM computer cluster at the University of California, Davis. Y.C. thanks Song Chen (University of Wisconsin, La Crosse, Wisconsin, USA) for technical assistance.
Footnotes
Table 4 was updated. Figures 2-5 were modified for greater legibility. Various minor revisions