Interpretable, non-mechanistic forecasting using empirical dynamic modeling and interactive visualization
=========================================================================================================

* Lee Mason
* Amy Berrington de Gonzalez
* Montserrat Garcia-Closas
* Stephen J Chanock
* Blànaid Hicks
* Jonas S Almeida

## Abstract

Forecasting methods are notoriously difficult to interpret, particularly when the relationship between the data and the resulting forecasts is not obvious. Interpretability is an important property of a forecasting method because it allows the user to complement the forecasts with their own knowledge, a process which leads to more applicable results. In general, mechanistic methods are more interpretable than non-mechanistic methods, but they require explicit knowledge of the underlying dynamics. In this paper, we introduce a tool which performs interpretable, non-mechanistic forecasts using interactive visualization and a simple, data-focused forecasting technique. To ensure the work is FAIR and privacy is ensured, we have released the tool as an entirely in-browser web-application.

## 1. Introduction

It is difficult to predict the future. Many forecasts fail to beat even basic benchmarks such as not projecting any changes at all (1), despite the fact that forecasters have access to more data and methods than ever before. The main issue is that the real-world is very complicated, it involves a lot of elements interacting in complex, non-linear ways (2–4). Leading forecasting methods are often ineffective at dealing with this complexity, in part because they impose unrealistic or over-simplistic assumptions upon the data (5,6). On the other hand, there is limited evidence that complex methods lead to better forecasts than simple methods (7). One way in which people attempt to make better forecasts is by using explicit, mechanistic models of the system dynamics which they wish to predict (8). For example, in infectious epidemiology, an SIR (susceptible-infected-recovered) model aims to find the rates of change between three population groups: susceptible, infected, and recovered (9). However, a mechanistic approach requires the forecaster to choose an appropriate model and correctly parameterize it, which is only possible if they have sufficient knowledge of the dynamics. Insufficient knowledge may lead the forecaster to choose an incorrect or over-simplistic mechanistic model, resulting in poor forecasts (10–13). Indeed, even when the correct mechanistic model is chosen, it may be outperformed by a well-tuned model-free (non-mechanistic) method (14).

Non-mechanistic methods differ from mechanistic methods in that they do not require an explicit model of the dynamics (15,16). As such, they can be similarly applied to data from any domain so long as the data meets the method’s assumptions. Non-mechanistic methods include statistical modeling (e.g. ARIMA, exponential smoothing) (17), empirical dynamic modeling (e.g. simplex) (18), and deep learning (e.g. LSTM) (19,20). Non-mechanistic methods are powerful and flexible, but they are generally less interpretable than mechanistic methods because their parameters lack a domain-specific meaning (15). Interpretability is useful because it makes the relationship between the data and the forecasts more transparent. If a forecasting method is complex and difficult to interpret, it can give the user an inflated confidence in the quality of the forecasts that cannot be easily checked by the representation of the data (7). When a complex method makes a surprising forecast, it is difficult for the user to assess whether that forecast is a realistic consequence of some hidden pattern in the data, or merely a fit to the noise. Similarly, if a forecasting method is interpretable with interactive visual reference to the data, it is easier for a forecaster to incorporate their knowledge into the forecasts (i.e. to “own the forecasts”), leading to more applicable results and thus more effective applications (21–23).

In this paper, we introduce EpiForcast: a web tool which produces detailed forecasts using an easy-to-understand, non-mechanistic approach. The tool uses interactive visualization to ensure interpretability, which allows the user to assess and improve the forecasts using their own knowledge. To ensure EpiForecast is accessible to a variety of users, we have released it as an open-source, in-browser web application. EpiForecast can work privately with local data or pull data from online data sources, allowing users to engage with live datasets. We have also made the tool available in a reactive notebook (24), complete with additional interactive explanations of the underlying principles.

## 2. Results

### 2.1 Overview

We have produced a web-tool which uses interactive visualization and empirical dynamic modeling to make interpretable, non-mechanistic forecasts. The tool is available at https://episphere.github.io/forecast and is also available, with additional explanations, in an Observable Notebook at https://observablehq.com/@siliconjazz/edm-interpretable-forecasting. We encourage readers to use these resources before reading the rest of this section because it will make the following explanations clearer. The tool is built on vanilla, client-side JavaScript using a small number of libraries, most notably D3. The tool consists of four interactive plots with some additional controls (see Figure 1). The basic principles of the tool are based on empirical dynamic modeling (EDM), a simple and intuitive non-mechanistic technique which is used for forecasting, especially of non-linear systems (18,25). The EDM process is relatively straightforward. It first searches for the nearest dynamic neighbors, moments in the past where the dynamics most closely resemble the recent dynamics. Then it then looks at what happened after each of these moments and uses that to predict what may happen next. Each neighbor is weighted by how closely it resembles the most recent dynamics. This information can be used to generate point forecasts. The simplest way to accomplish this, used here, is by calculating weighted mean of all the neighbors’ futures (see simplex method in Methods). However, in this tool we use visualization to deemphasize point predictions in favor of a more detailed representation of the range of potential futures which arise from the different dynamic neighbors.

![Figure 1:](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2022/10/25/2022.10.21.22281384/F1.medium.gif)

[Figure 1:](http://medrxiv.org/content/early/2022/10/25/2022.10.21.22281384/F1)

Figure 1: 
Screenshot of the tool, available at [https://episphere.github.io/forecast](https://episphere.github.io/forecast). The example data depicted here is all-cause mortality from the Centers of Disease Control and Prevention (CDC). The user can also upload their own data or point the tool to data at a URL. **a)** The chronological time series, which shows the data accompanied by relevant information from the method. The blue dot is the current point, which is configurable by the user using the time-slider below the plot. The red dots are the nearest dynamic neighbors; their color is proportional to the respective neighbor’s weight. The purple shading to the right of the current point is the forecast area, a continuous combination of the neighbors’ futures. Darker areas in the forecast area are higher weighted, meaning they occur in more or higher weighted neighbors’ futures. The purple dashed line is drawn by joining the point forecasts in the forecast horizon; the point forecasts are a weighted average of the neighbors’ futures. **b)** An offset time-series plot which shows each of the neighbors and their futures, all offset to be on top of each other. This plot is useful for directly comparing the embedded neighbors and their futures. **c)** A phase-space plot, useful for inspecting dynamic patterns in the data (such as cycles). The phase space plot also clearly shows when the data has entered a new dynamic space, a case in which the EDM method will be less useful. **d)** Hyperparameters for the methods used by the tool, as well as a couple of additional visual options. **e)** A bar plot showing the distance of each neighbor to the current embedded point. This is useful for seeing explicitly which neighbors are closest, as well as seeing how that relates to their assigned color across the plots. The dashed line is the mean of all neighbors up to the selected current point in time. If a user hovers over one of the neighbors in any of the plots, then it is highlighted in all other plots. This allows the user to gain an immediate and concurrent sense of several aspects of the neighbor.

### 2.2 Details of the Tool

The main plot of the tool is a time-series marked with additional information (Figure 1a). The data line is drawn using two different shades of grey; past data which the forecasting method can “see” (i.e., data before the current point) are drawn with the darker shade. There are several colored points marked on the time-series: the current point in time is a purple dot, and the nearest neighbors are red dots. Each neighbor’s shade of red is proportional to the weight of that neighbor, higher weighted neighbors are darker. Straight after the current point there is a shaded forecast area which extends *tp* timesteps into the future (see Methods for more details about the forecasting method parameters). The purpose of this area is to summarize, at a glance, the offset futures of all the neighbors. In brief, the higher opacity parts of this area are closer to the offset futures of (higher-weighted) neighbors. More detail about how this area is generated can be found in *Methods*.

Optionally, the user can choose to display point forecasts calculated using the simplex method, which are represented by a dashed purple line. Point forecasts are another way to summarize the neighbors’ futures, but they are less detailed than the visual representation provided by forecast area. For example, take the case in Figure 2, where the EDM method seems to have identified two potential futures. By necessity, point forecasts will fall somewhere in the middle of these two futures and the forecaster will be unaware that two different possibilities were identified. One way which we have addressed this concern in the tool is by generating a shaded forecast area which summarizes the neighbors’ futures in a more detailed way (see Methods). The user could simply view all potential futures together, as is the default for the tool’s offset time-series plot, but the shaded forecast area makes it easier to quickly see the weighted contribution of the futures, especially when there are a lot of neighbors.

![Figure 2:](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2022/10/25/2022.10.21.22281384/F2.medium.gif)

[Figure 2:](http://medrxiv.org/content/early/2022/10/25/2022.10.21.22281384/F2)

Figure 2: 
This snapshot of the tool (using the default mortality data and with time point t = 87 selected) shows an example where point forecasts (the dashed purple line) are an ineffective summary of the forecasting results. The method appears to predict two similarly weighted potential futures. In this case, point forecasts necessarily must fall between these two futures, leading to a misleading result where all the information about the two possibilities is lost in favor of a line which is not representative of the method’s more nuanced results.

The chronological time-series plot with a time slider beneath it (Figure 1a) is the natural starting point for the user to begin interacting with the tool. It is accompanied by the offset time-series plot (Figure 1b); this plot shows the neighbors, the neighbors’ futures, the recent dynamics, the forecast area, and (optionally) the point forecasts. It is, essentially, a zoomed in version of the chronological plot, with each dynamic neighbor offset to the present such that the neighbors and their ensuring futures can be directly compared. This allows the user to get a better idea of how point forecasts and the forecasting area are produced from the neighbors and their futures. The user can then use the phase-space plot (Figure 1c) to inspect dynamic patterns, such as the cyclic patterns present in the mortality data. It is also helps highlight when the dynamics deviate from previous patterns — for example the emergence of COVID-19 resulted in a new region of the mortality data phase space being populated. Finally, there is a bar plot which shows the Euclidean distances between each dynamic neighbor and the recent dynamics (Figure 1e). The dashed line provides the mean distance of all previous neighbors to their respective current points, which allows the forecaster to contextualize the distance values.

Interaction is the critical feature of this tool. The central tenet argued in this report is that engaging multiple direct representations of the forecasts interactively leads to a deeper understanding of the underlying dynamics. To this end, each of the plots can be interacted with and this interaction is reflected in the other plots. The plots are accompanied by a panel of additional controls (Figure 1d) which allow the user to control the hyperparameters of the EDM method, the relative width of the gaussian kernels used to generate the shaded forecasting area, and some minor visual options. The user can also disable nearest neighbors, which removes them from the shaded forecast area and the point forecasts calculation. The user can export the results as either a simple series of point forecasts or as multiple parallel series, one for each neighbor. Instructions for interacting with the tool are available in a video linked on EpiForecast’s web page and we encourage readers to play around with the interactive forecasts themselves to develop an understanding of how the interactivity facilitates better understanding.

### 2.3 Example

Using Fig 1, we will illustrate how a detailed and deconstructed representation of forecasts can help an analyst generate applicable forecasts. In this example, it is 2019/09/06 and a public health official wishes to forecast US all-cause mortality rates (the example data for the tool). The forecaster notices that the method is drawing from several different situations in the past: the decreasing side of a peak in 2014, the increasing side of peaks in 2016 and 2017, and the bottom of a peak in 2018. The forecaster knows that unnormalized mortality counts tend to increase over time, and they therefore decide that the 2014 neighbor’s contribution to the forecasts is not relevant, and thus they disable that neighbor. A number of influences in 2017/18 lead to a particularly deadly flu season (hence the larger peak in deaths). If the forecaster believed these influences were present again in 2019, they could disable the remaining neighbors in order to favor the neighbors from 2017. However, the forecaster is not sure, and thus leaves in the neighbors from 2016, 2017, and 2018. Finally, the forecaster exports the forecasts as 7 separate forecast series, one for each enabled neighbor. These exported series could now be analyzed in another environment using a suitable forecasting technique of the forecaster’s choice, or they could form the basis of judgmental forecasts.

### 2.4 Working Privately with Data

By default, the application shows the all-cause US mortality dataset from the Centers of Disease Control (26,27), showcased in Figure 1, but users can supply their own data to the tool by uploading a CSV or JSON file, or by pointing the application to a corresponding URL. See the supplementary video or the tool’s web page for more instructions on how to upload your own data to the tool. It is important to note that all visualizations and computations take place within the safety of the user’s browser sandbox. No data or analytics circulate outside the user’s Web browser, fully preserving privacy in order to enable the visualization of both public and sensitive data.

## 3. Discussion

We have produced a FAIR web-tool which allows users to produce interpretable forecasts with their own data and then explore them in detail. This tool seeks to illustrate the role of interpretability via simplicity and interactivity, a key but often overlooked element of forecasting. If the forecaster can understand how a forecasting method produces results, then they can better assess the relevance and reliability of those results. As such, the primary goal of EpiForecast is not to produce more accurate point forecasts, but instead to further the “explainability” of forecasts by conveying a more detailed representation of the results -a representation which helps the user understand how the resulting forecasts were generated from the input data (21). In a sense, this work treats forecasting as an exploratory process rather than an analytic one. Concrete numeric results are replaced by a more nuanced and detailed understanding of dynamic structures in the data and how these structures may be informative when forecasting. One way to achieve explainability is by choosing a method which in some way mirrors intuitive human reasoning. EDM is suitable for this because it based on the reasonable premise that immediate futures of similar pasts may provide insight into the future. EDM executes steps which are analogous to how a human may perform forecasting and, in doing so, produces a lot of information which is easy for a human to interpret. EpiForecast takes this information, visualizes it in multiple different ways, and makes it explorable through interaction.

EpiForecast uses the shaded forecasting area (see Figure 2) as a way to provide a more detailed representation of the information generated by EDM than would be gained from point forecasts. Interactivity is also used for this purpose. Point forecasts and the shaded forecast area are both ways to summarize the information from the EDM method, but with interactivity there is no need to summarize; we can just include all this information and the forecaster can view it on request. This provides the forecaster with a more complete understanding of the results and could thus reduce the bias which may arise from a static summary. The idea of using interaction visualizations to improve analytical insight is the core tenet of the field of visual analytics. There has been recent interest in using the principles of visual analytics to improve forecasting, and in particular to improve the forecaster’s understanding of the forecasting method (21,28).

Our accompanying tool has, nevertheless, several notable limitations. The EDM method requires a lot of data (29) and is most useful when the embedded space is well populated in the region of the current embedded point. Therefore, the tool will not work as well when the data has a substantial trend because the current embedded point will often fall in an uninhabited region of embedded space and therefore neighbors may not represent meaningfully similar pasts. As such, the tool will also be less effective for time-series where the long-term trend dominates the short-term dynamics, which usually occurs for long term time-series with few points. To some extent, this can be addressed by detrending the data, and the explainability of our tool makes it easier to identify other potentially useful preprocessing steps. Furthermore, the explainability allows a user to quickly see when the method is producing inappropriate forecasts, reducing the chance that they will be misled. EDM has several parameters which must be tuned, all of which can have a substantial effect on the visualization. However, these parameters are easy to interpret and can be quickly configured in the tool. This issue could potentially be addressed further by introducing an automatic parameter selection algorithm. A potential impedance to adoption of our tool is the fact that, paradoxically, complexity is often associated with depth and accuracy which leads users to more trust complex and difficult-to-interpret methods over simple and interpretable ones (7). Finally, at present this tool works only on univariate data, but the EDM method can be extended to handle multivariate data so the tool could be updated to support this in the future.

We found a single example of another tool using interactive visualization and nearest neighbors for forecasting (30). However, this tool does not find the neighbors in the embedded dynamic space, but instead in the multivariate space. The data has several variables which are relevant to forecasting energy demand, and the neighbors are single points in time which are closest to the current point in time along these variables. Like our tool, the energy tool uses visualization to highlight how, exactly, the neighbors are like the current time point. But because the tool does not consider dynamics the interactive plots differ substantially from our approach. However, it also suggests that approaching the multivariate dynamic space is a natural evolution of the work reported here.

To ensure that this tool is FAIR (31) and preserves the privacy of the data, we have developed the tool as an in-browser JavaScript application. Only the code needs to be hosted and computation is done on the client side, which means the application is easy and inexpensive to host. The web itself is a natural environment to host FAIR applications due to the ubiquity of the web-browser as both an interactive platform and an execution engine. Finally, due to the in-browser, client-side nature of the tool, the privacy of the user’s data is ensured by default. In order to explore the participative modularity of this tool, its basic elements framed by interactive explanations are also made available in an Observable Notebook at https://observablehq.com/@siliconjazz/edm-interpretable-forecasting, which have important advantages over traditional notebooks (24,32).

In conclusion, we have produced an in-browser web tool which allows users to make forecasts and explore those forecasts using interactive visualization. We hope that this work will further the idea of using visual analytics to improve the interpretability of forecasting methods, and ultimately lead to more applicable, relevant, and accurate forecasts.

## 4. Methods

### 4.1 Empirical Dynamic Modeling (EDM)

Empirical dynamic modeling (EDM) is a time-series method which aims to empirically reconstruct the state-space of a system using delay embedding (18). EDM can be used for a number of tasks, such as estimating the non-linearity of a system, but for our purposes we are interested in its use for forecasting. EDM has proven especially effective on complex, nonlinear systems, such as ecological models, but it requires a lot of data. EDM is effective at modeling univariate and multivariate data (18), but in this paper we will explore the univariate implementation. We have a value *y**t* for each evenly spaced time point t = 1 to t = T. The first step is to perform the delay embedding: packaging each value in a vector with E - 1 values which precede it in time:![Graphic][1]</img>. The τ value is a hyperparameter which allows the user to leave gaps in the embedding; if τ=1 then each value is packaged with the values which immediately precede it. Next, the algorithm finds the n nearest neighbors of the most recent embedded vector ![Graphic][2]</img> using the Euclidean distance. Here we define *t**i* to be the time-step of the ith neighbor, and *x**i* to be the embedded vector of the ith neighbor (i.e. ![Graphic][3]</img>). Then, the future vector for each neighbor is retrieved, which is a vector of values which succeed each neighbor ![Graphic][4]</img>. E, n, and tp are configurable hyperparameters; tp is the forecast horizon. The algorithm then calculates a weight for neighbor: ![Graphic][5]</img>, where *d**i* is the euclidean distance between ![Graphic][6]</img> and the neighbor ![Graphic][7]</img>, and ![Graphic][8]</img> is the mean euclidean distance between ![Graphic][9]</img> and each neighbor. Another hyperparameter is θ which specifies how much the weight is affected by the distance, if θ = 0 then all neighbors are weighted equally at 1. This covers the information used by the tool to draw the forecast area. However, this information can also be used to generate point forecasts. There are a few ways to accomplish this, but we have chosen the simplex method due to its simplicity. The simplex method calculates a series of forecasts by taking a weighted average of the neighbors’ future vectors: ![Formula][10]</img>  

### 4.2 Drawing the Shaded Forecast Area

The shaded forecasting area is generated using a method based on weighted kernel density estimation. At t = T we have n neighbors, each with a future vector ![Graphic][11]</img> and a weight *w**i*. To keep this description simple, we shall only consider how the area is generated for a single future timestep t = T + 1, but this is easily repeatable for each t in the forecast horizon. The basic idea behind this method is that we place a gaussian kernel centered at each’s neighbor’s relevant future value *z**i,T*+1. The height of a neighbor’s kernel is proportional to that neighbor’s weight and the width is controlled by a hyperparameter. Specifically, for the ith neighbor we have a kernel function: ![Formula][12]</img>  Where f is the density function of the normal distribution with mean 0 and variance 1, s is a scaling parameter set to ensure the sum of all kernel peaks would be equal to 1 if all neighbors had a weight equal to 1, ɑ is the desired width of the kernel, and ![Graphic][13]</img> is a constant which ensures the desired width. The “width” is the width of the density function f at probability density c. We set c = 0.0001 and ɑ = σ · kernel\_width where σ is the standard deviation of the data and kernel\_width is a hyperparameter with a default value of 1. Then, to get the overall value we sum up the value of each kernel: ![Graphic][14]</img>. Finally, this value is mapped linearly to a color on a color scale which gives the color in the forecast area at v for t = T + 1. This is repeated for all future time steps. For an example of how the shaded forecasting area is generated, see Figure 3.

![Figure 3:](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2022/10/25/2022.10.21.22281384/F3.medium.gif)

[Figure 3:](http://medrxiv.org/content/early/2022/10/25/2022.10.21.22281384/F3)

Figure 3: 
An example of how the weighted kernel density estimation works, and how the resulting values are mapped onto colors for the shaded forecasting area. The example is from t = 82 and tp=8 on the default mortality data, with the relative kernel width parameter, kernel_width set to 1.0. A gaussian kernel is placed at each neighbor’s value; higher weighted neighbors (represented with a darker color) correspond to taller and thinner kernels. The sum of the kernels forms the forecast distribution (black line) which is then linearly mapped onto an opacity value used to draw the color gradient.

### 4.3 Data Availability

The example data used in the tool is a combination of two CDC datasets: ‘Weekly Counts of Deaths by State and Select Causes, 2014-2019’, available at [https://data.cdc.gov/NCHS/Weekly-Counts-of-Deaths-by-State-and-Select-Causes/3yf8-kanr](https://data.cdc.gov/NCHS/Weekly-Counts-of-Deaths-by-State-and-Select-Causes/3yf8-kanr), and ‘Weekly Provisional Counts of Deaths by State and Select Causes, 2020-2022’, available at [https://data.cdc.gov/NCHS/Weekly-Provisional-Counts-of-Deaths-by-State-and-S/muzy-jte6](https://data.cdc.gov/NCHS/Weekly-Provisional-Counts-of-Deaths-by-State-and-S/muzy-jte6).

### 4.4 Code Availability

The code is available at [https://github.com/episphere/forecast](https://github.com/episphere/forecast) under the MIT license. This includes the specific code for the EpiForecast website, reusable functions to perform EDM on the web, and JavaScript classes for the interactive plots. An Observable Notebook which shows how the code can be imported and used is available at [https://observablehq.com/@siliconjazz/edm-interpretable-forecasting](https://observablehq.com/@siliconjazz/edm-interpretable-forecasting).

## Data Availability

The example data used in the tool is a combination of two CDC datasets: ‘Weekly Counts of Deaths by State and Select Causes, 2014-2019’, available at https://data.cdc.gov/NCHS/Weekly-Counts-of-Deaths-by-State-and-Select-Causes/3yf8-kanr, and ‘Weekly Provisional Counts of Deaths by State and Select Causes, 2020-2022’, available at https://data.cdc.gov/NCHS/Weekly-Provisional-Counts-of-Deaths-by-State-and-S/muzy-jte6 . The code is available at https://github.com/episphere/forecast under the MIT license. This includes the specific code for the EpiForecast website, reusable functions to perform EDM on the web, and JavaScript classes for the interactive plots. An Observable Notebook which shows how the code can be imported and used is available at https://observablehq.com/@siliconjazz/edm-interpretable-forecasting.

## Acknowledgements

None

*   Received October 21, 2022.
*   Revision received October 21, 2022.
*   Accepted October 25, 2022.


*   © 2022, Posted by Cold Spring Harbor Laboratory

This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license

## References

1.  1.Wright JH. Some observations on forecasting and policy. Int J Forecast. 2019 Jul;35(3):1186–92.
    
    
2.  2.Rutter H, Savona N, Glonti K, Bibby J, Cummins S, Finegood DT, et al. The need for a complex systems model of evidence for public health. The Lancet. 2017 Dec;390(10112):2602–4.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/s0140-6736(17)31267-9&link_type=DOI) 

3.  3.Mohammadi N, Taylor JE. Thinking fast and slow in disaster decision-making with Smart City Digital Twins. Nat Comput Sci. 2021 Dec;1(12):771–3.
    
    
4.  4.Dolfin M, Leonida L, Outada N. Modeling human behavior in economics and social science. Phys Life Rev. 2017 Dec;22–23:1–21.
    
    
5.  5.Liu C, Hoi SC, Zhao P, Sun J. Online arima algorithms for time series prediction. In: Thirtieth AAAI conference on artificial intelligence. 2016.
    
    
6.  6.Lo JH. A study of applying ARIMA and SVM model to software reliability prediction. In: 2011 International Conference on Uncertainty Reasoning and Knowledge Engineering [Internet]. Bali, Indonesia: IEEE; 2011 [cited 2022 Mar 3]. p. 141–4. Available from: [http://ieeexplore.ieee.org/document/6007794/](http://ieeexplore.ieee.org/document/6007794/)
    
    
7.  7.Green KC, Armstrong JS. Simple versus complex forecasting: The evidence. J Bus Res. 2015 Aug;68(8):1678–85.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jbusres.2015.03.026&link_type=DOI) 

8.  8.Kandula S, Yamana T, Pei S, Yang W, Morita H, Shaman J. Evaluation of mechanistic and statistical methods in forecasting influenza-like illness. J R Soc Interface. 2018 Jul;15(144):20180174.
    
    
9.  9.Weiss HH. The SIR model and the foundations of public health. Mater Mat. 2013;0001–17.
    
    
10. 10.Moein S, Nickaeen N, Roointan A, Borhani N, Heidary Z, Javanmard SH, et al. Inefficiency of SIR models in forecasting COVID-19 epidemic: a case study of Isfahan. Sci Rep. 2021 Dec;11(1):4725.
    
    
11. 11.Urban MC, Bocedi G, Hendry AP, Mihoub JB, Pe’er G, Singer A, et al. Improving the forecast for biodiversity under climate change. Science. 2016 Sep 9;353(6304):aad8466.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjE2OiIzNTMvNjMwNC9hYWQ4NDY2IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMTAvMjUvMjAyMi4xMC4yMS4yMjI4MTM4NC5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 

12. 12.Funk S, Camacho A, Kucharski AJ, Lowe R, Eggo RM, Edmunds WJ. Assessing the performance of real-time epidemic forecasts: A case study of Ebola in the Western Area Region of Sierra Leone, 2014–15 [Internet]. Epidemiology; 2017 Aug [cited 2022 Mar 3]. Available from: [http://biorxiv.org/lookup/doi/10.1101/177451](http://biorxiv.org/lookup/doi/10.1101/177451)
    
    
13. 13.Holmdahl I, Buckee C. Wrong but Useful — What Covid-19 Epidemiologic Models Can and Cannot Tell Us. N Engl J Med. 2020 Jul 23;383(4):303–5.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMp2016822&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32412711&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F25%2F2022.10.21.22281384.atom) 

14. 14.Perretti CT, Munch SB, Sugihara G. Model-free forecasting outperforms the correct mechanistic model for simulated and experimental data. Proc Natl Acad Sci. 2013 Mar 26;110(13):5253–7.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxMToiMTEwLzEzLzUyNTMiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMi8xMC8yNS8yMDIyLjEwLjIxLjIyMjgxMzg0LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 

15. 15.Lagergren J, Reeder A, Hamilton F, Smith RC, Flores KB. Forecasting and Uncertainty Quantification Using a Hybrid of Mechanistic and Non-mechanistic Models for an Age-Structured Population Model. Bull Math Biol. 2018 Jun;80(6):1578–95.
    
    
16. 16.Sundar S, Schwab P, Tan JZH, Romero-Brufau S, Celi LA, Wangmo D, et al. Forecasting the COVID-19 Pandemic: Lessons learned and future directions [Internet]. Public and Global Health; 2021 Nov [cited 2022 Mar 3]. Available from: [http://medrxiv.org/lookup/doi/10.1101/2021.11.06.21266007](http://medrxiv.org/lookup/doi/10.1101/2021.11.06.21266007)
    
    
17. 17.Hyndman RJ, Athanasopoulos G. Forecasting: principles and practice. OTexts; 2018.
    
    
18. 18.Chang CW, Ushio M,  Hsieh C hao. Empirical dynamic modeling for beginners. Ecol Res. 2017 Nov;32(6):785–96.
    
    
19. 19.Siami-Namini S, Tavakoli N, Siami Namin A. A Comparison of ARIMA and LSTM in Forecasting Time Series. In: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA) [Internet]. Orlando, FL: IEEE; 2018 [cited 2022 Mar 3]. p. 1394–401. Available from: [https://ieeexplore.ieee.org/document/8614252/](https://ieeexplore.ieee.org/document/8614252/)
    
    
20. 20.Siami-Namini S, Tavakoli N, Namin AS. The Performance of LSTM and BiLSTM in Forecasting Time Series. In: 2019 IEEE International Conference on Big Data (Big Data) [Internet]. Los Angeles, CA, USA: IEEE; 2019 [cited 2022 Mar 3]. p. 3285–92. Available from: [https://ieeexplore.ieee.org/document/9005997/](https://ieeexplore.ieee.org/document/9005997/)
    
    
21. 21.Lu Y, Steptoe M, Buchanan V, Cooke N, Maciejewski R. Evaluating Forecasting, Knowledge, and Visual Analytics. In: 2021 IEEE Workshop on TRust and EXpertise in Visual Analytics (TREX) [Internet]. New Orleans, LA, USA: IEEE; 2021 [cited 2022 Mar 3]. p. 32–9. Available from: [https://ieeexplore.ieee.org/document/9619888/](https://ieeexplore.ieee.org/document/9619888/)
    
    
22. 22.Arvan M, Fahimnia B, Reisi M, Siemsen E. Integrating human judgement into quantitative forecasting methods: A review. Omega. 2019 Jul;86:237–52.
    
    
23. 23.Alvarado-Valencia J, Barrero LH, Önkal D, Dennerlein JT. Expertise, credibility of system forecasts and integration methods in judgmental demand forecasting. Int J Forecast. 2017 Jan;33(1):298–313.
    
    
24. 24.Perkel JM. Reactive, reproducible, collaborative: computational notebooks evolve. Nature. 2021 May 6;593(7857):156–7.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/d41586-021-01174-w&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=33941927&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F25%2F2022.10.21.22281384.atom) 

25. 25.Ye H, Beamish RJ, Glaser SM, Grant SCH,  Hsieh C hao, Richards LJ, et al. Equation-free mechanistic ecosystem forecasting using empirical dynamic modeling. Proc Natl Acad Sci. 2015 Mar 31;112(13):E1569–76.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxMjoiMTEyLzEzL0UxNTY5IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjIvMTAvMjUvMjAyMi4xMC4yMS4yMjI4MTM4NC5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 

26. 26.Center for Disease Control. Weekly Provisional Counts of Deaths by State and Select Causes, 2020-2022 [Internet]. 2022 [cited 2022 Mar 3]. Available from: [https://data.cdc.gov/NCHS/Weekly-Provisional-Counts-of-Deaths-by-State-and-S/muzy-jte6](https://data.cdc.gov/NCHS/Weekly-Provisional-Counts-of-Deaths-by-State-and-S/muzy-jte6)
    
    
27. 27.Center for Disease Control. Weekly Counts of Deaths by State and Select Causes, 2014- 2019 [Internet]. Available from: [https://data.cdc.gov/NCHS/Weekly-Counts-of-Deaths-by-State-and-Select-Causes/3yf8-kanr](https://data.cdc.gov/NCHS/Weekly-Counts-of-Deaths-by-State-and-Select-Causes/3yf8-kanr)
    
    
28. 28.Nowak S, Bartram L, Haegeli P. Designing for Ambiguity: Visual Analytics in Avalanche Forecasting. In: 2020 IEEE Visualization Conference (VIS) [Internet]. Salt Lake City, UT, USA: IEEE; 2020 [cited 2022 Mar 3]. p. 81–5. Available from: [https://ieeexplore.ieee.org/document/9331311/](https://ieeexplore.ieee.org/document/9331311/)
    
    
29. 29.Hsieh C, Anderson C, Sugihara G. Extending Nonlinear Analysis to Short Ecological Time Series. Am Nat. 2008 Jan;171(1):71–80.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1086/524202&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18171152&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2022%2F10%2F25%2F2022.10.21.22281384.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000252191700009&link_type=ISI) 

30. 30.Grimaldo AI, Novak J. Combining Machine Learning with Visual Analytics for Explainable Forecasting of Energy Demand in Prosumer Scenarios. Procedia Comput Sci. 2020;175:525–32.
    
    
31. 31.Wilkinson MD, Dumontier M, Aalbersberg IjJ, Appleton G, Axton M, Baak A, et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016 Dec;3(1):160018.
    
    
32. 32.Pimentel JF, Murta L, Braganholo V, Freire J. A Large-Scale Study About Quality and Reproducibility of Jupyter Notebooks. In: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR) [Internet]. Montreal, QC, Canada: IEEE; 2019 [cited 2022 Mar 3]. p. 507–17. Available from: [https://ieeexplore.ieee.org/document/8816763/](https://ieeexplore.ieee.org/document/8816763/)

 [1]: /embed/inline-graphic-1.gif
 [2]: /embed/inline-graphic-2.gif
 [3]: /embed/inline-graphic-3.gif
 [4]: /embed/inline-graphic-4.gif
 [5]: /embed/inline-graphic-5.gif
 [6]: /embed/inline-graphic-6.gif
 [7]: /embed/inline-graphic-7.gif
 [8]: /embed/inline-graphic-8.gif
 [9]: /embed/inline-graphic-9.gif
 [10]: /embed/graphic-3.gif
 [11]: /embed/inline-graphic-10.gif
 [12]: /embed/graphic-4.gif
 [13]: /embed/inline-graphic-11.gif
 [14]: /embed/inline-graphic-12.gif