Real time scalable data acquisition of COVID-19 in six continents through PySpark - a big data tool =================================================================================================== * Tanvi S. Patel * Daxesh P. Patel * Chirag N. Patel ## Abstract Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was declared as a global emergency in January 2020 due to its pandemic outbreak. To examine this Coronavirus disease 2019 (COVID-19) effects various data are being generated through different platforms. This study was focused on the clinical data of COVID-19 which relied on python programming. Here, we proposed a machine learning approach to provide a insights into the COVID-19 information. PySpark is a machine learning approach which also known as Apache spark an accurate tool for the searching of results with minimum time intervals as compare to Hadoop and other tools. World Health Organization (WHO) started gathering corona patients’ data from last week of the February 2020. On March 11, 2020, the WHO declared COVID-19 a global pandemic. The cases became more evident and common after mid-March. This paper used the live owid (our world in data) dataset and will analyse and find out the following details on the live COVID-19 dataset. (1) The daily Corona virus scenario on various continents using PySpark in microseconds of Processor time. (2) After the various antibodies have been implemented, how they impact new cases on a regular basis utilizing various graphs. (3) Tabular representation of COVID-19 new cases in all the continents. Key words * COVID-19 * apache spark * PySpark * live data analysis * big data analysis * continents ## Introduction In early December 2019, an unidentified virus known as coronavirus was discovered in Wuhan, China, among a group of patients suffering from severe tuberculosis, and it quickly spread to other cities. The modern infection severe intense respiratory syndrome coronavirus 2 (SARS-CoV-2) was declared by the World Health Organization, and the illness caused by this unused infection was called coronavirus disease 2019 (COVID-19)(You et al. 2020)(Patel et al. 2020). In January 2020, the World Health Organization (WHO) declared a global emergency, and the situation quickly escalated, prompting WHO to declare COVID-19 a global pandemic on March 11, 2020 (Cucinotta and Vanelli 2020; Patel et al. 2021a). COVID-19 is spread predominantly from person to person, most commonly through respiratory droplets from coughing, sneezing, or talking. All the researchers, biologist, Pharmaceuticals and doctors were trying to find the medicine, vaccines and treatment to control COVID-19 (Patel et al. 2021b). In December 2020, The U.S. Food and Drug Administration (USFDA) granted an Emergency Use Authorization (EUA) for the Pfizer-BioNTech COVID-19 Vaccine. The FDA granted an EUA for the use of the Moderna COVID-19 Vaccine on December 18, 2020. The present study provides an insights of the COVID-19 new cases curve after antibody vaccines were introduced using machine learning approach (Ong et al. 2020). Since MS Excel has a 255-row limit for processing and visualizing data, we can use tableau, Hadoop MapReduce, or Apache Spark for big data analysis (Gupta et al. 2017). However, when compared to other machine learning tools, Apache Spark provides us with quick, reliable, and accurate results. Apache spark can also handle Parquet, ORC, JSON, CSV, and text format. There are many ways of data visualization and analysis, such as theoretical, graphical and tabular forms. Graphs are striking psychological resources because they rapidly and efficiently display information. Often, when provided by a graph, data can be better interpreted than by a hypothesis since a pattern or comparison can be unveiled by the graph and chart. The Python scripting language offers numerous libraries, such as pandas, to construct very powerful graphs and charts (Gupta et al. 2017; Stancin and Jovic 2019). PySpark includes comprehensive and quality ways to run zeta bytes and peta bytes of data on parallel clusters a lot faster than conventional python applications for machine learning applications. PySpark offers methods for statistical analysis and a ML (Machine Learning) pipeline. In specific, Apache spark is emerged as popular open framework that support the various data visualization techniques like different charts (Bar, line, pie, etc). Apache Spark provides JAVA, Scala, R and python API’s but among all of these, python is simple scripting language and good for machine learning due to its versatile existence. Python’s popularity in the field of data science has reached new heights, especially in terms of freely accessible frameworks and libraries. Among the many free libraries available, pandas, NumPy, and Matplotlib are well-known and widely used in the field of data analysis (Ghaffar et al. 2015; Assefi et al. 2017; Gupta et al. 2017; Magar et al. 2021). In this COVID-19 challenge, everyday millions of data is being generated and these data are continuously employed to established different preventions and control steps. COVID-19 datasets are increasing in size day by day as well as it is becoming sophisticated too. It is necessary to build solutions using statistical methods to leverage this wealth of data. Due to the rise in data and new data on a regular basis, new scripting codes every day need to be analysed to create new visualizations of the data as well as new graphs. In order to solve this issue, this paper incorporates live data analysis so that visualization is often updated every day as data changes and there is no need for new scripts every day. This study focuses on the daily cases and daily deaths in various continents due to COVID-19 and makes statistical graphs for all continents. This work also highlights confirmed cases and death ratios in a tabular format, which allows for easy comparison of the entire world’s continents in a single table. Using PySpark, compare all the Earth’s continents in a graph and table. ## Methodology ### Data curation In order to perform the statistical analysis, the data of COVID-19 collected from all over in world. It includes confirmed cases, new cases, deaths, hospitalizations, testing, and vaccinations, throughout the duration of the COVID-19 pandemic. COVID-19 cases and death were collected from COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (Badr et al. 2020; Magar et al. 2021). European countries vaccination against COVID-19 details were obtained from the European Surveillance System (TESSy) webpage (Ammon and Faensen 2009). Vaccinations against COVID-19 data was retrieved by the Data team from official reports of all across the globe. Other data was selected from a variety of sources like United Nations, World Bank, Global Burden of Disease, Blavatnik School of Government and more. Web-based application was used for the utilization of the COVID-19 data with live monitoring (Ghaffar et al. 2015). Figure-1 depicts the workflow of proposed study which comprises of various data collection and administered. The raw COVID-19 data was customized using text editor with normalization with the filtered by continent accountability. ![Figure-1](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2021/07/06/2021.07.04.21259983/F1.medium.gif) [Figure-1](http://medrxiv.org/content/early/2021/07/06/2021.07.04.21259983/F1) Figure-1 Flowchart of proposed study. ### Machine learning Approach: Apache Spark with API python Apache Spark is a fast and trustable general-purpose cluster computing framework. This framework offers API (application programming interfaces) in Java, Python, Scala and R script. Spark is an Apache incubator cluster computing system that is focused in making data processing easier with top agile application in coding as well as running. It is very fast at both running and coding (Salloum et al. 2016; Assefi et al. 2017). This applicability was applied for the current study which proposed novel data structure of COVID-19 using Spark core and upper-level libraries such as Spark’s MLlib for machine learning, GraphX for graph analysis, Spark Streaming for stream processing, and Spark SQL for structure data processing make up Apache Spark (Assefi et al. 2017; Gupta et al. 2017). Real-time data, such as streaming, video, and social media data, was also analysed by Apache Spark. The promising and accurate data processing was performed through Apache Spark which includes the 7500 rows at a single instance or operation with meticulously. Figure-2 shows the Apache spark architecture incorporates four APIs with correlation to COVID-19 dataset to analyse it on case by case basis with numbers of deaths and vaccines are being employed till now. The average speed of the Apache Spark is 75 thousand rows and 59 columns in 4.46 μs of total time and 589 μs of wall time (Binti Hamzah et al. 2020). ![Figure-2](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2021/07/06/2021.07.04.21259983/F2.medium.gif) [Figure-2](http://medrxiv.org/content/early/2021/07/06/2021.07.04.21259983/F2) Figure-2 Apache Spark and API (Application Programming Interface). Python’s prominence is increasing, widely in the all areas like Bio informatics and data analysis as well as in programming too (Latif et al. 2020). As a result, the number of free frameworks and libraries for use is growing. Various Python big data research libraries, such as pandas, matplotlib, NumPy, and seaborn, are also used in this analysis (Figure-3). Python libraries are commonly used to visualize data, and visualization tool is becoming simpler with the use of these free libraries. ![Figure-3](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2021/07/06/2021.07.04.21259983/F3.medium.gif) [Figure-3](http://medrxiv.org/content/early/2021/07/06/2021.07.04.21259983/F3) Figure-3 Python and it’s used libraries. ***Formulas*** ![Formula][1] ### Script used to find charts The given script shows how a small amount of code can create a graph with real time data rows of data in a short amount of time. Real time data set was obtained from [https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/owid-covid-data](https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/owid-covid-data) .csv. PySpark Scripting code was uploaded on [https://github.com/pateltanvi2992/Data-Analysis-of-Covid-19-using-Spark-(Patel2021)](https://github.com/pateltanvi2992/Data-Analysis-of-Covid-19-using-Spark-(Patel2021)). ## Results and Discussion The vast volume of daily produced data has pushed us beyond the limits of conventional processing capacities, necessitating the adoption of advanced computing infrastructures that can process data in parallel and distributed ways. The generated data consists of huge in size and requires innumerable memory space, so Apache Spark was adopted to rectify this problem. The Apache spark platform and its library provides the exploration of qualitative and quantitative attributes collection. With the help of Python large-scale real-world data experiments of COVID-19 were applied for this study. These data were represented in six different continents. Given demographic table-1 is describe the whole scenario of the COVID-19 in six continents. The limitation of this study in our analysis we did not consider data from the seventh continent Antarctica as there is not any permeant resident staying although there is at least 36 people confirmed with COVID infection ([https://www.barrons.com/](https://www.barrons.com/); [https://www.coolantarctica.com/](https://www.coolantarctica.com/)). View this table: [Table-1](http://medrxiv.org/content/early/2021/07/06/2021.07.04.21259983/T1) Table-1 Covid-19 Tabular data (till May-15th-2021). ### Asia Asia has a total of 23225625.0 COVID-19 positive cases and 22836076.0 recovered cases as of May 15th, 2021. According to this, the mortality rate will be 1.6359 percent until May 15th, 2021. From February 24, 2020 to May 15th, 2021, the daily recover cases and daily confirm cases are depicted in Figure-4 (B) (scatter graph). On September 16, 2020, the highest number of new cases were registered in India, with 97894.0 (new cases) out of 1136613.0 (new tests). The COVID-19 cases peaked in September 2020, according to Figure-5 (B) (histogram), and then started to decline. Total tests performed in Asia were 373625749.0, with 23225625.0 confirmed cases. As a result of this, 6.2163 percent of the cases tested positive. ![Figure-4](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2021/07/06/2021.07.04.21259983/F4.medium.gif) [Figure-4](http://medrxiv.org/content/early/2021/07/06/2021.07.04.21259983/F4) Figure-4 COVID-19 cases distribution based on recovery rate from various continents. ### Africa Africa has total COVID-19 positive cases till May 15th 2021 are 4027046.0 and recover are 3912657.0. On basis of that the death ratio is 2.6701 %. Figure-4 (E) is showing the daily recover cases and positive cases. Figure-5 (E) is showing the positive cases going from 25th February 2020 to May 15th 2021. Although Africa has 17.2% of the world’s population, it only has about 5% of COVID-19 cases diagnosed and 3% of COVID-19-related mortality (Bamgboye et al. 2020). ![Figure-5](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2021/07/06/2021.07.04.21259983/F5.medium.gif) [Figure-5](http://medrxiv.org/content/early/2021/07/06/2021.07.04.21259983/F5) Figure-5 Histographic representation of COVID-19 updated cases in various regions. ### Europe Europe has total COVID-19 positive cases till May 15th 2021 are 36470789.0 and recover cases are 35620243.0. On basis of that the death ration in Europe was 2.3321%. The maximum cases reported on 2nd November 2020 in France which was 106091.0. Figure-4 (C) shows the recover cases and positive cases through the scatter graph. The total test occur in Europe was 448349376.0. Figure-5 (C) is showing the graph of positive cases from 11th March 2020 to 13th March 2021. ### North America North America has total COVID-19 positive cases till May 15th, 2021 are 33850076.0 and recover cases are 33070581.0. The percentage of the death ratio in North America was 2.29889 %. Figure-4 (A) (scattered graph) is showing the daily new cases and daily recover cases from 24th February 2020 to May 15th, 2021. The maximum new confirm cases reported in North America on 2nd January 2021 in United states was 300416.0 among 108349.0 new tests. The total test occurred in North America till May 15th, 2021 was 363400300.0. So, 9.3148% positive cases among all tests. Figure-5 (A) (histogram) is showing the cases going from February 24th, 2020 to May 15th, 2021 (Control 2021). ### South America South America has total COVID-19 positive cases till May 15th, 2021 are 19191051.0 and recover cases are 18693169.0. Based on this the death ratio of the South America was 2.5853%. Figure-4 (D) (scatter graph) is showing the daily new cases and daily recover cases. The maximum new cases reported in South America 87843.0 in Brazil on 7th, January 2021. The total test occur in South America was 4245255498.0. Figure-5 (D) is showing the graph going for daily positive cases in South America from 1st January 2020 to May 15th, 2021 (Control 2021). ### Oceania Oceania has total COVID-19 positive cases till May 15th, 2021 are 33454.0 and recover cases are 31901.0. The death ratio of the COVID-19 in Oceania was 2.8636%. Figure-4 (F) shows the recover cases and positive cases. The total test occur was 11088450.0 so the ration of positive cases among tests are 0.3017%. Figure-5 (F) is showing the graph cases occur from 26th January 2020 to May 15th, 2021. ### Effect of COVID-19 Vaccine and out of vaccination campaigns The COVID-19 Vaccine Program in the United States was started on December 14, 2020. About 87.49 million individuals, or 26.4% of the population of the United States, have received at least one dose of vaccine as of March 26, 2021. Around 39.9 million people, or 12.07% of the US population, have received fully vaccination. Figure-6 (A) is showing that still 73.6 % US population requires COVID-19 vaccination. Figure-6 (B) and (C) (line chart) reveals that after Jan-2021 the chart of new cases is curve and new cases are decreasing day by day. Mostly there are three kind of vaccines available in the USA namely Pfizer-BioNTech, Moderna And J&J (Control 2021). ![Figure-6](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2021/07/06/2021.07.04.21259983/F6.medium.gif) [Figure-6](http://medrxiv.org/content/early/2021/07/06/2021.07.04.21259983/F6) Figure-6 A comparison of case-based antibodies of COVID-19 observed in the U.S. ## Conclusion The current work represents a novel code - scripting big data analysis of COVID-19 using Apache Spark and Python, as well as a machine learning approach that any researcher can use. The advantage of this study is real time data set rows were examined and endeavoured at the speed of microseconds of time. These data comprised of six distinct continents of the world. Based on this data we concluded the death ratio, positive COVID cases rate changes were observed very thoroughly, Although Africa has the lowest death rate, it is not statistically significantly lower than other continents. Interestingly, we observed that the Oceanian continent has a higher death rate than the other continents, even though the total number of COVID testing is very low. In Asia, 6% of the total COVID testing population tested positive for COVID, whereas 9% of the total COVID testing population tested positive for COVID in North America. Even though the total number of COVID testing population in North as well South America is greater than either Asia or other continents. We also observed that after vaccination, positive COVID cases begin to decrease dramatically, which is a result of the ongoing successful vaccination campaign program in the United States and the North American continent (Figure-6). In Conclusion, this Apache Spark and Python based a machine learning approach implements the time dependent data availability with live coverage of updates with personal hygiene and precautions towards COVID-19. Represented machine learning-based PySpark analysis is also useful for healthcare research institutes in clinical and biostatistics rapid analysis, allowing them to conduct a quick, efficient, visualizing, and agile study on the COVID-19 outbreak. ## Data Availability https://github.com/owid/covid-19-data/blob/master/public/data ## Compliance with Ethical Standards The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. This article does not contain any studies with human participants or animals performed by any of the authors. This work relies on the online data available from Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. ## Footnotes * **Author Approvals** All authors have seen and approved the manuscript, and that it hasn’t been accepted or published elsewhere * **Authors have no competing interests to declare** * Received July 4, 2021. * Revision received July 4, 2021. * Accepted July 6, 2021. * © 2021, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NonCommercial-NoDerivs 4.0 International), CC BY-NC-ND 4.0, as described at [http://creativecommons.org/licenses/by-nc-nd/4.0/](http://creativecommons.org/licenses/by-nc-nd/4.0/) ## References 1. [1]. M. You, Z. Wu, Y. Yang, J. Liu, D. Liu, Open Forum Infectious Diseases Spread of Coronavirus 2019 From Wuhan to Rural Villages in the Hubei Province, Open Forum Infect. Dis. (2020). [https://doi.org/10.1093/ofid/ofaa228](https://doi.org/10.1093/ofid/ofaa228). 2. [2]. C.N. Patel, S.P. Kumar, H.A. Pandya, R.M. Rawal, Identification of potential inhibitors of coronavirus hemagglutinin-esterase using molecular docking, molecular dynamics simulation and binding free energy calculation, Mol. Divers. (2020). [https://doi.org/10.1007/s11030-020-10135-w](https://doi.org/10.1007/s11030-020-10135-w). 3. [3]. C.N. Patel, D. Goswami, D.G. Jaiswal, R.M. Parmar, H.A. Solanki, H.A. Pandya, Pinpointing the potential hits for hindering interaction of SARS-CoV-2 S-protein with ACE2 from the pool of antiviral phytochemicals utilizing molecular docking and molecular dynamics (MD) simulations., J. Mol. Graph. Model. 105 (2021) 107874. [https://doi.org/10.1016/j.jmgm.2021.107874](https://doi.org/10.1016/j.jmgm.2021.107874). 4. [4]. D. Cucinotta, M. Vanelli, WHO Declares COVID-19 a Pandemic, Acta Biomed. 91 (2020) 157–160. [https://doi.org/10.23750/abm.v91i1.9397](https://doi.org/10.23750/abm.v91i1.9397). [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.23750/abm.v91i1.9397&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F07%2F06%2F2021.07.04.21259983.atom) 5. [5]. C.N. Patel, D. Goswami, K. Sivakumar, H.A. Pandya, Repurposing of anticancer phytochemicals for identifying potential fusion inhibitor for SARS-CoV-2 using molecular docking and molecular dynamics (MD) simulations, (2021). [https://doi.org/10.1080/07391102.2021.1902393](https://doi.org/10.1080/07391102.2021.1902393). 6. [6]. E. Ong, M.U. Wong, A. Huffman, Y. He, COVID-19 coronavirus vaccine design using reverse vaccinology and machine learning, BioRxiv. (2020). [https://doi.org/10.1101/2020.03.20.000141](https://doi.org/10.1101/2020.03.20.000141). 7. [7]. A. Gupta, H.K. Thakur, R. Shrivastava, P. Kumar, S. Nag, A Big Data Analysis Framework Using Apache Spark and Deep Learning, IEEE Int. Conf. Data Min. Work. ICDMW. 2017-Novem (2017) 9–16. [https://doi.org/10.1109/ICDMW.2017.9](https://doi.org/10.1109/ICDMW.2017.9). 8. [8]. I. Stancin, A. Jovic, An overview and comparison of free Python libraries for data mining and big data analysis, 2019 42nd Int. Conv. Inf. Commun. Technol. Electron. Microelectron. MIPRO 2019 - Proc. (2019) 977–982. [https://doi.org/10.23919/MIPRO.2019.8757088](https://doi.org/10.23919/MIPRO.2019.8757088). 9. [9]. A. Ghaffar S. & Tariq, R. Soomro, A.G. Shoro, & Tariq, Big Data Analysis: Ap Spark Perspective, Type Double Blind Peer Rev. Int. Res. J. Publ. Glob. Journals Inc. 15 (2015). 10. [10]. M. Assefi, E. Behravesh, G. Liu, A.P. Tafti, Big data machine learning using apache spark MLlib, Proc. - 2017 IEEE Int. Conf. Big Data, Big Data 2017. 2018-Janua (2017) 3492–3498. [https://doi.org/10.1109/BigData.2017.8258338](https://doi.org/10.1109/BigData.2017.8258338). 11. [11]. R. Magar, P. Yadav, A. Barati Farimani, Potential neutralizing antibodies discovered for novel corona virus using machine learning, Sci. Rep. 11 (2021) 5261. [https://doi.org/10.1038/s41598-021-84637-4](https://doi.org/10.1038/s41598-021-84637-4). 12. [12]. H.S. Badr, H. Du, M. Marshall, E. Dong, M.M. Squire, L.M. Gardner, Association between mobility patterns and COVID-19 transmission in the USA: a mathematical modelling study, Lancet Infect. Dis. 20 (2020) 1247–1254. [https://doi.org/10.1016/S1473-3099(20)30553-3](https://doi.org/10.1016/S1473-3099(20)30553-3). 13. [13]. A. Ammon, D. Faensen, Surveillance von Infektionskrankheiten auf europäischer Ebene, Bundesgesundheitsblatt - Gesundheitsforsch. - Gesundheitsschutz. 52 (2009) 176–182. [https://doi.org/10.1007/s00103-009-0759-y](https://doi.org/10.1007/s00103-009-0759-y). 14. [14]. S. Salloum, R. Dautov, X. Chen, P.X. Peng, J.Z. Huang, Big data analytics on Apache Spark, Int. J. Data Sci. Anal. 1 (2016) 145–164. [https://doi.org/10.1007/s41060-016-0027-9](https://doi.org/10.1007/s41060-016-0027-9). 15. [15]. F.A. Binti Hamzah, C.H. Lau, H. Nazri, D.C. Ligot, G. Lee, C.L. Tan, et al., CoronaTracker: World-wide Covid-19 outbreak data analysis and prediction, Bull. World Health Organ. (2020) Submitted. 16. [16]. S. Latif, M. Usman, W. Iqbal, J. Qadir, S. Manzoor, G. Tyson, I. Castro, A. Razi, M.N. Kamel Boulos, A. Weller, J. Crowcroft, Leveraging Data Science To Combat COVID- 19: A Comprehensive Review TESCON (Tools for Enforcement of Smart Contracts) View project CONTRIVE View project Leveraging Data Science To Combat COVID- 19: A Comprehensive Review, 1 (2020) 85–103. 17. [17]. T. Patel, GitHub - pateltanvi2992/Data-Analysis-of-Covid-19-using-Spark-: Data Analysis of Covid-19(corona virus data) using Spark(Anaconda-Jupyter notebook), (2021). [https://github.com/pateltanvi2992/Data-Analysis-of-Covid-19-using-Spark](https://github.com/pateltanvi2992/Data-Analysis-of-Covid-19-using-Spark) (accessed March 29, 2021). 18. [18].[https://www.coolantarctica.com/](https://www.coolantarctica.com/), The Antarctic Population - Who lives in Antarctica?, (n.d.). [https://www.coolantarctica.com/Antarcticafactfile/science/can\_you\_live\_in\_antarctica.php](https://www.coolantarctica.com/Antarcticafactfile/science/can\_you_live_in_antarctica.php) (accessed March 29, 2021). 19. [19].[https://www.barrons.com/](https://www.barrons.com/), The Stock Market Is in a Manic Bull Market. How Long It Will Last—and What to Buy. | Barron’s, (n.d.). [https://www.barrons.com/articles/the-stock-market-is-in-a-manic-bull-market-how-long-it-will-lastand-what-to-buy-51616760007](https://www.barrons.com/articles/the-stock-market-is-in-a-manic-bull-market-how-long-it-will-lastand-what-to-buy-51616760007) (accessed March 29, 2021). 20. [20]. E.L. Bamgboye, J.A. Omiye, O.J. Afolaranmi, M.R. Davids, E.K. Tannor, S. Wadee, A. Niang, A. Were, S. Naicker, COVID-19 Pandemic: Is Africa Different?, J. Natl. Med. Assoc. (2020) 1–12. [https://doi.org/10.1016/j.jnma.2020.10.001](https://doi.org/10.1016/j.jnma.2020.10.001). 21. [21]. D. Control, First Month of COVID-19 Vaccine Safety Monitoring — United States, 70 (2021) 283–288. [1]: /embed/graphic-4.gif