Abstract
Background Incidence of long COVID in the elderly is difficult to estimate and can be under-reported. While long COVID is sometimes considered a novel disease, many viral or bacterial infections have been known to cause prolonged illnesses. We postulate that some influenza patients might develop residual symptoms that would satisfy the diagnostic criteria for long COVID, a condition we call “long Flu”. In this study, we estimate the incidence of long COVID and long Flu among Medicare patients using the World Health Organization (WHO) consensus definition. We compare the incidence, symptomatology, and healthcare utilization between long COVID and long Flu patients.
Methods and Findings This is a cohort study of Medicare (the U.S. federal health insurance program) beneficiaries over 65. ICD-10-CM codes were used to capture COVID-19, influenza and residual symptoms. Long COVID was identified by a) the designated long COVID-19 code B94.8 (code-based definition), or b) any of 11 symptoms identified in the WHO definition (symptom-based definition), from one to 3 months post infection. A symptom would be excluded if it occurred in the year prior to infection. Long Flu was identified in influenza patients from the combined 2018 and 2019 Flu seasons by the same symptom-based definition for long COVID. Long COVID and long Flu were compared in four outcome measures: a) hospitalization (any cause), b) hospitalization (for long COVID symptom), c) emergency department (ED) visit (for long COVID symptom), and d) number of outpatient encounters (for long COVID symptom), adjusted for age, sex, race, region, Medicare-Medicaid dual eligibility status, prior-year hospitalization, and chronic comorbidities. Among 2,071,532 COVID-19 patients diagnosed between April 2020 and June 2021, symptom-based definition identified long COVID in 16.6% (246,154/1,479,183) and 29.2% (61,631/210,765) of outpatients and inpatients respectively. The designated code gave much lower estimates (outpatients 0.49% (7,213/1,479,183), inpatients 2.6% (5,521/210,765)). Among 933,877 influenza patients, 17.0% (138,951/817,336) of outpatients and 24.6% (18,824/76,390) of inpatients fit the long Flu definition. Long COVID patients had higher incidence of dyspnea, fatigue, palpitations, loss of taste/smell and neurocognitive symptoms compared to long Flu. Long COVID outpatients were more likely to have any-cause hospitalization (31.9% (74,854/234,688) vs. 26.8% (33,140/123,736), odds ratio 1.06 (95% CI 1.05-1.08, p<0.001)), and more outpatient visits than long Flu outpatients (mean 2.9(SD 3.4) vs. 2.5(SD 2.7) visits, incidence rate ratio 1.09 (95% CI 1.08-1.10, p<0.001)). There were less ED visits in long COVID patients, probably because of reduction in ED usage during the pandemic. The main limitation of our study is that the diagnosis of long COVID in is not independently verified.
Conclusions Relying on specific long COVID diagnostic codes results in significant under-reporting. We observed that about 30% of hospitalized COVID-19 patients developed long COVID. In a similar proportion of patients, long COVID-like symptoms (long Flu) can be observed after influenza, but there are notable differences in symptomatology between long COVID and long Flu. The impact of long COVID on healthcare utilization is higher than long Flu.
Why was this study done?
The quoted incidence of long COVID varies widely because of differences in definition and measurement method. Long COVID in the elderly is likely to be under-reported because they are less likely to respond to surveys, and symptoms may be confused with other chronic diseases. We describe a method of identifying long COVID in the elderly using a standard definition.
Lingering ill health after infections is not limited to COVID-19. We postulate that some patients may fit the diagnostic criteria of long COVID after a bout of influenza. We call this condition “long Flu”. Comparing and contrasting long COVID and long Flu may shed light on the understanding of long COVID, a disease still shrouded in mystery.
What did the researchers do and find?
We applied the World Health Organization consensus clinical definition to identify long COVID among 2 million Medicare patients who were diagnosed with COVID-19 between April 2020 and June 2021.
We applied the symptom-based long COVID definition to almost 900,000 influenza patients during the 2018 and 2019 Flu seasons to identify long Flu.
Long COVID occurred in 16.6% of outpatients and 29.2% of inpatients. The corresponding rates for long Flu were 17% and 24.6%. If one had relied solely on the designated diagnostic code to identify long COVID, the estimated rates of long COVID would be 0.5% and 2.6% among outpatients and inpatients, way below the reported rates in most studies.
Despite the similar overall incidence rates, long COVID patients suffered more often from difficulty in breathing, fatigue, palpitations, loss of taste/smell, memory problems, cognitive impairment and sleep disturbance than long Flu patients.
Long COVID patients were also more likely to be admitted to hospital and had more outpatient visits on average than long Flu patients.
What do these findings mean?
The use of designated long COVID diagnostic codes alone is likely to result in gross under-reporting.
A similar proportion of influenza patients suffer from a prolonged illness resembling long COVID, but there are notable differences in the incidence of individual symptoms between long COVID and long Flu.
Long COVID is associated with higher level of healthcare utilization than long Flu. This means that long COVID is likely to have a bigger impact on the individual’s health as well as on society as a whole.
1. Introduction
Most Coronavirus Disease 2019 (COVID-19) patients recover completely after an infection with the SARS-CoV-2 virus. However, a proportion suffer from persistent health issues after the acute phase of COVID-19 [1-7]. Various names have been used to describe this condition, including long COVID, long-haulers, long-term effects of COVID-19, post-COVID syndrome, chronic COVID syndrome, post-COVID conditions and post-acute sequelae SARS-CoV-2 infection (PASC). We shall use long COVID in this report. Symptoms reported by long COVID patients range from fatigue, dyspnea, loss of smell to “brain fog”. The incidence of long COVID varies widely between studies, the majority are between 10% to 30% [8-19]. According to one estimation, up to 23 million people in the U.S. may have developed long COVID as of February 2022 [20]. Another study estimated that at least 3-5 million U.S. adults have activity-limiting long COVID [21].
While it is known that elderly patients are more prone to develop severe COVID-19, some studies have identified age as a risk factor also for long COVID [22, 23]. So far, relatively few long COVID studies have focused on the elderly. Long COVID can be under reported in the elderly population because they may not be as troubled by, or ready to report, the symptoms as in younger people [24]. They may also be less likely to participate in Internet-based research or respond to questionnaires. Moreover, the long COVID symptoms may be masked by or attributed to existing chronic diseases. One study finds that almost a third of COVID-19 patients over 65 years developed one or more new or persistent clinical sequelae [25]. Another study reports significant deterioration in quality of life and functional decline in elderly patients 6 months after COVID-19 [26]. More information is needed to understand the impact of long COVID on the elderly population.
While long COVID is sometimes considered a novel disease, it is hardly a totally unexpected phenomenon. Many viral or bacterial infections have been known to cause prolonged illnesses in a subset of patients [27]. Rheumatic fever following infection by Streptococcus pyogenes is a well-known example. Herpesviruses and enteroviruses are implicated in the cause of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), characterized by fatigue, musculoskeletal pain and post-exertional malaise [28]. First reports of prolonged symptoms after contracting Russian influenza dated back to the 19th century [10, 29]. More recently, the Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV-1) and the Middle East Respiratory Syndrome Coronavirus (MERS-CoV) have been associated with post-acute phase persistent symptoms that affected approximately one-third of patients [30].
There are a lot of similarities between COVID-19 and influenza. Both diseases are caused by easily transmissible single-stranded RNA viruses primarily affecting the respiratory tract, with significant systemic manifestations. Both diseases affect millions of patients every year, presenting substantial medical and socioeconomic challenges. It is conceivable that, in some patients, a persistent state of ill health, similar to long COVID, can occur after influenza [31]. Given the similarities between COVID-19 and influenza, we postulate that there is considerable overlap in the symptomatology of the post-viral syndromes that they are associated with. This means that some patients suffering from post-flu syndrome might meet the diagnostic criteria for long COVID, had the primary infection been COVID-19 instead of influenza. For lack of a better name, we shall call this condition “long Flu”. Conceptually, long Flu patients can serve as an “influenza comparator group” for long COVID. We think that comparing long COVID and long Flu would bring new insights to the understanding of long COVID.
In this study, we develop a pragmatic algorithm to identify long COVID patients based on the clinical definition proposed by the World Health Organization (WHO). Furthermore, we apply the same algorithm to identify patients who may be suffering from long Flu in two previous influenza seasons (2018 and 2019) and compare them with long COVID patients in three aspects: incidence, symptomatology, and impact on healthcare utilization.
2. Materials and methods
2.1 Study population
The primary cohort of this study was all Medicare patients over 65 diagnosed to have COVID-19 between April 2020 and June 2021. A control cohort of non-COVID-19 patients was identified by 1 to 5 matching for the same period. An influenza comparator cohort was identified from two pre-pandemic influenza seasons in 2018 and 2019. Through the Virtual Research Data Center (VRDC) [32] of the Centers for Medicare and Medicaid Services (CMS), we accessed de-identified encounter data of all Medicare beneficiaries from 2016-2021. Medicare is the U.S. federal government’s health insurance program that primarily covers people 65 and older, and certain younger people with disabilities or kidney failure. Most individuals become eligible for Medicare when they reach 65 [33]. By one estimate, almost all (93%) of non-institutionalized persons 65 and over, about 52 million in 2017, are covered by Medicare [34]. We focused our analysis only on Medicare beneficiaries aged ≥ 65, since younger Medicare beneficiaries are not representative of the general population aged <65 as they need qualifying disability conditions to enroll. To ensure we have sufficient data for symptom look-up (see below for method to identify long COVID and long Flu), we excluded patients a) with less than one year of Medicare coverage, b) with no encounters in a year prior to COVID-19 or influenza diagnosis, c) who were continuously enrolled in Medicare Advantage plans, mostly private health maintenance organization (HMO) plans that are not original Medicare fee-for-service (FFS) plans in the period between 1 year before and 12 weeks after the COVID-19 or influenza diagnosis. The last exclusion is necessary because Medicare claims data are potentially incomplete for patients enrolled in non-FFS plans. This study was declared not human subject research by the Office of Human Research Protection at the National Institutes of Health and by the CMS’s Privacy Board. There was no prospective analysis protocol submitted before the commencement of this study.
2.2 Identifying long COVID and long Flu
2.2.1 Long COVID
We identified COVID-19 patients based on the International Classification of Disease-10th Version-Clinical Modification (ICD-10-CM) code U07.1 COVID-19, in either inpatient or outpatient claims between April 1, 2020, and June 30, 2021. We stopped at June 2021 to ensure that we have acquired complete data for long COVID analysis because of several months’ lag for claim maturity. We separated COVID-19 patients into two mutually exclusive groups: outpatient and inpatient. For outpatients, the first COVID-19 diagnosis must be an outpatient coding, and the patient must not be admitted to an inpatient facility (acute care hospital or skilled nursing facility (SNF)) for COVID-19 within 4 weeks of COVID-19 diagnosis. A SNF is an in-patient rehabilitation and medical treatment center which provides a wide range of medical care including physical therapy, intravenous therapy, injections, monitoring of vital signs and medical equipment. For Inpatients, a) the first COVID-19 diagnosis must be an inpatient coding, or b) the first COVID-19 diagnosis is an outpatient coding, but the patient must be admitted to an inpatient facility for COVID-19 within 4 weeks of the COVID-19 diagnosis.
For identification of long COVID, we tried two approaches. The first was based on the recommended ICD-10-CM code for long COVID, B94.8 Sequelae of other specified infectious and parasitic diseases, during our study period (code-based definition). B94.8 can potentially be used in non-COVID-19 infections, but significant use of this code in Medicare data occurred only after April 2020 (usage increased over 20-fold), making it reasonably specific for long COVID. Note that after our study, a new specific code for long COVID, U09.9 Post COVID-19 condition, became available from October 2021.
During peer review of this paper, additional analysis was suggested to study the impact of the new long COVID code on the diagnosis of long COVID. This was done using additional data from September to December 2021 which had become available after our study was concluded. The second approach was based on a constellation of symptoms (symptom-based definition). We followed the WHO’s clinical definition that was developed through a consensus process involving over 200 experts, researchers and patients [35]:
“Post COVID-19 condition occurs in individuals with a history of probable or confirmed SARS-CoV-2 infection, usually 3 months from the onset of COVID-19 with symptoms that last for at least 2 months and cannot be explained by an alternative diagnosis. Common symptoms include fatigue, shortness of breath, cognitive dysfunction but also others which generally have an impact on everyday functioning. Symptoms may be new onset, following initial recovery from an acute COVID-19 episode, or persist from the initial illness. Symptoms may also fluctuate or relapse over time.”
We used ICD-10-CM codes to identify the 11 symptoms that at least 50% of the participants in the WHO’s consensus building process thought were critical to include (see S1 Table for their incidence and temporal trend). Long COVID was defined as presence of any of the 11 symptoms unless they were excluded (see below for exclusion criteria). We also did a sensitivity analysis using only the top 3 symptoms (fatigue, shortness of breath, cognitive dysfunction) that reached 70% agreement in the WHO’s consensus building. For outpatients, we looked for symptoms from 4 to 12 weeks after the COVID-19 diagnosis. For inpatients, we started looking for long COVID symptoms after they were discharged to their original place of residence, following the recommendation from Amenta et al [36]. The observation period for inpatients was from 2 to 10 weeks after discharge to compensate for the median hospitalization period of 2 weeks.
We did not include SNF patients (9% of all COVID-19 patients) in the inpatient group because the proportion of COVID-19 patients admitted to SNF was unusually high (32.6%) compared to influenza patients in two previous Flu seasons (4.2%). More importantly, 88.6% of SNF COVID-19 patients were not discharged home but transferred to different inpatient facilities at the end of our study period. Two CMS policy changes in response to the pandemic may explain these phenomena: a) waiving of the three-day requirement of prior hospital stay for admission to SNF, and b) the extension of SNF coverage for an additional 100 days [37]. Since our identification of long COVID for inpatients started after their discharge, most of SNF patients did not satisfy the inclusion criteria.
To satisfy the requirement that “[the symptom] cannot be explained by an alternative diagnosis”, we tried two approaches:
Exclusion by history – the long COVID symptom must not be present from 2 weeks to 1 year before the COVID-19 diagnosis. We started the look-back period 2 weeks prior to COVID-19 diagnosis because COVID-19-related symptoms started to increase from 2 weeks before the COVID-19 diagnosis, indicating a lag in diagnosis reporting (see S1 Table). Note that this exclusion only applied to individual symptoms, not to the patient as a whole. For example, if a patient had dyspnea 6 weeks after COVID-19 diagnosis, but the patient also complained of dyspnea 6 months before COVID-19 diagnosis, then dyspnea would not be counted as a long COVID symptom. However, the patient was not excluded and might still be identified as long COVID due to other symptoms.
Exclusion by history and comorbidities – in addition to 1 above, we excluded symptoms that could be explained by a known comorbidity, e.g., dyspnea excluded in the presence of chronic obstructive pulmonary disease (see S2 Table for excluded symptoms). We used the chronic condition onset dates in the Medicare database to identify comorbidities [38].
To estimate the false positive rate of the symptom-based definition, we matched each outpatient COVID-19 case to 5 controls (never had the code U07.1 COVID-19 or Z86.16 Personal history of COVID-19) on age in years, race, sex, dual-eligibility status (a surrogate for income), geographic region and Charlson comorbidity index [39] (within a range of +/- 2). We applied the same symptom-based definition on controls to assess the proportion of patients who would be falsely identified as long COVID.
2.2.2 Long Flu
Since the incidence of influenza diminished significantly during the COVID pandemic, we used data from two pre-pandemic Flu seasons, October 2017 – May 2018 (2018 season) and October 2018 – May 2019 (2019 season) to estimate the incidence of the postulated long Flu. We identified influenza patients using the ICD-10-CM codes J09, J10 and J11. Similar to COVID-19, we separated influenza patients into outpatient and inpatient groups. We used the same list of long COVID symptoms and symptom exclusion criteria to identify long Flu. The observation period was the same as for long COVID, i.e., from 4 to 12 weeks after influenza diagnosis for outpatients, and 2 to 10 weeks after discharge for inpatients. We excluded all SNF influenza inpatients (0.5% of total) from the long Flu cohort to be comparable with the long COVID cohort.
2.3 Comparing long COVID and long Flu
We compared the incidence and the distribution of symptoms between long COVID and long Flu patients. To estimate the impact of long COVID or long Flu on healthcare utilization, we analyzed four outcomes: a) hospitalization (any cause), b) hospitalization (due to any long COVID symptom), c) emergency department (ED) visit (due to any long COVID symptom), and d) number of outpatient (excluding ED) encounters (due to any long COVID symptom). We ran analyses of each of four outcomes separately for outpatients and hospitalized patients. For outpatients, we observed the outcomes for the period 4-12 weeks post COVID-19 or influenza diagnosis. For inpatients, the observation period was 2-10 weeks post discharge.
2.4 Statistical analysis
We included all patients with COVID-19 or influenza, including those diagnosed with both. When we had to compare COVID-19 and influenza patients statistically, we excluded patients who had both conditions to ensure independence between groups. To test the difference in incidence of each specific symptom between long COVID and long Flu patients, we first implemented a two-by-two contingency Chi-square test [40], and used Hochberg method to control for familywise error rate from multiple hypotheses testing by decreasing the number of false positives [41]. We then used a multiple logistic regression model [42] to compare them further controlling for age, sex, race, region, dual eligibility and Charlson comorbidity index and reported the adjusted odds ratios. We did not have to deal with missing data as demographics and socioeconomic data were always present.
To test the difference in the first three outcomes of healthcare utilization (hospitalization any cause, hospitalization with long COVID symptoms, ED visit with long COVID symptoms), we implemented a generalized linear model with logit link adjusting for all available patient characteristics (age, sex, race, geographical region, dual eligibility status, history of any hospitalization in prior year, and 55 chronic conditions) as covariates. We adjusted for these covariates because demographics, comorbidities and socioeconomic factors are known to affect healthcare utilization. To compare the number of outpatient visits with long COVID symptoms, we used a generalized linear model with a log link function (i.e., Poisson regression or log-linear regression analysis) with the same set of covariates as adjusters. We used the generalized estimating equations (GEE) method to account for overdispersion in the Poisson regression model.
The primary goal of this study is to estimate the incidence of long COVID in the elderly by various approaches (code-based and symptom-based). Furthermore, we aim to compare the incidence, symptomatology, and healthcare utilization of long COVID with the hypothetical condition of long Flu. This study is reported as per the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guideline (S1 Checklist).
3. Results
3.1 Patient characteristics
We began with 2,434,154 COVID-19, 1,033,515 influenza and 8,755,799 control Medicare beneficiaries aged ≥ 65. After applying exclusion criteria discussed in 2.1., we were left with 2,071,532 COVID-19 patients of whom 1,479,183 (71.4%) were outpatients (Fig 1). We combined data from the two Flu seasons (2018 and 2019) because they were quite similar in terms of patients’ health status and demographics (see S3 Table). There were 933,877 influenza patients of whom 817,336 (87.5%) were outpatients. All cases and controls were followed-up for 2 months to identify long COVID or long Flu, making a total follow-up time of 1.5 million patient years.
Participant inclusion, exclusion, and matching
Table 1 shows the characteristics of the COVID-19 and influenza patients. Note that in comparing two groups, 91,796 COVID-19 patients who also had influenza in previous two Flu seasons were excluded (3% of total). Among the hospitalized patients, influenza patients were older and sicker (higher prior hospitalization rate and Charlson comorbidity index) than COVID-19 patients, but the difference was much smaller among outpatients.
Characteristics of COVID-19 and influenza patients (CCI – Charlson comorbidity index, SD – standard deviation, IQR – inter-quartile range, LIS – low-income subsidy)
a combined 2018 & 2019 Flu seasons
b excluding 91,796 patients with both COVID-19 and influenza
c any prior hospitalization within one year of COVID-19 or influenza diagnosis
Each COVID-19 outpatient was matched to 5 non-COVID-19 patients. Overall, there were 6,286,633 controls with unmatched rate (COVID-19 patients unable to be matched) of 1.1%. The cases and controls were generally quite well matched in terms of demographics and Charlson comorbidity index -. Due to the large sample size, most differences between cases and controls were statistically significant but the absolute standardized difference <0.25 indicated good balance between them (S4 Table) [43].
3.2 Incidence of long COVID and long Flu
Among the hospitalized COVID-19 and influenza patients, 52.8% (210,765/399,124) and 68.4% (76,390/111,721) respectively were discharged to their original place of residence by the end of our study period. These patients were used to estimate the incidence of long COVID and long Flu in inpatients. Based on the ICD-10-CM diagnosis code of B94.8 (i.e., the code-based definition), only 0.49% of outpatients and 2.6% of hospitalized patients were identified to develop long COVID (Table 2), way lower than most published reports. Using the symptom-based definition, the estimated incidence of long COVID was closest to other studies when we applied the definition of “any of 11 symptoms with history exclusion only”. By this definition, the incidence of long COVID was 16.6% and 29.2% in outpatients and hospitalized patients respectively. If we added the comorbidity exclusion to this definition, the rates would drop to 5.7% (outpatient) and 8.8% (inpatient). We shall use the “any of 11 symptoms with history exclusion only” definition as our main result in subsequent discussion.
Incidence of long COVID and long Flu by various definitions
3.3 Difference in symptomatology of long COVID and long Flu
After excluding 91,796 patients with both COVID-19 and influenza (3% of all COVID-19 and influenza patients), we had 293,172 long COVID patients (outpatient 234,688, inpatient 58,484) and 140,697 long Flu patients (outpatient 123,736, inpatient 16,961) (Table 3). Rates of dyspnea, fatigue, palpitations, and loss of taste/smell were significantly higher in long COVID than long Flu, among both outpatients and hospitalized patients. In contrast, cough, chest pain, headache, and muscle/joint pain were more frequent in long Flu in both outpatient and inpatient groups. The incidence of memory problem, cognitive impairment and sleep disturbance were significantly higher in long COVID among outpatients only.
Symptoms in long COVID and long Flu patients (OR – odds ratio of developing each specific symptom for long COVID compared to long Flu, CI – confidence interval)
a adjusted for multiple hypothesis testing from Chi-Square test
b adjusted for age, sex, race, region, dual eligibility and Charlson comorbidity index
3.4 Difference in healthcare utilization in long COVID and long Flu
For both outpatients and hospitalized patients, long COVID was associated with significantly higher chance of any hospitalizations and more outpatient visits than long Flu (Table 4). The difference is especially notable among hospitalized patients, even though hospitalized influenza patients, at the baseline, were sicker and older than hospitalized COVID-19 patients, which should normally translate into more healthcare utilization. In contrast, the likelihood of an ED visit was significantly higher among long Flu patients.
Healthcare utilization of long COVID and long Flu patients (ED – emergency department, SD – standard deviation, OR – odds ratio, IRR – incidence rate ratio, CI – confidence interval)
a with any of the 11 symptoms of long COVID
b adjusted for age, sex, race, geographical region, dual eligibility status, history of any hospitalization in prior year, and chronic comorbidities
4. Discussion
One major hurdle in long COVID-19 research is the difficulty in identifying long COVID cases. Due to the wide range of definitions, the results of studies often cannot be compared or generalized [12]. The development of a consensus on clinical definition by the WHO partially fills this gap [35]. Our study has developed a method to operationalize this clinical definition. The WHO definition depends on the presence of a constellation of symptoms but requires that the symptoms do not have an alternative explanation. Identifying such explanations can be done in observational studies. We compared two approaches – exclusion by history (the symptom did not occur within the previous year) and exclusion by history and comorbidities. The use of exclusion by history alone yields estimates of long COVID of 16.6% in outpatients and 29.2% in hospitalized patients, which are close to many published results. A recent large meta-analysis covering 194 studies shows that on average, at least 45% of COVID-19 survivors (non-hospitalized patients 34.5%, hospitalized patients 52.6%) continue to experience at least one unresolved symptom [44]. This means that our results are likely to be an under-estimation. One potential problem of exclusion of individual symptoms based on history is that common symptoms are more likely to be excluded, even if they are indeed related to long COVID. In our study, the incidence of cognitive impairment and memory problem among long COVID patients is lower than that reported in the literature [44]. Since these problems are generally more common among elderly patients, they are more likely to be excluded by coincidence. Another factor that can be at play is that the threshold of seeking help for these problems may be higher in elderly patients. Adding the exclusion based on comorbidities cuts the estimates significantly to 5.7% for outpatients and 8.8% for hospitalized patients. We speculate that the comorbidity exclusion is probably too strict because some comorbidities are very prevalent in our study population of elderly Medicare beneficiaries. For example, high prevalence of heart failure (43%), chronic obstructive pulmonary disease (41%), and fibromyalgia chronic pain and fatigue (54%) would lead to exclusion of cough, dyspnea, and fatigue.
A recent study shows promise in the use of machine learning to identify long COVID [45]. However, the model in Pfaff et al. was only built on a very small, 0.6%, subset of 97,995 COVID-19 patients attending a long COVID clinic, highlighting the difficulty of identifying all patients suffering from long COVID. For machine learning methods to be effective, a large number of cases along with high-quality data is required. Achieving this goal can be difficult because long COVID tends to be under-reported and under-coded [46]. In our study, code-based identification of long COVID yields an estimation of only 0.49% in outpatients and 2.6% in hospitalized patients, way below published results. A specific long COVID code, U09.9, was delivered in October 2021, however, healthcare providers were slow to take advantage of this code [47]. Based on additional data from September to December 2021, which became available after our study concluded, we found that the new long COVID code (U09.9) was used more often than the old one (B94.8), whose usage dropped off significantly. Using the new code in our code-based definition, 2.0% (11,424/573,965) of outpatients and 9.2% (6,253/68,030) of hospitalized patients developed long COVID. This is still considerably lower than most reports in the literature. Researchers should be aware of the potential under-reporting if they rely solely on specific codes to identify long COVID.
We postulate the existence of “long Flu” based on reports of post-infectious sequelae after influenza [29]. Long COVID has attracted special attention because of the pandemic, but the possible occurrence of prolonged symptoms from influenza should not be overlooked. Long COVID is still a poorly understood disease. Comparing and contrasting long COVID with long Flu may offer new insights into its pathogenesis and treatment. The pathogenesis of long COVID is likely to be complex and more than one mechanism may be implicated in some clinical manifestations [28, 48]. Evidence suggests that prolonged inflammation probably plays a key role. In addition, it is known that, like other coronaviruses, SARS-CoV-2 can invade the blood-brain barrier and access the central nervous system through peripheral or olfactory neurons [49, 50]. This could explain the greater incidence of psychoneurological symptoms (e.g., cognitive impairment, loss of taste or smell, memory problem and sleep disturbance) in long COVID compared to long Flu in our study. Another special feature of COVID-19 is the high incidence of thromboembolism, probably as a result of endothelial injury and heightened inflammation, which can lead to organ or tissue injury [51, 52]. Investigators have found high incidence of significant radiological and functional abnormalities indicative of lung parenchymal and small airway disease after the acute phase of COVID-19, which could give rise to dyspnea and easy fatigue [53].
Based on our estimation, the incidence of long COVID and long Flu is comparable among outpatients (16.6% vs. 17.0%), and slightly higher for long COVID in inpatients (29.2% vs. 24.6%). But incidence alone doesn’t tell the whole story about the impact of the two diseases. Our model on healthcare utilization shows that long COVID patients are more likely to seek outpatient care and be hospitalized, after controlling for demographics, socioeconomic factors, and comorbidities. This suggests that long COVID is a more serious illness than long Flu and has greater societal impact. One unexpected finding is the reduced ED visits in long COVID patients. This is probably an anomaly and could be explained by the fact that overall, ED visits shrunk during the pandemic, possibly because patients fearing exposure to COVID-19 avoided the ED for conditions for which they otherwise would have sought emergency care. Long flu data came from 2017-2019, and we observed that the overall usage of ED dropped by 18% in 2020 compared with 2017-2019 among Medicare senior beneficiaries.
We recognize the following limitations. Based on our exclusion criteria, 15% of COVID-19 patients, 10% of influenza patients and 28% of matched controls were excluded, which may affect the generalizability of our findings. Not all COVID-19 diagnoses are captured in Medicare claims data. Our previous study showed that up to one-third of COVID-19 cases could be missed [54]. Medicare claims-based data may miss services or treatment paid for by private insurance or other means. The code-based definition of long COVID is based on the recommended code available for the study period, which may not be specific for long COVID. The symptom-based definition relies on the symptoms reported as “diagnosis” at the healthcare encounter and may not be sensitive because providers may not routinely code all symptoms. Using claims data, we cannot easily ascertain the duration of the symptoms as stipulated in the WHO definition. There is no independent confirmation that patients identified by our method are indeed suffering from long COVID. However, we can venture some estimation of our error rates. False positive rate can be estimated by the positive rate in controls (10.5%). Among the 12,734 patients specifically coded as long COVID (code-based definition), 8,239 (64.7%) patients were identified as long COVID by the symptom-based definition, so the false negative rate can be estimated to be about 35%. If we adjust our results by these estimated error rates, the incidence of long COVID would be 9% for outpatients and 28% for inpatients, still not far from other studies. Our observation period ends at 12 weeks after the COVID-19 or influenza diagnosis. Some patients may present with long COVID or long Flu after that period. Among the inpatients, we exclude SNF patients because a significant proportion of them are not discharged within the study period. We assume that long Flu is similar to long COVID and use the same symptomatic definition. There may be symptoms in post-influenza syndrome that are not common in long COVID, and patients with those symptoms may not be identified as long Flu in our study.
Based on a constellation of symptoms identified in the WHO’s consensus definition, we estimate that long COVID occurs in 16.6% and 29.2% of elderly COVID-19 outpatients and inpatients respectively. The corresponding incidence for long Flu, identified by the same constellation of symptoms for two pre-pandemic influenza seasons, is about the same (17% and 24.6%). Long COVID patients have significantly higher incidence of dyspnea, fatigue, palpitations, loss of taste or smell, and neurocognitive symptoms. Compared to long Flu, patients with long COVID are hospitalized more often and have more outpatient visits, suggesting that it is a more serious illness and has higher societal impact.
Data Availability
Concerning data availability, the minimal data set is included in the Supporting information. All data in Supporting information can be used without restriction. This now includes the precise values used to build the long COVID symptom trend graphs (S1 Table) and the detailed statistical data obtained in the logistic and Poisson regressions (S5 Table), from which the odds ratios and incidence rate ratios can be derived. As for raw data, CMS does not let us download (or distribute) any patient level data. The data stay on their machine, and we analyze it with software they provide on their machine. If researchers wish to access the raw data, they can contact the CMS Virtual Research Data Center. However, data access requires the payment of a fee.
Supporting information
S1 Checklist Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guideline checklist
S1 Table Long COVID symptoms and temporal trends
S2 Table Exclusion of symptoms by comorbidities
S3 Table Patient characteristics in the 2018 and 2019 Flu seasons
S4 Table COVID-19 outpatient cases and controls
S5 Table Detailed statistical data of the logistic and Poisson regression models for healthcare utilization