ABSTRACT
Background High-value care aims to enhance meaningful patient outcomes while reducing costs. Curating data across healthcare systems with common data models (CDMs) would help these systems move towards high-value healthcare. However, meaningful patient outcomes, such as function, must be represented in commonly used CDMs, such as Observational Medical Outcomes Partnership Model (OMOP). Yet the extent that functional assessments are included in the OMOP CDM is unclear.
Objective Examine the extent that functional assessments used in neurologic and orthopaedic conditions are included in the OMOP CDM.
Methods After identifying functional assessments from clinical practice guideline, two reviewer teams independently mapped the neurologic and orthopaedic assessments into the OMOP CDM. After this mapping, we measured agreement with the reviewer team with the number of assessments mapped by both reviewers, one reviewer but not the other, or neither reviewer. The reviewer teams then reconciled disagreements, after which we again examined agreement and the average number of concept ID numbers per assessment.
Results Of the 81 neurologic assessments, 48.1% were initially mapped by both reviewers, 9.9% were mapped by one reviewer but not the other, and 42% were unmapped. After reconciliation, 46.9% of neurologic assessments were mapped by both reviewers and 53.1% were unmapped. Of the 79 orthopaedic assessments, 46.8% were initially mapped by both reviewers, 12.7% were mapped by one reviewer but not the other, and 48.1% were unmapped. After reconciliation, 48.1% of orthopaedic assessments were mapped by both reviewers and 51.9% were unmapped. Most assessments that were mapped had more than one concept ID number (neurologic assessments: 2.2±1.3; orthopaedic assessments: 4.3±4.4).
Conclusions The OMOP CDM includes a portion of functional assessments recommended for use in neurologic and orthopaedic conditions. Many assessments did not have any term in the OMOP CDM. Thus, expanding the OMOP CDM to include recommended functional assessments and creating guidelines for mapping functional assessments would improve our ability to harmonize these data across healthcare systems.
1. BACKGROUND AND SIGNIFICANCE
The United States spends twice as much annually per person on healthcare compared to other high-income countries, yet we obtain worse health outcomes (e.g., shorter life expectancy, lower quality of life).[1, 2] This necessitates a shift towards high-value healthcare, where meaningful patient outcomes are improved while lowering the costs associated with these outcomes (i.e., better outcomes per dollar spent).[3–5] Large real-world data about interventions, outcomes, and costs are needed to move towards high-value care because they allow us to identify what interventions result in the best outcomes at the lowest cost. The outcomes needed for value-based care initiatives must be meaningful to patients.[3–6] While there are a number of important outcomes, function is particularly important because it is salient across all health diagnoses.[3–6] Thus, large-scale real-world databases that include assessments of function, which are routinely collected by rehabilitation professionals, are needed to improve the value of healthcare.
The widespread adoption of electronic medical records (EMRs) has allowed for the generation of these types of real-world databases. These real-world databases are even more useful for value-based care initiatives when data is aggregated across healthcare systems. Unfortunately, the same metric, such as a functional assessment, is frequently referred to differently across healthcare systems (Table 1, Healthcare System 1 and 2 columns), presenting a barrier to aggregating data across systems. To overcome this barrier, a number of common data models (CDMs) have been developed (e.g., Sentinel, the National Patient-Centered Clinical Research Network [PCORnet], Informatics for Integrating Biology in the Bedside [i2b2], the Observational Health Data Sciences and Informatics’ (OHDSI) Observational Medical Outcomes Partnership [OMOP]).[7] CDMs provide a standard set of terms (i.e., vocabulary) and structure so that data from different EMRs can be harmonized (Table 1, Harmonized Data column) and then aggregated.[8–10] Of these CDMs, the OMOP CDM[11, 12] has gained popularity due to its broad coverage of standard terminologies, flexibility and simplicity, and robust open-source tools.[9, 13–16]
To leverage the OMOP CDM, healthcare systems must map their data into that CDM through an extract-transform-load (ETL) process. In this process, key data elements are first identified or extracted. These key data elements are then mapped into the OMOP CDM. Finally, the data from the EMR is loaded into the OMOP CDM. Thus, the utility of the OMOP CDM for creating robust datasets that can be used for high-value care is dependent on how well functional assessments are represented in the OMOP CDM. As a result, many medical fields have examined the extent to which key metrics in their field are included in the OMOP CDM.[14, 15, 17, 18] However, despite the importance of function to patients and to value-based care, the extent to which key functional assessments are included in the OMOP CDM has not been examined and, therefore, these important assessments are rarely included in large, real-world datasets.
2. OBJECTIVES
Our purpose was to determine the extent to which assessments of function that are recommended for use in neurologic and orthopaedic patient populations are contained within the OMOP CDM. These selected these two patient groups because they represent a large portion of patients seen in rehabilitation settings and of health care spending in the United States.[19, 20] We included assessments of function that are recommended in existing guidelines from the American Physical Therapy Association as physical therapy is a specific area of medicine that is focused on function. We hypothesized that 1) there would be good agreement between reviewers related to the mapping of the assessments and 2) that less than half of the included assessments would be included in the OMOP CDM. This work serves as a first step to ensuring that measures of function can be harmonized across healthcare systems and included in databases that facilitate our shift towards high value healthcare.
3. METHODS
This work was determined to not involve human subjects by the University of Utah IRB.
3.1. Assessment Selection
Because there are numerous assessments for function and quality of life in individuals with neurologic and orthopaedic conditions, we used clinical practice guidelines to select assessments with strong psychometric properties and with strong recommendations for use in these patient groups.
Two authors (H.H. and M.A.F.) with over 30 years of combined experience in neurologic rehabilitation selected the neurologic assessments for inclusion. They leveraged the American Physical Therapy Association Academy of Neurologic Physical Therapy’ Evidence Database to Guide Effectiveness (EDGE) documents for individuals with Multiple Sclerosis,[21] Parkinson Disease,[22] spinal cord injury,[23] stroke[24], traumatic brain injury,[25] and vestibular disorders[26] to guide the selection process. These documents provide recommendations regarding what functional assessments should be used in each specific patient population based on a thorough review of the assessments’ psychometric properties (i.e., validity, reliability) by content matter experts. The recommendations within each EDGE document are “highly recommended,” “recommended,” “unable to recommend,” or “not recommended.” They further make recommendations regarding which assessments entry level physical therapy practitioners should be familiar. Assessments that were 1) highly recommended or recommended in more than one neurologic patient population and 2) recommended for entry level physical therapy practitioners were included.
Two authors (A.T. and J.M.) with over 48 years of combined experience in orthopaedic rehabilitation selected the orthopaedic assessments for inclusion. Because there are no EDGE documents for orthopaedic conditions, they leveraged the American Physical Therapy Association Academy of Orthopaedic Physical Therapy’s Clinical Practice Guidelines (CPGs) for achilles tendinopathy, [27] ankle instability,[28] carpal tunnel syndrome,[29] heel pain, [30] ACL injury Prevention,[31] knee ligamentous instability,[32] meniscal and cartilage lesions,[33] anterior knee pain,[34] hamstring injury,[35] hip fracture,[36] hip osteoarthritis,[37] non-arthritic hip pain,[38] lateral elbow pain,[39] low back pain,[40, 41] concussion,[42] neck pain,[43] occupational injury,[44] pelvic girdle pain,[45] and adhesive capsulitis.[46] These CPGs provide recommendations on assessments for the targeted patient population based on the current best evidence. Recommendations are graded as “A - Strong Evidence,” “B – Moderate Evidence,” “C-Weak Evidence,” “D - Conflicting Evidence,” “E -Theoretical Foundation Evidence,” “F - Expert Opinion.” All recommendations graded A, B, or C were included in the current work except for concussion assessments. These were graded as “F-expert opinion” but were included as they were the only measures recommended for concussion assessment.
If an assessment was identified as both a neurologic and orthopaedic assessment, only one team of reviewers completed the mapping process described below. These assessments are identified in Appendix A.
3.2 Initial Mapping
After identifying the functional assessments, we followed a similar process to that of previous work to assess their content coverage in OMOP.[14, 15, 17] Two reviewers independently mapped the neurologic (M.A.F. and H.H.) and the orthopaedic (P.H. and A.T.) assessments into the OMOP CDM using Usagi.[47] USAGI is one of the many open source tools developed by OHDSI to support the use of the OMOP CDM. Usagi uses term similarity to provide an automated suggestion for the standard OMOP CDM vocabulary term, called a concept identification (ID) number, for each functional assessment.[47] Two reviewers independently reviewed the suggested concept ID number provided by Usagi to determine 1) if the concept ID number was correct and 2) if there were other concept ID numbers onto which the assessment should be mapped. Reviewers selected all concept ID numbers that were standard concepts, as opposed to non-standard, in the OMOP CDM that were appropriate for each assessment.
3.3 Metrics of Initial Agreement
After the initial mapping, each reviewer exported their mappings from Usagi for analysis. Two primary metrics were calculated separately for the neurologic assessments and for the orthopaedic assessments using R (version 4.2.1)[48]. The first metric was assessment agreement, in which each assessment was labeled as being 1) mapped by both reviewers, 2) mapped by reviewer A but not reviewer B, 3) mapped by reviewer B, but not reviewer A, or 4) unmapped by both reviewers. The second metric was the concept ID number agreement. This metric looked specifically at the assessments that were mapped by both reviewers to determine if the two reviewers mapped each assessment to the same concept ID number. Each concept ID number was categorized into one of three categories: 1) mapped by both reviewers, 2) mapped by reviewer A but not reviewer B, or 3) mapped by reviewer B, but not reviewer A. There was no unmapped category for this metric as we focused on only measures that were mapped by both reviewers to at least one concept ID number.
3.4 Reconciliation
After the analysis of the initial mapping, each reviewer team reviewed the following: 1) assessments that were unmapped by both reviewers, 2) assessments that were mapped by one but not the other reviewer, and 3) concept ID numbers that were mapped by one but not the other reviewer. The reviewer teams reconciled any differences in the mappings in these three areas. Following reconciliation, the same metrics described in the Metrics of Initial Agreement were calculated on the reconciled mappings.
3.5 Statistical Analysis
For assessment agreement and concept ID number agreement, we report 1) the portion of metrics in each category, 2) Cohen’s kappa,[49], 3) the percent overall agreement, and 4) Gwet’s agreement coefficient.[50, 51] We calculate Gwet’s agreement coefficient because there is a well-documented paradox of having a high percent overall agreement with a low kappa when the observations are imbalanced as expected in the analysis of concept ID number agreement (i.e., we expect no observations in the unmapped category).[50, 52–54] The Gwet’s agreement coefficient is interpreted similarly to Cohen’s kappa, with >0.81 being very good agreement, 0.61-0.8 being good agreement, 0.41 to 0.6 being moderate agreement, 0.21 to 0.4 being fair agreement, and less than or equal to 0.2 being poor agreement.[50] We also present Sankey plots to visualize changes in the mapping category from the initial mapping to reconciliation for each of these metrics. Lastly, we provide descriptive statistics of the number of concept ID numbers per assessment after reconciliation.
4. RESULTS
4.1 Assessment Agreement
We identified 160 unique assessments, 81 neurologic assessments and 79 orthopaedic assessments, to include in the analysis. During the initial mapping for the neurologic assessments, 48.1% (39/81) of assessments were mapped by both reviewers, 9.9% (8/81) were mapped by one but not the other reviewer, and 42% (34/81) were unmapped by both reviewers (Figure 1A). This resulted in a 90.1% overall agreement with a Cohen’s kappa and Gwet’s agreement coefficient of 0.80, indicating good agreement. After reconciliation, both reviewers mapped 46.9% of assessments (38/81), while the remaining 53.1% (43/81) were determined to be unmapped (Figure 1A). This resulted in a 100% overall agreement and a Cohen’s kappa and Gwet’s agreement coefficient of 1, indicating perfect agreement.
During the initial mapping of the orthopaedic assessments, 46.8% (37/79) of the assessments were mapped by both reviewers, 12.7% (10/79) were mapped by one reviewer but not the other, and 48.1% (38/79) were unmapped by both reviewers (Figure 1B), This resulted in 87.3% overall agreement with a Cohen’s kappa and Gwet’s agreement coefficient of 0.74 and 0.75, respectively, indicating good agreement. After reconciliation, both reviewers mapped 48.1% of assessments (38/79), while the remaining 51.9% (41/79) were determined to be unmapped (Figure 1B). This resulted in a 100% overall agreement with a Cohen’s kappa and Gwet’s agreement coefficient of 1, indicating perfect agreement.
4.2 Concept ID number agreement
For the neurologic assessments that were initially mapped by both reviewers, there were 233 unique concept ID numbers identified. Of these, 33.5% (78/233) were the same between both reviewers, while 66.6% (155/233) were mapped by only one reviewer (Figure 2A). This resulted in a 33.5% overall agreement with Cohen’s kappa and Gwet’s agreement coefficient indicated poor agreement (-0.8 and -0.2, respectively). During reconciliation, it was determined that this poor agreement was in large part due to one reviewer selecting individual questionnaire items for several large testing batteries in addition to the full test. During reconciliation, these concept ID numbers were determined to be incorrect; thus, after completing the reconciliation process, 36.0% (84/233) concept ID numbers were determined to be correct, while 64.0% (149/233) were determined to be incorrect (Figure 2A). After reconciliation there was a 100% overall agreement with a Cohen’s kappa and Gwet’s agreement coefficient of 1, indicating perfect agreement. For the 38 neurologic assessments that were mapped, there were 84 unique concept ID numbers. Ten neurologic assessments had a single unique concept ID number with an average number of concept ID numbers per neurologic assessment of 2.2 (1.3) and a median of 2 (Figure 3A). Importantly, a number of assessments had concept ID numbers in multiple OMOP domains, typically in the measurement and observation domain (Appendix A).
For the orthopaedic assessments that were initially mapped, there were 196 unique concept ID numbers identified. Of these, 77.0% (151/196) were the same between both reviewers, while 23.0% (45/196) were mapped by only one reviewer (Figure 2B). The overall percent agreement was 77.0%. Cohen’s kappa and Gwet’s agreement coefficient were -0.09 and 0.71, respectively. This demonstrates the well-documented paradox that can occur with Cohen’s kappa when observations are imbalanced, [50, 52–54] necessitating the inclusion of Gwet’s agreement.[51] After reconciliation, 83.2% (163/196) concept ID numbers were determined to be correct, while 16.8% (33/196) were determined to be incorrect (Figure 2B). After reconciliation, overall percent agreement was 100% with a Cohen’s kappa and Gwet’s agreement coefficient of 1, indicating perfect agreement. The final mapping for orthopaedic assessments resulted in 163 unique concept ID numbers for the 38 orthopaedic measures that were mapped. Only four orthopaedic assessments had a single unique concept ID number. The average number of concept ID numbers per orthopaedic assessment was 4.3 (4.4) with a median of 2.5 (Figure 3B). Many of the assessments with a large number of potential concept ID numbers were the Patient-Reported Outcomes Measurement Information System (PROMIS) measures as there are numerous versions of these assessments in the OMOP CDM (Appendix A). As with the neurologic assessments, many measures had at least one concept ID number in the measurement and observation domain of the OMOP CDM.
5. DISCUSSION
This study examined the extent to which functional assessments are included in the OMOP CDM specifically for neurologic and orthopaedic conditions. Our hypotheses were supported such that there was high agreement between reviewers and that less than 50% of assessments were included in the OMOP CDM. We also found that there were multiple standard OMOP CDM concept ID numbers, or vocabulary terms, for most assessments.
We found that 46.9% and 48.1% of the neurologic and orthopaedic assessments were included in the OMOP CDM after reconciliation. The optimistic conclusion from this finding is that almost half of the assessments that are recommended to measure function in these two broad patient groups are already in the OMOP CDM; thus, because of the importance of these constructs to all patients, these assessments should be included in datasets that leverage OMOP CDM. Of note, although we selected these assessments based on recommendations for neurologic and orthopaedic conditions, many of the assessments are valid in other patient populations and, therefore, can be used across an even broad group of patients. For example, the European Quality of Life was included as an orthopaedic assessment. This assessment, however, has been validated in neurologic[55, 56] and general populations[57] as well. Similarly, the 10 meter walk test, which is used to calculate gait speed, was included as a neurologic assessment; however, this metric has been called the “sixth vital sign”[58] and is a critical metric across all patient diagnoses. This highlights that the mapping of functional assessments has significance across medical diagnoses and that the OMOP CDM supports the harmonization of some of these assessments.
The more pessimistic interpretation of our findings is that over half of the assessments we selected based on clinical practice guidelines and recommendations from the American Association for Physical Therapy are not included in the OMOP CDM. Some of the assessments not included in the OMOP CDM are particularly troublesome because of their importance in guiding treatment and understanding patient outcomes. For example, the Functional Gait Assessment is a predictor of falls in neurologic patient groups[59–61] and, is a recommended functional assessment in all individuals with neurologic diagnoses.[62] This tool is also reliable, valid, and predictive of falls in geriatric populations.[63, 64] Yet, this assessment is not mapped into the OMOP CDM. Similarly, the Fear Avoidance Beliefs Questionnaire is recommended with strong evidence in 3 CPGs[41, 44, 65] and with weak evidence in 1 other CPG.[28] Further, fear avoidance is known to significantly affect quality of life and function,[66, 67] yet it is not mapped into the OMOP CDM. Our findings and these specific examples demonstrate the need for expanding the OMOP CDM vocabulary to include more assessments of function.
A significant limitation in the coverage of functional assessments in the OMOP CDM that we identified is that many assessments had multiple standard concept ID numbers. This lack of unique concept ID number is problematic as individuals undertaking an ETL at one site may select a different concept ID number than the individual conducting the ETL process at a different site. This challenge limits our ability to harmonize these assessments across healthcare systems with the OMOP CDM. Agreement and guidelines about which concept ID number should be used from functional assessments are needed. Providing these recommendations is beyond the scope of this work and will require collaboration across the OHDSI community. Based on this work, there are two primary areas that these guidelines should address. First, there should be clarification regarding onto which domain of the OMOP CDM the functional assessment should be mapped. This clarification is needed because many assessments had a single unique concept ID number when constrained to a single domain of the OMOP CDM. For example, we identified two appropriate concept ID numbers for the Short Physical Performance Battery, which is a commonly used functional assessment in orthopaedic and geriatric patient groups; one of the concept ID numbers was in the measurement domain, while the other was in the observation domain (Appendix A). This situation occurred in a number of assessments; thus, clarification on which domain is correct would help mitigate the challenge with multiple concept ID numbers. Secondly, guidelines should address assessments that have multiple concept ID numbers within the same domain. For example, the 6-minute walk test is a commonly used assessment of walking endurance and has three vocabulary terminologies in the OMOP CDM (Appendix A). This could be address by providing clear recommendations for which concept ID number to use or by revising the OMOP CDM vocabulary to minimize redundancy within the CDM.
While the current work significantly contributes to our ability to use the OMOP CDM, there are several limitations. First, we focused on two broad patient diagnosis categories: neurologic and orthopaedic. While these patient groups make up a significant portion of healthcare spending, there are numerous other diagnoses for which a similar exploration should be performed. Importantly, however, many of the assessments included in this analysis can be and are used in other patient populations. Additionally, there are numerous assessments of function that we did not include in our analysis. Instead, we used clinical practice guidelines and other recommendations to guide the selection of the assessments we included, yet there may be other assessments of function that researchers and clinical teams may want to map into the OMOP CDM. Finally, we focused primarily on physical function in this work; however, other domains of function, such as cognition and mood, are important to patients and to value-based care initiatives. Therefore, similar work with these other domains of function would be valuable.
6. CONCLUSION
In this work, we found that the OMOP CDM includes a portion of functional assessments that are recommended for use in clinical practice with individuals with neurologic or orthopaedic conditions. While these findings are encouraging, many of the assessments that were mapped did not have a single unique term in the OMOP CDM. To include functional assessments in databases that allow us to understand and improve the value of healthcare, we must 1) ensure that the assessments that are already mapped into the OMOP CDM are included in the development of large databases, 2) work towards guidelines for the ETL process of functional assessments into the OMOP CDM terminologies, and 3) continuing to expand the OMOP CDM such that it includes all key functional assessments.
Data Availability
All data produced in this work are included in the manuscript.
CONFLICT OF INTEREST
The authors declare that they have no conflicts of interest in the research.
APPENDICES
Appendix A