Abstract
Background Core patient characteristic sets (CPCS) are increasingly developed to identify variables that should be reported to describe the target population of epidemiological studies in the same medical area, while keeping the additional burden on the data collection acceptable.
Methods We conduct a systematic review of primary studies/ protocols published aiming to develop CPCS, using the PubMed database. We particularly focus on the study design and the characteristics of the proposed CPCS. Quality of Delphi studies was assessed by a tool prosposed in the literatue. All results are reported descriptively.
Results Among 23 eligible studies, Delphi survey is the most frequently used technique to obtain consensus in CPCS development (69.6%, n=16). Most studies do not include patients as stakeholders. The final CPCS rarely include socioeconomic factors. 60.9% (n=14) and 31.6% (n=6) of studies provide definition and recommend measurement methods for items, respectively.
Conclusion This study identified a considerable variation and suboptimality in many methodological aspects of CPCS studies. To enhance the credibility and adoption of CPCS, a standard for conducting and reporting CPCS studies is warranted.
Funding No funds, grants, or other support were received during the preparation of this manuscript.
Registration This review was not pre-registered.
Introduction
In epidemiological research, collecting and reporting patient characteristics are of crucial importance. These data allow one to assess the generalizability (or external validity) of the obtained findings, by looking at how closely the study samples match patients in a realistic healthcare setting (1). When comprehensive patient characteristics data are available, the difference between a study sample and a clinically relevant patient population can even be statistically accounted for, to improve the applicability of the findings in clinical practice (2).
Beyond external validity, patient characteristic data is also helpful to improve internal validity. For instance, by assessing the balance of important outcome prognostic factors across different treatment groups in a trial, one can assess whether there might be imperfect randomization. This is highly important when trials are of small sample size (such as in cancerology, where algorithms like minimization-based methods are often used to determine the treatment assignment for each patient) or trials with specific design (such as clusters randomized) (3, 4). In pragmatic trials, detailed patient characteristic data is also strongly needed to account for adherence and drop-out, especially when the aim is to estimate per-protocol treatment effects or to handle missing data (5). Likewise, in observational studies, assessing the balance of exposure and non-exposure groups after propensity score-based stratification or matching, for instance, require extensive data on patient characteristics (6).
In systematic reviews and evidence synthesis, when the eligible studies collect and report data on a common set of patient characteristics, the assessment of the target population (factor P in the PICO criteria) across studies will be facilitated. A more insightful evaluation of the heterogeneity observed among trial results will also be possible (7, 8). Recently, novel methods for causally interpretable meta-analysis have been proposed (8–11). These frameworks also rely on having a rich set of (prognostic) patient characteristics collected across individual studies.
Despite its importance in practice, the collection and reporting of patient characteristic data remains inconsistent and suboptimal. Cahan et al (2017) recently showed that among 186,941 trials on ClinicalTrials.gov, only 8.9% reported baseline participant measures, and up to 85% of those measures were reported only once in the entire registry (12). Lack of adequate reporting of important prognostic factors was also highlighted by Wertli et al. (2013), when they assessed 84 low back pain trials and found that almost half of them incompletely reported variables that are of prognostic importance, even with easily obtainable variables such as age or comorbidities (13). Similar issues also prevalent in many other medical fields, including asthma, diabetes, hypertension, or colorectal cancer (14–18).
In these recent years, significant efforts have been made to standardize the collection and reporting of patient characteristics in epidemiological research. Across many therapeutic areas, a so-called core patient characteristic set (CPCS) is specifically developed to identify all key prognostic factors that should be commonly collected and reported (among studies and databases evaluating a target medical condition), while keeping the additional burden on the implementation acceptable (Fig. 1). Above and beyond the variables proposed in the core set, researchers are free to measure and report additional patient characteristics that are of relevance to their topic. This CPCS concept is inspired by (and hence closely related to) the concept of core outcome set (COS) proposed in clinical research. However, while the methodology for COS development is increasingly enriched in the literature, little attention has been given so far to CPCS and how to develop it in practice.
The use of core patient characteristic sets in epidemiological research
In this paper, we aim to describe the methodology of studies aiming to establish a core set of patient characteristics that should be commonly measured and reported in epidemiological studies and/or in large medical cohorts. By shedding light on current practice and challenges in CPCS development, this review could pave the way for future recommendations and guidelines on methodological standards of CPCS, thus enhancing the adoption of this concept in epidemiological research.
Methods
Study design
We conduct a methodological systematic review conforming to Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 statement (19).
Eligibility criteria
We include primary studies or study protocols that aim to establish a core set of patient characteristics that should be commonly measured and reported in epidemiological studies and/or databases of a pre-specified medical condition, published between 01/01/2001 and 11/08/2022. We thus exclude studies that establish patient characteristics sets for other purposes such as to guide therapeutic decision-making in clinical practice. Conference abstracts, editorials, commentaries, and letters to the editor are excluded. Non-English publications and articles without full text accessible are also excluded from our review.
Search strategy
A structured search in the PubMed database is undertaken by P.H.T. Tu on 12/08/2022. The full search strategy is available in Appendix S1. This search strategy is first developed by two reviewers (P.H.T. Tu and K.L. Duong), then further optimized by a senior researcher (T.-T. Vo) and a librarian specialized in epidemiological systematic reviews. We also manually screened the reference lists of the eligible articles to identify additional eligible studies.
Study selection
The search results are downloaded into Endnote and imported into Rayyan web-based software (20). Duplicates are removed by the duplicate search function in Endnote and by manually reviewing the records list. Four reviewers (P.H.T. Tu, K.L. Duong, M.L. Vuong and T.H.T. Nguyen) independently screen titles and abstracts of retrieved records to select eligible papers based on the inclusion criteria. Each reviewer screens 25% of the total number of records and double-checks 20% of the work of another reviewer. Disagreements are resolved by discussion among four reviewers, and consultation with a senior researcher (T.-T. Vo).
Data extraction and assessment
The data extraction form is developed by M.L. Vuong and P.H.T. Tu, pilot-tested and refined by K.L. Duong and T.-T. Vo (Appendix S2). The structure of the data extraction form is partially adapted from Boulkedid et al (2011) and Diamond et al (2014) (21, 22). Data extraction is performed by M.L. Vuong, P.H.T. Tu and K.L. Duong. Each reviewer extracts 33% and double-checks 33% of the total number of records. Any discrepancy is resolved by discussing among the three reviewers.
We extract the following information from the eligible studies: [1] publication year, [2] target medical conditions, [3] purposes of the developed CPCS (to use in epidemiological studies or in registry settings), [4] study design (consensus-reaching or non-consensus methods), and [5] geographical scope of the study (international or national-wide).
As Delphi technique is the most frequently used method among the eligible studies, we evaluate the methodological and reporting quality of Delphi studies with greater thoroughness. The following characteristics of Delphi studies are extracted: [1] study participants (number, response rate, types, selection criteria of participants, and whether authors report how representativeness of participants is ensured), [2] method to establish the primary list of items before Delphi rounds, [3.1] questionnaire round characteristics: number of rounds, purpose of each round, questions formulation (rating scale or open question), whether the rating scale (if used) is well-defined (i.e. number and the meaning of levels in the scales are specified), whether the questionnaire’s content is publicly available and is piloted in advance, summary information sent to respondents after each round, and methods used to encourage participants to complete the questionnaires, [3.2] in-person meetings characteristics: number of meetings and purposes, form of rating scale (if used) and whether the rating scale is well-defined, whether participants from questionnaire rounds are all invited to the meetings or only selectively, and the timing of meetings, [3.3] whether new items are allowed to be added between rounds, and [4] how consensus are defined and attained, and how Delphi process is terminated.
In the absence of a standardized, validated quality scores for Delphi studies, we roughly assess the quality of those studies by using the checklist proposed by Diamond et al (2014) (22). Four items in the checklists include [1] the reproducibility of criteria for participant selection and whether the number of Delphi rounds, [3] the criteria for dropping items at each round and [4] the criteria to stop the Delphi process are stated and prespecified. The number of items satisfied in each Delphi study is then reported as quality score. Three reviewers (M.L. Vuong, P.H.T. Tu and K.L. Duong) independently assess the quality of all Delphi studies by this tool and reach final consensus.
For the remaining studies, we narratively describe the study design, number and type of participants and organization among them, and method to establish the final CPCS. With non-Delphi, consensus-reaching study, we also extract information on method to establish the primary list of items and the definition and attainment of consensus.
Finally, we extract details of the final CPCS obtained. These include, [1] whether description of item flow reported, [2] whether only the final set or also intermediate results were reported, [3] whether the items in the final set were ranked and how, [4] number of items in the final set, [5] whether the definition and measurement of included items were given and [6] domains of items in the CPCS (demographic, clinical, patient history, socioeconomic or healthcare setting factors).
Data synthesis
Continuous variables were presented with median and interquartile range. Categorical variables were summarized with frequencies and percentages. To investigate the content pattern of the final lists of items across eligible studies, we performed a hierarchical, complete-linkage clustering analysis (23). For each final list, we first calculate the percentage of each domain. The domain profile for each study was then used to calculate the matrix of between-study Euclidean distances. Finally, the obtained result was visualized by a tree-structure graphic.
Data analysis was performed using Microsoft Excel 365 and R version 4.1.1.
Results
Study selection
The PRISMA flow diagram summarizing the screening process is presented in Fig. 2. Of all 5819 references identified, 23 articles met the inclusion criteria for review.
Study selection PRISMA Flowchart.
PRISMA: Preferred Reporting Items for Systematic Reviews and Meta-Analyses
General characteristic of included studies
The general characteristics of all the studies included (24–42) are provided in Table 1. Among 23 eligible studies, 73.9% (n=17) develop a CPCS for epidemiological studies, 21.7% (n=5) develop such a set for heathcare registries and one study (4.3%) develops a CPCS for both registries and epidemiological studies. About 91% of studies (n=21) were published in the last ten years, and 78.3% of studies (n=18) has an international scope. Regarding methodology, 87.0% of studies (n=20) consider a consensus reaching method to develop the core set, with Delphi being the most frequently used technique (69.6%, n=16). Other non-consensus methods include systematic review (8.7%, n=2) and conceptual analysis (4.3%, n=1).
General characteristics of eligible studies (N=23)
Methodological characteristics of Delphi studies
The methodological characteristics of 16 eligible Delphi studies are provided in Table 2 and Appendix S3. Remarkably, almost all studies involve healthcare professionals (93.8%, n=15) or researchers (81.3%, n=13), while only one study (8.3%) involves patients or patient representatives. The criteria for selecting participants are quite various across studies, but most commonly based on scientific renown, publishing and/or expertise level (58.3%, n=7). Though the acceptance rate of the eligible studies is relatively low (median of 25 participants versus 40 invitations), only 41.7% of studies (n=5) reported how they ensured the representativeness of participants.
Methodological characteristics of studies using Delphi consensus approaches (N=16)
Across all studies (100%, n=12), rating scales are used to judge the importance of items during the questionnaire rounds. These scales range from two-point to ten-point, with five-point scales being the most commonly used (33.3%, n=4). The scale is deemed as well-defined in 83.3% of studies (n=10). Apart from item rating, open-ended questions are also included in 66.7% of studies (n=8), mostly to collect qualitative feedback from participants (58.3%, n=7). Besides, 41.7% of studies (n=5) report the use of a specific method to encourage participants to complete the questionnaires (e.g., by sending them reminders or vouchers).
In six (modified) Delphi studies, in-person meetings or teleconferences are additionally organized. The median number of meetings is two (IQR 1–4). The aim of these meetings is to have discussions among participants before rerating the existing items (41.6%, n=5) and adding new items (8.3%, n=1). The rating scales used in these meeting rounds are mainly binary scales (25.0%, n=3), and are well-defined in four studies (33.3%). Meetings are scheduled at different timepoints, either before (8.3%, n=1), in between (25.0%, n=3) or after the questionnaire rounds (16.7%, n=2).
Finally, 16.7% of studies (n=2) do not report the criteria for selecting or dropping an item (Appendix S3). In 91.7% of studies (n=11), the Delphi process is terminated when the preplanned rounds are completed, regardless of the stability of responses or whether consensus has been obtained for all items. In one study, the reason for termination is unclear. As stopping the Delphi not based on response stability or consensus is deemed as suboptimal (22), all studies are penalized for this in the subsequent quality assessment. More precisely, 50% (n=6) of studies have a quality score of three, and 50% (n=6) of studies have a quality score of one or two, on the four-point quality scoring system proposed by Diamond et al. (2014) (22) (Appendix S3).
Methodological characteristics of non-Delphi studies
The methodological characteristics of seven non-Delphi studies are provided in Table 3. In general, only one study (14.3%) reports the types of stakeholders participating in the construction of the CPCS, and no studies report number nor distribution of stakeholders. Similarly, no studies report the criteria for selecting/dropping each item, neither how consensus is reached after each round and at the end.
Other studies of non-Delphi methods for core patient characteristic set (CPCS) construction
Characteristics of the final lists of patient characteristics
The reporting of results and characteristics of the final CPCSs are provided in Table 4 and Fig. 3. Almost all studies (91.3%, n=21) report the final CPCS. Studies that develop a CPCS for in registries often have more items than those developing a CPCS for epidemiological studies (26 [10-31] vs 17 [10-23]) (Table 4).
The reporting of results among all eligible studies (N=23)
Hierarchical clustering of 21/23 CPCS based on five variable domains, namely [1] Demographic factors (age, gender, race), [2] Clinical factors (e.g., disease severity, signs and symptoms, laboratory test), [3] Patient history factors (e.g., lifestyle factors, comorbidities, family history), [4] Socioeconomic factors (e.g., level of education, income, occupation), [5] Healthcare setting factors (e.g., ambulatory care, inpatient, or ICU). Each slice of the chart represents one CPCS.
The characteristics of the final CPCSs is provided in Fig. 3. Most CPCSs contain demographic factors (age, gender, race), clinical factors (e.g., disease severity, presence of a symptom, laboratory test) and patient history factors (e.g., lifestyle, comorbidities, family history). In contrast, socioeconomic factors and healthcare settings factors are often absent in most final lists.
Items included are defined in 60.9% of CPCS (n=14). Besides, 34.8% (n=8) and 26.1% (n=6) of CPCS have specific recommendations on the scale and measurement method for non-obvious items, respectively (Table 4).
The sectors in each chart indicate what type of variables were included in each CPCS, with the area of each sector corresponds to the proportion of each variable type within one CPCS. For instance, the CPCS developed by Khalil et al. (2019) consists of two variable domains: demographic factors and patient history factors, which make up 25% and 75% of the CPCS, respectively. The blue lines starting from the center of the chart define how the tools are divided into the six clusters. Clusters #3 and #4, and #5 and #6 are grouped as sub-nodes of two major nodes, meaning that the tools in these sub-nodes have more similar domain profile compared to the tools in other clusters.
Discussion
The call for better patient characteristics collection and reporting in epidemiological research is not new. The Consolidated Standards of Reporting Trials (CONSORT) statement is one of the first initiatives aiming to improve the reporting of trials, including the selection criteria (item 4a) and the description of the resulting samples (item 15). A table showing baseline demographic and clinical characteristics for each treatment group, including the baseline measurement of the outcome is required. However, the CONSORT statement provides no further indication of which patient characteristics to report. Extensions of the CONSORT statement specify that information on socioeconomic variables should be added, and that all relevant prognostic variables should be reported, but only one CONSORT extension explicitly asks to include comorbidity. Another initiative is the Food and Drug Administration Amendments Act (FDAAA) mandates which require all covered studies to report results (including participants’ age, gender, race or ethnicity, and the baseline measures of the primary outcome) within 1 year of completion (43).
Constructing core patient characteristics sets is increasingly considered as a new method to further improve the collection and reporting of patient characteristics. Most CPCSs are developed within the last ten years. Not only for improving internal and external validity of epidemiological studies, many CPCSs are also developed to increase the quality of patient characteristics data in registries. This is essential because registries are becoming important data sources for recent epidemiological research.
In this review, we identify many different methods to construct a CPCS. Among these methods, consensus-reaching techniques such as Delphi survey are the most frequently used. Indeed, Delphi survey is one of the ideal methods to collect expert-based judgements when the available knowledge is incomplete, which is often the case in CPCS or COS development (44).
Most Delphi studies in our review do not include patients as stakeholders. This is probably because CPCS development requires specialized knowledge on prognostic factors of a certain disease. Hence, involving patients would bring little benefit to the process. However, embracing patients’ perspective on certain variables in the final set could be helpful, especially when these variables are private information of patients such as socioeconomic status, income, family history etc. Methods for patient engagement has recently been proposed for core outcome set, which could be further adjusted for the development of CPCS (45, 46). Besides, many CPCS studies do not report how the representativeness of participants is ensured. Such information is important to determine the quality of the obtained CPCS and its uptake, hence should be better reported in future practice.
Our review has identified a wide range of consensus definitions employed by Delphi studies, with the most common definitions based on the pre-defined cut-offs of percentage of participants voting certain rating levels. This is in line with findings from previous reviews (22, 47). Earlier studies also acknowledged the difficulty of ascertaining the validity of consensus definitions, and there has been no specific guidance on best consensus definition method, which could explain the observed variability in our study (22). However, the minimum standard is to report comprehensibly how consensus is defined and achieved throughout the process. This is not satisfied by one-sixth of eligible studies. Lack of clarity on this could render the studies susceptible to bias and arbitrariness during data collection, analysis and interpretation (47).
Most of studies stop the Delphi process after completing a prespecified number of rounds, regardless of the consensus attainment status. Considering the scarcity and/or divergence of evidence, perfect consensus for 100% of items may not be achievable. Indeed, it has been shown that the evidence of many prognostic variables greatly suffers from a high risk of publication bias, selective reporting biases, poor statistical analyses and so forth (48). To compromise on this issue, many CPCS studies choose to group items into different sets with different priority (based on level of evidence and/or consensus), so that researchers will also be informed about the quality of the variables in the final set. On the other hand, it is important to update the CPCS over time when novel evidence for new (and current) prognostic factors are available in the literature.
Regarding non-Delphi studies, the methodological reporting is relatively weak. Many important factors such as characteristics of experts, method to establish the final list and consensus attainment were often not reported. This raises concern about the rigor of the CPCSs obtained from these studies.
Our review also provides many important remarks on the final core sets across studies. First, while demographic, clinical and patient history factors are dominant in all final sets, socioeconomic and healthcare setting factors are often overlooked. This is suboptimal. Indeed, the socioeconomic gradient in health is ubiquitous, and has been described across pathologies, in life expectancy and mortality (49–51). Meanwhile, describing the healthcare setting is important to assess the applicability of any epidemiological findings into practice. These two types of factors are not any less important than other clinical factors often included in the CPCSs.
Second, the number of (final) items in CPCSs for registries is often higher than in CPCS for epidemiological studies. This could be because registries are of large scale and have more (financial and human) resources for data collection than in traditional epidemiological research (41). The disparity between CPCS for registries and epidemiological studies, however, could imply the challenges in the interoperability between these two settings, and the adoption of CPCS from one setting to the other within one medical field.
Finally, apart from a list of important patient characteristics to collect and report, many CPCSs also provide recommendations on the measurement methods and scales for non-obvious or subjective items. Doing so could further reduce the heterogeneity and inconsistency in data collection practice, as the conversion between different scales for many variables is not straightforward. The downside of this, however, could be that the applicability of the proposed CPCS is reduced in practice. For instance, the recommended measurement methods might be not widely used or have a high cost. These practical factors should be taken into account when making recommendations on the core set.
It is important to acknowledge some limitations of our study. First, we limited the eligibility criteria to articles published in English, hence appropriate studies that were not published in English might have been excluded. Second, the great difference between the number of records identified from the literature and the number of eligible studies may arise from the fact that the specificity and coverage of our search strategy is not optimal. Such a challenge stems from the discrepancies and non-standardized terminology for CPCS, as opposed to COS. We mitigate this issue by consulting a librarian specialized in epidemiological systematic reviews to optimize the search strategy, and by manually search for additional eligible studies from the reference list of identified eligible studies. Finally, we were not able to conduct a formal quality assessment for Delphi studies nor CPCS studies in general, as tools for such purpose are not yet available in the literature.
Conclusion
This methodological systematic review has revealed the suboptimality of the conduct and reporting of CPCS studies, particularly in participant characteristics, method to obtain the final CPCS, and coverage and detail of the CPCS obtained. A conduct and reporting standard for CPCS studies is thus warranted, to further enhance the quality of CPCSs and promote the adoption of this concept in epidemiological research.
Data Availability
All data produced in the present study are available upon reasonable request to the authors
Declarations
Ethics approval
Our study results were based on previously published data, none of which is stated as research involving human or animal subjects. Ethical approval is thus not required.
Data and computing code availability
All data extracted for the systematic review and R code are made available as supplementary information (Appendix S4&S5).
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Registration and protocol
This review includes no clinical studies and therefore no protocol was pre-registered.
Conflicting Interests
The Authors declare that there is no conflict of interest.
Footnotes
Email: Myluong.1710{at}gmail.com. Tel: 32 485 46 16 31.
Email: trang.tu{at}uantwerpen.be. Tel: 32 486977358.
Email: 202160462{at}o.cnu.ac.kr. Tel: 82 10 3928 3663.
Email: tatthang{at}wharton.upenn.edu. Tel: 215 898 8222.