Effect of Sociodemographic Characteristics on Longitudinal Progression of Multimorbidity: A Multistate Modelling Analysis of a Large Primary Care Records Dataset ================================================================================================================================================================= * Sida Chen * Tom Marshall * Christopher Jackson * Jennifer Cooper * Francesca Crowe * Krish Nirantharakumar * Catherine L Saunders * Paul Kirk * Sylvia Richardson * Duncan Edwards * Simon Griffin * Christopher Yau * Jessica K Barrett ## Abstract **Background** Multimorbidity presents a growing public health challenge. While existing research has mainly focused on cross-sectional patterns of multimorbidity, there remains a need to better understand the longitudinal accumulation of diseases and the effects of important sociodemographic characteristics on the accumulation and progression of chronic conditions. **Methods and findings** We use a spline-based parametric multistate modelling (MSM) framework for investigating the role of several sociodemographic characteristics such as ethnicity, deprivation, gender, and age on the rate of disease transitions along different pathways to multimorbidity accumulation. The model is highly flexible and permits a parallel estimation framework that scales well to very large datasets. Focusing on cardiometabolic, renal conditions and mental conditions, we analysed a dataset of about 13 million English electronic primary care records. Our results revealed complex interactions between temporal trajectories of multimorbidity and the sociodemographic factors, with effects that varied depending on the comorbidity status of the subject. These factors tended to have a larger impact on initial multimorbidity development while a smaller impact on later stages. Mental health as a comorbidity often strengthened the adverse effects of sociodemographic characteristics on disease accumulation. **Conclusion** Our findings underscore the potential importance of earlier screening and intervention of chronic conditions, particular for mental health conditions, in the higher-risk populations. This may have important implications for the management of multimorbidity. ## Background Multimorbidity or multiple long-term conditions is defined as the co-existence of two or more chronic conditions in an individual [1]. Multimorbidity increases with age and because of the complexity of managing those with multiple health conditions, it is of increasing importance in countries with ageing populations [2]. Existing research indicates that multimorbidity is associated with poorer quality of life, worsened health outcomes, increased healthcare use and higher mortality [3, 4]. Multimorbidity develops sequentially, as individuals accumulate diagnoses, but to date, research has mainly described cross-sectional patterns of multimorbidity, primarily focusing on identifying multimorbidity clusters [5–7]. We need to understand how diseases accumulate over time and the effects of potentially important sociodemographic characteristics on the accumulation and progression of diagnoses. This may help identify those at high risk of developing further diseases. A recent systematic scoping review summarised research on longitudinal modelling and analysis of multimorbidity patterns, see [8]. One of these analyses investigated the sequence of occurrence of diseases using a visualisation approach, stratified by deprivation status, age and ethnicity [9]. They reported that the more socially deprived had a greater risk for multimorbidity and that ethnicity was associated with the development of multimorbidity. The multistate model (MSM) extends the traditional survival model to encompass more than two states, providing a robust framework for modelling longitudinal event history data and making predictions. MSMs have been effectively used in studies of multimorbidity, where states represent the individual’s health status. In recent studies, MSMs have been applied to examine the progression of chronic conditions [10] and the impact of lifestyle factors on the accumulation of diseases [11]. In [12] the authors evaluated the effect of the order of disease acquisition on patient mortality using a 17-state MSM, while in [11] a 5-state MSM is used to study the impact of lifestyle factors on cancer and cardiometabolic diseases. MSMs have also been used to analyse socioeconomic inequalities in life expectancy with and without multimorbidity [13], the effects of sociodemographic and clinical factors on different stages of multimorbidity development [14], and the potentially differing impacts of lifestyle risk factors on cardiometabolic multimorbidity [15]. MSMs have also been used to examine the role of midlife factors on different cardiometabolic disease trajectories [16]. Most of the existing MSM models of multimorbidity have included only small numbers of conditions (e.g. three) or represent multimorbidity status as counts of conditions rather than combinations of specific conditions. The resulting MSMs thus permit a relatively simple structure and are specified either using simple parametric models such as the Weibull model, which has limited flexibility, or Cox semi-parametric models, which are incapable of absolute risk quantifications or predictions over time. While some risk factors such as lifestyle factors have been considered, detailed studies on the effects of sociodemographic characteristics on the risk of disease-specific multimorbidity progression remain scarce. In addition, previous analyses are mostly based on cohorts of relatively small sample sizes, which may limit the power of their findings relative to the number of states of the models and subgroups. Here we propose a flexible parametric multistate modelling framework to investigate the role of several sociodemographic risk factors on the temporal progression of 5 chronic conditions and 33 states, based on a dataset of approximately 13 million English electronic primary care records. We have focused on cardiometabolic, renal conditions and mental conditions because they are common, important conditions and are known to frequently co-occur, especially in socio-economically deprived areas [17]. The considered conditions are cardiovascular disease (CVD), type 2 diabetes (T2D), chronic kidney disease (CKD), mental health problems (MH) and heart failure (HF). Our MSM extends previous works by allowing for more comorbidity states and disease transition types, providing a more detailed characterization of the pattern of progression of these five diseases. ## Methods ### Data sources The study investigated the temporal sequence of diagnosed chronic conditions in a large anonymised dataset of electronic primary care records, the Clinical Practice Research Datalink (CPRD) [18]. The CPRD Aurum database contains routinely-collected primary care electronic health records from 19 million patients across the 738 English general practices that use EMIS Web® software [18]. It is representative of the English population in terms of age, gender, regional geographical spread and deprivation. It contains information on individual patient demographics. Clinical observations, diagnoses and treatments are recorded as Read Version 2, SNOMED-CT, and EMIS Web® clinical codes. The full data resource profile has been described elsewhere [18]. ### Conditions Five chronic conditions were considered in this study, namely: 1) cardiovascular disease (CVD), including ischaemic heart disease (IHD), peripheral arterial disease (PAD) and stroke or transient ischaemic attack (CVA); 2) type-2 diabetes (T2D); 3) chronic kidney disease (CKD); 4) mental health problems (MH), including depression or anxiety; and 5) heart failure (HF). The presence of a chronic condition was determined by the occurrence of one of the relevant clinical codes (code lists are provided in the supplementary material) in the records, and the date of diagnosis of a condition was taken as the date of the first occurrence of any associated clinical codes. An absence of the code was considered to indicate an absence of the condition. ### Sociodemographic characteristics Sociodemographic characteristics include ethnicity, social and material deprivation, age (calculated from the year of birth) and sex (male/female). Ethnicity was initially classified into five different groups, including White, South Asian (South Asian or British Asian), Black (Black African, Black Carribean or Black British), Mixed and Others (ethnicity codings are provided in the supplementary material) [19]. In our multistate analysis described below, Mixed was merged into Others due to their infrequency in many disease transitions. Social and material deprivation was quantified by the English Index of Multiple Deprivation (IMD), a widely used area-based composite measure of individual-level deprivation [20]. IMD is derived from income, employment, education and living environment and in the CPRD Aurum data, each subject was assigned an IMD decile from 1 (least deprived) to 10 (most deprived). For our analysis, we used IMD quintiles. ### Cohort description Participants were eligible for inclusion in the study if they were registered in a general practice contributing to CPRD between the study start date and the study end date. The study start date was defined as the latest of date the subject turned 18, 01/01/2005, or six months after the patient registration start date. The study end date was defined as the earliest of the patient registration end date, date of death, or the last collection date for practice (11/05/2020). The minimum period of registration ensures time for patient characteristics to be recorded and prior diagnoses to be transferred/recorded. Patients were followed up continuously during the study period. For conditions where code lists were combined the earliest valid code in any code list was used as the date of incidence, and long-term conditions recorded before the study start date were treated as prevalent. For our analysis dataset, we excluded patients with gender classified as indeterminate, those aged >90 years at the study start date and those with missing data on ethnicity or IMD (the vast majority is due to missing information on ethnicity), see supplemental Figure A for more details on sample selection and supplemental Figure C for results of a sensitivity analysis. The resulting cohort consists of a total of 13.48 million individuals. ### Statistical method To examine the effects of ethnicity, deprivation, age and sex on the rate of different pathways of multimorbidity accumulation, we consider a MSM with a total of 33 states and 112 allowable direct transitions (see Figure 1 for an illustration). The states incorporate all possible combinations of the five conditions a person may have during the process of disease progression and the permitted instantaneous state-to-state transitions were indicated by a direct solid line with an arrow indicating the direction of the transition. The death state is the terminal state and can be directly reached from any other state. Here we made the simplifying assumption that no more than one event per person (diagnosis or death) can happen on the same day (so the number of transitions to be estimated is greatly reduced). It is extremely rare for two or more events to take place on a single day and even the most common transitions on the same day were no more than about 5000 among the 13.47 million subjects (i.e. less than 0.04%) in the cohort. Disease transitions recorded on the same day are also likely to be artificially increased by same-day diagnostic tests or recording of diagnoses, so the accuracy of the timing of these recorded transitions is questionable. To flexibly specify the MSM, we followed the suggestions of [21] to use a separate time-to-event model for each transition, and we used a clock-reset time scale (i.e. the time scale resets to zero after every transition) so that the outcome of interest is the time from one disease diagnosis to the next diagnosis or death. More specifically, we modelled the logarithm of the cumulative transition-specific intensities as a spline function of log time [22], from which age, sex, ethnicity and deprivation adjusted hazard ratios (HR) were estimated, assuming proportional hazards and no interactions between these predictors. For each sociodemographic characteristic, we report hazard ratios (and the associated 95% confidence intervals) adjusted for the other sociodemographic characteristics; e.g. hazard ratios for the ethnic group are adjusted for age, sex and IMD. We were able to conduct inference for such a complex MSM by taking advantage of the factorization property of the model likelihood and parallel computing; see the supplementary file for additional technical details. Model fitting for different transitions is performed in parallel via maximum likelihood estimation using the flexsurv package [23] in the R software environment. ![Figure 1:](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2023/03/06/2023.03.06.23286491/F1.medium.gif) [Figure 1:](http://medrxiv.org/content/early/2023/03/06/2023.03.06.23286491/F1) Figure 1: The multistate diagram. The comorbidity states are represented in black boxes where we use red, yellow, green, blue and purple colours to distinguish between cardiovascular disease (CV), type-2 diabetes (T2), chronic kidney disease (CK), mental health (MH) and heart failure (HF), respectively. The transitions are represented using coloured arrows, with colours indicating the disease that is diagnosed when the transition takes place. ## Results ### Summary statistics The cohort baseline characteristics are shown in Table 1. At study entry, 78.5% (n=10,577,667) of the eligible participants had no diagnosis of any of the five conditions of interest. By the end of the follow-up, 83.7% (n=8,856,098) of these had not developed any of the five conditions, 13.8% (n=1,462,966) had developed one, 2% (n=209,237) had developed two, 0.4% (n=42504) had developed three, 0.06% (n=6399) four and 0.0044% (n=463) five of the conditions. Five percent of the study population (n=670,753) died during the study period. Detailed summary statistics for each comorbidity state in our MSM are presented in Supplementary Table A. A summary of the incidence rate for each disease transition in our MSM is provided in Supplementary Table B. The incidence rates of transitions to the first morbidity were notably lower compared to the other transitions, and generally, the incidence rates for transitions increased with the number of comorbidities. It should be noted that the relatively lower incidence rates for transitions from MH compared to other disease transitions could be due to the fact that subjects in the MH state were generally younger (this age effect is adjusted in our MSM analysis). View this table: [Table 1.](http://medrxiv.org/content/early/2023/03/06/2023.03.06.23286491/T1) Table 1. Baseline sociodemographic characteristics of study participants ### Effect of ethnicity on the rate of disease progression There was no consistent association pattern between ethnicity and disease accumulation as this was also modified by comorbidity status (Figure 2). For CVD diagnosis among different comorbidity groups (panel (a)), the hazard ratios tended to be higher for South Asian but lower for black compared to white ethnicity. For T2D (panel (b)), hazard ratios for South Asian, black and other ethnicities compared to White were higher than 1 for most comorbidity groups, but the differences between each of the non-white and white groups decreased with increasing numbers of comorbidities. Generally, differences were less pronounced between the four ethnic groups for CKD (panel (c)), HF (panel (d)) and MH (panel (e)) diagnosis among groups with different comorbidities. For a few transitions, people from the three non-white groups had a slightly reduced rate of disease progression compared to the white ethnicity. For MH, the hazard ratios for black compared to white were mostly lower than 1. ![Figure 2.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2023/03/06/2023.03.06.23286491/F2.medium.gif) [Figure 2.](http://medrxiv.org/content/early/2023/03/06/2023.03.06.23286491/F2) Figure 2. Effect of ethnicity on the rate of disease transition by comorbidity. Estimated hazard ratios are shown in black dots and the associated 95% confidence intervals are represented with coloured bands. White ethnicity is treated as the reference category. ### Effect of deprivation on the rate of disease progression For the diagnosis of CVD, T2D, CKD, HF or MH, hazard ratios were greater for the more deprived groups (IMD>1) compared with the least deprived group (IMD=1), with a higher level of deprivation associated with a higher rate of disease transition (Figure 3). The strength of association with deprivation was stronger for T2D (HR between 1 to 2) and HF (HR between 1 to 1.7), while it was weaker for CKD, MH, and CVD. The effect of greater deprivation on all five conditions tended to decrease with increasing numbers of comorbidities. ![Figure 3.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2023/03/06/2023.03.06.23286491/F3.medium.gif) [Figure 3.](http://medrxiv.org/content/early/2023/03/06/2023.03.06.23286491/F3) Figure 3. Effect of deprivation on the rate of disease transition by comorbidity. Estimated hazard ratios are shown in black dots and the associated 95% confidence intervals are represented with coloured bands. Quintile 1 is treated as the reference category and represents the least deprived. ### Effect of gender on the rate of disease progression Adjusted hazard ratios for men (versus women) were statistically significantly higher for CVD, T2D and HF diagnosis but lower for CKD and MH diagnosis across different comorbidity groups (Figure 4). The difference between men and women for CVD and T2D decreased with increasing numbers of comorbidities (HR for men compared to women was still statistically significantly higher than 1 for most transitions). In contrast, for CKD and HF diagnosis (except for transitions from baseline or MH), there was little difference between men and women regardless of the number of comorbidities. The lower hazards of MH diagnosis for men were slightly reduced with increasing number of comorbidities. ![Figure 4.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2023/03/06/2023.03.06.23286491/F4.medium.gif) [Figure 4.](http://medrxiv.org/content/early/2023/03/06/2023.03.06.23286491/F4) Figure 4. Effect of gender (male) on the rate of disease transition by comorbidity. Estimated hazard ratios are shown in red dots and the associated 95% confidence intervals are represented with black bands. The female gender is treated as the reference category. ### Effect of age on the rate of disease progression A 10-year increase in age was associated with an elevated rate of CVD (HR between 1 to 2), CKD (HR between 1.3 to 2.5) or HF (HR between 1.2 to 2.4) diagnosis (Figure 5). Adjusted hazard ratios for a 10-year increase in age were greatest for those with a single condition, particularly MH, and approached the null for those with increasing numbers of comorbidities. By contrast, adjusted hazard ratios for age were generally statistically significantly lower than 1 for T2D or MH diagnosis, with little difference across the number of comorbidities. ![Figure 5.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2023/03/06/2023.03.06.23286491/F5.medium.gif) [Figure 5.](http://medrxiv.org/content/early/2023/03/06/2023.03.06.23286491/F5) Figure 5. Effect of age at entry to the current state (in 10 years) on the rate of next disease transition by comorbidity. Estimated hazard ratios are shown in red dots and the associated 95% confidence intervals are represented with black bands. ### Comparing the effects of the sociodemographic characteristics For CVD, CKD, MH and HF diagnosis, a 10-year increase in age, higher deprivation levels, and the sex of the individual (male for CVD and HF, female for CKD and MH) tended to be more strongly associated with an increased rate of diagnosis of a condition compared to ethnic group differences (White as the reference group). Among these four characteristics, the age effect was strongest for CKD and HF. For T2D, belonging to the South Asian ethnic group was associated with the greatest increase in the rate of the diagnosis. ### Mental health The impact of the sociodemographic characteristics on the rate of diagnosis of a condition was generally more significant (in the direction of increasing the transition intensity) for individuals with current diagnoses of MH compared to those with other comorbidities. For age and deprivation (particularly age), this pattern is observed for transitions to all four conditions except MH. Even the slightly protective effect of higher age on transitions to T2D is reversed in the presence of a comorbidity of MH. The adverse effect of being male on transitions to CVD, T2D, or HF was amplified with a MH comorbidity. For ethnicity, the pattern is only observed for transitions to T2D, where the hazard ratio for South Asian ethnicity compared to White was higher for those with a MH diagnosis. ### Sensitivity analysis We conducted a sensitivity analysis to examine the impact of data exclusion on the inferences (here we focused on missing ethnicity as it accounts for the vast majority of missing data). More specifically, we expanded our cohort by including subjects with missing ethnicity, who were assigned to an artificial ethnic group named “missing”. We then re-fitted the MSM in the same manner as before. The results, including estimated HRs and associated confidence intervals for each transition, are presented in supplemental Figure C. Upon comparison with Figures 2-5 in the main paper, we observed that there were no significant differences and the key patterns remained unchanged. ## Discussion ### Main findings We analysed the progression of multimorbidity, defined as the combination of five chronic conditions, including CVD, T2D, CKD, MH and HF, in a cohort of more than 13 million English adults. To our knowledge, this is the largest cohort and the highest number of chronic conditions and disease transitions analysed using a flexible multistate modelling framework. The proposed MSM-based approach provides a detailed picture of the effects of sociodemographic characteristics, including ethnicity, deprivation, gender and age, on the rate of disease transitions along different pathways to multimorbidity accumulation. Our study revealed complex interactions between trajectories of multimorbidity progression and ethnicity, deprivation, gender and age, with effects that varied depending on mental and physical health conditions. Generally, the effect of these characteristics was greater for those with fewer morbidities than those with many comorbidities. This suggests sociodemographic characteristics have a larger impact on initial multimorbidity development while a smaller impact on those with many long-term conditions. Notably, MH, as a co-morbidity, was typically an “outlier” as often exacerbated the effects of sociodemographic characteristics on the transition to the next disease. We also found that the relative magnitude of the effects of these sociodemographic characteristics on the rate of diagnosis of a new condition also depended greatly on the comorbidity status of the subject. ### Ethnicity Compared to the White UK population, a higher risk of diabetes in those of South Asian and Black ethnicity has been observed in other epidemiological research [24]. This research also found higher risk and lower risks of CVD in those of South Asian and Black ethnicity respectively. CKD is underdiagnosed in UK primary care, therefore diagnosis is likely to be influenced by the frequency of diagnostic testing [25]. Rates of testing are strongly associated with the presence of chronic conditions but less strongly with ethnicity [26]. In the past, the estimation of CKD prevalence in Black patients and women may have been overestimated due to adjustments for Black ethnicity and female sex in the calculation of the estimated glomerular filtration rate (eGFR) [27]. Epidemiological research has suggested that depression (the most common mental health condition) is more common in older UK adults of Black and South Asian ethnicity than those of White ethnicity [28]. Our finding of slightly lower rates of mental health problems for those in the Black and South Asian ethnic groups may suggest underascertainment of mental health problems, particularly for these groups. ### Deprivation Our results show that higher levels of deprivation tended to be associated with an elevated risk of disease progression for all conditions and types of transitions. This is in keeping with a previous study, which showed that multimorbidity occurred 10–15 years earlier and physical-mental comorbidity was more than twice as prevalent in the most socioeconomically deprived areas of Scotland compared to the least deprived areas [17]. We found that the effect of deprivation was less strong as the number of comorbidities increased, suggesting that deprivation may have a stronger influence on the initial trajectory of multimorbidity. ### Sex As seen in other studies, men tend to have an increased risk of CVD, T2D, and HF, but a lower risk of MH problems and CKD [29]. Our results echo previous findings and additionally demonstrate that the effect of sex on disease progression is dependent on the underlying comorbidity status. For instance, the association between the male gender and the rate of transition to CVD or T2D was most significant for those with the least comorbidities. For MH, the greater the number of comorbidities, the weaker the gender difference in the rate of MH accumulation. For HF and CKD, the gender effect is less influenced by the level of multimorbidity. Note that our results may be affected by gender disparities in the presentation to healthcare, where women are more likely to present for screening and investigations [30]. On the other hand, those with known health conditions are more likely to be invited for (and maybe more inclined to present for) further monitoring and screening for other health conditions. Thus this gender difference regarding clinical presentation is expected to be smaller with increasing comorbid conditions. ### Age As expected, age was associated with increased risk of CVD, CKD and HF. In addition, we found that age amplified the effect of MH on the likelihood of transition to each of these conditions. Increasing age had a much greater impact on the likelihood of transition to a new condition on those with a single condition than those with multiple conditions. Multimorbidity is increasingly prevalent with increasing age [17]. ### Mental health In comparison to physical health conditions such as CVD, T2D, HF, and CKD, MH was a notable outlier in its effect on the rates of disease transition. Our results suggest that the adverse effect of sociodemographic factors on the rate of transitioning to the next condition (CVD, T2D, CKD or HF) tended to be enhanced for trajectories starting with MH compared to those starting with other conditions. The pattern may be partly explained from biological and behavioural points of view. Those with MH as a starting condition tend to fall into the younger age groups (see Supplemental Table A) and mental ill health is associated with unhealthy lifestyle habits such as smoking, alcohol or substance abuse etc [31], which are also known to be important risk factors for cardiometabolic diseases and contribute to the worsening of multimorbidity [32, 11]. On the other hand, this adverse impact of MH is attenuated with additional comorbidities, which could be influenced by the additional medical intervention/treatment received by the subject and relevant lifestyle modifications. ### Strengths and limitations This study benefits from a large dataset from primary health care which is broadly representative of the English population. The conditions considered in this study are included in an incentive scheme for the management of chronic conditions, which means they are likely to be more accurately coded [33]. The use of a highly flexible multistate model allows for a detailed characterization of the effect of key sociodemographic characteristics on 112 different transitions with a very good overall fit for the vast majority of the transitions, demonstrating the utility and computational feasibility of the approach. There are some limitations with the current study. One primary limitation is the possibility of underdiagnosis or ascertainment bias of the onset of chronic conditions. For instance, diseases like CKD and T2D are often asymptomatic in the early stages, and diagnosis may occur during routine health check/screening or following the diagnosis of a related condition (once diagnosed, general practices actively follow up patients for further disease monitoring). The uptake of such checkups is generally less than 50%, with lower figures for men and some ethnic groups [34]. For CKD, renal function testing is lower in minority ethnic groups [26, 35]. For MH, certain groups (e.g. those in the Black or South Asian ethnic groups) are less likely to present to the GP even with symptoms developed [36]. The implication is that the identified date of diagnosis or the transition time between diagnoses of the conditions in the electronic health record data may not reflect the corresponding true onset or transition times. Additionally, the ethnicity categories used in this study crudely group together communities with diverse cultural practices and sociodemographic characteristics, which may mask differences in outcomes within each category. It is also important to note that time-varying factors associated with the disease progression such as treatment information, lifestyle modifications, and change in deprivation status are not available from the present dataset and thus they were not taken into account in the present analysis. Finally, the median follow-up time for the study participants was about 5 years, with a maximum period of about 15 years, which limits our ability to observe complete trajectories for most subjects and gives rise to a very high right censoring rate for most transitions. A longer follow-up period would be desirable for obtaining better inference for transitions in later stages of multimorbidity. ### Unanswered questions and future research Clinical research implications of this analysis are to investigate and better understand the biopsychosocial factors driving progression to multimorbidity. The greatest differences in transition rates to multimorbidity between different groups are in the development of the first few conditions, which suggests the early stages in the pathway to multimorbidity may be particularly important. The MSM framework used here could permit broader types of inference, such as the probability of transition between two comorbidity states over a given time interval (conditional on sociodemographic characteristics), which can be inferred by simulating trajectories conditional on the fitted model. We could also analyse the effect of disease transition history on subsequent disease progression by representing previous transitions as additional covariates in the model. There is also scope to study the association between concurrently measured longitudinal covariates (e.g. biomarkers) and multimorbidity progression using a joint modelling framework for both the longitudinal marker and the multistate processes [37]. Further methodological research is required to overcome the computational challenge of fitting this model. ### Conclusion Multimorbidity is a growing public health challenge and there is a need of moving from a single disease-oriented to a more individualised and holistic approach. In this study, we investigated the influence of key sociodemographic characteristics on the development and progression of multimorbidity, revealing distinctive association patterns for different levels of multimorbidity and disease transitions. Our results indicate that the effects of these factors tend to be weakened with increasing levels of multimorbidity. The difference of mental health as a comorbidity is also highlighted as it often strengthens the adverse effects of sociodemographic characteristics on disease accumulation. These findings underscore the need for earlier screening and intervention of mental health conditions, especially for those in the higher-risk populations, which may have important implications for the management of multimorbidity. ## Supporting information code lists [[supplements/286491_file02.zip]](pending:yes) ## Data Availability The data were provided under license by the Clinical Practice Research Datalink (CPRD), and so are not publicly available. Access to CPRD data is subject to protocol approval via CPRD's Research Data Governance (RDG) Process. ## Author Declarations ### Competing interest statement JB has previously received funding for unrelated work from F. Hoffmann-La Roche Ltd. KN reports grants from NIHR, MRC, Diabetes UK, Vifor, and AstraZeneca, and personal fees from Merck Sharp & Dohme, Sanofi, and Boehringer Ingelheim, outside of and unrelated to the submitted work. TM has previously received funding for unrelated work: from Cancer Research UK and from the Health Research Board (Ireland) for advisory board participation and travel. SJG received payment from Novo Nordisk, Napp and Astra Zeneca for lectures at 6 educational events over the last 3 years. Napp supported SJG’s attendance at EASD 2018 and he received payment from Eli Lilly associated with membership of an independent data monitoring committee for a randomised trial of a medication to lower glucose. PK, SR and KN are co-investigators on the HDRUK grant titled Measuring & Understanding Multimorbidity using Routine Data in the UK (MUrMuR-UK). CY has received remuneration for unrelated consultancy services to F. Hoffmann-La Roche and Singula Bio. The remaining authors have no conflicts of interest. ### Data availability statement The data were provided under license by the Clinical Practice Research Datalink (CPRD), and so are not publicly available. Access to CPRD data is subject to protocol approval via CPRD’s Research Data Governance (RDG) Process. ### Funding statements This work is funded by the MRC/NIHR grant MR/S027602/1 (BIRM-CAM). TM is supported by the National Institute for Health Research Collaboration Applied Research Collaboration West Midlands (NIHR ARC WM). KN is funded by a Health Data Research UK Fellowship. CY is funded by a UKRI Turing AI Fellowship (EP/V023233/2). JB was funded by MRC unit programme MC\_UU\_00002/5. CJ was funded by the Medical Research Council programme number MRC\_MC\_UU\_00002/11. PK was funded by MRC unit programme MC\_UU\_00002/13. SR was funded by MRC unit programme MC\_UU_00002/10. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. ### Disclaimers The views expressed in this publication are those of the authors and not necessarily those of the NIHR, NHS, the UK Department of Health and Social Care or other funding institutions. ### Ethics approval statement The Clinical Practice Research Datalink (CPRD) Research Database has ethics approval from the Health Research Authority to support research using anonymised patient data (East Midlands – Derby Research Ethics Committee, reference 21/EM/0265). Data was approved for this study by the CPRD Independent Scientific Advisory Committee (protocol 19_265). ## Supplementary file ![Supplemental Figure A.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2023/03/06/2023.03.06.23286491/F6.medium.gif) [Supplemental Figure A.](http://medrxiv.org/content/early/2023/03/06/2023.03.06.23286491/F6) Supplemental Figure A. Flowchart for sample selection ### Supplemental Technical Details #### The multistate modelling approach In this section we provide a brief introduction to multistate modelling (MSM), and we refer to [38] and references therein for more in-depth coverage of the topic. A multistate model is built from a mathematical object named a multistate process. A multistate process is a stochastic process {*X(t)*, *t* ≥ 0} with a finite state space *S* = {1, …, *N* } (here we focus on the setting where *t* is continuous and *X*(*t*) is continuously observed). The law of a multistate process can be fully characterised via the transition intensities, which directly describe the instantaneous dynamic of the process ![Formula][1] where *F**t*− represents the history of the process over [0, *t*). There are two special classes of the multistate processes that have been thoroughly studied. The first is the Markov processes, which relies on the Markov assumption such that risks at a particular time are independent of the history, so that ![Formula][2] The second class is the so-called clock-reset semi-Markov processes, which relaxes the Markov property by assuming that ![Formula][3] where *B*(*t*) is the time since entry to the current state h. Therefore, *a**hj* depends on the history only through *B*(*t*), and in effect, the time scale after each transition is reset to zero. In the former scenario, the transition probabilities between two states can be derived from the intensities by solving the Kolmogorov differential equations, whereas for the latter case, no explicit formulation exists. In such scenarios, one could estimate transition probabilities from intensities via simulation-based techniques. A multistate model can be fitted to data representing the evolution of *n* subjects (defined on the same state space) in a population, each of whom is followed up continuously for a potentially different length of time (may be subject to delayed entry or right censoring). To model the covariate-dependent transition intensities for subject ![Graphic][4], Markov or semi-Markov assumptions are commonly adopted and semi-parametric or parametric proportional hazard regression framework is often used. If we in addition assume that each individual’s trajectory, *X**i*(*t*), is independent, then it can be shown using counting process theory that the MSM likelihood factorizes as ![Formula][5] where *K* is the total number of allowed transition types in the multistate process (e.g. if *a*hj ≠ 0 then *h* → *j* represents one type of transition), *L**i* is the likelihood obtained by considering transition type *i* as a survival problem with data based on all subjects at risk for this transition and *#x03B8;* is the parameter vector consisting of parameters from all transitions. An important consequence of this factorization property is that if no parameters are shared across transitions, i.e. *θ* = (*θ*1, …, *θ**K*), then we could perform maximum likelihood estimation of the MSM by maximising each *L**i* independently with respect to its parameter set *θ**i* (see e.g. [39]). In our analysis we specify the MSM by using a separate time-to-event model for each permitted *h* → *j* transition. To ensure flexibility, we use the spline-based Royston-Parmar model [22] for the transition-specific intensities,![Graphic][6], ![Formula][7] where *u* refers to the time since entry to the state *h*, ![Graphic][8] is the cumulative transition intensity associated with ![Graphic][9] is a natural cubic spline in *log u* with coefficient vector *r**hj* (see [22] for more details on the construction of the spline function), *Z**i* denotes the vector of the exogenous covariates for subject *i* and *β*hj** is the vector of the associated linear effects. Note that the resulting time-to-event model retains a proportional hazard structure so that elements of *β**hj* are still interpreted as log-hazard ratios. In our context, the covariates are IMD, ethnicity, gender and age (at the time of the previous transition; in 10 years). The first three variables are categorical and are represented via dummy variables with reference groups set to female, IMD=1 (the least deprived group) and White (the dominant ethnic group), respectively. Age is approximated by a piecewise constant process with jumps occurring at the transition times (hence is treated as a time constant covariate in a transition model). A practical issue here is to determine the number and location of the internal knots for the spline term *s**hj*, which would influence the flexibility of the resulting model. We follow the suggestions in [22] to use the Akaike information criterion (AIC) to determine the number of internal knots in the spline from the candidate values of {0, …, 5}. Note that when there are zero knots the model reduces to a Weibull model. Conditional on the knot number, the location of these knots is set to equally spaced percentiles of the logarithm of the uncensored transition times for *h* → *j* (empirical study indicates that the knot position is not a key driver of model fit). #### Model checking We used the Cox-Snell residuals to graphically examine the goodness of fit of each of our transition models [40, 41]. For a fitted time-to-event model, the Cox-Snell residuals are defined as, ![Graphic][10] where ![Graphic][11] is the estimated cumulative hazard function for subject *i* under the fitted model and *T**i* is the observed event time. If the model fits the data well, we expect the ![Graphic][12] to be close to draws from a unit exponential distribution. This diagnostic is motivated by the fact that for any random variable *T* with survival function *S*(*t*) = *P*(*T* > *t*), the transformed random variable is ![Graphic][13] distributed as an exponential random variable with unit mean. To appropriately take censored event times into account, in practice we estimate the cumulative hazard of the residuals, i.e ![Graphic][14] where ![Graphic][15] is the Kaplan-Meier estimate of the survival function associated with ![Graphic][16] (for censored event times Ti the corresponding ![Graphic][17] are treated as censored as well). A good fit is suggested if a plot of ![Graphic][18] against ![Graphic][19] is close to a straight line with an intercept of zero and a slope of one. For our fitted MSM, we computed the Cox-Snell residuals for each transition, and we observed satisfactory results for of transitions, see Figure B for selective results. Here we observed a slight lack of fit for transitions MH->MH+T2 and MH->MH+CK. We found that residuals that significantly deviated from the reference line (in red) correspond to subjects who had a long transition time, but their sociodemographic characteristics suggest that the transition should have happened earlier (e.g. diagnosed MH at a high age and with high deprivation levels). The presence of such “outliers” may not be surprising given the complexity of the disease progression mechanisms and the inter-subject heterogeneity in such a large cohort. On the other hand, it should be noted that these outliers only account for an extremely small proportion of the data involved in these 112 transitions. Out of the 112 transitions these two were among the most ill-fitting. View this table: [Supplemental Table A.](http://medrxiv.org/content/early/2023/03/06/2023.03.06.23286491/T2) Supplemental Table A. Summary of disease states by sociodemographic characteristics. View this table: [Supplemental Table B.](http://medrxiv.org/content/early/2023/03/06/2023.03.06.23286491/T3) Supplemental Table B. Summary of disease transitions ![Supplemental Figure B.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2023/03/06/2023.03.06.23286491/F7.medium.gif) [Supplemental Figure B.](http://medrxiv.org/content/early/2023/03/06/2023.03.06.23286491/F7) Supplemental Figure B. Plots of the estimated cumulative hazard of the Cox-Snell residuals. ![](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2023/03/06/2023.03.06.23286491/F8/graphic-22.medium.gif) [](http://medrxiv.org/content/early/2023/03/06/2023.03.06.23286491/F8/graphic-22) ![](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2023/03/06/2023.03.06.23286491/F8/graphic-23.medium.gif) [](http://medrxiv.org/content/early/2023/03/06/2023.03.06.23286491/F8/graphic-23) ![](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2023/03/06/2023.03.06.23286491/F8/graphic-24.medium.gif) [](http://medrxiv.org/content/early/2023/03/06/2023.03.06.23286491/F8/graphic-24) ![](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2023/03/06/2023.03.06.23286491/F8/graphic-25.medium.gif) [](http://medrxiv.org/content/early/2023/03/06/2023.03.06.23286491/F8/graphic-25) Supplemental Figure C. Effect of the four sociodemographic characteristics on the rate of disease transition by comorbidity. Results are obtained based on the cohort including those with missing ethnicity information. Graphical settings are the same as in Figures 2-5 of the main paper. ## Acknowledgements This work used data from the Clinical Practice Research Datalink (CPRD) for ISAC protocol 19_265. The data were provided by patients and collected by the NHS as part of their care and support. For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising from this submission. * Received March 6, 2023. * Revision received March 6, 2023. * Accepted March 6, 2023. * © 2023, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution 4.0 International), CC BY 4.0, as described at [http://creativecommons.org/licenses/by/4.0/](http://creativecommons.org/licenses/by/4.0/) ## References 1. 1.van den Akker M, Buntinx F, Knottnerus JA. Comorbidity or multimorbidity. Eur J Gen Pract. 1996;2: 65–70. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.3109/13814789609162146&link_type=DOI) 2. 2.Wallace E, Salisbury C, Guthrie B, Lewis C, Fahey T, Smith SM. Managing patients with multimorbidity in primary care. BMJ. 2015;350: h176. 3. 3.Soley-Bori M, Ashworth M, Bisquera A, Dodhia H, Lynch R, Wang Y, et al. Impact of multimorbidity on healthcare costs and utilisation: a systematic review of the UK literature. Br J Gen Pract. 2021;71: e39– e46. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoiYmpncCI7czo1OiJyZXNpZCI7czoxMDoiNzEvNzAyL2UzOSI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIzLzAzLzA2LzIwMjMuMDMuMDYuMjMyODY0OTEuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 4. 4.Nunes BP, Flores TR, Mielke GI, Thumé E, Facchini LA. Multimorbidity and mortality in older adults: A systematic review and meta-analysis. Arch Gerontol Geriatr. 2016;67: 130–138. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.archger.2016.07.008&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27500661&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F06%2F2023.03.06.23286491.atom) 5. 5.Identifying longitudinal clusters of multimorbidity in an urban setting: A population-based cross-sectional study. The Lancet Regional Health – Europe. 2021;3: 100047. 6. 6.Busija L, Lim K, Szoeke C, Sanders KM, McCabe MP. Do replicable profiles of multimorbidity exist? Systematic review and synthesis. Eur J Epidemiol. 2019;34: 1025–1053. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F06%2F2023.03.06.23286491.atom) 7. 7.Zhu Y, Edwards D, Mant J, Payne RA, Kiddle S. Characteristics, service use and mortality of clusters of multimorbid patients in England: a population-based study. BMC Med. 2020;18: 78. 8. 8.Cezard G, McHale CT, Sullivan F, Bowles JKF, Keenan K. Studying trajectories of multimorbidity: a systematic scoping review of longitudinal approaches and evidence. BMJ Open. 2021;11: e048485. 9. 9.Ashworth M, Durbaba S, Whitney D, Crompton J, Wright M, Dodhia H. Journey to multimorbidity: longitudinal analysis exploring cardiovascular risk factors and sociodemographic determinants in an urban setting. BMJ Open. 2019;9: e031649. 10. 10.Siriwardhana C, Lim E, Davis J, Chen JJ. Progression of diabetes, ischemic heart disease, and chronic kidney disease in a three chronic conditions multistate model. BMC Public Health. 2018;18: 1–14. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s12889-018-6293-9&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F06%2F2023.03.06.23286491.atom) 11. 11.Freisling H, Viallon V, Lennon H, Bagnardi V, Ricci C, Butterworth AS, et al. Lifestyle factors and risk of multimorbidity of cancer and cardiometabolic diseases: a multinational cohort study. BMC Med. 2020;18: 1–11. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s12916-019-1443-1&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=31898501&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F06%2F2023.03.06.23286491.atom) 12. 12.Owen R, Lyons J, Akbari A, Guthrie B, Agrawal U, Alexander D, et al. Temporal sequencing in multimorbidity using population-scale linked data for 1.7 million individuals with 20-year follow-up. Research Square. 2022. doi:10.21203/rs.3.rs-1537576/v1 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.21203/rs.3.rs-1537576/v1&link_type=DOI) 13. 13.Chan MS, van den Hout A, Pujades-Rodriguez M, Jones MM, Matthews FE, Jagger C, et al. Socio-economic inequalities in life expectancy of older adults with and without multimorbidity: a record linkage study of 1.1 million people in England. Int J Epidemiol. 2019;48: 1340–1351. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F06%2F2023.03.06.23286491.atom) 14. 14.Bisquera A, Turner EB, Ledwaba-Chapman L, Dunbar-Rees R, Hafezparast N, Gulliford M, et al. Inequalities in developing multimorbidity over time: A population-based cohort study from an urban, multi-ethnic borough in the United Kingdom. Lancet Reg Health Eur. 2022;12: 100247. 15. 15.Han Y, Hu Y, Yu C, Guo Y, Pei P, Yang L, et al. Lifestyle, cardiometabolic disease, and multimorbidity in a prospective Chinese study. Eur Heart J. 2021;42: 3374–3384. 16. 16.Singh-Manoux A, Fayosse A, Sabia S, Tabak A, Shipley M, Dugravot A, et al. Clinical, socioeconomic, and behavioural factors at age 50 years and risk of cardiometabolic multimorbidity and mortality: A cohort study. PLoS Med. 2018;15: e1002571. 17. 17.Barnett K, Mercer SW, Norbury M, Watt G, Wyke S, Guthrie B. Epidemiology of multimorbidity and implications for health care, research, and medical education: a cross-sectional study. Lancet. 2012;380: 37–43. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S0140-6736(12)60240-2&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22579043&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F06%2F2023.03.06.23286491.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000306115900031&link_type=ISI) 18. 18.Wolf A, Dedman D, Campbell J, Booth H, Lunn D, Chapman J, et al. Data resource profile: Clinical Practice Research Datalink (CPRD) Aurum. Int J Epidemiol. 2019;48: 1740–1740g. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/ije/dyz034&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F06%2F2023.03.06.23286491.atom) 19. 19.List of ethnic groups. [cited 21 Feb 2023]. Available: [https://www.ethnicity-facts-figures.service.gov.uk/style-guide/ethnic-groups](https://www.ethnicity-facts-figures.service.gov.uk/style-guide/ethnic-groups) 20. 20.Ministry of Housing, Communities, Local Government. English indices of deprivation 2019. GOV.UK; 2019. Available: [https://www.gov.uk/government/statistics/english-indices-of-deprivation-2019](https://www.gov.uk/government/statistics/english-indices-of-deprivation-2019) 21. 21.Crowther MJ, Lambert PC. Parametric multistate survival models: Flexible modelling allowing transition-specific distributions with application to estimating clinically useful measures of effect differences. Stat Med. 2017;36: 4719–4742. 22. 22.Royston P, Parmar MKB. Flexible parametric proportional-hazards and proportional-odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects. Stat Med. 2002;21: 2175–2197. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/sim.1203&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=12210632&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F06%2F2023.03.06.23286491.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000177051300004&link_type=ISI) 23. 23.Jackson CH. flexsurv: A Platform for Parametric Survival Modeling in R. J Stat Softw. 2016;70. doi:10.18637/jss.v070.i08 24. 24.Tillin T, Hughes AD, Mayet J, Whincup P, Sattar N, Forouhi NG, et al. The relationship between metabolic risk factors and incident cardiovascular disease in Europeans, South Asians, and African Caribbeans: SABRE (Southall and Brent Revisited) -- a prospective population-based study. J Am Coll Cardiol. 2013;61: 1777–1786. [FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6MzoiUERGIjtzOjExOiJqb3VybmFsQ29kZSI7czo0OiJhY2NqIjtzOjU6InJlc2lkIjtzOjEwOiI2MS8xNy8xNzc3IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjMvMDMvMDYvMjAyMy4wMy4wNi4yMzI4NjQ5MS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 25. 25.Hirst JA, Hill N, O’Callaghan CA, Lasserson D, McManus RJ, Ogburn E, et al. Prevalence of chronic kidney disease in the community using data from OxRen: a UK population-based cohort study. Br J Gen Pract. 2020;70: e285–e293. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoiYmpncCI7czo1OiJyZXNpZCI7czoxMToiNzAvNjkzL2UyODUiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMy8wMy8wNi8yMDIzLjAzLjA2LjIzMjg2NDkxLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 26. 26.Feakins B, Oke J, McFadden E, Aronson J, Lasserson D, O’Callaghan C, et al. Trends in kidney function testing in UK primary care since the introduction of the quality and outcomes framework: a retrospective cohort study using CPRD. BMJ Open. 2019;9: e028062. 27. 27.Tomlinson LA, Clase CM. Sex and the Incidence and Prevalence of Kidney Disease. Clin J Am Soc Nephrol. 2019;14: 1557–1559. [FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiRlVMTCI7czoxMToiam91cm5hbENvZGUiO3M6ODoiY2xpbmphc24iO3M6NToicmVzaWQiO3M6MTA6IjE0LzExLzE1NTciO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMy8wMy8wNi8yMDIzLjAzLjA2LjIzMjg2NDkxLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 28. 28.Williams ED, Tillin T, Richards M, Tuson C, Chaturvedi N, Hughes AD, et al. Depressive symptoms are doubled in older British South Asian and Black Caribbean people compared with Europeans: associations with excess co-morbidity and socioeconomic disadvantage. Psychol Med. 2015;45: 1861– 1871. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F06%2F2023.03.06.23286491.atom) 29. 29.McManus S, Bebbington P, Jenkins R, Brugha T. Mental Health and Wellbeing in England: the Adult Psychiatric Morbidity Survey 2014. McManus S, Bebbington P, Jenkins R, Brugha T, editors. 2016 [cited 18 Dec 2022]. Available: [https://openaccess.city.ac.uk/id/eprint/23646/](https://openaccess.city.ac.uk/id/eprint/23646/) 30. 30.Vaidya V, Partha G, Karmakar M. Gender differences in utilization of preventive care services in the United States. J Womens Health . 2012;21: 140–145. 31. 31.Jha MK, Qamar A, Vaduganathan M, Charney DS, Murrough JW. Screening and Management of Depression in Patients With Cardiovascular Disease: JACC State-of-the-Art Review. J Am Coll Cardiol. 2019;73: 1827–1845. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S0735-1097(19)32433-7&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.ncbi.nlm.nih.gov/pubmed/30975301&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F06%2F2023.03.06.23286491.atom) 32. 32.Bonnet F, Irving K, Terra J-L, Nony P, Berthezène F, Moulin P. Anxiety and depression are associated with unhealthy lifestyle in patients at risk of cardiovascular disease. Atherosclerosis. 2005;178: 339– 344. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.atherosclerosis.2004.08.035&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15694943&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F06%2F2023.03.06.23286491.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000227200800016&link_type=ISI) 33. 33.QOF 2021-22. [cited 4 Nov 2022]. Available: [https://qof.digital.nhs.uk/](https://qof.digital.nhs.uk/) 34. 34.Robson J, Garriga C, Coupland C, Hippisley-Cox J. NHS Health Checks: an observational study of equity and outcomes 2009-2017. Br J Gen Pract. 2021;71: e701–e710. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoiYmpncCI7czo1OiJyZXNpZCI7czoxMToiNzEvNzEwL2U3MDEiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMy8wMy8wNi8yMDIzLjAzLjA2LjIzMjg2NDkxLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 35. 35.Pal K, Horsfall L, Sharma M, Nazareth I, Petersen I. Time trends in the incidence of clinically diagnosed type 2 diabetes and pre-diabetes in the UK 2009–2018: a retrospective cohort study. BMJ Open Diabetes Research and Care. 2021;9: e001989. 36. 36.Weich S, Nazroo J, Sproston K, McManus S, Blanchard M, Erens B, et al. Common mental disorders and ethnicity in England: the EMPIRIC study. Psychol Med. 2004;34: 1543–1551. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1017/S0033291704002715&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15724884&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2023%2F03%2F06%2F2023.03.06.23286491.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000225680700016&link_type=ISI) 37. 37.Rizopoulos D. Joint Models for Longitudinal and Time-to-Event Data: With Applications in R. CRC Press; 2012. 38. 38.Cook RJ, Lawless JF. Multistate models for the analysis of life history data. [cited 22 Oct 2022]. doi:10.1201/9781315119731/multistate-models-analysis-life-history-data-richard-cook-jerald-lawless 39. 39.Ieva F, Jackson CH, Sharples LD. Multi-state modelling of repeated hospitalisation and death in patients with heart failure: The use of large administrative databases in clinical epidemiology. Stat Methods Med Res. 2017;26: 1350–1372. 40. 40.Cox DR, Snell EJ. A general definition of residuals. J R Stat Soc. 1968;30: 248–265. 41. 41.Collett D. Modelling Survival Data in Medical Research. CRC Press; 2015. [1]: /embed/graphic-8.gif [2]: /embed/graphic-9.gif [3]: /embed/graphic-10.gif [4]: /embed/inline-graphic-1.gif [5]: /embed/graphic-11.gif [6]: /embed/inline-graphic-2.gif [7]: /embed/graphic-12.gif [8]: /embed/inline-graphic-3.gif [9]: /embed/inline-graphic-4.gif [10]: /embed/inline-graphic-5.gif [11]: /embed/inline-graphic-6.gif [12]: /embed/inline-graphic-7.gif [13]: /embed/inline-graphic-8.gif [14]: /embed/inline-graphic-9.gif [15]: /embed/inline-graphic-10.gif [16]: /embed/inline-graphic-11.gif [17]: /embed/inline-graphic-12.gif [18]: /embed/inline-graphic-13.gif [19]: /embed/inline-graphic-14.gif