Choosing questions before methods in dementia research with competing events and causal goals
=============================================================================================

* L. Paloma Rojas-Saunero
* Jessica G. Young
* Vanessa Didelez
* M. Arfan Ikram
* Sonja A. Swanson

## ABSTRACT

Several of the hypothesized or studied exposures that may affect dementia risk are known to increase the risk of death. This may explain counterintuitive results, where exposures that are known to be harmful for mortality risk sometimes seem protective for the risk of dementia. Authors have attempted to explain these counterintuitive results as biased, but the bias associated with a particular analytic method cannot be defined or assessed if the causal question is not explicitly specified. Indeed, we can consider several causal questions when competing events like death, which cannot be prevented by design, are present. Current dementia research guidelines have not explicitly considered what constitutes a meaningful causal question in this setting or, more generally, how this choice justifies and should drive particular analytic decisions. To contextualize current practices, we first perform a systematic review of the conduct and interpretation of longitudinal studies focused on dementia outcomes where death is a competing event. We then describe and demonstrate how to address different causal questions (referred here as “the total effect” and “the controlled direct effect”) with traditional analytic approaches under explicit assumptions. Our application focuses on smoking cessation in late-midlife. To illustrate core concepts, we discuss this example both in terms of a hypothetical randomized trial and with an emulation of such a trial using observational data from the Rotterdam Study.

Search Terms
*   All epidemiology [52]
*   Cohort studies [54]
*   Alzheimer’s disease [26]
*   All cognitive disorders/dementia [25]
*   Mortality

## 1. INTRODUCTION

Much research on dementia etiology focuses on understanding the role of biological factors in the pathophysiological process, and the impact of modifiable factors that could prevent or delay the onset of the disease[1]. As such, the field is often interested in causal questions. However, proper causal inference in dementia research faces many methodological and substantive challenges[2]. One of these challenges, which arises even in randomized trials, is that individuals at risk of dementia may die of other causes prior to its onset. In this setting, death is a *competing event* because an individual who dies from another cause prior to dementia onset cannot subsequently experience dementia[3].

Several of the hypothesized or studied exposures that may affect dementia risk can also increase the risk of death. This may explain counterintuitive results, where exposures that are known to be harmful for mortality risk, such as smoking[4,5] or history of cancer[6], sometimes seem protective for the risk of dementia. Authors have attempted to make sense of these counterintuitive results by naming biases such as “competing risk bias” or “survival bias”[7,8]. However, the bias associated with a particular analytic method cannot be defined or assessed if the causal question is not explicitly specified. We can envisage several causal questions when competing events like death are present, as we will discuss below.

Current dementia research guidelines[2] have not explicitly considered what constitutes a meaningful causal question in this setting or, more generally, how this choice justifies and should drive particular analytic decisions. Previous recommendations in the statistical and epidemiologic methods literature have advocated for the *cause-specific hazard* ratio when the aim is “etiologic”[9–12], relying on the popular Cox proportional hazards (PH) model. However, such recommendations cannot be justified (or criticized) without reference to the causal question which the analysis seeks to answer. The lack of explicit consideration of questions has contributed to confusion about the interpretation of different approaches to data analysis, including the misconception that “*censoring”* competing events is equivalent to somehow “*ignoring”* them. Recently, Young and colleagues placed core statistical concepts like *risk* and *hazard* within a formal causal inference framework and mapped common analytic strategies for competing events to questions about an intervention’s effect either with and without elimination of the competing event[13]. This work clarifies the role of censoring with respect to question formulation, assumptions needed for causal interpretation given real-world data, and particular analytic choices.

The goals of the current study are two-fold. To contextualize current practices, we first perform a systematic review of the conduct and interpretation of longitudinal studies focused on dementia outcomes where death is a competing event. Second, to contextualize the ideas formalized by Young and colleagues[13] in the setting of dementia research, we describe and demonstrate how to translate different causal questions (what we will refer to as questions concerning “the total effect” and “the controlled direct effect”) to popular analytic approaches under explicit assumptions. Our application focuses on smoking cessation in late-midlife, one of the twelve modifiable risk factors for dementia prevention described in the 2020 report of the Lancet Commission[1]. To illustrate core concepts, we discuss this example both in terms of a hypothetical randomized trial and with an emulation of such a trial using observational data from the Rotterdam Study[14].

## 2. SYSTEMATIC REVIEW OF LONGITUDINAL STUDIES OF DEMENTIA

### 2.1. Methods

We conducted a systematic review of recent original research articles with dementia outcome. We aimed to describe how death during follow-up is handled in the design, analysis, reporting, and interpretation.

Eligibility for our systematic review included: original research with longitudinal data on dementia or Alzheimer’s disease outcomes, published between January 2018 to December 2019. We limited our systematic review to nine journals in applied dementia or applied general medical research: Alzheimer’s and Dementia, Annals of Neurology, BMJ, JAMA, JAMA Neurology, Lancet, Lancet Neurology, Neurology, New England Journal of Medicine. We searched PubMed for papers that contained the words in the abstract: Alzheimer’s disease or dementia; longitudinal or cohort; hazard or risks (**Supplemental Data 1**).

We collected the following information from each eligible article: (1) reported study characteristics (type of exposure, median length of follow-up, study aim); (2) reporting on death and loss to follow-up (number of people who died over time, number of people who died over time by level of an exposure of interest, number of people lost to follow-up); (3) information on specific methodologic considerations (explicitly mentions how the competing event of death is handled in the analysis plan, primary target parameter, primary statistical method, explicitly mentions the assumptions needed for valid inference given the competing event of death and additional analyses and measures reported); and (4) interpretation (valid interpretation of the primary result given the competing event of death, discusses mortality in discussion).

### 2.2. Results

We retrieved 210 papers using the specified terms, 78 of which met eligibility criteria (**Supplemental Data 1**). Though we intended to classify articles according to the type of aim (descriptive, predictive and causal), this was not possible since most articles are not specific, and more frequently the term “association” is used to represent an ambiguous aim. Over 80% of the studies (n=63) reported associations between dementia and a single measure of a time-fixed or time-varying exposure (**Table 1**). Mean or median follow-up was over 5 years for 70% (n=55) of the studies. The number or proportion of individuals who died over time was reported in 41 (53%) papers; 12 (15%) presented these numbers by exposure level and 41 (53%) reported losses to follow up. Only 27% (n=21) explicitly reported how the competing event of death was handled in the methods section. The vast majority presented estimates of a hazard ratio (88%) based on a Cox proportional hazard model (85%). A table with the additional reported summary measures is present in **Supplemental Data 1**. Of all papers, only four explicitly mentioned assumptions pertaining to the presence of deaths by other causes (i.e., competing events). No papers that reported estimated coefficients of a (cause-specific) Cox proportional hazard model considered their explicit interpretation (see below)[3,13,15–18]. Overall, only one-third of the publications (n=25) mentioned death in some context (e.g., interpretation, limitation) in the discussion section.

View this table:
[Table 1.](http://medrxiv.org/content/early/2021/06/03/2021.06.01.21258142/T1)

Table 1. Current reporting practices relevant to competing events in dementia research (N=78 articles)

With the findings of this systematic review in mind, we next turn to describing important features of data structures with competing events, causal questions that can be posed, and under what assumptions and using what methods those questions can be answered.

## 3. FROM QUESTIONS TO METHODS IN DEMENTIA STUDIES WHERE SOME INDIVIDUALS DIE DURING STUDY: A PEDAGOGIC EXAMPLE

### 3.1. Observed data structure

Consider the effect of smoking cessation (versus continuing) in late-midlife on developing dementia after 20 years of follow-up. In order to focus on the challenges to causal inference created by competing events, let us begin by considering an idealized randomized trial such that middle-aged smokers are randomly assigned to a strategy to quit smoking versus continue smoking. Dementia onset is rigorously measured through constant screening, with date of death during follow-up recorded through linkage with municipal records. Further, suppose in the idealized trial we have complete follow-up (all individuals remain in the study until end of follow-up or until death) and perfect adherence.

Trial participants will be observed to follow different possible event trajectories through the study period: death without developing dementia; dementia onset (some dying after dementia onset); or remaining alive and dementia-free until end of follow-up. For those individuals who died without developing dementia, after the time of death, they cannot subsequently develop dementia. This is the key implication of competing events: they make it impossible for the event of interest to subsequently occur. It is precisely this determinism that makes choosing a causal question more difficult than when competing events are absent (e.g., when the event of interest is all-cause mortality rather than dementia).

The causal diagram in **Figure 1** represents some key features of this data structure, where an arrow from one node A into another node B on a causal diagram reflects that A may cause B[19]. Gaining familiarity with the key features in this causal diagram will illuminate the tradeoffs in interpretation of different definitions of a causal effect on dementia in the presence of death, as well as the assumptions used for identifying these effects from observable data. In this graph, *Smoking* represents an individual’s smoking status, and *Death**19* and *Dementia**20* represent indicators of death by 19 years of follow-up and dementia risk by 20 years of follow-up, respectively. By randomization, we know there are no shared causes of *Smoking* and other variables represented on the graph (the only cause of quitting smoking is a “coin flip”). However, we have no such guarantee for death and dementia status over the follow-up; therefore, the graph depicts shared causes *C* of dementia and death (such as cardiovascular comorbidities) that may or may not be measured. The arrows from *Smoking* to *Death**19* and *Smoking* to *Dementia**20* illustrate that smoking may affect both dementia and death through different mechanisms. The bold arrow from *Death**19* to *Dementia**20* represents the key feature of a competing events data structure: an individual who dies by year 19 of follow-up *cannot* subsequently develop dementia at the next time point, with the boldness here indicating the determinism. Though we present death and dementia at years 19 and 20 respectively, the causal diagram could be expanded to include their assessments in previous years as well. This simplified causal diagram is sufficient, however, for our consideration of the causal question.

![Figure 1.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2021/06/03/2021.06.01.21258142/F1.medium.gif)

[Figure 1.](http://medrxiv.org/content/early/2021/06/03/2021.06.01.21258142/F1)

Figure 1. 
A causal directed acyclic graph representing some key causal features of the data structure. *Smoking* represents the exposure status (quit smoking vs. continue smoking), *Death(19)* and *Dementia(20)* represent indicators of death by 19 years of follow-up and dementia by 20 years of follow-up, respectively. *C* represents possible shared causes of dementia and death (such as cardiovascular comorbidities). The key relations are: 1) smoking may independently affect both the risk of dementia and death over time through different mechanisms; 2) dying over the first 19 years of follow-up (without prior onset of dementia) determines that the indicator of dementia at 20 years of follow-up is zero (the bold arrow representing this key determinism induced by competing events); and 3) dementia and death can have shared causes.

### 3.2. Choosing a causal question: the total and controlled direct effect

We say the study we have conceptualized is “ideal” because, in the case of a randomized trial with no loss to follow-up and perfect adherence, we can identify the exposure effect on the outcome of interest through all possible pathways: the *total effect*. In our example, the following is a question about a total effect: *What would the difference in dementia risk by 20-year follow-up be had all individuals in the study population quit smoking versus, instead, had all individuals continued smoking?* This dementia risk is an example of a “cause-specific cumulative incidence” or “crude risk”[3,17].

Unfortunately, the total effect captures all pathways by which exposure affects dementia, including those mediated by death. In the causal diagram in Figure 1 this includes both the *direct* effect on dementia (Smoking→Dementia20) and indirect of smoking via mortality (Smoking→Death19→Dementia20). This indirect effect is necessarily “protective” since participants who die due to smoking at an earlier time point are “protected” from developing dementia. This “pathological mediation” structure gives the total effect a potentially problematic interpretation, since smoking cessation may increase the risk of dementia but primarily or solely because it delays death. Thus, the total effect may not answer a desirable causal question in these settings, especially when there is an arrow between the exposure and competing event. Empirical support for this arrow and, in turn, for quantifying the concerns about “pathological mediation” of the total effect, can be obtained by also estimating the effect of smoking cessation on all-cause mortality.

Instead of a total effect, a direct effect of smoking on the risk of dementia (that does not also capture the pathways mediated by death) may be of interest. There are multiple ways to define a direct effect[20–22]. Here we will consider one definition that has been historically considered and may lead to familiar statistical methods as will be described in the next section: the *controlled direct effect*. In our example, this question is phrased as: *What would the difference in dementia risk by 20-year follow-up be had all individuals in the study population quit smoking and not died throughout the study period versus, instead, had all individuals continued smoking and not died throughout the study period*? This dementia risk (under elimination of death) is an example of a “net risk” or “marginal cumulative incidence”[3,17]. This effect only captures the direct effect of smoking on dementia because it refers to a hypothetical setting in which somehow death could be eliminated. As we discuss in the next section, while our idealized trial allows us to identify the total effect by design, it does not guarantee this for the controlled direct effect.

The risk differences above both quantify causal effects because they both refer to a comparison of outcome distributions under different interventions but in the same individuals. In contrast, while cause-specific hazard ratios are the basis of the majority of analyses in dementia studies, these generally do not quantify causal effects, even under the conditions of an ideal trial. Unlike risks, hazards are defined conditional on not yet having had the outcome or competing event. This conditioning means that hazard contrasts do not compare outcomes under different exposures in the same individuals when exposure affects these events[13,23,24]. Therefore, in dementia studies, cause-specific hazard ratios will not generally have a causal interpretation when exposure affects *either* dementia *or* death (directly or indirectly). For this reason, we will focus on risks even though many studies in our literature review reported hazard ratios.

In sum, there is no single way to define “the” causal effect on dementia when deaths occur. Choosing either of these research questions should be done in a case-by-case basis, and we will address this, as well as introduce other alternative questions, in the discussion section. Presenting information on the relation between exposure and mortality can complement both questions.

### 3.3. Identifying the total versus controlled direct effect in a real-world study

In this section we consider assumptions that help us connect our causal quantity of interest to observable data (i.e., identification). Consider again Figure 1: because exposure was randomized, there are no non-causal paths connecting *Smoking* and *Dementia**20*[19,25]. This is consistent with the assumption of “no confounding”, allowing identification of the total effect. In contrast, to identify the *controlled direct effect* of smoking cessation on the risk of dementia, we need to make additional assumptions; that is, “no confounding” ensured by randomization of the exposure is not sufficient.

In Figure 1, we observe the non-causal path between death and dementia through their shared cause C, Dementia20←C→Death19. Thus, even in our ideal trial, we need to measure C to identify this effect. The reason we need to measure and adjust for C when interest is in the controlled direct effect is because death is a form of *censoring* for this question[13]. Censoring is a type of *missingness* in the outcome of interest. Therefore, what constitutes censoring depends on the question of interest. A, as such the controlled direct effect is a question about dementia outcomes in hypothetical settings where death is eliminated. When an individual dies prior to dementia onset, dementia onset “under elimination of death” is missing for that individual. While many researchers equate “death” with “censoring”, these terms are not synonymous: death is *only* a type of censoring (leading to missingness of the dementia outcome) when the question of interest is about outcomes “under elimination of death”.

In turn, measuring and including the shared cause C in Figure 1 of dementia and death is consistent with an assumption often referred to as *conditional independent censoring* (here, conditional on C) [3,10,11,13,17,18,26]. This assumption becomes more plausible in most studies if time-varying shared causes are included rather than only baseline covariates. Assuming that there are no shared causes between death and dementia (i.e., assuming the absence of the dotted arrows from C to Death19 and Dementia20 in Figure 1) coincides with the assumption of *unconditional independent censoring*. This is implausible for nearly all dementia research since both events are related to the aging process and consequences of it. Thus, even in our ideal trial, given interest in the controlled direct effect, we must measure a rich set of covariates related to both dementia and mortality risk at baseline and repeatedly throughout follow-up.

On a separate note, loss to follow-up is a form of censoring (missingness) whether interest is in total or direct effects. Though loss to follow-up can in principle be prevented by design, in trials or other studies of dementia, mechanisms of loss to follow-up might be related to impaired cognition and dementia[27], and as such, shared causes of loss to follow-up and dementia should be measured as well for similar reasons as for measuring C[28,29]. Further details on censoring and graphical identification of both effects, including scenarios with loss to follow-up, can be found in Young et al.[13].

### 3.4. Statistical methods to estimate the total effect or the controlled direct effect

Choosing an appropriate statistical method depends jointly on the choice of causal effect and the identifying assumptions we make. In an ideal trial, the *total effect* can be trivially estimated by simply comparing two proportions: the proportion diagnosed with dementia at 20-year follow-up in the “quit smoking” arm versus the proportion diagnosed with dementia at 20-year follow-up in the “do not quit smoking” arm. In both proportions, individuals who die before developing dementia will contribute to the denominator but never to the numerator. Likewise, these quantities can be estimated with the Aalen-Johansen estimator[13,17,18], which extends to settings with loss to follow-up under the assumption of unconditional independent censoring by loss to follow-up.

In contrast, the controlled direct effect requires covariate adjustment on the shared causes of death and dementia, even in an ideal trial. For example, the controlled direct effect could be estimated by comparing the risk estimates from the complement of a weighted version of the Kaplan-Meier estimator[13], where weights represent the inverse probability of censoring by death conditional on covariates[13,26,30–32]. These covariates should be those assumed to ensure the conditional independent censoring assumption for this form of censoring (e.g., the covariates C in Figure 1).

We note that the historic survival-analysis terminology classifies this structure as “semi-competing events” since death is a competing event for dementia but not the other way around. Therefore, we can estimate the risk of all-cause mortality using standard methods like the Kaplan-Meier estimator. In all cases, straightforward extensions exist for covariate adjustment (e.g., by inverse probability weighting) to address loss to follow-up as well as for confounding [13,29,33–35]. As such, these methods can be used in realistic trials and in observational studies, though our consideration of estimation in an ideal trial helps illuminate the unique feature of competing events.

## 4. APPLICATION TO THE ROTTERDAM STUDY

We now illustrate an application of inverse probability weighted methods to estimate total and controlled direct effects of smoking cessation on dementia using data collected from the Rotterdam Study. Briefly, the Rotterdam Study is a population-based prospective cohort study among persons living in the Ommoord district in Rotterdam, the Netherlands[14]. Participants older than 55 years underwent questionnaire administration, physical and clinical examinations, and blood sample collection at baseline (1990-1993) and at follow-up visits from 1993-1995, 1997-1999, 2002-2005, and 2009-2011. Smoking habits were assessed through questionnaires at study entry via self-reported status as “former”, “current smoker’’ or “never smoker”. Dementia diagnosis was collected by screening at each visit and through continuous automated linkage with digitized medical records and regional registries (**Supplemental Data 2**). Death certificates were obtained via municipal population registries and through general practitioners’ and hospitals’ databases, with complete linkage. This ascertainment method means the Rotterdam Study has functionally no loss to follow-up with respect to dementia diagnosis and death.

Individuals ages 55-70 years who reported smoking (current or former) and who did not have history of dementia at cohort entry were eligible for the current study. To emulate the ideal trial described in Section 3, we contrast former and current smokers. This contrast has some limitations when viewed as an emulation of the ideal trial described in Section 2. For example, there may be unmeasured confounding, selection bias due to misaligning “time zero”[36,37], and measurement error[19]. A thorough consideration of these other issues would be critical for evaluating the effect size of smoking cessation on dementia risk, but go beyond the scope of this exercise. For didactic purposes, we therefore focus our discussion on how the competing event of death affects the interpretation, analytic decisions, and assumptions evoked.

### 4.1. Methods

To estimate the total effect of smoking cessation on dementia risk, we compared a weighted Aalen-Johannsen estimator in current versus former smokers with weights defined as a product of inverse probability of treatment weights[19] to adjust for the following possible confounders: age at study entry, sex, APOE ε4 status, and educational attainment. Briefly, the weight for a current smoker is defined as the inverse of the probability of smoking conditional on confounders, and for a former smoker as the inverse of quitting conditional on covariates. We estimated these probabilities with a logistic regression model for smoking as a function of the above-mentioned covariates. Specific modeling specifications and weights assessment are presented as **Supplemental Data 3**.

To estimate the controlled direct effect, we compared the complement of a weighted Kaplan-Meier survival estimator in smokers versus former smokers with time indexed in years. The weights in this case are time-varying by follow-up year, defined as a product of the time-fixed weights above and a year-specific inverse probability of censoring by death weights. For an individual still alive in year t, the time t censoring weight is the product of the inverse probability of surviving in each year prior to t, conditional on measured shared causes of death and dementia (that is, variables such as C in Figure 1). For an individual who has died by time t, the year t censoring weight is zero. We estimated survival probabilities using a logistic regression model for death as a function of baseline and time-varying covariates. Baseline covariates included smoking status, age at study entry, sex, APOE ε4 status, and educational attainment; time-varying covariates included systolic blood pressure, BMI, and prevalent and incident comorbid heart disease, cancer, stroke, and diabetes.

We also estimated the total effect of smoking on mortality risk applying the Kaplan-Meier estimator with the weights calculated for handling confounding. We therefore are assuming the same set of measured confounders used to estimate the total effect of smoking on dementia risk are sufficient for addressing confounding of the total effect of smoking on mortality risk.

Estimates of the total and controlled direct effect at 20 years of follow-up are presented as risk differences (RD) and risk ratios (RR). All 95% confidence intervals were calculated using percentile-based bootstrapping based on 500 bootstrap samples. All analysis were performed using R, code is available in [https://github.com/palolili23/competing\_risks_dementia](https://github.com/palolili23/competing_risks_dementia).

#### Standard protocol approvals, registrations, and patient consents

The Rotterdam Study has been approved by the Medical Ethics Committee of the Erasmus MC (registration number MEC 02.1015) and by the Dutch Ministry of Health, Welfare and Sport (Population Screening Act WBO, license number 1071272-159521-PG). The Rotterdam Study has been entered into the Netherlands National Trial Register (NTR; [www.trialregister.nl](http://www.trialregister.nl)) and into the WHO International Clinical Trials Registry Platform (ICTRP; [www.who.int/ictrp/network/primary/en/](http://www.who.int/ictrp/network/primary/en/)) under shared catalogue number NTR6831. All participants provided written informed consent to participate in the study and to have their information obtained from treating physicians.

#### Data availability

Rotterdam Study can be obtained via requests directed toward the management team of the Rotterdam Study (secretariat.epi{at}erasmusmc.nl), which has a protocol for approving data requests. Because of restrictions based on privacy regulations and informed consent of the participants, data cannot be made freely available in a public repository.

### 4.2. Results

Out of 10994 individuals included in the Rotterdam Study, 4179 individuals met eligibility criteria (55-70 years who reported smoking history at baseline and who did not have history of dementia at study entry). The mean age was 62 years and 1870 (44.7%) were women (**Table 2**). In total, 368 (8.8%) developed dementia and 1318 (31.5%) died over 20 years of follow-up. The median time to dementia was 15.5 years and the median time to death was 13.1 years. Overall, from 1572 who were current smokers at baseline, 117 (7.4%) developed dementia and 630 (40.1%) died; of the 2607 former smokers, 251 (9.6%) developed dementia and 688 (26.4%) died.

View this table:
[Table 2.](http://medrxiv.org/content/early/2021/06/03/2021.06.01.21258142/T2)

Table 2. Descriptive characteristics of former and current smokers in the Rotterdam Study

We estimated a total effect of smoking cessation (compared to continued smoking) on 20-year dementia risk of 2.1 (95%CI: −0.1, 4.2) percentage points (**Table 3; Figure 2**). This slightly harmful effect estimate of quitting smoking (with wide confidence intervals) includes all causal pathways, including that through death. The presence of this pathway is evidenced in the estimated total effect of quitting smoking on 20-year mortality risk: −17.4 (95%CI: −20.5, −14.5) percentage points. Alternatively, we estimated a controlled direct effect of quitting smoking on 20-year dementia risk had death been fully prevented during the study period as −1.9 (−5.1, 1.4) percentage points.

View this table:
[Table 3.](http://medrxiv.org/content/early/2021/06/03/2021.06.01.21258142/T3)

Table 3. 
**Total effect and controlled direct effect of smoking cessation (compared to continued smoking) on the risk of dementia, and the total effect on risk of mortality, at 20 years of follow-up**

![Figure 2.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2021/06/03/2021.06.01.21258142/F2.medium.gif)

[Figure 2.](http://medrxiv.org/content/early/2021/06/03/2021.06.01.21258142/F2)

Figure 2. Risk of dementia and death by smoking cessation status over 20 years of follow-up

## 5. DISCUSSION

In longitudinal (randomized and observational) studies where dementia is the main outcome and deaths occur during follow-up, understanding the different causal questions and making explicit the assumptions required for answering them can lead to better interpretation of results, and a deeper understanding about plausible sources and magnitudes of bias. We considered two causal questions that can be addressed with common statistical methods, beginning with the total effect which captures all causal pathways including those mediated by death. This makes the total effect difficult to interpret as illustrated in our example: the small estimated harmful total effect of smoking cessation on dementia risk necessarily captures some “protection” against dementia by death. *This is not a “bias” but rather a problematic feature of the total effect as the research question*.

The controlled direct effect does not have this problematic feature, and in our example, we estimated that smoking cessation reduced the risk of dementia if death was eliminated. However, residual bias from failing to adjust for a sufficient set of shared causes of death and dementia can remain. In addition, the independent censoring assumption cannot be verified empirically, though bounding can be used to assess extreme scenarios of dependency[3,30,38–40]. Furthermore, the controlled direct effect is not an ideal measure of direct effect because it refers to a fictional scenario where everyone remains alive and therefore it generally will not provide useful information for decision-making.

Since both of these questions can seem unsatisfactory, we note that there are yet further alternative questions that can potentially be posed. For example, the so-called “survivor average treatment effect” quantifies the effect of a treatment on a subgroup of individuals who would not die during the study period under either level of treatment. Although this option has been widely considered in the methodology literature[22,41], the utility of this question is questionable in public health and clinical dementia research as this subgroup is not observable and may not even exist. One can also consider a combined outcome endpoint, such as the effect on dementia *or* death, but this too is unsatisfactory in many cases: in our example, the effect of smoking on risk of death would overwhelm. A novel alternative, the so-called separable effects avoid evoking consideration of implausible scenarios that “eliminate death” or unobservable subpopulations[21]. The separable effects are effects of modified treatments that are assumed to operate like the study treatment but with particular mechanisms removed. These may be a physical decomposition of the exposure assumed to operate on dementia and death through separate pathways or completely different treatments that operate like the study treatment. While identifying these effects similarly rely on strong assumptions, including necessitating measuring a rich set of shared causes of dementia and death, estimates of these effects can be confirmed with future studies on these modified treatments. In this work we focused on the two questions because of their relation to commonly used estimators in dementia research, but we suspect in the future that separable effects will become a more explicit question of interest with the increasing development of useable tools for aiding reasoning.

Too often, we start by defining the statistical method that appears to fit the complexity of data, and we let this decision define the target parameter and thus implicitly determine the research question to be answered. In a setting with competing events, there is no “one size fits all”. Through our discussion and application, we hope that readers will see an opportunity to re-conceptualize how to ask clearer questions in the context of competing events and let the question define the methods that best suit the research aim.

## Supporting information

Supplemental material [[supplements/258142_file02.pdf]](pending:yes)

## Data Availability

Rotterdam Study can be obtained via requests directed toward the management team of the Rotterdam Study (secretariat.epi{at}erasmusmc.nl), which has a protocol for approving data requests. Because of restrictions based on privacy regulations and informed consent of the participants, data cannot be made freely available in a public repository.

## Footnotes

*   **Author Email Addresses:** L. Paloma Rojas-Saunero: l.rojassaunero{at}erasmusmc.nl
    
    Jessica G. Young: jyoung{at}hsph.harvard.edu
    
    Vanessa Didelez: didelez{at}leibniz-bips.de
    
    M. Arfan Ikram: m.a.ikram{at}erasmusmc.nl
    
    Sonja A. Swanson: s.swanson{at}erasmusmc.nl

*   **Financial Disclosure:** L. Paloma Rojas-Saunero: Reports no disclosure
    
    Jessica G. Young: Reports no disclosure
    
    Vanessa Didelez: Reports no disclosure
    
    M. Arfan Ikram: Reports no disclosure
    
    Sonja A. Swanson: Reports no disclosure

*   **Study funded:** This study was partly funded by ZonMW Memorabel (projectnr 73305095005) and Alzheimer Nederland through the Netherlands Consortium of Dementia Cohorts (NCDC) in the context of Deltaplan Dementie.

*   Received June 1, 2021.
*   Revision received June 1, 2021.
*   Accepted June 3, 2021.


*   © 2021, Posted by Cold Spring Harbor Laboratory

This pre-print is available under a Creative Commons License (Attribution-NonCommercial-NoDerivs 4.0 International), CC BY-NC-ND 4.0, as described at [http://creativecommons.org/licenses/by-nc-nd/4.0/](http://creativecommons.org/licenses/by-nc-nd/4.0/)

## References

1.  [1].Livingston G, Huntley J, Sommerlad A, Ames D, Ballard C, Banerjee S, Brayne C, Burns A, Cohen-Mansfield J, Cooper C, Costafreda SG, Dias A, Fox N, Gitlin LN, Howard R, Kales HC, Kivimäki M, Larson EB, Ogunniyi A, Orgeta V, Ritchie K, Rockwood K, Sampson EL, Samus Q, Schneider LS, Selbæk G, Teri L, Mukadam N (2020) Dementia prevention, intervention, and care: 2020 report of the Lancet Commission. Lancet 396, 413–446.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S0140-6736(20)30367-6&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32738937&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F06%2F03%2F2021.06.01.21258142.atom) 

2.  [2].Weuve J, Proust-Lima C, Power MC, Gross AL, Hofer SM, Thiébaut R, Chêne G, Glymour MM, Dufouil C (2015) Guidelines for reporting methodological challenges and evaluating potential bias in dementia research. Alzheimer’s Dement 11, 1098–1109.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jalz.2015.06.1885&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26397878&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F06%2F03%2F2021.06.01.21258142.atom) 

3.  [3].Tsiatis AA (1975) A nonidentifiability aspect of the problem of competing risks. Proc Natl Acad Sci 72, 20–22.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czo3OiI3Mi8xLzIwIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDYvMDMvMjAyMS4wNi4wMS4yMTI1ODE0Mi5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 

4.  [4].Hernán MA, Alonso A, Logroscino G (2008) Cigarette smoking and dementia: Potential selection bias in the elderly. Epidemiology 19, 448–450.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/EDE.0b013e31816bbe14&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18414087&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F06%2F03%2F2021.06.01.21258142.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000255314400016&link_type=ISI) 

5.  [5].Abner EL, Nelson PT, Jicha GA, Cooper GE, Fardo DW, Schmitt FA, Kryscio RJ (2019) Tobacco Smoking and Dementia in a Kentucky Cohort: A Competing Risk Analysis. J Alzheimer’s Dis 68, 625–633.
    
    
6.  [6].Driver JA (2014) Inverse association between cancer and neurodegenerative disease: review of the epidemiologic and biological evidence. Biogerontology 15, 547–557.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s10522-014-9523-2&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25113739&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F06%2F03%2F2021.06.01.21258142.atom) 

7.  [7].Ospina-Romero M, Glymour MM, Hayes-Larson E, Mayeda ER, Graff RE, Brenowitz WD, Ackley SF, Witte JS, Kobayashi LC (2020) Association Between Alzheimer Disease and Cancer With Evaluation of Study Biases: A Systematic Review and Meta-analysis. JAMA Netw open 3, e2025515.
    
    
8.  [8].Hayes-Larson E, Ackley SF, Zimmerman SC, Ospina-Romero M, Glymour MM, Graff RE, Witte JS, Kobayashi LC, Mayeda ER (2020) The competing risk of death and selective survival cannot fully explain the inverse cancer-dementia association. Alzheimer’s Dement 1–8.
    
    
9.  [9].Lau B, Cole SR, Gange SJ (2009) Competing risk regression models for epidemiologic data. Am J Epidemiol 170, 244–256.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/aje/kwp107&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19494242&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F06%2F03%2F2021.06.01.21258142.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000267887900014&link_type=ISI) 

10. [10].Austin PC, Lee DS, Fine JP (2016) Introduction to the Analysis of Survival Data in the Presence of Competing Risks. Circulation 133, 601–609.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTQ6ImNpcmN1bGF0aW9uYWhhIjtzOjU6InJlc2lkIjtzOjk6IjEzMy82LzYwMSI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzA2LzAzLzIwMjEuMDYuMDEuMjEyNTgxNDIuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 

11. [11].Austin PC, Fine JP (2017) Accounting for competing risks in randomized controlled trials: a review and recommendations for improvement. Stat Med 36, 1203–1209.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/sim.7215&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28102550&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F06%2F03%2F2021.06.01.21258142.atom) 

12. [12].Koller MT, Raatz H, Steyerberg EW, Wolbers M (2012) Competing risks and the clinical community: Irrelevance or ignorance? Stat Med 31, 1089–1097.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/sim.4384&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21953401&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F06%2F03%2F2021.06.01.21258142.atom) 

13. [13].Young JG, Stensrud MJ, Tchetgen EJT, Hernán MA (2020) A causal framework for classical statistical estimands in failure time settings with competing events. Stat Med 1, 1–38.
    
    
14. [14].Ikram MA, Brusselle G, Ghanbari M, Goedegebure A, Ikram MK, Kavousi M, Kieboom BCT, Klaver CCW, de Knegt RJ, Luik AI, Nijsten TEC, Peeters RP,  van Rooij Fja, Stricker BH, Uitterlinden AG, Vernooij MW, Voortman T (2020) Objectives, design and main findings until 2020 from the Rotterdam Study, Springer Netherlands.
    
    
15. [15].Andersen PK, Geskus RB, De witte T, Putter H (2012) Competing risks in epidemiology: Possibilities and pitfalls. Int J Epidemiol.
    
    
16. [16].Andersen PK, Keiding N (2012) Interpretability and importance of functionals in competing risks and multistate models. Stat Med 31, 1074–1088.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/sim.4385&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22081496&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F06%2F03%2F2021.06.01.21258142.atom) 

17. [17].Geskus RB (2016) Data analysis with competing risks and intermediate states, Chapman & Hall/CRC Biostatics Series.
    
    
18. [18].Geskus RB (2020) Competing risks: Aims and methods, Elsevier B.V.
    
    
19. [19].Hernán MA, Robins JM (2019) Causal Inference: What If (Harvard book).
    
    
20. [20].Robins JM, Greenland S (1992) Identifiability and exchangeability for direct and indirect effects. Epidemiology 3, 143–155.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/00001648-199203000-00013&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=1576220&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F06%2F03%2F2021.06.01.21258142.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1992HH93800013&link_type=ISI) 

21. [21].Stensrud MJ, Young JG, Didelez V, Robins JM, Hernán MA (2020) Separable Effects for Causal Inference in the Presence of Competing Events. J Am Stat Assoc , 1–23.
    
    
22. [22].Frangakis CE, Rubin DB (2002) Principal stratification in causal inference. Biometrics.
    
    
23. [23].Stensrud MJ, Aalen JM, Aalen OO, Valberg M (2019) Limitations of hazard ratios in clinical trials. Eur Heart J 40, 1378–1383.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/eurheartj/ehy77&link_type=DOI) 

24. [24].Stensrud MJ, Robins JM, Sarvet A, Tchetgen Tchetgen EJ, Young JG (2020) Conditional separable effects. arXiv 1–59.
    
    
25. [25].Pearl J (1995) Causal Diagrams for Empirical Research. Biometrika 82, 669.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/biomet/82.4.669&link_type=DOI) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1995TT22800001&link_type=ISI) 

26. [26].Willems SJW, Schat A, van Noorden MS, Fiocco M (2018) Correcting for dependent censoring in routine outcome monitoring data by applying the inverse probability censoring weighted estimator. Stat Methods Med Res 27, 323–335.
    
    
27. [27].Vega S, Benito-León J, Bermejo-Pareja F, Medrano MJ, Vega-Valderrama LM, Rodríguez C, Louis ED (2010) Several factors influenced attrition in a population-based elderly cohort: Neurological disorders in Central Spain Study. J Clin Epidemiol 63, 215–222.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jclinepi.2009.03.005&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19473811&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F06%2F03%2F2021.06.01.21258142.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000274062400018&link_type=ISI) 

28. [28].Hernán MA, Hernández-Díaz S, Robins JM (2004) A structural approach to selection bias. Epidemiology 15, 615–625.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/01.ede.0000135174.63482.43&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15308962&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F06%2F03%2F2021.06.01.21258142.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000223382600020&link_type=ISI) 

29. [29].Howe CJ, Cole SR, Lau B, Napravnik S, Eron JJ (2016) Selection Bias Due to Loss to Follow Up in Cohort Studies. Epidemiology 27, 91–97.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/EDE.0000000000000409&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F06%2F03%2F2021.06.01.21258142.atom) 

30. [30].van Geloven N, Geskus RB, Mol BW, Zwinderman AH (2014) Correcting for the dependent competing risk of treatment using inverse probability of censoring weighting and copulas in the estimation of natural conception chances. Stat Med 33, 4671–4680.
    
    
31. [31].Satten GA, Datta S (2001) The Kaplan-Meier estimator as an inverse-probability-of-censoring weighted average. Am Stat 55, 207–210.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1198/000313001317098185&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28845048&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F06%2F03%2F2021.06.01.21258142.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000170299300007&link_type=ISI) 

32. [32].Robins JM, Finkelstein DM (2000) Correcting for noncompliance and dependent censoring in an AIDS clinical trial with inverse probability of censoring weighted (IPCW) log-rank tests. Biometrics 56, 779–788.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1111/j.0006-341X.2000.00779.x&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=10985216&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F06%2F03%2F2021.06.01.21258142.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000089069500018&link_type=ISI) 

33. [33].Cole SR, Lau B, Eron JJ, Brookhart MA, Kitahata MM, Martin JN, Mathews WC, Mugavero MJ (2015) Estimation of the standardized risk difference and ratio in a competing risks framework: Application to injection drug use and progression to AIDS after initiation of antiretroviral therapy. Am J Epidemiol 181, 238–245.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/aje/kwu122&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24966220&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F06%2F03%2F2021.06.01.21258142.atom) 

34. [34].Xu S, Shetterly S, Powers D, Raebel MA, Tsai TT, Ho PM, Magid D (2012) Extension of Kaplan-Meier methods in observational studies with time-varying treatment. Value Heal 15, 167–174.
    
    
35. [35].Howe CJ, Cole SR, Chmiel JS, Muñoz A (2011) Limitation of inverse probability-of-censoring weights in estimating survival in the presence of strong selection bias. Am J Epidemiol 173, 569–577.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/aje/kwq385&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=21289029&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F06%2F03%2F2021.06.01.21258142.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000287746200012&link_type=ISI) 

36. [36].Hernán MA, Sauer BC, Hernández-Díaz S, Platt R, Shrier I (2016) Specifying a target trial prevents immortal time bias and other self-inflicted injuries in observational analyses. J Clin Epidemiol 79, 70–75.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=doi:10.1016/j.jclinepi.2016.04.014&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27237061&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F06%2F03%2F2021.06.01.21258142.atom) 

37. [37].Howe CJ, Robinson WR (2018) Survival-related selection bias in studies of racial health disparities: the importance of the target population and study design. EpidemiologyEpidemiology 29, 524–524.
    
    
38. [38].Dignam JJ, Weissfeld LA, Anderson SJ (1998) Methods for bounding the marginal survival distribution. 14, 1985–1998.
    
    
39. [39].Peterson A V. (1976) Bounds for a joint distribution function with fixed sub distribution functions: Application to competing risks. Proc Natl Acad Sci U S A 73, 11–13.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czo3OiI3My8xLzExIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjEvMDYvMDMvMjAyMS4wNi4wMS4yMTI1ODE0Mi5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 

40. [40].Van Geloven N, Le Cessie S, Dekker FW, Putter H (2017) Transplant as a competing risk in the analysis of dialysis patients. Nephrol Dial Transplant 32, ii53–ii59.
    
    
41. [41].Tchetgen Tchetgen EJ (2014) Identification and estimation of survivor average causal effects. Stat Med 33, 3601–3628.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/sim.6181&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24889022&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F06%2F03%2F2021.06.01.21258142.atom)