Enhanced Classification Performance using Deep Learning Based Segmentation for Pulmonary Embolism Detection in CT Angiography
=============================================================================================================================

* Ali Teymur Kahraman
* Tomas Fröding
* Dimitris Toumpanakis
* Christian Jamtheim Gustafsson
* Tobias Sjöblom

## Abstract

**Purpose** To develop a deep learning-based algorithm that automatically and accurately classifies patients as either having pulmonary emboli or not in CT pulmonary angiography (CTPA) examinations.

**Materials and Methods** For model development, 700 CTPA examinations from 652 patients performed at a single institution were used, of which 149 examinations contained 1497 PE traced by radiologists. The nnU-Net deep learning-based segmentation framework was trained using 5-fold cross-validation. To enhance classification, we applied logical rules based on PE volume and probability thresholds. External model evaluation was performed in 770 and 34 CTPAs from two independent datasets.

**Results** A total of 1483 CTPA examinations were evaluated. In internal cross-validation and test set, the trained model correctly classified 123 of 128 examinations as positive for PE (sensitivity 96.1%; 95% C.I. 91-98%; *P* < .05) and 521 of 551 as negative (specificity 94.6%; 95% C.I. 92-96%; *P* < .05). In the first external test dataset, the trained model correctly classified 31 of 32 examinations as positive (sensitivity 96.9%; 95% C.I. 84-99%; *P* < .05) and 2 of 2 as negative (specificity 100%; 95% C.I. 34-100%; *P* < .05). In the second external test dataset, the trained model correctly classified 379 of 385 examinations as positive (sensitivity 98.4%; 95% C.I. 97-99%; *P* < .05) and 346 of 385 as negative (specificity 89.9%; 95% C.I. 86-93%; *P* < .05).

**Conclusion** Our automatic pipeline achieved beyond state-of-the-art diagnostic performance of PE in CTPA using nnU-Net for segmentation and volume- and probability-based post-processing for classification.

*   An nnU-Net segmentation framework was applied to patient-level classification in CTPA examinations.

*   The proposed algorithm can enable prioritization of patients with PE for rapid review in emergency radiology.

*   The proposed algorithm showed outstanding performance on both internal and two publicly available external testing datasets (AUC, 98.3%; n=1355).

Keywords
*   Computed tomography pulmonary angiography
*   pulmonary embolism
*   nnU-Net
*   deep learning

## 1. Introduction

Pulmonary embolism (PE) is a potentially life-threatening occlusion of pulmonary arteries caused by blood clotting and is associated with significant morbidity and mortality [1]. PE affects >400,000 patients in Europe [2] and between 300,000 and 600,000 patients in the US [3] causing an estimated >100,000 deaths annually [4]. PE is a significant cause of preventable hospital deaths in the world [5], demanding rapid clinical management [6]. The computed tomography pulmonary angiography (CTPA) imaging modality is the current gold standard for PE diagnosis [7]. The CTPA is a CT scan performed after intravenous injection of iodinated contrast medium. As the emboli do not absorb contrast medium they can be recognized as dark filling defects in the pulmonary arteries [8]. Thoroughly examining every CT slice and identification of PE in CTPA is time-consuming for the radiologist and requires considerable training and attentiveness, and the inter-observer variability is high for small, sub-segmental emboli [9]. An automated solution for detection of PE in CTPA has potential to assist the radiologist by reducing reading times and the risk of emboli being overlooked.

Developing a general solution for automatic detection of PE has proven challenging because of anatomical variation, motion and breathing artifacts, inter-patient variability in contrast medium concentration, and concurrent pathologies. Over the past two decades, automated PE detection has been attempted using deterministic models, such as image processing and analysis techniques [10, 11], or probabilistic/statistical models such as machine learning [12–14] and deep convolutional neural networks [15, 16]. Yet, the accuracies of these solutions have been insufficient for clinical use due to low sensitivity [10, 13, 15] and high false positive rate [10, 11, 13, 14], potentially caused by training on small datasets [10, 11, 13–15]. The state-of-the-art is a residual neural network (ResNet) classification architecture on 1465 CTPA examinations with sensitivity of 92.7% and specificity of 95.5% [17]. To mitigate the limited dataset size challenges hampering the training of AI models for PE classification, an alternative approach is to employ a fine-tuned U-Net-like semantic segmentation model. The U-Net model has demonstrated its effectiveness in several medical image segmentation tasks. [18]. The no-new U-Net framework (nnU-Net) successfully addresses challenges of finding the best U-net model and fine-tuning its hyperparameters [19].

Here, we sought to take advantage of the segmentation performance of the nnU-Net framework in an algorithm that automatically classifies routine patient CTPA examinations as having PE or not with higher sensitivity and specificity than the current state-of-the-art performance.

## 2. Materials and Methods

### 2.1. Internal dataset

The single-institution (Nyköping Hospital, Sweden) retrospective dataset consisted of 700 non-ECG-gated CTPA examinations performed between 2014 and 2018 (n=149 positive for PE); 383 CTPA examinations from 353 women (age range 16-97 years; median age 73 years; interquartile range 20 years) and 317 from 299 men (age range 19-100 years; median age 71 years; interquartile range 15 years) [20]. The CTPAs were clinical routine examinations exported in chronological order from a history list in the institution’s Picture Archiving and Communication System (PACS). The only disruption in the order were a few inserted time gaps, which allowed for a larger number of CT scanners to be included as new CT scanners were installed during the time period. The CTPAs were acquired on five different CT scanners (Somatom Definition Flash, Siemens Healthcare, Erlangen, Germany; LightSpeed VCT, General Electric (GE) Healthcare Systems, Waukesha, WI, USA; Brilliance 64, Ingenuity Core and Ingenuity CT, Philips Medical Systems, Eindhoven, the Netherlands). As contrast medium, Omnipaque 350 mg I/ml (GE Healthcare Systems, Waukesha, WI, USA) was used. Collection and analysis of CTPA examinations was approved by the Swedish Ethical Review Authority (EPN Uppsala Dnr 2015/023 and 2015/023/1). The CTPA data was anonymized and exported in Digital Imaging and Communications in Medicine (DICOM) format, using a hardware solution (Dicom2USB). The CTPAs were reviewed and annotated using the open-source software Medical Imaging Interaction Toolkit (MITK) [21] by two radiologists (DT and TF) with 6 and 16 years of experience. Each CTPA was annotated by either DT or TF. All blood clots in 149 CTPA examinations positive for PE were manually segmented in axial view, image by image, resulting in 36,471 segmentations.

### 2.2. External datasets

Two publicly available datasets were used for external evaluation; the Ferdowsi University of Mashhad’s PE dataset (FUMPE) [22] and the RSNA-STR Pulmonary Embolism CT (RSPECT) Dataset [23]. The FUMPE dataset contains 35 CTPAs with voxel-level PE annotation by radiologists. Of the 35 CTPAs, two were negative for PE, 32 were positive and one was excluded for lack of ground truth annotation (Supp. materials). The RSPECT dataset consisted of a training (n=7279) and a test (n=2167) set and image-level annotations were provided for the training set by several subspecialist thoracic radiologists. 385 CTPAs were selected from the RSPECT training dataset out of a total of 398 that had central PE, and 13 examinations were excluded due to errors during the DICOM to NIfTI format conversion process. Of the 4877 CTPAs without PE or other true filling defect, 385 examinations were randomly selected. An overview of our internal and external datasets is shown in Figure 1.

![Figure 1.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2024/04/11/2023.04.21.23288861/F1.medium.gif)

[Figure 1.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/F1)

Figure 1. Internal and external datasets for training and evaluation of a segmentation-based classification model for pulmonary embolism detection.
Positive examinations refer to the patient having pulmonary embolism (PE) and negative examinations are patients without PE. True filling defect refers to tumor invasion, stump thrombus, catheter, embolized wire, or other obvious non-PE condition as defined in the RSPECT dataset.

### 2.3. nnU-Net Model training and validation

For model training, the nnU-net DL open-source framework, implemented in a Docker container (Docker Inc., Palo Alto, California, USA), was used [19]. The nnU-Net is a semantic segmentation method, and when provided with a training dataset, it automatically configures an end-to-end experimental pipeline (Supp. materials). The PE positive examinations from the internal dataset (n=149) were randomly assigned to training (80%, n=119) and validation sets (20%, n=30) using 5-fold cross-validation during model training (Supp. materials).

### 2.4. Automated Classification Algorithm

After model training and validation, the validated model was embedded in a classification algorithm consisting of three steps, pre-processing, image segmentation inference, and post-processing (Figure 2). Notably, nnU-Net necessitates the utilization of the Neuroimaging Informatics Technology Initiative (NIfTI) file format for model inference. Thus, in the pre-processing step, all DICOM data were converted to the NIfTI format using an in-house Python script. Next, the nnU-Net model inference was performed. Since the nnU-Net model is a volumetric segmentation model, its inference yields a segmentation mask that predicts pulmonary emboli. The segmentation output was transformed into patient-level classification during the post-processing step by applying a threshold to the predicted segmentations, which was based on the total predicted emboli volume. Consequently, a patient-level classification distinguishing between PE and non-PE cases was achieved.

![Figure 2.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2024/04/11/2023.04.21.23288861/F2.medium.gif)

[Figure 2.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/F2)

Figure 2. Training and evaluation of a segmentation-based classification model for pulmonary embolism detection.
700 CTPA examinations were collected and annotated by either of two radiologists. Of these, all 149 PE positive examinations were used for training and the PE negative were kept for later evaluation (a). The 3D U-Net deep learning model, which is generated by the nnU-Net framework, was trained with the 149 PE-positive CTPAs using 5-fold cross-validation. The convolution layer used a 3×3×3 filter size by default, followed by an instance normalization (IN) layer and a leaky rectified Linear Unit (lRELU) layer (b). The *softmax* probabilities were obtained from the model inference for fine-tuning classification into PE or non-PE voxel classes and for calculating the predicted PE volume. By thresholding the predicted volumes and applying a set of logical rules, accurate patient-level classification for PE was achieved (c). The final model was evaluated on 804 external CTPA examinations from two publicly available datasets (d).

The softmax activation function in the final layer of the U-Net-like architecture provides a probability distribution across predicted classes. By strategically selecting different softmax probability thresholds, it was possible to generate segmentation masks with varying volumes. As such, we established rules that incorporated different softmax probability thresholds (ranging from 0.75 to 0.95 in 0.05 intervals) and volumetric thresholds (ranging from 0 mm³ to 200 mm³ in 10 mm³ intervals). This approach was instrumental in improving the model’s performance and fine-tuning the differentiation between PE and non-PE voxel classes. Considering these rules, we formulated two distinct strategies. The strategy that offered the best trade-off between sensitivity and specificity was denoted as Strategy 1, while the one delivering the highest specificity was denoted as Strategy 2 (Supp. materials).

### 2.5. Statistical Analysis

Sensitivity and specificity of our trained model for binary classification for PE/non-PE were assessed on a per-patient basis. Matthew’s correlation coefficient (MCC) was used to find the optimal balance between sensitivity and specificity. The area under the receiver operating characteristic (AUROC) curve was used to determine classification performance during model training, validation, and evaluation. Statistical analysis was performed with Microsoft Office Excel (Microsoft Corporation, Washington, USA, Office Professional Plus 2016) and statsmodels package (version 0.13.5) in Python (version 3.8.10; Python Software Foundation). A *p*-value less than .05 was defined as statistically significant and for C.I., the Wilson score interval was used.

## 3. Results

### 3.1. Model training and performance evaluation on the internal dataset

For model training, 2,439,000 voxels of 1497 PE were annotated by two radiologists in all 149 PE positive CTPAs of the internal dataset (Table 1). Acute as well as chronic PEs were annotated, and no distinction was made between them. Consequently, the model did not distinguish between the two types. An nnU-net model was trained with 5-fold cross-validation with 119 training and 30 validation CTPAs per set in 4 sets and 120 training and 29 validation CTPAs in the fifth set without data overlap between the validation sets. To assess model performance, 21 PE positive exams with small PEs (M = 22.9 mm³, SD = 11.6 mm³) with a total volume of less than 50 mm³ were excluded. The remaining 128 PE positive CTPAs and 551 PE negative CTPAs constituted the internal cross-validation and test set. Training and validation was performed on a single Nvidia RTX 2080 TI GPU card which took ∼1 week in total for all cross-validation folds. The classification performance of the trained nnU-Net model on internal and external test datasets was explored over different threshold volumes, with and without post-processing strategies. Without the post-processing strategy and by setting the threshold volume to 20 mm³, a Matthews correlation coefficient score (MCC) of 63.9% was achieved, correctly classifying 128 out of 128 positive examinations as having PE and 433 out of 551 negative examinations as non-PE. With the post-processing strategy 1 (Supp. materials) and threshold volume of 20 mm³, the best MCC (84.9%) was obtained with 123 of 128 positive examinations correctly classified as PE, and 521 of 551 negative examinations correctly classified as non-PE. Further, the model achieved an AUROC of 96.4% and 94.9% with and without post-processing respectively (Figure 3). The trained nnU-Net model thus achieved a sensitivity of 96.1% (95% C.I. 91-98%, *P* < .05) and 100% (95% C.I. 97-100%, *P* < .05), and a specificity of 94.6% (95% C.I. 92-96%, *P* < .05) and 78.6% (95% C.I. 75-82%, *P* < .05) in the internal dataset with and without the post-processing strategies, respectively (Table 2).

![Figure 3.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2024/04/11/2023.04.21.23288861/F3.medium.gif)

[Figure 3.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/F3)

Figure 3. Classification performance of the trained nnU-Net model.
Area Under the Curves (AUC) without (a) and with (b) post-processing. Black, internal dataset (n = 679, 128 PE and 551 non-PE); Blue, the FUMPE datasets (n = 34, 32 PE and 2 non-PE); Red, the RSNA PE dataset (n = 770, 385 PE and 385 non-PE). TPR, true positive rate; FPR, false positive rate. AUC values are in percentages.

View this table:
[Table 1.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/T1)

Table 1. Ground truth annotation of 149 internal CTPAs with PE.

View this table:
[Table 2.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/T2)

Table 2. Diagnostic performance of the trained model

### 3.2. Model performance on external datasets

For external evaluation, the trained model was applied to a total of 804 CTPAs from two publicly available datasets. First, 34 PE positive CTPAs and 2 PE negative CTPAs from the FUMPE dataset were analyzed. With post-processing strategy 1, an MCC score of 80.4% was obtained with 31 of 32 positive examinations correctly classified as PE, and 2 of 2 negative examinations correctly classified as non-PE. The trained model achieved AUROC 98.5% (Figure 3) with sensitivity of 96.9% (95% C.I. 84-99%, *P* < .05) and specificity of 100% (95% C.I. 34-100%, *P* < .05) (Table 2, Supp. Table 5). Focusing on central PE, where the annotations can be assumed to be more consistent, we used 385 CTPAs annotated as having at least one central PE and 385 PE negative CTPAs from the RSPECT pulmonary embolism CT dataset for model evaluation. With the post-processing strategy 1 (Supp. Materials), an MCC of 88.6% was obtained with 379 of 385 positive examinations correctly classified as PE, and 346 of 385 negative examinations correctly classified as non-PE. The trained model achieved an AUROC of 98.6% (Figure 3) with sensitivity of 98.4% (95% C.I. 97-99%, *P* < .05) and a specificity of 89.9% (95% C.I. 86-93%, *P* < .05) (Table 2, Supp. Table 6). Without the post-processing strategy and by setting the threshold volume to 20 mm³, MCC of 100% and 73.3% were obtained with 32 (n=32) and 385 (n=385) positive examinations correctly classified as PE, and 2 (n=2) and 269 (n= 385) negative examinations correctly classified as non-PE in the first and second external datasets, respectively (Table 2, Supp. Table 2 and 3). Moreover, the model achieved an AUROC of 100% and 94.2% (Figure 3) with a sensitivity of 100% (95% C.I. 89-100%, *P* < .05) and 100% (95% C.I. 99-100%, *P* < .05), and a specificity of 100% (95% C.I. 34-100%, *P* < .05) and 69.9% (95% C.I. 65-74%, *P* < .05) in the first and second datasets, respectively (Table 2, Supp. Table 2 and 3).

The output of the automated classification algorithm is shown (Figure 4, Supp. Figures 1-3). Performing model inference within the nnU-Net framework for a single CTPA volume examination, utilizing a singular Nvidia RTX 4090 GPU card, required 240-300 s. Furthermore, disabling the test time augmentation (TTA) yielded a decrease in inference time to 60-70 s with an average accuracy decline of 1% (Supp. Tables 1-12).

![Figure 4.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2024/04/11/2023.04.21.23288861/F4.medium.gif)

[Figure 4.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/F4)

Figure 4. Representative segmentation results of the trained model.
Axial, coronal, and sagittal planes from the same CTPA examinations from the external FUMPE dataset with the same window setting (width = 800 HU, level = 100 HU) are shown. Red, pulmonary embolism annotation; blue, model segmentation; purple, overlay of annotation and model segmentation.

### 3.3. Benchmarking of model performance

As mentioned above, post-processing strategy 1 was used to find out the best balance between sensitivity and specificity and 20 mm³ was determined as the optimal threshold volume. Aiming for the highest specificity and the lowest patient level false positive rate, we used post-processing strategy 2 where 50 mm³ was determined as optimal threshold volume. For size comparison, 20 mm³, 50 mm³, and other threshold volumes (Figure 5a) are compared to a segmented reference pulmonary artery (Figure 5b). In the internal test dataset, the highest specificity (96.7%; 95% C.I. 95-98%, *P* < .05) was obtained with a sensitivity of 87.5% (95% C.I. 81-92%, *P* < .05) with post-processing strategy 2 (Supp. materials, Supp. Table 7) and threshold volume of 50 mm³. For the external datasets, the highest specificity 100% (95% C.I. 34-100%, *P* < .05) and 96.9% (95% C.I. 95-98%, *P* < .05) was obtained with a sensitivity of 90.6% (95% C.I. 76-97%, *P* < .05) and 96.6% (95% C.I. 94-98%, *P* < .05) in the FUMPE and RSPECT datasets, respectively (Supp. Table 8 and 9). Moreover, we examined the source of false positives by implementing post-processing strategy 2, which led to a minimum number of FPs per dataset. The most frequent false positives were due to low contrast medium in pulmonary arteries (Table 3). Whereas 18% of FPs occurred on the outside of the thoracic cavity, in the upper abdomen, or in the superior vena cava, the remaining FPs occurred within or close to the pulmonary vessel network. Besides, most of the false negatives (FN) occurred in the RSPECT dataset, and the primary cause of these FNs is chronic PEs. We next compared model performance to those of previous studies (Table 4). With post-processing strategy 2, the proposed algorithm achieved a sensitivity of 96.2% (95% C.I. 94-98%, *P* < .05) and a specificity of 96.8% (95% C.I. 95-98%, *P* < .05) on the combined (internal and external) testing set (Supp. Table 12). While investigating the causes of false positives, we observed that 3 CTPAs from the RSPECT dataset that were annotated as PE negative were actually PE positive. Considering this correction, the proposed algorithm achieved a specificity of 97.1%.

![Figure 5.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2024/04/11/2023.04.21.23288861/F5.medium.gif)

[Figure 5.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/F5)

Figure 5. An illustration of a segmented reference pulmonary artery with reference threshold volumes.
Patient orientation of 3D volumes with reference threshold volumes (20, 50, 200, 1000, and 10000 mm³) (a). Manually segmented reference pulmonary artery (volume of 113 cm³) from a male patient without PE (b). All volumetric images are isotropic (1 mm × 1 mm × 1 mm).

View this table:
[Table 3.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/T3)

Table 3. Sources of false positives in PE negative CTPA examinations from internal and external datasets

View this table:
[Table 4.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/T4)

Table 4. Model performance comparison for patient-level classification for PE in CTPA examinations.

## 4. Discussion

Detection of PE in CTPA using deep convolutional neural networks (DCNN) was first demonstrated by Tajbakhsh et al. with a sensitivity of 83% and 34.6% at 2 FPs per examinations on 121 internal and 20 external test examinations, respectively [26]. Rajan et al. proposed a two-stage solution where a 2D U-Net model was used for PE candidate generation, followed by a convolutional long short-term memory (LSTM) network coupled with multiple instance learning to detect PE lesions, with AUROC of 70% for subsegmental and segmental PE and 0.85 for saddle and main pulmonary artery PE on a test dataset of 512 CTPA examinations [16]. In a study by Huang et al. [27], a 3D CNN model termed PENet was developed which achieved a sensitivity of 75% and specificity of 81% on an external test dataset of 200 CTPA examinations. However, all these studies have major limitations such as small testing dataset sizes or low specificity rates. The current state-of-the-art results were recently achieved using the Resnet architecture on 1465 CTPA examinations with a sensitivity of 92.7% and specificity of 95.5% at the patient level [18]. Taken together, the performance of AI systems for PE detection is now at a point where clinical utility can be expected, but further gains in sensitivity and specificity are still warranted.

In this study, we developed an algorithm that classifies CTPA examinations for presence of PE consisting of two main stages, PE candidate selection and post-processing. For PE candidate selection, we trained and validated a semantic segmentation model, nnU-Net, on our internal dataset. The nnU-Net is a medical image segmentation framework based on the U-Net architecture and has outperformed state-of-the-art models by competing in 53 segmentation tasks from 11 international biomedical image segmentation challenges and taking first place in 33 of them [19]. To our knowledge, this is the first use of nnU-Net for classification for PE. To transform the segmentation model into a classification model, we developed rules based on probability and minimum volume thresholds as a post-processing stage. We defined two post-processing strategies, one for the best trade-off between sensitivity and specificity and one for achieving the highest specificity. At the best trade-off between sensitivity and specificity, the patient-level classification performance of the trained model achieved a sensitivity of 98.3% and specificity of 92.6% on the combined testing dataset using a threshold volume of 20 mm³, compared to specificity of 75.1% with sensitivity of 100% without post-processing. Thus, by sacrificing 1.7% of sensitivity, the model gained 17.4% in specificity using post-processing. The model outperformed the current state-of-the-art using the strategy of highest specificity, achieving 96.2% sensitivity and a specificity of 96.8% on the combined testing dataset of 1355 CTPA examinations with a total emboli volume threshold of 50 mm³.

Treatment for small PEs is debated and controversial [24,25] and PE volumes in this context are rarely measured and reported in the literature and are subject to inter-observer variability. We decided to set a cut-off volume to exclude the very smallest and in some cases doubtful PEs in our internal dataset. Radiologists DT and TF segmented all 1497 PEs and found a total volume of 50 mm³ to be a reasonable cut-off for exclusion. In a scenario where the total PE volume of 50 mm³ was represented by a single, equidimensional embolus, the cut-off size would amount to a cylinder with a diameter and height of 4 mm. Many of the PEs in the 21 excluded exams were much smaller eccentric or tiny webs, likely chronic.

Although the nnU-Net based model presented here is superior to the state-of-the-art, there are some limitations and opportunities for future enhancement. First, the model was trained on data from a single institution, although derived from five different CT scanners. Second, the training dataset, even though the proportion was not analyzed in detail, was predominantly comprised of acute PEs with a limited representation of chronic PEs. However, our observations suggest that chronic PEs within the RSPECT dataset are a significant factor contributing to false negatives. Third, the RSPECT validation dataset lacks voxel level annotation of PE by radiologists, which precludes final determination of sensitivity and specificity until a review has been completed. Finally, the activation of test time augmentation (TTA) extends the model inference duration to ∼300 s per CTPA examination. Conversely, deactivating the TTA reduces the model inference time to a range of 60 to 70 seconds for a single CTPA examination. However, when TTA is disabled, sensitivity and specificity decrease by approximately 0.3% and 1.7%, respectively.

In conclusion, nnU-Net deep learning based binary classification for PE holds potential to assist radiologists in the reading of CTPA examinations. Preferentially, such a system could prioritize PE positive cases in the work list, identifying high-priority cases for swift review [28], or provide a second opinion.

## Data Availability

Data generated or analyzed during the study are available from the corresponding author by request.

## Data sharing statement

Data generated or analyzed during the study are available from the corresponding author by request.

## Funding information

The project was supported by a grant from Analytic Imaging Diagnostics Arena (AIDA), [https://medtech4health.se/aida-en/](https://medtech4health.se/aida-en/), to Tobias Sjöblom. Tomas Fröding and Dimitrios Toumpanakis were supported by clinical fellowships from AIDA. Tomas Fröding was supported by the Centre for Clinical Research Sörmland, Uppsala University, Eskilstuna, Sweden.

## Supplementary Materials

### S.1. Dataset

#### S.1.1 Internal dataset CT Acquisition Protocols

All 700 CTPAs were performed with bolus tracking technique with the region of interest (ROI) in the pulmonary trunk. Different Hounsfield unit thresholds and delays were used. Contrast medium doses were recorded for 191 (range 20 ml – 114 ml, mean 62 ml) and injection rates for 158 (range 2,4 ml/s – 6,1 ml/s, mean 3,6 ml/s) CTPAs, respectively. The most frequently used CT image acquisition parameters were slice thickness 0.625 mm (range 0.625 mm - 2.0 mm), pixel spacing 0.7 mm (range 0.59 mm - 0.98 mm), tube voltage 100 kV (range 80 kV - 120 kV) and scanning direction caudal to cranial. The CTPAs acquired on the Siemens Somatom Definition Flash CT were in the majority of cases performed with dual-energy source acquisition with tube settings 80 kV / 140 kV and the images used in the dataset were post-processed blended images from a weighting factor 0.5.

#### S.1.2 Distribution of CT Pulmonary Angiography examinations from the same patient in internal datasets

The internal dataset comprises 700 CT Pulmonary Angiography (CTPA) examinations involving 652 patients. Among them, 41 patients underwent CTPA twice, 2 patients thrice, and 1 patient four times. Of the 149 pulmonary embolism (PE) -positive examinations, 142 patients were involved, and of the 551 PE-negative examinations, 520 patients were included. Ten patients had both PE-negative and PE-positive CT examinations. Since the CT scans acquired from the same patient were performed at different occasions, several anatomical aspects depending on breath hold level, angle of spine and pulmonary disease status were different (Supp. Figures 4-5). This means that there were differences at the voxel level and also differences in the data label (positive/negative PE), depending on the scan session. The examinations were therefore used and analyzed as if they had been obtained from different patients. They were therefore randomly distributed to cross-validation dataset, regardless of whether they belonged to the same patient. In the external datasets, all CTPAs were obtained from different patients and thus truly statistically independent.

#### S.1.3 Ferdowsi University of Mashhad’s Pulmonary Embolism Dataset

The Ferdowsi University of Mashhad’s PE dataset (FUMPE) is a publicly available dataset consisting of 35 CTPA examinations with voxel-level PE annotations by radiologists. One PE-positive examination was excluded due to a lack of ground truth annotation. Out of the 34 CTPA examinations, 32 were PE-positive and 2 were PE-negative. When examining the ground truth, we noticed that the slice locations of PE annotations were incorrect in 8 CTPA examinations. Specifically, in these cases, PE annotations that should have been located in slice 101 were mistakenly placed in slice 11. As a result, we relocated the PE annotations from slice 11 to slice 101.

### S.2. Environmental Settings and Versions

#### S.2.1 Model Training Environment

The environmental settings employed for both model training and cross-validation in this study encompass specific software versions: Ubuntu 22.04.3 LTS as the operating system, Docker version 24.0.7 for containerization, and CUDA Version 12.1, along with Nvidia driver version 530.30.02, are employed to facilitate seamless interactions with the NVIDIA GeForce RTX 2080 Ti GPUs. The programming language employed is Python, specifically version 3.8.10. The deep learning framework PyTorch is leveraged in version 2.0.0, and the semantic segmentation method nnU-Net is implemented in version 1.7.1.

#### S.2.2 Model Inference Environment

For model inference, another workstation was utilized, with specific environmental settings and software versions. These settings include Ubuntu 22.04.3 LTS as the operating system, Docker version 24.0.7 for containerization, CUDA Version 12.2, and Nvidia driver version 535.129.03, facilitating seamless interactions with the NVIDIA GeForce RTX 4090 GPU. The programming language employed is Python version 3.10.6. PyTorch, the deep learning framework, is utilized in version 2.1.0, and the semantic segmentation method nnU-Net is implemented in version 1.7.1.

### S.3. The nnU-Net Deep Learning Framework

#### S.3.1 Hyperparameters

In the training of our deep learning model, the nnU-Net framework employed a specific set of hyperparameters to optimize the learning process. The chosen optimizer is Stochastic Gradient Descent (SGD) with Nesterov momentum, utilizing a momentum value of 0.99. Additionally, weight decay was incorporated with a coefficient of 3e-05 to regulate the model’s complexity during training. The initial learning rate was set to 0.01, providing a starting point for the optimization process. To enhance the training procedure, a learning rate scheduler was implemented with a patience parameter of 30 epochs and a threshold of 1e-06. Lastly, the maximum number of training epochs was defined as 1000. These carefully selected hyperparameters contribute to the fine-tuning of the model, optimizing its performance over the course of training.

#### S.3.2 Data Augmentation

The nnU-Net framework employed a set of data augmentation techniques to generalize the models to prevent overfitting to the training data set. Elastic deformation, a spatial transformation technique, was introduced with an alpha range of (0.0, 200.0) and a sigma range of (9.0, 13.0), implemented with a probability of occurrence set at 0.2. Scaling transformations were applied within the range of (0.7, 1.4). To introduce variability in the orientation of the input data, rotational transformations along the X, Y, and Z axes were implemented with specified ranges. Gamma correction, an intensity transformation, was applied with a probability of 0.3 and a gamma range of (0.7, 1.5). Mirroring along axes (0, 1, 2) were applied. Furthermore, a cascaded random binary transformations and additive brightness adjustments were employed with specified probabilities and parameters.

#### S.3.3 3D U-Net Architecture

The nnU-Net framework is configured to generate a 3D U-Net architecture for semantic segmentation tasks. The 3D U-Net architecture is characterized by a symmetrical design with a decoder path that uses transposed convolutions for up sampling. The decoder path of the network consists of five transposed convolutional layers. Each layer employs 3D transposed convolutional operation with varying input and output channel sizes, effectively increasing spatial resolution. Starting with the first layer, it utilizes a transposed convolution operation with 320 input channels, 320 output channels, a 2x2x2 kernel size, and a stride of 2 in all spatial dimensions. Subsequent other layers follow a similar structure, progressively decreasing the number of input channels while maintaining the up-sampling strategy. Additionally, the architecture includes an encoder path with five convolutional layers, where each 3D convolutional operation employs a 1x1x1 kernel with a stride of 1. These layers reduce the channel depth and capture hierarchical features. Each convolutional layer followed by 3D instance normalization and leaky rectified linear unit activation.

## S.4. Post-processing step

The nn-Unet *softmax* activation function of the final layer of the U-Net architecture can be used to scale network output into probabilities. Hence, the probabilities could be gathered, and not only final pixel class values. We developed a set of logical rules based on different *softmax* probability thresholds (0.75 - 0.95) and threshold volumes per examinations (0 mm³ to 200 mm³ in 10 mm³ intervals) to reduce false positives (FPs) and convert nnU-Net inference segmentation output into a patient-level classification output. By setting different *softmax* probability thresholds, we obtained different predicted PE volumes. If the model is well-trained to distinguish between PE and non-PE classes, the number of predicted voxels (false positive voxels) that do not belong to the PE class will decrease when the *softmax* probabilities are set to higher thresholds. Therefore, we developed the formulas below to decide whether the total predicted PE volume is sufficient to determine the patient as PE positive/negative.

Proposition 1: ![Formula][1]</img>  where *P*0.75 is the volume of total PE predicted by the trained model at a softmax probability of 0.75, *P*0.90 the ratio factor, which was fixed at 15. The *softmax* probability value range and the ratio factor were optimized by systematic exploration.

Proposition 2: ![Formula][2]</img>  where *Pi* is the volume of total PE predicted by the trained model at a *softmax* probability of *i* between 0.75 to 0.95 with 0.05 intervals, *v* is the threshold volume between 0 and 200 mm³ at 10 mm³ intervals and is the condition factor (min value is 0, max value is 4) that refers to the total number of true conditions satisfying *Pi < v* equation.

Then, the final decision is made as follows:

![Formula][3]</img> 
According to the propositions above, we defined two post-processing strategies. Strategy 1 (Rule-in classification for PE) aimed to find the exact threshold volume value and value for the best trade-off between sensitivity and specificity by checking the Matthew’s correlation coefficient (MCC) value. And strategy 2 (Rule-out classification for PE) aimed to find the exact threshold volume and values for the highest specificity alongside the highest MCC value.

*Strategy 1:*

By systematic exploration, setting the threshold volume value to 20 mm³ and the k value to 1 gives the highest MCC value (84.9%, Supplementary Table 4).

*Strategy 2*

By systematic exploration, setting the threshold volume value to 50 mm³ and the k value to 0 gives the highest specificity alongside the highest MCC value (83.7%, Supplementary Table 7).

![Supplemental Figure 1.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2024/04/11/2023.04.21.23288861/F6.medium.gif)

[Supplemental Figure 1.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/F6)

Supplemental Figure 1. Representative segmentation results from the FUMPE dataset (patient 03).
Axial, coronal, and sagittal planes from the same CTPA examination from the external FUMPE dataset with the same window setting (width = 800 HU, level = 100 HU) are shown. Red, pulmonary embolism annotation; blue, model segmentation; purple, overlay of annotation and model segmentation.


![Supplemental Figure 2.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2024/04/11/2023.04.21.23288861/F7.medium.gif)

[Supplemental Figure 2.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/F7)

Supplemental Figure 2. Representative segmentation results from the FUMPE dataset (patient 04).
Axial, coronal, and sagittal planes from the same CTPA examination from the external FUMPE dataset with the same window setting (width = 800 HU, level = 100 HU) are shown. Red, pulmonary embolism annotation; blue, model segmentation; purple, overlay of annotation and model segmentation.

![Supplemental Figure 3.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2024/04/11/2023.04.21.23288861/F8.medium.gif)

[Supplemental Figure 3.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/F8)

Supplemental Figure 3. Representative segmentation results from the FUMPE dataset (patient 05).
Axial, coronal, and sagittal planes from the same CTPA examination from the external FUMPE dataset with the same window setting (width = 800 HU, level = 100 HU) are shown. Red, pulmonary embolism annotation; blue, model segmentation; purple, overlay of annotation and model segmentation.

![Supplemental Figure 4.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2024/04/11/2023.04.21.23288861/F9.medium.gif)

[Supplemental Figure 4.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/F9)

Supplemental Figure 4. Representative examples from two CT Pulmonary Angiography (CTPA) examinations of the same patient, both having pulmonary embolism.
Two CTPA examinations from the same patient within the internal dataset are presented, featuring identical window settings (width = 1500 HU, level = -400 HU) and depicted at three distinct anatomical levels. The patient exhibited pulmonary embolism in both CTPA examinations.

![Supplemental Figure 5.](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2024/04/11/2023.04.21.23288861/F10.medium.gif)

[Supplemental Figure 5.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/F10)

Supplemental Figure 5. Representative examples of two CT Pulmonary Angiography (CTPA) examinations of the same patient with a pulmonary embolism in one examination but not in the other.
Two CTPA examinations from the same patient within the internal dataset are presented, featuring identical window settings (width = 1500 HU, level = -400 HU) and depicted at three distinct anatomical levels. The patient exhibited pulmonary embolism in examination 2.

View this table:
[Supplementary Table 1.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/T5)

Supplementary Table 1. Diagnostic performance of the trained model without post-processing in the internal dataset

View this table:
[Supplementary Table 2.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/T6)

Supplementary Table 2. Diagnostic performance of the trained model without post-processing in the external FUMPE dataset

View this table:
[Supplementary Table 3.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/T7)

Supplementary Table 3. Diagnostic performance of the trained model without post-processing in the external RSPECT Dataset

View this table:
[Supplementary Table 4.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/T8)

Supplementary Table 4. Diagnostic performance of the trained model with post-processing strategy 1 in the internal dataset

View this table:
[Supplementary Table 5.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/T9)

Supplementary Table 5. Diagnostic performance of the trained model with post-processing strategy 1 in the external FUMPE Dataset

View this table:
[Supplementary Table 6.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/T10)

Supplementary Table 6. Diagnostic performance of the trained model with post-processing strategy 1 in the external RSPECT Dataset

View this table:
[Supplementary Table 7.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/T11)

Supplementary Table 7. Diagnostic performance of the trained model with post-processing strategy 2 in the internal Dataset

View this table:
[Supplementary Table 8.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/T12)

Supplementary Table 8. Diagnostic performance of the trained model with post-processing strategy 2 in the eternal FUMPE Dataset

View this table:
[Supplementary Table 9.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/T13)

Supplementary Table 9. Diagnostic performance of the trained model with post-processing strategy 2 in the external RSPECT Dataset

View this table:
[Supplementary Table 10.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/T14)

Supplementary Table 10. Diagnostic performance of the trained model without post-processing strategy in the combined testing dataset

View this table:
[Supplementary Table 11.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/T15)

Supplementary Table 11. Diagnostic performance of the trained model with post-processing strategy 1 in the combined testing dataset

View this table:
[Supplementary Table 12.](http://medrxiv.org/content/early/2024/04/11/2023.04.21.23288861/T16)

Supplementary Table 12. Diagnostic performance of the trained model with post-processing strategy 2 in the combined testing dataset

## Acknowledgements

The project was supported by a grant from Analytic Imaging Diagnostics Arena (AIDA) to Tobias Sjöblom. Tomas Fröding and Dimitrios Toumpanakis were financially supported by clinical fellowships from AIDA. Tomas Fröding was financially supported by the Centre for Clinical Research Sörmland, Uppsala University.

## Footnotes

*   * Co-first authors

*   ** Co-Senior authors

*   Abstract and Supplemental materials are updated. Tables 4 are updated.

## Abbreviations

PE
:   pulmonary embolism
CTPA
:   computed tomography pulmonary angiography
nnU-Net
:   no-new-U-Net
DL
:   deep learning
CADe
:   computer-aided detection.

*   Received April 21, 2023.
*   Revision received April 11, 2024.
*   Accepted April 11, 2024.


*   © 2024, Posted by Cold Spring Harbor Laboratory

The copyright holder for this pre-print is the author. All rights reserved. The material may not be redistributed, re-used or adapted without the author's permission.

## References

1.  [1].Sista AK, Kuo WT, Schiebler M, Madoff DC. Stratification, Imaging, and Management of Acute Massive and Submassive Pulmonary Embolism. Radiology 2017;284:5–24. doi:10.1148/radiol.2017151978.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1148/radiol.2017151978&link_type=DOI) 

2.  [2].Raskob GE, Angchaisuksiri P, Blanco AN, Buller H, Gallus A, Hunt BJ, et al. Thrombosis: A Major Contributor to Global Disease Burden. ATVB 2014;34:2363–71. doi:10.1161/ATVBAHA.114.304488.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiYXR2YmFoYSI7czo1OiJyZXNpZCI7czoxMDoiMzQvMTEvMjM2MyI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDI0LzA0LzExLzIwMjMuMDQuMjEuMjMyODg4NjEuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 

3.  [3].Elenizi K, Alharthi R, Galinier M. Pulmonary embolism originating from germ cell tumor causes severe left ventricular dysfunction in a healthy young adult with full recovery: a case report. BMC Cardiovasc Disord 2021;21:260. doi:10.1186/s12872-021-02066-7.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s12872-021-02066-7&link_type=DOI) 

4.  [4].Sista AK, Horowitz JM, Tapson VF, Rosenberg M, Elder MD, Schiro BJ, et al. Indigo Aspiration System for Treatment of Pulmonary Embolism. JACC: Cardiovascular Interventions 2021;14:319–29. doi:10.1016/j.jcin.2020.09.053.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jcin.2020.09.053&link_type=DOI) 

5.  [5].Jha AK, Larizgoitia I, Audera-Lopez C, Prasopa-Plaizier N, Waters H, Bates DW. The global burden of unsafe medical care: analytic modelling of observational studies. BMJ Qual Saf 2013;22:809–15. doi:10.1136/bmjqs-2012-001748.
    
    [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoicWhjIjtzOjU6InJlc2lkIjtzOjk6IjIyLzEwLzgwOSI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDI0LzA0LzExLzIwMjMuMDQuMjEuMjMyODg4NjEuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 

6.  [6].Rivera-Lebron B, McDaniel M, Ahrar K, Alrifai A, Dudzinski DM, Fanola C, et al. Diagnosis, Treatment and Follow Up of Acute Pulmonary Embolism: Consensus Practice from the PERT Consortium. Clin Appl Thromb Hemost 2019;25:107602961985303. doi:10.1177/1076029619853037.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1177/1076029619853037&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F11%2F2023.04.21.23288861.atom) 

7.  [7].Konstantinides SV, Meyer G, Becattini C, Bueno H, Geersing G-J, Harjola V-P, et al. 2019 ESC Guidelines for the diagnosis and management of acute pulmonary embolism developed in collaboration with the European Respiratory Society (ERS): The Task Force for the diagnosis and management of acute pulmonary embolism of the European Society of Cardiology (ESC). European Heart Journal 2020;41:543–603. doi:10.1093/eurheartj/ehz405.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/eurheartj/ehz405&link_type=DOI) 

8.  [8].Hendriks BMF, Eijsvoogel NG, Kok M, Martens B, Wildberger JE, Das M. Optimizing Pulmonary Embolism Computed Tomography in the Age of Individualized Medicine: A Prospective Clinical Study. Invest Radiol 2018;53:306–12. doi:10.1097/RLI.0000000000000443.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/RLI.0000000000000443&link_type=DOI) 

9.  [9].Wittenberg R, Peters JF, Sonnemans JJ, Prokop M, Schaefer-Prokop CM. Computer-assisted detection of pulmonary embolism: evaluation of pulmonary CT angiograms performed in an on-call setting. Eur Radiol 2010;20:801–6. doi:10.1007/s00330-009-1628-7.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s00330-009-1628-7&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19862534&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F11%2F2023.04.21.23288861.atom) 

10. [10].Zhou C, Chan H-P, Hadjiiski LM, Chughtai A, Patel S, Cascade PN, et al. Automated detection of pulmonary embolism (PE) in computed tomographic pulmonary angiographic (CTPA) images: multiscale hierachical expectation-maximization segmentation of vessels and PEs. In: Giger ML, Karssemeijer N, editors., San Diego, CA: 2007, p. 65142F. doi:10.1117/12.713769.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1117/12.713769&link_type=DOI) 

11. [11].Özkan H, Osman O, Şahin S, Boz AF. A novel method for pulmonary embolism detection in CTA images. Computer Methods and Programs in Biomedicine 2014;113:757–66. doi:10.1016/j.cmpb.2013.12.014.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cmpb.2013.12.014&link_type=DOI) 

12. [12].Myers MH, Beliaev I, Lin K-I. Machine Learning Techniques in Detecting of Pulmonary Embolisms. 2007 International Joint Conference on Neural Networks, Orlando, FL, USA: IEEE; 2007, p. 385–90. doi:10.1109/IJCNN.2007.4370987.
    
    
13. [13].Wang X, Song X, Chapman BE, Zheng B. Improving performance of computer-aided detection of pulmonary embolisms by incorporating a new pulmonary vascular-tree segmentation algorithm. In: van Ginneken B, Novak CL, editors., San Diego, California, USA: 2012, p. 83152U. doi:10.1117/12.911301.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1117/12.911301&link_type=DOI) 

14. [14].Ozkan H, Tulum G, Osman O, Sahin S. Automatic Detection of Pulmonary Embolism in CTA Images Using Machine Learning. ElAEE 2017;23:63–7. doi:10.5755/j01.eie.23.1.17585.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.5755/j01.eie.23.1.17585&link_type=DOI) 

15. [15].Tajbakhsh N, Shin JY, Gotway MB, Liang J. Computer-aided detection and visualization of pulmonary embolism using a novel, compact, and discriminative image representation. Medical Image Analysis 2019;58:101541. doi:10.1016/j.media.2019.101541.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.media.2019.101541&link_type=DOI) 

16. [16].Rajan D, Beymer D, Abedin S, Dehghan E. Pi-PE: A Pipeline for Pulmonary Embolism Detection using Sparsely Annotated 3D CT Images. Proceedings of the Machine Learning for Health NeurIPS Workshop, PMLR; 2020, p. 220–32.
    
    
17. [17].Weikert T, Winkel DJ, Bremerich J, Stieltjes B, Parmar V, Sauter AW, et al. Automated detection of pulmonary embolism in CT pulmonary angiograms using an AI-powered algorithm. Eur Radiol 2020;30:6545–53. doi:10.1007/s00330-020-06998-0.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s00330-020-06998-0&link_type=DOI) 

18. [18].Siddique N, Paheding S, Elkin CP, Devabhaktuni V. U-Net and Its Variants for Medical Image Segmentation: A Review of Theory and Applications. IEEE Access 2021;9:82031–57. doi:10.1109/ACCESS.2021.3086020.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1109/ACCESS.2021.3086020&link_type=DOI) 

19. [19].Isensee F, Jaeger PF, Kohl SAA, Petersen J, Maier-Hein KH. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods 2021;18:203–11. doi:10.1038/s41592-020-01008-z.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41592-020-01008-z&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=33288961&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F11%2F2023.04.21.23288861.atom) 

20. [20].Kahraman AT, Fröding T, Toumpanakis D, Sladoje N, Sjöblom T. Automated detection, segmentation and measurement of major vessels and the trachea in CT pulmonary angiography. Sci Rep 2023;13:18407. doi:10.1038/s41598-023-45509-1.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41598-023-45509-1&link_type=DOI) 

21. [21].Wolf I, Vetter M, Wegner I, Böttger T, Nolden M, Schöbinger M, et al. The Medical Imaging Interaction Toolkit. Medical Image Analysis 2005;9:594–604. doi:10.1016/j.media.2005.04.005.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.media.2005.04.005&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15896995&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F11%2F2023.04.21.23288861.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000232945700008&link_type=ISI) 

22. [22].Masoudi M, Pourreza H-R, Saadatmand-Tarzjan M, Eftekhari N, Zargar FS, Rad MP. A new dataset of computed-tomography angiography images for computer-aided detection of pulmonary embolism. Sci Data 2018;5:180180. doi:10.1038/sdata.2018.180.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/sdata.2018.180&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=30179235&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F11%2F2023.04.21.23288861.atom) 

23. [23].Colak E, Kitamura FC, Hobbs SB, Wu CC, Lungren MP, Prevedello LM, et al. The RSNA Pulmonary Embolism CT Dataset. Radiology: Artificial Intelligence 2021;3:e200254. doi:10.1148/ryai.2021200254.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1148/ryai.2021200254&link_type=DOI) 

24. [24].Fernández□Capitán C, Rodriguez Cobo A, Jiménez D, Madridano O, Ciammaichella M, Usandizaga E, et al. Symptomatic subsegmental versus more central pulmonary embolism: Clinical outcomes during anticoagulation. Research and Practice in Thrombosis and Haemostasis 2021;5:168–78. doi:10.1002/rth2.12446.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/rth2.12446&link_type=DOI) 

25. [25]. Den Exter PL, Kroft LJM, Gonsalves C, Le Gal G, Schaefer-Prokop CM, Carrier M, et al. Establishing diagnostic criteria and treatment of subsegmental pulmonary embolism: A Delphi analysis of experts. Research and Practice in Thrombosis and Haemostasis 2020;4:1251–61. doi:10.1002/rth2.12422.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/rth2.12422&link_type=DOI) 

26. [26].Tajbakhsh N, Gotway MB, Liang J. Computer-Aided Pulmonary Embolism Detection Using a Novel Vessel-Aligned Multi-planar Image Representation and Convolutional Neural Networks. In: Navab N, Hornegger J, Wells WM, Frangi A, editors. Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2015, vol. 9350, Cham: Springer International Publishing; 2015, p. 62–9. doi:10.1007/978-3-319-24571-3_8.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/978-3-319-24571-3_8&link_type=DOI) 

27. [27].Huang S-C, Kothari T, Banerjee I, Chute C, Ball RL, Borus N, et al. PENet—a scalable deep-learning model for automated diagnosis of pulmonary embolism using volumetric CT imaging. Npj Digit Med 2020;3:61. doi:10.1038/s41746-020-0266-y.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41746-020-0266-y&link_type=DOI) 

28. [28].Wittenberg R, Berger FH, Peters JF, Weber M, van Hoorn F, Beenen LFM, et al. Acute Pulmonary Embolism: Effect of a Computer-assisted Detection Prototype on Diagnosis—An Observer Study. Radiology 2012;262:305–13. doi:10.1148/radiol.11110372.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1148/radiol.11110372&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22190659&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F11%2F2023.04.21.23288861.atom) 

29. [29].Stein PD, Fowler SE, Goodman LR, Gottschalk A, Hales CA, Hull RD, et al. Multidetector Computed Tomography for Acute Pulmonary Embolism. N Engl J Med 2006;354:2317–27. doi:10.1056/NEJMoa052367.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1056/NEJMoa052367&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16738268&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F11%2F2023.04.21.23288861.atom) 
    
    [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000237918500004&link_type=ISI) 

30. [30].Maizlin ZV, Vos PM, Godoy MB, Cooperberg PL. Computer-aided Detection of Pulmonary Embolism on CT Angiography: Initial Experience. Journal of Thoracic Imaging 2007;22:324–9. doi:10.1097/RTI.0b013e31815b89ca.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/RTI.0b013e31815b89ca&link_type=DOI) 
    
    [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18043386&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2024%2F04%2F11%2F2023.04.21.23288861.atom) 

31. [31].Lahiji K, Kligerman S, Jeudy J, White C. Improved Accuracy of Pulmonary Embolism Computer-Aided Detection Using Iterative Reconstruction Compared With Filtered Back Projection. American Journal of Roentgenology 2014;203:763–71. doi:10.2214/AJR.13.11838.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2214/AJR.13.11838&link_type=DOI) 

32. [32].Huhtanen H, Nyman M, Mohsen T, Virkki A, Karlsson A, Hirvonen J. Automated detection of pulmonary embolism from CT-angiograms using deep learning. BMC Med Imaging 2022;22:43. doi:10.1186/s12880-022-00763-z.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s12880-022-00763-z&link_type=DOI) 

33. [33].Ma X, Ferguson EC, Jiang X, Savitz SI, Shams S. A multitask deep learning approach for pulmonary embolism detection and identification. Sci Rep 2022;12:13087. doi:10.1038/s41598-022-16976-9.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41598-022-16976-9&link_type=DOI) 

34. [34].Wiklund P, Medson K, Elf J. Incidental pulmonary embolism in patients with cancer: prevalence, underdiagnosis and evaluation of an AI algorithm for automatic detection of pulmonary embolism. Eur Radiol 2022;33:1185–93. doi:10.1007/s00330-022-09071-0.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s00330-022-09071-0&link_type=DOI) 

35. [35].Djahnine A, Lazarus C, Lederlin M, Mulé S, Wiemker R, Si-Mohamed S, et al. Detection and severity quantification of pulmonary embolism with 3D CT data using an automated deep learning-based artificial solution. Diagnostic and Interventional Imaging 2024;105:97–103. doi:10.1016/j.diii.2023.09.006.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.diii.2023.09.006&link_type=DOI) 

36. [36].Islam NU, Zhou Z, Gehlot S, Gotway MB, Liang J. Seeking an optimal approach for Computer-aided Diagnosis of Pulmonary Embolism. Medical Image Analysis 2024;91:102988. doi:10.1016/j.media.2023.102988.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.media.2023.102988&link_type=DOI) 

37. [37].Belkouchi Y, Lederlin M, Ben Afia A, Fabre C, Ferretti G, De Margerie C, et al. Detection and quantification of pulmonary embolism with artificial intelligence: The SFR 2022 artificial intelligence data challenge. Diagnostic and Interventional Imaging 2023;104:485–9. doi:10.1016/j.diii.2023.05.007.
    
    [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.diii.2023.05.007&link_type=DOI)

 [1]: /embed/graphic-10.gif
 [2]: /embed/graphic-11.gif
 [3]: /embed/graphic-12.gif