Abstract
Breast cancer is one of the most spread and monitored pathologies in high-income countries. After breast biopsy, histological tissue is stored in paraffin, sectioned and mounted. Conventional inspection of tissue slides under benchtop light microscopes involves paraffin removal and staining, typically with H&E. Then, expert pathologists are called to judge the stained slides. However, paraffin removal and staining are operator-dependent, time and resources consuming processes that can generate ambiguities due to non-uniform staining. Here we propose a novel method that can work directly on paraffined stain-free slides. The method is automatic, independent from the operator, and provides classification of different portions of the tissue image with very high accuracy. Besides, it returns a guide map to help pathologist to judge the different tissue portions based on the likelihood these can be associated to a breast cancer or fibroadenoma biomarker. We use Fourier Ptychography as a quantitative phase-contrast microscopy method, which allows accessing a very wide field of view (i.e., square millimeters) in one single image while guaranteeing at the same time high lateral resolution (i.e., 0.5 microns). This imaging method is multi-scale, since it enables looking at the big picture, i.e. the complex tissue structure and connections, with the possibility to zoom-in up to the single cell level. In order to handle this informative image content, we introduce elements of fractal geometry as a multi-scale analysis method. We show the effectiveness of fractal features in describing fibroadenoma and breast cancer from six patients with very high accuracy. The proposed method could significantly simplify the steps of tissue analysis and make it independent from the sample preparation, the skills of the lab operator and the pathologist.
Introduction
Breast cancer represents one of the most monitored pathologies for women due to its high mortality and morbidity rate. In fact, the five-year survival rate in metastatic breast cancer is less than 30%. Recent data produced by the IARC (International Agency for Research on Cancer) report that in 185 examined countries, 2.3 million new cases (11.7%) of breast cancer were found with a mortality rate of 6.9%.1 Also, the incidence of breast cancer is more common in high-income countries (571/100,000) than in low-income countries (95/100,000). Breast cancer encompasses a group of diseases characterized by different biological subtypes, with a molecular profile and specific clinical-pathological characteristics.2 The diagnosis of breast cancer is based on clinical examination combined with imaging and confirmed by pathological assessment. The comprehensive pathological assessment of breast cancer should be performed in alignment with the World Health Organization (WHO) classification3 and the eighth edition of the American Joint Committee on Cancer (AJCC) Tumour, Node, Metastasis (TNM) staging system,4 and includes not only anatomical considerations but also crucial prognostic insights tied to tumor biology, such as tumor grade, estrogen receptor (ER), progesterone receptor (PgR), human epidermal growth factor receptor 2 (HER2), and available gene expression data.5
In clinical practice, the pathological assessment of breast tissue is usually performed through needle aspiration, biopsy, or surgical excision. Immunohistochemical investigation such as the classical hematoxylin and eosin (H&E), the staining with specific antibody and other useful molecular tests are used for the characterization of breast cancer. Moreover, the diagnosis of breast cancer is made by pathologists requiring time and experience. Hence, the diagnosis results do not always coincide as they depend on several factors such as previous experience and sample preparation. Currently, accuracy of diagnosis is limited to 75%.6 Although recent progresses on image acquisition of tissue slides at modern microscopes have brought to a strong automation, the evaluation still remains extremely complex, very time-consuming and labor-intensive. Therefore, reliable, rapid, automatic and less dependent operator methods for the breast cancer diagnosis are still challenging and far from the actual needs. Essentially, in the era of digital analysis and big data, the bottleneck in the diagnosis lies in the visual examination of such huge images that implies tiredness workload for histopathologists. One possibility is to apply straightforward approaches to obtain classifiers working on texture and morphological features by a computational analysis of standard microscope images of stained tissue slides. This would have the benefit to make steps forward in classifying benign and malignant tumors by automatic process and try to overcome the subjectivity in image analysis.7,8 Recently, interesting developments have been introduced in the histopathology field by digital pathology. The widespread use of slide scanning systems is mainly associated with the reduction of the costs of the scanning technology and digital storage. Nowadays, the technology advances offered by the modern microscope apparatus commonly named as Whole Slide Imaging (WSI) has opened the route to several new possibilities.9,10 WSI allows very fast and high-resolution acquisition of entire tissue slides thus making available images of the biopsies in digital format with typical times compatible with clinical practice.11 Accessing such a huge amount of data has favoured the use Artificial Intelligence (AI) by feeding Neural Networks with examples in the form of images for enhancing research and clinics in oncology. Quantitative data can be also extracted from digitized histopathology images of whole tissue slides. Such information can support and boost researchers, physicians and pathologists in the accurate analysis of a patient’s slide and can make their work faster and less cumbersome.12-15
Several aspects can significantly affect the quality of immunohistochemistry or, overall, the entire preparation process for tissues slides, such as storage time, oxidation, hydrolysis, tissue processing time, fixation time and type. Within this framework, the current protocols for clinical pathology are based on the inspection of stained slides, so that a pathologist is used to judge the slide based on her/his knowledge of the appearance of the morphology of a healthy tissue section when observed through the intrinsic filter brought by the stain. Hence, tissue slides preparation and staining process is crucial for an effective diagnosis. Besides, fixation in 10% neutral buffered formalin, for approximately 15 hour, and slide storage in paraffin is of great importance to preserve the tissue samples. Usually, coating or dipping coating with paraffin is provided in order to embed and seal the tissue slides to reduce oxidation. Then, for imaging the sample slides using a light microscope (LM), paraffin has to be removed in an incubator and the tissue slide has to be stained, as sketched in Fig. 1(a).16 However, staining the tissue using for example H&E is a process that can lead to misinterpretations since the results of the staining process strictly depend on the lab operator. Thus, an image section can appear “brighter” or “darker” depending on the laboratory where the staining has been applied, and the interpretation of the pathologist called to judge the slide can be affected as well. Similarly, algorithms for automatic image analysis and even deep learning architectures can be affected by such stain-induced ambiguities.17 In general, staining a sample is a process characterized by a large failure rate so that a large amount of stained slides cannot be used for histopathology purposes, e.g. due to uneven staining.18 Uneven staining can also be caused during the process of paraffin removal, or due to incorrect sectioning, overly dehydrated tissue, poorly infiltrated tissue, or water provoking under-staining of cytoplasmic structures.19 Also, the use of formalin can cause over-drying and searing of the outer edges of the tissue when the slide is excessively exposed to sunlight, thus provoking an incorrect appearance and loss of the nuclear details.19 Microscopy observation of the slides under a LM returns false colour images showing the stained areas with higher contrast, while the tissue inner structures that do not bind to the stain are returned with poor contrast. Such type of observation, although widely accepted and used in clinical practice, is intrinsically ambiguous due to the above-mentioned operator-dependency and also to the lack of an absolute reference to compare images having pixel values not linkable to a physical measure.6
(a) Conventional light microscopy analysis. (b) Proposed stain-free method.
With the aim to avoid the ambiguities associated with staining, label-free methods have emerged. For example, Raman spectroscopy can provide sample information about biomolecular alterations in non-destructive way but in label-free mode.20 Furthermore, Raman spectroscopy could be combined with machine learning (ML) to perform the spectral analysis automatic and more objective. One more way to image breast tissue was based on high-definition Fourier Transform Infrared (FT-IR) imaging to find a spectroscopy signature for cancer classification.21
Among these several label-free techniques, Quantitative Phase Imaging (QPI) is recently emerging as a class of methods that can image label/stain-free biological samples while providing the necessary contrast for downstream analysis, physiology and histopathology observations.22 QPI methods measure the optical path delay introduced by the biological sample on the light probe, which is linked to biophysical quantities. In QPI, the optical readout is the phase-contrast map. Unlike LM images of stained specimens, each pixel of a phase-contrast map is proportional to the optical thickness, i.e. the product between the physical thickness and the integral along the optical axis direction of the refractive index. Similarly, the dry mass is a quantitative parameter that can be calculated for each pixel in the image. Both these quantities depend on the local density of the specimen (e.g. cells nuclei show higher phase-contrast and dry mass than cytoplasmic compartments).23 This mechanism provides the contrast needed for analyzing tissue slides without ambiguities. Different QPI approaches have been proposed to inspect biological tissue slides in stain-free mode, including Digital Holography (DH),24-26 Fourier Ptychographic Microscopy (FPM),27-33 micro-optical coherence tomography,34 SLIM,35 and the above-mentioned WSI.9,10 Among the different QPI approaches, FPM is preferred to favourably stretch the optical constraint that limits the obtainable space bandwidth product. In FPM, phase-contrast imaging over a wide Field of View (FoV) is accessible by selecting an optical configuration adopting low magnification microscope objectives (MOs). For most of the benchtop microscopes, this choice means sacrificing the available lateral resolution. In FPM, angle diversity is introduced in the illumination pattern with the aim to enhance the resolution according to a synthetic aperture principle.27-33 A set of bright-field and dark-field images are captured, each one transferring a subset of spatial frequencies of the sample. In particular, the large angle light probes (dark field images) have the effect of conveying the high frequency details into the MO Numerical Aperture (NA), thus transferring them within the system bandpass cutoff. Then, a phase-retrieval process estimates the high-resolution complex amplitude from the set of low-resolution intensities. The effect is a mm2-cm2 size FoV image with submicron lateral resolution. This is ideal to investigate histopathology slides where small tissue portions are not necessarily representative of the condition of the patient undergoing biopsy and the inspection of the entire slide is sometimes necessary. FPM has been used with various coded illumination schemes36 for imaging cell cultures and tissue slides27,37 in applications ranging from biology research and drug testing to mechanobiology.27,31,38 Recently, deep learning methods have been employed to fasten the reconstruction process33,39,40 and to make FPM microscopes more robust against misalignments, thus helping the ongoing process of translating FPM to clinical practice.32,41
Here we use FPM to image and analyse breast tissue slides from six patients in stain-free modality. We accurately identify the tissue portions exhibiting breast cancer from the fibroadenoma areas. ML is applied by extracting meaningful features from the wrapped (i.e., modulus 2π) FPM phase-contrast maps. The features are used to train a classifier to infer the class each image patch belongs to. Then, by using a max-voting approach specifically developed for digital histopathology, the proposed method is able to provide accurate classification at the single patch level, image level, and patient’s level with increasing minimization of the classification error (i.e., on average, 21.6%, 7.7%, and 0.0%, respectively). As a result of this analysis, we provide a very accurate overall classification of the patients’ slide to furnish a first automatic indication to the pathologist. Besides, we create a heatmap of the most relevant parameter for classification, which can serve as a guide to establish the areas where the breast cancer phenotype is more or less expressed.
The stain-free process we propose is sketched in Fig. 1(b). It is important to note that in our FPM imaging of tissue slides it is not necessary to remove paraffin from the slides. Paraffin acts preserving the tissue slides. However, it can be detrimental for FPM phase imaging. Indeed, paraffin can act as a layer with large refractive index that introduces an additional optical path delay and provokes severe phase wrapping. Besides, refraction from the paraffin can change in unpredictable way the illumination vector, so that the actual illumination of the tissue portion is not consistent with the nominal illumination used in the FPM-phase retrieval algorithm. Nevertheless, we demonstrate that in FPM, paraffin does not affect the analysis when this is carried out relying on fractal biomarkers. Essentially, we show that the phase wrapping pattern obtained from FPM can be used as a fingerprint to characterize and classify the different portions of the image. In particular, with the aim of obtaining features to classify FPM images where the tissue morphology cannot be inferred, we rely on a recently developed analysis framework based on elements of fractal geometry. Fractal geometry is a branch of math particularly suitable to describe natural objects and their complexity.42,43 In microscopy, it has been applied to various problems, e.g. to describe the capillary system in angiography,44 the structure of neuron networks in LM images of brain tissue slides,45 to phenotype tumour cells,46 and in scattering-based cytometry to characterize the complexity of scattering patterns of single cells47 and their link to the intracellular composition and distribution of organelles, e.g. the mitochondrial network in healthy and precancerous epithelial cells.48 Recently, we applied the fractal analysis to wrapped holographic phase-contrast maps of marine microalgae and microplastics to define a fingerprint of microplastic items and identify them in water samples.49 Indeed, complexity descriptors like fractal dimension and lacunarity44,50,51 are particularly useful to characterize the distribution of the phase jumps within each single cell.49 Here we apply such elements of fractal geometry to the FPM wrapped phase-contrast images, i.e. we describe the structure of phase-jumps (or “lacunes”) at the whole image level to train the classifier. Such analysis is made possible thanks to the “multi-scale” feature of both FPM imaging and fractal geometry. We show the effectiveness of fractal descriptors to classify the stain-free digital images of breast tissue biopsies recorded by FPM without removing paraffin, as sketched in Fig. 1(b).
Material and Methods
FPM complex amplitude estimate
FPM27,28 is a non-interferometric QPI technique29 that provides phase images relying on phase retrieval algorithms.30 The FPM main feature is to generate high-resolution phase-contrast images over a large FoV. This is possible thanks to the principle of synthetic numerical aperture (PSNA),31 which allows to overcome the trade-off between high lateral resolution and large FoV, which is typical of conventional microscopy. In fact, in conventional microscopy, the bigger the NA the higher the resolution and the smaller the FoV and vice versa.
The FPM system is built according to the PSNA, managing the illumination source and the MO. In our configuration, the MO has low numerical aperture (NA) for ensuring the wide FoV, while the illumination source is a LEDs planar array. The system setup and acquisition working principle are sketched in Fig. 2. Sequentially turning on each LED, the object on the sample plane is probed by different LED light sources with illumination angles that depend on the LED position in the source array. The central LEDs probe the object perpendicularly to the sample plane and generate bright-field intensity images on the camera, while the outermost ones provide a beam grazing the sample at a certain angle, generating dark-field intensity images on the camera. The further away LEDs are with respect to the object, the more inclined is their beam, i.e. the greater the angle. In the frequency domain, a light beam with high illumination angle shifts the illumination NA towards high frequencies.
(a) Experimental setup, where we intentionally enlarged the sketch of the sample tissue slide in the acquisition plane. (b) Sketch of the Fourier synthetic spectrum that shows the NA enhancement. (c) Top: example of bright-field image corresponding to the central LED. Bottom: zoom-in detail of the area marked by the yellow box. Bottom left: low resolution bright-field intensity. Bottom right: corresponding high resolution wrapped phase-contrast map.
Mathematically, if O(r) represents the object on the sample plane (r as spatial coordinate) and ej2πfr a single LED complex field (f as frequency), the transmitted complex field through the object is
where θ refers to the illumination angle and λ to the LED wavelength. In Eq. 1, the linear correlation between the illumination angle and the frequency highlights the influence of the angle variation on the frequency values. Hence, combining properly the LED Nas in the Fourier spectrum, a bigger NA can be synthetized covering a wide frequency range (Fig. 2(b)), i.e.
where SNA is the synthetic numerical aperture, NAMO the numerical aperture of the MO, and NAi the numerical aperture of the i-th LED.
The central LEDs contribute to image the basic structure of the object (i.e. the low spatial frequency content), while the external LEDs provide the finest details (dark field images). The captured images (per each LED) have low-resolution, and their intensities can be estimated as follows
where FT-1 is the inverse Fourier Transform (FT),
the FT of O′(r) and H the FT of the system impulsive response, i.e. the transfer function. An example of bright-field image of one of the breast cancer tissue slides, acquired by switching on the central LED, is reported in Fig. 2(c). In Fig. 2(c) we also show a zoom-in detail of the area marked by the yellow box. In particular, we show the enlarged detail of the low-resolution bright-field intensity and the corresponding high-resolution wrapped phase-contrast map.
The relationship between spatial domain (left side Eq. 3) and frequency domain (right side Eq. 3) is pivotal in the phase retrieval algorithm that is based on an iterative updating of the estimated complex amplitude between both domains until the convergence of the metric used is reached. After several iterations, the high-resolution complex amplitude is obtained, whose phase distribution is given by
where ILR,0 is the initial guess of the iterative algorithm and Ô is the high-resolution complex field.
FPM experimental setup
The experimental apparatus for FPM is sketched in Fig. 2. Here, we use a ×4 plan achromatic MO (Plan N, 0.1 NA, Olympus) and a 32×32 RGB LEDs array (4 mm apart), set at red wavelength (632 nm), with a bandwidth of ∼20 nm. A 400 mm tube lens converges the transmitted light beam in a charge coupled device (CCD) camera (Photometrics Evolve 512, 12-bit quantization), with 4.54 μm pixel pitch. The sample plane is 4.67 cm far from the illumination source.
An Arduino board with MATLAB® codes guide the sequential illumination of 177 LEDs. The acquired images (low resolution) have a ×4.29 magnification and a size of 1460×1940. To promote the convergence of the phase retrieval algorithm and the assumption of plane wave, the images are cropped in 100×100 pixels, obtaining 266 patches. The final image (high-resolution) reaches a size of 7000×9500 pixels (where each high-resolution patch is 500×500 pixels sized). The spatial resolution of our system is demonstrated to reach 0.5 µm over a ∼3 mm2 FoV area.
The entire FPM process for one image takes ∼32 min by using an Intel i7-4790 CPU running @3.60 GHz and 16 GB RAM. For each patch, 7.2 s are needed to complete 60 iterations to end the phase-retrieval FP process.
Fractal analysis of wrapped FPM maps
An example of wrapped FPM maps related to a fibroadenoma tissue slide and a breast cancer tissue slide are displayed in Figs. 3(a,c), respectively. Due to the presence of paraffin inside the imaged tissue biopsies, a dense distribution of phase jumps characterizes the wrapped FPM maps. Nevertheless, differences between the underlying tissue structures can be inferred from the wrapped FPM maps, as highlighted in the red insets in Figs. 3(a,c). In order to quantitively characterize them, an ad hoc feature set based on the fractal geometry theory was measured. At this aim, since the fractal parameters are related to a binary map consisting of full and empty areas, a zero-threshold was applied to the wrapped FPM maps. The corresponding binary FPM maps are shown in Figs. 3(b,d) for the fibroadenoma and the cancer tissue slides, respectively. Then, the binary FPM maps, made of 7000×9500 square pixels, were divided into non-overlapping 14×19 binary patches made of 500×500 square pixels, as shown by the yellow grid in Figs. 3(b,d). Moreover, as the numerical methods usually employed to compute fractal parameters work with powers of 2, each binary patch was zero-padded up to 512×512 square pixels, thus obtaining the hole patches. Instead, for each hole patch, the corresponding support patch was created by 512×512 square pixels made of 1 values. Finally, for each of the 14×19 patches, the corresponding hole patch and support patch were used to compute the 13 fractal parameters defined in Ref. 49, namely the fractal dimension, lacunarity index, fill ratio, regularity index, vertex density, vertex lacunarity index, vertex regularity index, fractal dimension contrast, lacunarity contrast, vertex lacunarity contrast, fractal dimension RMSE, lacunarity RMSE, and vertex lacunarity RMSE. Furthermore, for each patch, other 2 features were added, which can be related to the fractal behaviour of the wrapped FPM maps, i.e. the standard deviation and the entropy, that were computed directly from the phase values since they describe the frequency and the intensity of the phase jumps.
(a,c) Wrapped FPM maps. (b,d) Binary FPM maps obtained by zero-thresholding the wrapped FPM maps in (a,c), respectively, with overlapped in red the 14×19 patches (500×500 square pixels) dividing the overall 7000×9500 FOV.
Results
ML classification of breast tissue slides
The FPM experimental setup described in Material and Methods has been employed to image tissue biopsies taken from 6 patients, i.e. 3 patients with fibroadenoma and 3 patients with breast cancer. For each patient (i.e., for each tissue biopsy), the wrapped FPM maps of 13 different FoVs have been recorded, like those displayed in Figs. 3(a,c). According to the fractal analysis described in Material and Methods, each FoV has been divided into 266 non-overlapped patches, according to the grid sketched in Figs. 3(b,d). Finally, for each patch, 15 fractal parameters have been measured. In summary, the collected dataset is made of 15 fractal features related to the overall 20748 FPM patches, which are taken from 78 wrapped FPM maps belonging to the tissue biopsies of 6 patients (3 fibroadenoma patients and 3 breast cancer patients).
In order to inspect the collected dataset in terms of fractal features, the principal component analysis (PCA) has been implemented to reduce its dimensionality.52 The first three principal components are shown in Fig. 4(a), in which it can be seen that the 10374 fibroadenoma patches and the 10374 breast cancer patches shape two well defined clusters, which are quite separated each other. Furthermore, in order to perform a more detailed data inspection within the two clusters, the t-distributed stochastic neighbor embedding (t-SNE) algorithm has been exploited.53 Results of the t-SNE analysis are reported in Fig. 4(b), in which it can be seen that the 10374 patches belonging to the 3 fibroadenoma tissue biopsies (i.e., patients) are grouped within one single cluster (blue points). Instead, the 10374 patches belonging to the 3 breast cancer tissue biopsies (i.e., patients) form two separated clusters (red points). In particular, the left-side cluster is mainly made of patches belonging to the first and third breast cancer patient, while the right-side cluster is mainly made of patches belonging to the second and third breast cancer patient. This means that the paraffined breast cancer patches, when characterized by a fractal feature set, exhibit a greater intra-class variability with respect to the fibroadenoma patches. This is reasonable considering that the breast cancer phenotype is not homogeneously expressed over the entire FoV, rather it is localized in certain image areas. Besides, there are patches belonging to breast cancer patients that do not exhibit that phenotype and cluster in a separate region of the t-SNE diagram. Nevertheless, as well as the PCA analysis, also the t-SNE analysis confirms the good inter-class separation.
(a) PCA scatter plot. (b) t-SNE scatter plot.
The data inspection performed in Fig. 4 suggests that the fractal feature set could be suitable to solve a classification problem for detecting a breast cancer tissue biopsy in respect to a fibroadenoma one. Moreover, the good separation between the two classes also suggests that a small training set could be enough to have a good generalization at the inference step. For this reason, the training set has been created in the worst possible condition, i.e. by using all the FPM patches of just two tissue biopsies, that are one fibroadenoma and one breast cancer patient. Moreover, to avoid any bias that could be induced by a favorable splitting of the overall dataset into a specific training set and test set, all the nine possible splits among the 6 patients have been considered. In fact, for each split, the training set is made of the 6916 FPM patches belonging to two patients (one fibroadenoma and one breast cancer patient) and the test set is made of the 13832 FPM patches belonging to the remaining 4 patients (two fibroadenoma and two breast cancer patients). For each of these nine classification problems, several ML models have been trained by using a 10-fold cross-validation.
Then, for each classification problem, it has been selected the ML model providing the best classification accuracy over the corresponding test sets, which resulted in the support vector machine (SVM) and the k-nearest neighbors (KNN).54 The average and standard deviation values about the resulting nine confusion matrices related to the 13832 FPM patches are summarized in Fig. 5(a), in which it can be seen that a 78.4 ± 5.0 % accuracy is reached, which is a satisfactory result considering that the training set is made of just two patients.
(a) Average and standard deviation values of the nine confusion matrices obtained over the test sets made of 13832 patches belonging to 52 FoVs of 4 tissue biopsies. (b) Average and standard deviation values of the nine confusion matrices obtained over the test sets made of 52 FoVs belonging to the tissue biopsies of 4 patients, obtained after max-voting of the patch classes. (c) Average and standard deviation values of the nine confusion matrices obtained over the test sets made of 4 tissue biopsies belonging to 4 patients, obtained after max-voting of the FoV classes.
However, as discussed before, each imaged FoV is made of 266 non-overlapped FPM patches. Thus, for each FoV, 266 possible classes have been predicted by the ML classifier. This means that, in order to predict the class related to a specific FoV, a max-voting strategy can be applied,55 i.e. the class the imaged FOV belongs to is represented by the mode of the corresponding 266 patch classes. In this way, exploiting the intrinsic correlation between patches that belong to the same imaged FoV, the accuracy in classifying the overall FoV instead of the single patches raises up to a remarkable 92.3 ± 6.5 % within the test set made of 52 elements (see the average ± standard deviation confusion matrix in Fig. 5(b)).
In turn, 13 FOVs are imaged for each patient. Therefore, a max-voting strategy can be exploited again to combine the predicted classes of all the 13 FOVs belonging to the same slide in order to predict whether the corresponding tissue biopsy is a fibroadenoma or a breast cancer. As highlighted in Fig. 5(c), remarkably a 100.0 ± 0.0 % accuracy is reached in this classification task within the test set made of 4 patients.
Fractal FPM heat maps as a guide for pathologists
Fractal geometry allows describing natural objects from an alternative point of view with respect to the conventional Euclidean geometry. The powerfulness of fractal geometry lies in its ability of describing patterns that are intrinsic to a certain object, thus accessing its inner complex nature. In this way, a more distinctive characterization can be extracted from the analyzed phenomenon. This is possible since fractal geometry involves a multi-scale analysis of the imaged object, meaning that it tries to describe and quantify the replication of specific patterns at different scales within the same object. For this reason, the multi-scale analysis provided by fractal geometry matches extremely well with the multi-scale imaging provided by FPM. The main advantage of FPM with respect to other imaging techniques is the large space-bandwidth product which optimizes both FoV size and lateral resolution. Hence, a large FoV can be imaged by preserving high frequency information. In this way, a biological sample can be analyzed both in its global context and in its finest details. At this scope, fractal geometry analysis can be considered the optimal solution. The fractal feature set computed from the wrapped FPM maps allows discriminating very well between fibroadenoma and breast cancer patches, as highlighted in the scatter plots of Fig. 4 and in the confusion matrix of Fig. 5(a). Moreover, accessing the wide mm2 FoV allows exploiting a max-voting strategy for further improving the classification performance (see Figs. 5(b,c)), since each FPM map can be divided into hundreds of patches without losing the possibility of performing a fractal characterization at the single-patch level.
In addition to the remarkable classification performance, the combination between fractal geometry and FPM also allows to generate a further source of information that could be meaningful as a guide for more in-depth studies by pathologists, i.e. fractal heat maps. In particular, among the several fractal features, the lacunarity index has been often exploited due to its higher correlation with biological phenomena.42 For example, lacunarity has been employed in the magnetic resonance imaging for distinguishing benign and malignant breast cancer56 or for differentiating the grades of glioma.57 It has been also used in other microscopy imaging techniques as prognostic indicator of clinical outcome in early breast cancer,58 for the diagnosis59 and the identification of the severity level of prostate cancer,60,61 or for the detection of the Alzheimer’s disease.62 Actually, the lacunarity index measures the distribution of the hole sizes within a certain structure.51
In the proposed study, the lacunarity index characterizes the hole maps obtained from the wrapped FPM maps, as shown in Fig. 3. In particular, for each patch in Figs. 3(b,d), a lacunarity index has been computed, as displayed in Figs. S1(a,b), respectively. In Fig. S1, each 500×500 patch takes a homogeneous value, that is the corresponding lacunarity index. Hence, images in Fig. S1 can be defined as the lacunarity heat maps. As the wrapped FPM maps have a high density of phase jumps due to the presence of paraffin within the tissue slides (see Figs. 3(a,c)), it is difficult to correlate them to a specific biological structure by means of a visual inspection. Instead, for the sole purpose of a visual analysis performed by the pathologist, the low-resolution bright-field maps (1400×1900 square pixels) can be exploited, as displayed in Figs. 6(a,c), corresponding to the wrapped FPM maps in Figs. 3(a,c), respectively. Furthermore, to help the visual analysis of the pathologist, the lacunarity heat maps computed through the quantitative fractal characterization of the wrapped FPM maps can be exploited. In particular, the 7000×9500 high-resolution lacunarity heat maps displayed in Figs. S1(a,b) can be resized to 1400×1900 square pixels in order to fit the size of the low-resolution bright-field map shown in Figs. 6(a,c), thus finally overlapping them in Figs. 6(b,d), respectively.
(a,c) Low-resolution bright-field map. (b,d) Lacunarity heat maps of Figs. S1(a,b), down-sampled and overlapped to the low-resolution bright-field maps in (a,c), respectively. Boxes in (b,d) highlight glandular structures.
It is worth noting that the fibroadenoma tissue slide exhibits lower lacunarity indices than a breast cancer tissue slide, as respectively shown in Figs. 6(b,d). According to the definition of lacunarity index,49 this means that the fibroadenoma tissue slide is more lacunar than the breast cancer tissue slide. This property can be related to the different inner structures forming the two kinds of tissues, that can be seen in the low-resolution bright-field maps in Figs. 6 (a,c). Small and rather spaced ductal-derived glandular structures are observed, in a fibrous stroma in Fig. 6 (a,b), and the glandular density appears reduced. The heatmap in Fig. 6(b) highlights glandular spaced structures (see the black box). In Figs. 6(c,d), numerous ductal-derived glandular elements are observed tightly packed and without evident stroma. Glandular density appears high. Remarkably, the heatmap in Fig. 6(d) highlights numerous glandular structures bundled together (see black box).
Discussion and Conclusions
Digital pathology analysis of breast tissue slides is widespread and can furnish a valuable help to guide pathologists called to judge heterogeneous morphologies. Extraction of features from stained histology slides obtained through WSI has emerged as a pivotal technique in pathomic studies. The primary objective is to quantitatively characterize cells and tissues derived from examined samples. Notably, some pathomic studies delve into the fractal dimension analysis to extract features to analyse WSIs of various cancer types. For instance, Lee et al. developed a computer-aided technique for the automated grading of prostatic carcinoma leveraging the application of fractal dimension for analysing pathological image texture of prostatic carcinoma WSIs. The extracted fractal dimension-based features were able to classify pathological prostate images into four classes within the Gleason grading system.63 Furthermore, Da Silva et al. embraced fractal dimension analysis as a computationally accessible approach to enhance the histopathological diagnosis of breast cancer. Their investigation revealed that fractal dimension-based features extracted from stained WSI demonstrated remarkable capabilities in distinguishing breast carcinomas from normal tissue and benign breast alterations. These findings underscore the significance of fractal dimension analysis as a powerful tool in advancing our understanding and diagnostic capabilities in histopathological studies.64
It is important to note that the histopathological analysis of breast cancer that we used as test case of our Digital Pathology approach can be critical due to its intrinsic complexity. Indeed, cell density and morphology alone cannot be sufficient for the differential diagnosis between a malignant and a benign lesion, as special cases such as florid adenosis or sclerosing adenosis could create outputs that may be associated with false-positive subjects. In addition, there are cases such as tubular breast carcinoma in which the tissue is well differentiated and only a detailed diagnostic study can make a correct diagnosis.65
However, independently form the specific test case in our research, we established a novel analysis framework that allowed us to analyze directly the paraffined unstained breast slides, in order to get rid of ambiguities that can be provoked by the paraffin removal and staining processes as occurs in conventional breast cancer diagnosis. We used two multi-scale methodologies for image acquisition and analysis, respectively. Fractal biomarkers, in particular the lacunarity index, well describe the FPM phase-contrast maps and allow classifying data from the single image portion level up to the patient level. In summary, the main contributions in this work are as follows:
We proposed an automatic method for classification of breast cancer and fibroadenoma based on the novel FPM technique applied to unstained tissue slides;
We introduced the fractal patterns analysis of wrapped FPM phase-contrast maps.
Histopathological image recognition, patch and patient classification is demonstrated by avoiding removing the paraffin layer;
Max-voting among different portions of the same image is demonstrated to enforce image classification. Besides, max-voting among different images from the same patient’s slide allowed very accurate classification. In particular, a 100% accuracy was obtained among the test tissue slides of 4 patients after training a ML model with the tissue slides of other 2 patients. The robustness of this method would allow to judge even using a reduced set of FPM images for the same patient.
The most important fractal parameter, i.e. the lacunarity index, can serve to create guide maps for pathologists.
We believe that our approach is highly innovative and potentially usable in the future to support the pathologist’s activities. In principle, the proposed strategy could be extended to other types of tissues. Therefore, next studies will be focused on the even more borderline cases mentioned above about breast cancer and other types of tissues and pathologies will be tested.
Data Availability
All data produced in the present study are available upon reasonable request to the authors
Data Availability
All data produced in the present study are available upon reasonable request to the authors.
Declaration of Competing Interest
The authors have no conflict of interests to declare.
Ethics Approval and Consent to Participate
The study was approved by the Ethics Committee of IRCCS Pascale (Naples, Italy) with reference number 3/19 approved on 29 May 2019. All methods were performed in compliance with standard operating procedures and in accordance with the Declaration of Helsinki and each patient participated in the study by signing written informed consent.
Acknowledgments
This work was supported by project POR CIRO (Campania Imaging for Research in Oncology) funded by Regione Campania (Italy).
This work was partially supported by the Italian Ministry of Health (“Ricerca Corrente” project).