Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Multi-omics analysis of renal clear cell carcinoma progression

View ORCID ProfileAnuj Guruacharya, James R Golden, Daniel Garrett, View ORCID ProfileDeven Atnoor, Sujaya Srinivasan, View ORCID ProfileUjjwal Ratan, KT Pickard
doi: https://doi.org/10.1101/2022.11.21.22282533
Anuj Guruacharya
1Amazon Web Services, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Anuj Guruacharya
  • For correspondence: anuj2054{at}gmail.com
James R Golden
1Amazon Web Services, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Daniel Garrett
1Amazon Web Services, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Deven Atnoor
1Amazon Web Services, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Deven Atnoor
Sujaya Srinivasan
1Amazon Web Services, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ujjwal Ratan
1Amazon Web Services, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ujjwal Ratan
KT Pickard
1Amazon Web Services, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Renal clear cell carcinoma (RCC), the most common type of kidney cancer, lacks a well-defined collection of biomarkers for tracking disease progression. Although complementary diagnostic and prognostic RCC biomarkers may be beneficial for guiding therapeutic selection and informing clinical outcomes, patients currently have a poor prognosis due to limited early detection. Without a priori biomarker knowledge or histopathology information, we used machine learning (ML) techniques to investigate how mRNA, microRNA, and protein expression levels change as a patient progresses to different stages of RCC. The novel combination of big data with ML enables researchers to generate hypothesis-free models in a fraction of the time used in traditional clinical trials. Ranked genes that are most predictive of survival and disease progression can be used for target discovery and downstream analysis in precision medicine. We extracted clinical information for normal and RCC patients along with their related expression profiles in RCC tissues from three publicly-available datasets: 1. The Cancer Genome Atlas (TCGA), 2. Genotype-Tissue Expression (GTEx) project, 3. Clinical Proteomic Tumor Analysis Consortium (CPTAC). Our study found that among others, gene expression levels (mRNA) from GNG7 and BCR are potential predictors for RCC progression. For microRNA, we found hsa-mir-199a-2 and hsa-mir-129-1 to be potential predictors of RCC progression. Understanding how genes and protein expression levels change as RCC progresses will further guide the development of prognostic biomarkers and targets for RCC therapies.

Introduction

Kidney cancer is among the ten most common cancers. In the US, rates for renal cell cancer continue to rise but mainly for early-stage tumors, with kidney renal clear cell carcinoma (RCC) accounting for 75% of all kidney cancers [1]. In the US, for the year 2020 the number of new cases of kidney cancer was 73,750 and deaths was 14,830 [2]. Although biomarkers for RCC are emerging [3, 4, 5, 6], patients usually have a poor prognosis due to limited early detection. Early identification of metastatic potential may be beneficial for therapeutic selection, as well as informing clinical outcomes. Using integrated diagnostic techniques made possible by lower-cost sequencing and predictive modeling, a more precise treatment approach may emerge by combining multimodal data such as mRNA, microRNA, and protein expression with clinical information [7, 8, 9, 10, 11].

Cancer databases contribute greatly to research that explains the molecular mechanisms underlying tumorigenesis. For example, the Cancer Genome Atlas (TCGA) database [12] consists of over 20,000 primary cancer and matched normal samples spanning 33 cancer types. Similarly, the Genotype-Tissue Expression (GTEx) database [13] contains tissue-specific gene expression for 54 non-diseased sites across nearly 1,000 individuals. The National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium (CPTAC) consists of over 2,000 proteomes and facilitates the discovery of cancer-specific protein biomarkers that augment DNA and RNA sequencing [14]. Bioinformatics mining—computational methods that are complementary to clinical trials—enables researchers to quickly evaluate concepts and generate hypothesis-free models. With publicly-available datasets becoming increasingly available, bioinformatics mining has led to a burst of research activity related to computational cancer biomarker and target discovery.

Understanding how genes and protein expression change as RCC progresses can further guide the development of prognostic biomarkers and treatment planning for RCC patients. In this study, we investigated how mRNA, microRNA, and protein expression levels were associated with different stages of RCC patients, as well as to their overall survival outcomes. Ranked genes that are most predictive of survival and disease progression can be used for target discovery for drugs and biomarker discovery for clinical decision support. In contemporary practice, VEGF and VEGF receptors are the most common therapy targets in the clinical treatment of RCC [15].

In this paper, we extracted clinical information of normal and RCC patients along with their related expression profiles in RCC tissues. All TCGA patients were clinically and pathologically diagnosed with RCC, and tumors were classified by TCGA into Stages I to IV based on the Fuhrman nuclear grading system [16]. Without a priori biomarker knowledge or histopathology information, we used ML techniques to investigate how clinical and omic data could be combined to identify the pathological stage of the tumor based on normal, early (Stage I+II), and late (Stage III+IV) classes.

Methods

Data collection and preparation

Clinical and genomic data were obtained from the Cancer Genome Atlas (TCGA) -Kidney Renal Carcinoma (KIRC) project, Genotype-Tissue Expression (GTEx), and the Clinical Proteomics Tumor Analysis Consortium (CPTAC). The processed mRNA data was used following the approach of Wang et al that combined data sources from TCGA and GTEx [17]. TCGA was the single source for microRNA data. Protein data and associated clinical data for those patients were obtained from CPTAC. Disease progression stages were defined by the American Joint Committee on Cancer (AJCC). Overall survival, overall survival months, and AJCC-defined pathologic tumor stage were extracted from clinical data and joined with genomic and protein data.

A combination of clinical and genomic data was provided as input to ML methods that included a decision tree model (XGBoost) and an ensemble (AutoGluon) model, which were trained to predict disease progression (Stages I through IV) for each of these datasets. We carried out a stratified 70/30 train-test data split before experiment training. Tables 1 and 2 show distributions of patients for each tumor stage based on the mRNA, microRNA, and protein datasets. A large class imbalance exists in each dataset, which is noted by the “Percentage of cohort” column in each table.

View this table:
  • View inline
  • View popup
Table 1:

For the different molecule types, the distribution of patients obtained from TCGA and GTEx for mRNA; TCGA for microRNA; and CPTAC for proteins.

View this table:
  • View inline
  • View popup
Table 2:

Comparison of evaluation metrics of the different models and methods used to predict stages.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 3:

Top 20 genes ordered by importance in mRNA multi-class classification: AutoGluon.

View this table:
  • View inline
  • View popup
Table 4:

Top 20 microRNA ordered by importance in multi-class classification selected: AutoGluon.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 5:

Top 20 genes ordered by importance in protein multi-class classification (AutoGluon) without survival analysis (from full set of available proteins).

To increase potential clinical insight, patients were stratified into three distinct groups per dataset. These classes are defined as follows:

  • Class 0: Normal Samples (includes matched normal samples)

  • Class 1: Stage I and Stage II

  • Class 2: Stage III and Stage IV

The stages of the samples were determined at the time of diagnosis by the TCGA consortium and are included in the public TCGA dataset. For this research, tumor stages were grouped together in classes based on biological similarities. For example, Stage I and II kidney tumors are concentrated within the kidney itself and combined into Class 1. Stage III and IV kidney tumors have progressed and spread outside of the kidney and combined into Class 2 [18]. This grouping allows for the identification of relevant potential biomarkers for tumor progression and the determination of patients with the highest risk of tumor progression. The distributions of patients over progression stages are shown in Table 1 by molecule type.

Preprocessing

Patients with missing ground truth entries or genomic features with a high fraction of missing data were removed from the dataset. Genetic expression levels were normalized to zero mean and unit standard deviation.

Survival analysis

The correlation between patient survival and tumor stage is shown in Table 2, which is supported by studies on genetics and survival [18]. Cox’s Proportional Hazard Test (Cox PH) was carried out for both mRNA and microRNA training sets separately based on the time to death in post-diagnosis days in order to determine which genes are significantly related to survival (p < 0.05) using the open-source Python package, lifelines [19]. Downstream models for disease progression were trained on a subset of each dataset containing only those genes significant for survival. This method allowed us to perform informed feature selection, reducing the mRNA feature space from 20,235 genes to 5,569 significant genes, and microRNA from 532 genes to 47 significant genes.

XGBoost

The open-source Python package for XGBoost version 1.4.2 [20] was coupled with an open-source Bayesian-optimization package for hyperparameter tuning [22] over 50 rounds per model. A model with optimal hyperparameters was determined with 10-fold cross-validation on the training set and returned from a Bayesian optimization procedure with 1,000 boosting rounds. The final model with optimal parameter settings was trained on the full training set. Model performance was evaluated with accuracy and ROC-AUC-OVO (Area Under the Curve of the Receiver Operating Curve, using a One-Versus-One evaluation for imbalanced multi-class data) on the test set. We then used the XGBoost feature importance tool to create importance matrices.

AutoGluon

AutoGluon version 0.3.1 [21], an open-source AutoML package, was also used to train a large ensemble model (linear, k-nearest neighbor, random forest, XGBoost, and tabular neural network models, among others) to compare with the custom XGBoost model, and also to compare the internal AutoGluon models that comprise the ensemble. The AutoGluon ensemble was trained for 30 minutes on each dataset, and, like the XGBoost models, the AutoGluon ensemble predictor was trained only on a subset of features deduced from Cox’s Proportional Hazard Test.

Model Tuning

For hyperparameter tuning, we performed Bayesian optimization with 10-fold cross validation using the average ROC-AUC-OVO score as the evaluation metric. Optimization was accomplished using an open-source Python package, BayesianOptimization [22].

Compute Environment

Models were trained using a ml.r5.4xlarge AWS SageMaker notebook instance. The code, models, and results are available in the GitHub repository listed in the Data Usage section. Data cleaning, Cox’s Proportional Hazard Test, XGBoost, and AutoGluon widely varied in time and computational resources due to the complexity of the data used (8,217 mRNA features as opposed to 11,710 protein features).

Results

As described previously, mRNA and microRNA data were obtained from TCGA’s KIRC/RCC dataset, mRNA data for non-RCC patients from GTEx, and protein data (and cohort follow-up) from CPTAC. From the survival analysis, we found 8,218 mRNA, 38 microRNA, and 10 of the top 20 proteins that were significant (p < 0.05). These findings were used as classifier inputs.

Using mRNA, microRNA, and protein input data, we trained a classifier on a 70/30 stratified train-test split to predict three classes:

  1. Class 0: Normal Patients

  2. Class 1: Stage I and Stage II

  3. Class 2: Stage III and Stage IV

On a test dataset of 30% of the data, we noted the evaluation metrics as in Table 2. AutoGluon outperformed XGBoost in every scenario, which is due to AutoGluon’s ensemble of models.

The mRNA, microRNA, and proteins found to be highly predictive of RCC progression are shown in Tables 2 through 5.

Discussion and conclusion

mRNA prognostic biomarkers

For mRNA gene expression potentially predictive of RCC progression, we will discuss the top 10 genes in descending order of importance (GNG7, BCR, B3GALT1, SPTSSB, GLS2, RP11-537L4, TMEM159, SHOX2, PTPRR, AC003006.7). GNG7, a modulator in various transmembrane signaling systems, had the highest ranking. GNG7 is part of the guanine nucleotide-binding protein (G protein) family with downregulation associated with pancreatic and esophageal cancer [23, 24], as well as squamous cell carcinoma of the head and neck [25]. A further study by Xu et al found that GNG7 contributes to the progression of clear cell renal cell carcinoma, which is in agreement with our findings [26]. BCR is a protein with two opposing regulatory activities toward small GTP-binding proteins, and has been associated with leukemia [27]. Studies by Kim et al have shown a relationship between BCR gene expression and RCC [28, 29]. Gene expression of BCR was also upregulated in KIRP, a renal cancer closely related to RCC [30]. B3GALT1 is involved in the biosynthesis of the carbohydrate moieties of glycolipids and glycoproteins, and has been found to be involved in cervical adenosquamous carcinoma [31]. SPTSSB (serine palmitoyltransferase small subunit B) downregulation has been found to be detrimental to RCC patients [32]. GLS2 promotes mitochondrial respiration and increases ATP generation in cells. In a study by Shi et al, GLS2 was found to be upregulated in ccRCC cells; cells with GLS2 shRNA displayed lower survival, lower glutathione levels, and a high lipid peroxide level [33]. RP11-527L4 is a long intergenic non-protein coding RNA more commonly known as LINC01976. The relationship of this ln-RNA has not been studied with respect to renal cancer and searches did not reveal studies regarding its function. TMEM159 plays an important role in the formation of lipid droplets, but we were unable to find literature associating TMEM159 expression with RCC progression. SHOX2 is an important gene implicated in craniofacial, brain, heart, and limb development [34], and a methylation analysis of SHOX2 has been performed for early-stage lung cancer [35]. Further, Jung et al found that SHOX2 gene body methylation was positively correlated with mRNA expression in RCC tissues [36]. PTPRR was found to be related with multiple types of cancer, but searches did not find an association with RCC. This gene is a member of the protein tyrosine phosphatase (PTP) family. PTPs are known to be signaling molecules that regulate a variety of cellular processes including cell growth, differentiation, mitotic cycle, and oncogenic transformation. Lastly, AC003006.7 is an uncharacterized protein, with one transcript having a KRAB-containing protein. We could not find literature regarding this protein, but include it for completeness.

microRNA prognostic markers

After comparing the top 10 microRNA sequences with search results from the mirbase database [37], several sequence candidates were implicated in RCC. For example, hsa-mir-301a has been cited as a possible biomarker of metastatic RCC [38]. hsa-mir-129-1 has been found to be a biomarker for hepatocellular carcinoma [39, 40], but no known relationship with RCC has been reported in literature. Hsa-mir-23b has been associated with various cancer types and is a promising therapeutic target [41]. Its downregulation has also been found to be a predictor of hepatocellular carcinoma progression. For miR-27b, Ishihara et al found that it functions as a tumor suppressor of RCC [42]. Hsa-mir-200b downregulation has been found to suppress metastasis of RCC [43], and hsa-let-7i has been implicated in RCC due to its relationship with TRIM genes [44]. Hsa-mir-24-1 plays a role in several different types of cancer [45] and may be connected to lupus nephritis [46]. Although hsa-mir-142 and hsa-mir-3676 have not been implicated in renal cancer in the literature, our final candidate, hsa-mir-133b, has been associated with renal cancer through the JAK2/STAT3 pathway [47]. It has also been identified as a biomarker in other studies [48].

Protein prognostic markers

Considering the top 10 proteins identified from AutoGluon’s importance matrix, we first searched for them in the Human Protein Atlas [50]. The Human Protein Atlas performed a detailed analysis of each gene available in open-source TCGA datasets across several different cancer types. From this search, we found that all of the top 10 were considered cancer-related genes. Of these genes, 4 were considered prognostic markers for renal cancer (MFSD4A, CHL1, HAPLN1, UTP11), although 6 genes were not previously described as prognostic (TUBB2B, TRIM37, HLA-DRB1, CRTC3, HLA-A, MT1A). Of the 6 non-prognostic genes for RCC, 3 were considered prognostic for other cancers (TUBB2B, TRIM37, HLA-A), 1 was found to be cancer enriched in liver cancer (MT1A), and the final 2 were not found to be associated with high cancer specificity (HLA-DRB1, CRTC3) [49].

MFSD4A is a protein within the major facilitator superfamily domain (MFSD) that focuses on glucose transmembrane transport, and UTP11 is a protein that processes pre-18S ribosomal RNA [51]. MFSD4A and UTP11, while noted as prognostic markers for renal cancer survival, did not have additional supporting literature beyond citations found in the Human Protein Atlas.

Although both of these genes were among the highest-ranked by the AutoGluon model’s importance matrix—the first and tenth, respectively—more research is required to explore the connection between these genes and RCC patient survival rates. CHL1 is an ATP-dependent DNA helicase that promotes genomic stability through DNA replication, DNA repair, and ribosomal RNA synthesis [51]. CHL1 was found to be a prognostic marker for renal cancer by multiple sources [50, 52], suggesting that under-expression can lead to lower survival rates in renal cancer patients. HAPLN1 is a protein that binds hyaluronic acid with proteoglycan monomers [51]. Other studies have found that high expression of HAPLN1 leads to lower survival rates in RCC patients [53].

About CPTAC Survival Data for Clear Cell Renal Carcinoma Patients and Contro

The results from Cox’s Proportional Hazard test on the CPTAC dataset for feature selection found only 3 significant proteins out of 11,170 (a 99.97% reduction). Due to the small sample size of deceased patients (12/203) and low correlation between tumor stage and vital status in the CPTAC cohort, the performance of XGBoost and AutoGluon models was evaluated on the full feature set of proteins from CPTAC data. XGBoost and AutoGluon performance improved when using all available proteins compared to the small subset found from Cox’s Proportional Hazard Test.

Conclusion

Although our findings may provide new perspectives in understanding the pathogenesis of RCC, some limitations exist in our study. The dataset size was relatively small, and results could be improved with more survival data for CPTAC patients. We used the AutoGluon permutation shuffling approach to assess feature importance. One limitation of permutation shuffling is that it can produce unrealistic inputs as a result of the shuffling procedure [54, 55]. This study is an exploratory analysis using computational techniques and further bench experimental studies are required for validation before clinical use.

In conclusion, the use of ML techniques enabled the evaluation of different models to predict RCC cancer progression and identify potential prognostic markers. Predicting cancer stages from RNA expression data may be beneficial for therapeutic selection, as well as informing clinical outcomes. This work will assist clinicians to inform patient prognosis, and can be extended with other multimodal data such as imaging and DNA methylation datasets. Employing graph-based neural network models could further elucidate the interaction of genes identified in this paper.

Data Availability

All data produced in the present work are contained in the manuscript.

https://github.com/aws-samples/biomarker-discovery

Data Usage

TCGA and GTEx data are from https://doi.org/10.6084/m9.figshare.5330575[17]

CPTAC data used in this publication were generated by the Clinical Proteomic Tumor Analysis Consortium (NCI/NIH) [14]. Data were accessed through the Python module cptac, PMID: 33560848 (https://github.com/PayneLab/cptac). Data from clear cell carcinoma (kidney) were originally published in PMID: 31675502.

The code for this study is in the GitHub repository: https://github.com/aws-samples/biomarker-discovery.

References

  1. [1].↵
    Cairns P. Renal cell carcinoma. Cancer Biomark. 2010;9(1-6):461–73. doi: 10.3233/CBM-2011-0176. PMID: 22112490; PMCID: PMC3308682.
    OpenUrlCrossRefPubMed
  2. [2].↵
    Padala SA, Barsouk A, Thandra KC, Saginala K, Mohammed A, Vakiti A, Rawla P, Barsouk A. Epidemiology of Renal Cell Carcinoma. World J Oncol. 2020 Jun;11(3):79–87. doi: 10.14740/wjon1279. Epub 2020 May 14. PMID: 32494314; PMCID: PMC7239575.
    OpenUrlCrossRefPubMed
  3. [3].↵
    Goossens, Nicolas et al. “Cancer biomarker discovery and validation.” Translational cancer research vol. 4,3 (2015): 256–269. doi:10.3978/j.issn.2218-676X.2015.06.04
    OpenUrlCrossRefPubMed
  4. [4].↵
    Farber, Nicholas J., et al. “Renal Cell Carcinoma: The Search for a Reliable Biomarker.” Translational Cancer Research, vol. 6, no. 3, 2017, pp. 620–632.
    OpenUrl
  5. [5].↵
    Yan, Fangrong, et al. “Identify clear cell renal cell carcinoma related genes by gene network.” Oncotarget [Online], 8.66 (2017): 110358–110366.
    OpenUrl
  6. [6].↵
    Deng, Su-Ping et al. “Identifying Stages of Kidney Renal Cell Carcinoma by Combining Gene Expression and DNA Methylation Data.” IEEE/ACM transactions on computational biology and bioinformatics vol. 14,5 (2017): 1147–1153. doi:10.1109/TCBB.2016.2607717
    OpenUrlCrossRef
  7. [7].
    Hu, Fuyan et al. “A Gene Signature of Survival Prediction for Kidney Renal Cell Carcinoma by Multi-Omic Data Analysis.” International journal of molecular sciences vol. 20,22 5720. 14 Nov. 2019, doi:10.3390/ijms20225720
    OpenUrlCrossRef
  8. [8].
    Banlai Ruan, et al. “Identification of a Set of Genes Improving Survival Prediction in Kidney Renal Clear Cell Carcinoma through Integrative Reanalysis of Transcriptomic Data”, Disease Markers, vol. 2020, Article ID 8824717, 20 pages, 2020. https://doi.org/10.1155/2020/8824717
    OpenUrl
  9. [9].
    Bhalla, Sherry, et al. “Gene Expression-Based Biomarkers for Discriminating Early and Late Stage of Clear Cell Renal Cancer.” Scientific Reports, vol. 7, no. 1, 2017, pp. 44997–44997.
    OpenUrl
  10. [10].
    Chen Y, Gu D, Wen Y, Yang S, Duan X, Lai Y, Yang J, Yuan D, Khan A, Wu W, Zeng G. Identifying the novel key genes in renal cell carcinoma by bioinformatics analysis and cell experiments. Cancer Cell Int. 2020 Jul 21;20:331. doi: 10.1186/s12935-020-01405-6. PMID: 32699530; PMCID: PMC7372855.
    OpenUrlCrossRefPubMed
  11. [11].
    Cui, H., Shan, H., Miao, M.Z. et al. Identification of the key genes and pathways involved in the tumorigenesis and prognosis of kidney renal clear cell carcinoma. Sci Rep 10, 4271 (2020). https://doi.org/10.1038/s41598-020-61162-4
    OpenUrl
  12. [12].↵
    The Cancer Genome Atlas Research Network. Comprehensive molecular characterization of clear cell renal cell carcinoma. Nature 499, 43–49 (2013).
    OpenUrlCrossRefPubMedWeb of Science
  13. [13].↵
    GTEx Consortium. The Genotype-Tissue Expression (GTEx) project. Nat Genet. 2013 Jun;45(6):580–5. doi: 10.1038/ng.2653. PMID: 23715323; PMCID: PMC4010069.
    OpenUrlCrossRefPubMed
  14. [14].↵
    Rudnick PA, Markey SP, Roth J, Mirokhin Y, Yan X, Tchekhovskoi DV, Edwards NJ, Thangudu RR, Ketchum KA, Kinsinger CR, Mesri M, Rodriguez H, Stein SE. A Description of the Clinical Proteomic Tumor Analysis Consortium (CPTAC) Common Data Analysis Pipeline. J Proteome Res. 2016 Mar 4;15(3):1023–32. doi: 10.1021/acs.jproteome.5b01091. Epub 2016 Feb 25. PMID: 26860878; PMCID: PMC5117628.
    OpenUrlCrossRefPubMed
  15. [15].↵
    Iacovelli R, Sternberg CN, Porta C, Verzoni E, de Braud F, Escudier B, Procopio G. Inhibition of the VEGF/VEGFR pathway improves survival in advanced kidney cancer: a systematic review and meta-analysis. Curr Drug Targets. 2015;16(2):164–70. doi: 10.2174/1389450115666141120120145. PMID: 25410406.
    OpenUrlCrossRefPubMed
  16. [16].↵
    Ficarra V, Martignoni G, Maffei N, Brunelli M, Novara G, Zanolla L, Pea M, Artibani W. Original and reviewed nuclear grading according to the Fuhrman system: a multivariate analysis of 388 patients with conventional renal cell carcinoma. Cancer. 2005 Jan 1;103(1):68–75. doi: 10.1002/cncr.20749. PMID: 15573369.
    OpenUrlCrossRefPubMed
  17. [17].↵
    Wang, Q., Armenia, J., Zhang, C. et al. Unifying cancer and normal RNA sequencing data from different sources. Sci Data 5, 180061 (2018). https://doi.org/10.1038/sdata.2018.61
    OpenUrl
  18. [18].↵
    Hsieh JJ, Purdue MP, Signoretti S, Swanton C, Albiges L, Schmidinger M, Heng DY, Larkin J, Ficarra V. Renal cell carcinoma. Nat Rev Dis Primers. 2017 Mar 9;3:17009. doi: 10.1038/nrdp.2017.9. PMID: 28276433; PMCID: PMC5936048.
    OpenUrlCrossRefPubMed
  19. [19].↵
    Davidson-Pilon, (2019). lifelines: survival analysis in Python. Journal of Open Source Software, 4(40), 1317, https://doi.org/10.21105/joss.01317
    OpenUrl
  20. [20].↵
    Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). New York, NY, USA: ACM. https://doi.org/10.1145/2939672.2939785
  21. [21].↵
    Shi, Xingjian, et al. “Multimodal AutoML on Structured Tables with Text Fields.” 8th ICML Workshop on Automated Machine Learning (AutoML). 2021.
  22. [22].↵
    Fernando Nogueira, Bayesian Optimization: Open source constrained global optimization tool for Python, 2014, https://github.com/fmfn/BayesianOptimization
  23. [23].↵
    Shibata K, Mori M, Tanaka S, Kitano S, Akiyoshi T. Identification and cloning of human G-protein gamma 7, down-regulated in pancreatic cancer. Biochem Biophys Res Commun. 1998 May 8;246(1):205–9. doi: 10.1006/bbrc.1998.8581. PMID: 9600093.
    OpenUrlCrossRefPubMedWeb of Science
  24. [24].↵
    Ohta M, Mimori K, Fukuyoshi Y, Kita Y, Motoyama K, Yamashita K, Ishii H, Inoue H, Mori M. Clinical significance of the reduced expression of G protein gamma 7 (GNG7) in oesophageal cancer. Br J Cancer. 2008 Jan 29;98(2):410–7. doi: 10.1038/sj.bjc.6604124. Epub 2008 Jan 22. PMID: 18219292; PMCID: PMC2361448.
    OpenUrlCrossRefPubMedWeb of Science
  25. [25].↵
    Hartmann S, Szaumkessel M, Salaverria I, Simon R, Sauter G, Kiwerska K, Gawecki W, Bodnar M, Marszalek A, Richter J, Brauze D, Zemke N, Jarmuz M, Hansmann ML, Siebert R, Szyfter K, Giefing M. Loss of protein expression and recurrent DNA hypermethylation of the GNG7 gene in squamous cell carcinoma of the head and neck. J Appl Genet. 2012 May;53(2):167–74. doi: 10.1007/s13353-011-0079-4. Epub 2011 Dec 20. PMID: 22183866; PMCID: PMC3334494.
    OpenUrlCrossRefPubMed
  26. [26].↵
    Xu S, Zhang H, Liu T, Chen Y, He D, Li L. G Protein γ subunit 7 loss contributes to progression of clear cell renal cell carcinoma. J Cell Physiol. 2019 Nov;234(11):20002–20012. doi: 10.1002/jcp.28597. Epub 2019 Apr 3. PMID: 30945310; PMCID: PMC6767067.
    OpenUrlCrossRefPubMed
  27. [27].↵
    Ye YX, Zhou J, Zhou YH, Zhou Y, Song XB, Wang J, Lin L, Ying BW, Lu XJ. Clinical significance of BCR-ABL fusion gene subtypes in chronic myelogenous and acute lymphoblastic leukemias. Asian Pac J Cancer Prev. 2014;15(22):9961–6. doi: 10.7314/apjcp.2014.15.22.9961. PMID: 25520136.
    OpenUrlCrossRefPubMed
  28. [28].↵
    Kim SH, Park WS, Chung J. SETD2, GIGYF2, FGFR3, BCR, KMT2C, and TSC2 as candidate genes for differentiating multilocular cystic renal neoplasm of low malignant potential from clear cell renal cell carcinoma with cystic change. Investig Clin Urol. 2019 May;60(3):148–155. doi: 10.4111/icu.2019.60.3.148. Epub 2019 Apr 1. PMID: 31098421; PMCID: PMC6495037.
    OpenUrlCrossRefPubMed
  29. [29].↵
    Kim SH, Park WS, Chung J. Tumour heterogeneity in triplet-paired metastatic tumour tissues in metastatic renal cell carcinoma: concordance analysis of target gene sequencing data. J Clin Pathol. 2019 Feb;72(2):152–156. doi: 10.1136/jclinpath-2018-205456. Epub 2018 Nov 8. PMID: 30409839.
    OpenUrlAbstract/FREE Full Text
  30. [30].↵
    Wu Z, Huang X, Cai M, Huang P. Potential biomarkers for predicting the overall survival outcome of kidney renal papillary cell carcinoma: an analysis of ferroptosis-related LNCRNAs. BMC Urol. 2022 Sep 14;22(1):152. doi: 10.1186/s12894-022-01037-0. PMID: 36104680; PMCID: PMC9476343.
    OpenUrlCrossRefPubMed
  31. [31].↵
    Martinez-Morales P, Morán Cruz I, Roa-de la Cruz L, Maycotte P, Reyes Salinas JS, Vazquez Zamora VJ, Gutierrez Quiroz CT, Montiel-Jarquin AJ, Vallejo-Ruiz V. Hallmarks of glycogene expression and glycosylation pathways in squamous and adenocarcinoma cervical cancer. PeerJ. 2021 Aug 31;9:e12081. doi: 10.7717/peerj.12081. PMID: 34540372; PMCID: PMC8415283.
    OpenUrlCrossRefPubMed
  32. [32].↵
    Zhu WK, Xu WH, Wang J, Huang YQ, Abudurexiti M, Qu YY, Zhu YP, Zhang HL, Ye DW. Decreased SPTLC1 expression predicts worse outcomes in ccRCC patients. J Cell Biochem. 2020 Feb;121(2):1552–1562. doi: 10.1002/jcb.29390. Epub 2019 Sep 12. PMID: 31512789.
    OpenUrlCrossRefPubMed
  33. [33].↵
    Shi Z, Zheng J, Liang Q, Liu Y, Yang Y, Wang R, Wang M, Zhang Q, Xuan Z, Sun H, Wang K, Shao C. Identification and Validation of a Novel Ferroptotic Prognostic Genes-Based Signature of Clear Cell Renal Cell Carcinoma. Cancers (Basel). 2022 Sep 27;14(19):4690. doi: 10.3390/cancers14194690. PMID: 36230613; PMCID: PMC9562262.
    OpenUrlCrossRefPubMed
  34. [34].↵
    Blaschke RJ, Monaghan AP, Schiller S, Schechinger B, Rao E, Padilla-Nash H, Ried T, Rappold GA. SHOT, a SHOX-related homeobox gene, is implicated in craniofacial, brain, heart, and limb development. Proc Natl Acad Sci U S A. 1998 Mar 3;95(5):2406–11. doi: 10.1073/pnas.95.5.2406. PMID: 9482898; PMCID: PMC19357.
    OpenUrlAbstract/FREE Full Text
  35. [35].↵
    Huang W, Huang H, Zhang S, Wang X, Ouyang J, Lin Z, Chen P. A Novel Diagnosis Method Based on Methylation Analysis of SHOX2 and Serum Biomarker for Early Stage Lung Cancer. Cancer Control. 2020 Jan-Dec;27(1):1073274820969703. doi: 10.1177/1073274820969703. PMID: 33167712; PMCID: PMC7791477.
    OpenUrlCrossRefPubMed
  36. [36].↵
    Jung M, Ellinger J, Gevensleben H, Syring I, Lüders C, de Vos L, Pützer S, Bootz F, Landsberg J, Kristiansen G, Dietrich D. Cell-Free SHOX2 DNA Methylation in Blood as a Molecular Staging Parameter for Risk Stratification in Renal Cell Carcinoma Patients: A Prospective Observational Cohort Study. Clin Chem. 2019 Apr;65(4):559–568. doi: 10.1373/clinchem.2018.297549. Epub 2019 Jan 9. PMID: 30626634.
    OpenUrlAbstract/FREE Full Text
  37. [37].↵
    Kozomara A, Birgaoanu M, Griffiths-Jones S. miRBase: from microRNA sequences to function. Nucleic Acids Res 2019 47:D155–D162. doi: 10.1093/nar/gky1141. PMID: 30423142.
    OpenUrlCrossRefPubMed
  38. [38].↵
    He H, Wang L, Zhou W, Zhang Z, Wang L, Xu S, Wang D, Dong J, Tang C, Tang H, Yi X, Ge J. MicroRNA Expression Profiling in Clear Cell Renal Cell Carcinoma: Identification and Functional Validation of Key miRNAs. PLoS One. 2015 May 4;10(5):e0125672. doi: 10.1371/journal.pone.0125672. PMID: 25938468; PMCID: PMC4418764.
    OpenUrlCrossRefPubMed
  39. [39].↵
    Li, Zhengzhao, Junyu Lu, Guang Zeng, Jielong Pang, Xiaowen Zheng, Jihua Feng, and Jianfeng Zhang. “MiR-129-5p inhibits liver cancer growth by targeting calcium calmodulin-dependent protein kinase IV (CAMK4).” Cell Death & Disease 10, no. 11 (2019): 1–14.
    OpenUrl
  40. [40].↵
    Xu, Shan, Wei Li, Jing Wu, Yuru Lu, Ming Xie, Yanlan Li, Juan Zou, Tiebing Zeng, and Hui Ling. “The Role of miR-129-5p in Cancer: A Novel Therapeutic Target.” Current Molecular Pharmacology 15, no. 4 (2022): 647–657.
    OpenUrl
  41. [41].↵
    Ding, Li, Jie Ni, Fan Yang, Lingli Huang, Heng Deng, Yang Wu, Xuansheng Ding, and Jinhai Tang. “Promising therapeutic role of miR-27b in tumor.” Tumor Biology 39, no. 3 (2017): 1010428317691657.
    OpenUrl
  42. [42].↵
    Ishihara T, Seki N, Inoguchi S, Yoshino H, Tatarano S, Yamada Y, Itesako T, Goto Y, Nishikawa R, Nakagawa M, Enokida H. Expression of the tumor suppressive miRNA-23b/27b cluster is a good prognostic marker in clear cell renal cell carcinoma. J Urol. 2014 Dec;192(6):1822–30. doi: 10.1016/j.juro.2014.07.001. Epub 2014 Jul 9. PMID: 25014580.
    OpenUrlCrossRefPubMed
  43. [43].↵
    Li, Yifan, Bao Guan, Jingtao Liu, Zhongyuan Zhang, Shiming He, Yonghao Zhan, Boxing Su et al. “MicroRNA-200b is downregulated and suppresses metastasis by targeting LAMA4 in renal cell carcinoma.” EBioMedicine 44 (2019): 439–451.
    OpenUrl
  44. [44].↵
    Shen, Junwen, Rongjiang Wang, Yu Chen, Zhihai Fang, Jianer Tang, Jianxiang Yao, Jianguo Gao, Wenxia Zhou, and Xiongnong Chen. “Comprehensive analysis of expression profiles and prognosis of TRIM genes in human kidney clear cell carcinoma.” Aging (Albany NY) 14, no. 10 (2022): 4606.
    OpenUrl
  45. [45].↵
    Wang S, Liu N, Tang Q, Sheng H, Long S, Wu W. MicroRNA-24 in Cancer: A Double Side Medal With Opposite Properties. Front Oncol. 2020 Oct 2;10:553714. doi: 10.3389/fonc.2020.553714. PMID: 33123467; PMCID: PMC7566899.
    OpenUrlCrossRefPubMed
  46. [46].↵
    Wu L, Han X, Jiang X, Ding H, Qi C, Yin Z, Xiao J, Xiong L, Guo Q, Ye Z, Qu B, Shen N. Downregulation of Renal Hsa-miR-127-3p Contributes to the Overactivation of Type I Interferon Signaling Pathway in the Kidney of Lupus Nephritis. Front Immunol. 2021 Oct 21;12:747616. doi: 10.3389/fimmu.2021.747616. PMID: 34745118; PMCID: PMC8566726.
    OpenUrlCrossRefPubMed
  47. [47].↵
    Zhou, Wenbin, Xingjie Bi, Guojun Gao, and Lijiang Sun. “miRNA-133b and miRNA-135a induce apoptosis via the JAK2/STAT3 signaling pathway in human renal carcinoma cells.” Biomedicine & Pharmacotherapy 84 (2016): 722–729.
    OpenUrl
  48. [48].↵
    Han, Miaoru, Haifeng Yan, Kang Yang, Boya Fan, Panying Liu, and Hongtao Yang. “Identification of biomarkers and construction of a microRNA-mRNA regulatory network for clear cell renal cell carcinoma using integrated bioinformatics analysis.” Plos one 16, no. 1 (2021): e0244394.
    OpenUrlCrossRefPubMed
  49. [49].↵
    Uhlén M et al., Tissue-based map of the human proteome. Science (2015) PubMed: pmid:25613900 doi: 10.1126/science.1260419
    OpenUrlAbstract/FREE Full Text
  50. [50].↵
    The UniProt Consortium UniProt: the universal protein knowledgebase in 2021, Nucleic Acids Res. 49:D1 (2021)
  51. [51].↵
    Qin M, Gao X, Luo W, Ou K, Lu H, Liu H, Zhuang Q. Expression of CHL1 in Clear Cell Renal Cell Carcinoma and its Association With Prognosis. Appl Immunohistochem Mol Morphol. 2022 Mar 1;30(3):209–214. doi: 10.1097/PAI.0000000000000993. PMID: 35262525.
    OpenUrlCrossRefPubMed
  52. [52].↵
    Han M, Yan H, Yang K, Fan B, Liu P, Yang H. Identification of biomarkers and construction of a microRNALJmRNA regulatory network for clear cell renal cell carcinoma using integrated bioinformatics analysis. PLoS One. 2021 Jan 12;16(1):e0244394. doi: 10.1371/journal.pone.0244394. PMID: 33434215; PMCID: PMC7802940.
    OpenUrlCrossRefPubMed
  53. [53].↵
    Kozomara A, Birgaoanu M, Griffiths-Jones S. miRBase: from microRNA sequences to function. Nucleic Acids Res 2019 47:D155–D162. doi: 10.1093/nar/gky1141. PMID: 30423142.
    OpenUrlCrossRefPubMed
  54. [54].↵
    Lipton ZC. The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue, 16(3) (2018): 31–57.
    OpenUrl
  55. [55].↵
    Molnar C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable (2nd ed.) (2022). https://christophm.github.io/interpretable-ml-book/
Back to top
PreviousNext
Posted November 22, 2022.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Multi-omics analysis of renal clear cell carcinoma progression
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Multi-omics analysis of renal clear cell carcinoma progression
Anuj Guruacharya, James R Golden, Daniel Garrett, Deven Atnoor, Sujaya Srinivasan, Ujjwal Ratan, KT Pickard
medRxiv 2022.11.21.22282533; doi: https://doi.org/10.1101/2022.11.21.22282533
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Multi-omics analysis of renal clear cell carcinoma progression
Anuj Guruacharya, James R Golden, Daniel Garrett, Deven Atnoor, Sujaya Srinivasan, Ujjwal Ratan, KT Pickard
medRxiv 2022.11.21.22282533; doi: https://doi.org/10.1101/2022.11.21.22282533

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Oncology
Subject Areas
All Articles
  • Addiction Medicine (349)
  • Allergy and Immunology (668)
  • Allergy and Immunology (668)
  • Anesthesia (181)
  • Cardiovascular Medicine (2648)
  • Dentistry and Oral Medicine (316)
  • Dermatology (223)
  • Emergency Medicine (399)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (942)
  • Epidemiology (12228)
  • Forensic Medicine (10)
  • Gastroenterology (759)
  • Genetic and Genomic Medicine (4103)
  • Geriatric Medicine (387)
  • Health Economics (680)
  • Health Informatics (2657)
  • Health Policy (1005)
  • Health Systems and Quality Improvement (985)
  • Hematology (363)
  • HIV/AIDS (851)
  • Infectious Diseases (except HIV/AIDS) (13695)
  • Intensive Care and Critical Care Medicine (797)
  • Medical Education (399)
  • Medical Ethics (109)
  • Nephrology (436)
  • Neurology (3882)
  • Nursing (209)
  • Nutrition (577)
  • Obstetrics and Gynecology (739)
  • Occupational and Environmental Health (695)
  • Oncology (2030)
  • Ophthalmology (585)
  • Orthopedics (240)
  • Otolaryngology (306)
  • Pain Medicine (250)
  • Palliative Medicine (75)
  • Pathology (473)
  • Pediatrics (1115)
  • Pharmacology and Therapeutics (466)
  • Primary Care Research (452)
  • Psychiatry and Clinical Psychology (3432)
  • Public and Global Health (6527)
  • Radiology and Imaging (1403)
  • Rehabilitation Medicine and Physical Therapy (814)
  • Respiratory Medicine (871)
  • Rheumatology (409)
  • Sexual and Reproductive Health (410)
  • Sports Medicine (342)
  • Surgery (448)
  • Toxicology (53)
  • Transplantation (185)
  • Urology (165)