Upregulation of Human Endogenous Retrovirus-K (HML-2) mRNAs in hepatoblastoma: Identification of potential new immunotherapeutic targets and biomarkers ======================================================================================================================================================= * David F Grabski * Aakrosh Ratan * Laurie R Gray * Stefan Bekiranov * David Rekosh * Marie-Louise Hammarskjold * Sara K Rasmussen ## Abstract **Purpose** Hepatoblastoma is the most common liver malignancy in children. In order to advance therapy against hepatoblastoma, novel immunologic targets and biomarkers are needed. Our purpose in this investigation is to examine hepatoblastoma transcriptomes for the expression of a class of genomic elements known as Human Endogenous Retrovirus (HERVs). HERVs are abundant in the human genome and are biologically active elements that have been associated with multiple malignancies and proposed as immunologic targets in a subset of tumors. A sub-family of HERVs, HERV-K (HML-2), have been shown to be tightly regulated in fetal development, making investigation of these elements in pediatric tumors paramount. **Methods** We first created a HERVK-FASTA file utilizing 91 previously described HML-2 proviruses. We then concatenated the file onto the GRCh38.95 cDNA library from Ensembl. We used this computational tool to evaluate existing RNA-seq data from 10 hepatoblastoma tumors and 3 normal liver controls (GEO accession ID: GSE89775). Quantification and differential proviral expression analysis between hepatoblastoma and normal liver controls was performed using the pseudo-alignment program Salmon and DESeq2, respectively. **Results** HERV-K mRNA was expressed in hepatoblastoma from multiple proviral loci. All HERV-K proviral loci were expressed at higher levels in hepatoblastoma compared to normal liver controls. Five HERV-K proviruses (1q21.3, 3q27.2, 7q22.2, 12q24.33 and 17p13.1) were significantly differentially expressed (p-adjusted value < 0.05, |log2 fold change| > 1.5) across conditions. The provirus at 17p13.1 had an approximately 300-fold increased expression in hepatoblastoma as compared to normal liver. This was in part due to the near absence of HERV-K mRNA at the 17p13.1 locus in fully differentiated liver samples. **Conclusions** Our investigation demonstrates that HERV-K is expressed from multiple loci in hepatoblastoma and that the expression is increased from several proviruses as compared to normal liver controls. Our results suggest that HERV-K mRNA expression may find use as a biomarker in hepatoblastoma, given the large differential expression profiles in hepatoblastoma, with very low mRNA levels in liver control samples. Key Words * Human Endogenous Retrovirus-K * Hepatoblastoma * Fetal Tumor * Immunotherapy * Tumor Biomarker ## 1. Introduction Hepatoblastoma is the most common pediatric liver malignancy, affecting approximately 500 children in the US each year [1, 2]. Similar to other fetal tumors, hepatoblastoma is thought to arise from embryonic liver progenitor cells that fail to differentiate into hepatocytes [3-5]. As hepatoblastoma precursor cells show different levels of differentiation prior to malignant transformation, this cancer is morphologically complex and histologically subcategorized as one of the following subtypes: fetal, embryonal, or mixed epithelial and mesenchymal undifferentiated small cell [6]. Treatment is multimodal, involving a combination of resection and chemotherapy or transplant [7]. Five-year survival in North America is between 70-80%, with the best outcomes in early stage disease [8, 9]. There is still a clear need to identify novel treatment strategies that can offer more children hope for a long term cure [10, 11]. Furthermore, a full understanding of the molecular drivers of these tumors will be advantageous in the search for new treatments [12]. This report focuses on the expression of Human Endogenous Retrovirus-K (HERV-K) (HML-2) mRNA in hepatoblastoma and identifies HERV-K expression as a potential disease marker in this cancer. HERVs have garnered increasing attention in translational investigative science in the last 30 years as evidence has accumulated that these genomic elements have significant effects on human biology, both in health and disease. Attention has recently turned to the potential role of HERVs as both targets for immunotherapy [13, 14] as well as tumor markers to stratify disease in different solid organ malignancies [15, 16]. Human endogenous retroviruses (HERVs) are transposable genomic elements that have integrated into the human germline over many millions of years. All together, HERVs comprise an estimated 8% of the human genome [17]. Several HERVs have been co-opted by the human cell for specific biologic functions. For example, the Syncytin protein, responsible for the formation of syncytiotrophoblasts and ultimately mammalian placentation, was exapted from a HERV envelope protein [18]. In addition, a HERV viral promoter activates and drives transcription of the amylase gene in the hominid parotid gland, which may have been key to the expansion of the hominid diet to include starches [19]. For in-depth reviews of HERV biology, see [20-22]. There have been numerous phylogenetically distinct integrations of HERVs over evolutionary time with approximately 40 independently identified sub-groups [23]. The HERV-K (HML-2) viruses represent the most recent retroviruses to integrate into the human germline and have thus undergone the least amount of genetic silencing. In addition, these viruses remain polymorphic in human populations [24]. When transcriptionally active, some of the HERV-K proviruses are still capable of producing viral proteins, which can exert varied functions, such as nucleocytoplasmic export of mRNA and cell-cell fusion. There are approximately 90 identified intact or partially intact HERV-K (HML-2) proviruses distributed throughout the human genome, as well as over 900 solo long terminal repeats (LTRs) that are remnants of inserted viruses [25]. In the majority of human cell types and cell states, HERV-Ks have been transcriptionally silenced through DNA methylation and chromatin remodeling [26]. However, growing evidence suggests that there are multiple diseases and developmental states where HERV-K mRNAs are more highly expressed than in their normal adult somatic cell counter-parts [27-29]. Important to this investigation, HERV-K has been shown to be over-expressed in multiple tumors [30, 31], as well as during fetal development [32, 33], leading to increasing research into the potential effects of HERV-K expression in both oncogenesis and during embryogenesis. In a previous study, Zhao et al. identified HERV-K envelope mRNA expression in breast cancer tissue, while expression was not detectable in healthy breast tissue controls [30]. Additional studies noted that HERV-K mRNA expression levels directly correlated with the eventual development of distant metastasis [15], while another investigation found that the overexpression of HERV-K was specifically associated with the basal cell subtype [34]. Similarly, in melanoma, HERV-K mRNA expression effectively distinguished tumor samples as well as metastatic lymph nodes from normal tissue and non-metastatic lymph nodes [31]. In an investigation of hepatocellular carcinoma and surrounding normal tissue it was noted that HERV-K mRNA levels were increased in hepatocellular carcinoma and that this provided an independent prognostic indicator for lower overall survival [35]. For the current understanding of HERV biology in cancer, see [36, 37]. Given the large differential expression profiles between disease and normal tissue and that HERV-Ks can express viral proteins capable of activating both innate, humoral and cell-mediated immune responses [38-42], attention has recently turned to the possibility of utilizing HERV-K neoantigens for targeted therapy. Chimeric antigen receptor (CAR) T cells that target HERV-K Env proteins have been developed and tested in *in vivo* murine models for both breast cancer and melanoma [43, 44]. In both models, the HERV-K Env CAR T-cells demonstrated tumor specific cytotoxicity, reduced the primary tumor mass and showed reduction of tumor metastases. Importantly, HERV-K expression has remained unexamined in pediatric tumors. This oversight is potentially impactful, given that HERV-K has also been shown to be regulated and transcribed during embryogenesis and progressively silenced as fetal development continues [32, 33]. Transcriptional activity of HERV-K proviruses in pediatric tumors are thus of specific interest as these cancers are thought to arise from embryonic precursor cells that fail to differentiate during organ development. Given the increased expression of HERV-K mRNA in hepatocellular carcinoma, as well as the fact that hepatoblastoma fits the paradigm of a fetal tumor which arises from a failure of cellular differentiation, we sought to investigate HERV-K expression in this tumor. We hypothesized that HERV-Ks may be more expressed in hepatoblastoma, as compared to normal, fully differentiated liver tissue. In this investigation, we utilized publicly available RNA sequencing data from hepatoblastoma and normal liver controls generated by the Children’s Hospital of Pittsburgh [45]. We utilized standard bioinformatics techniques to create a custom database that enabled us to examine RNA expression from HERV-K proviruses in these data sets. Here, we report on differences in HERV-K RNA expression in hepatoblastoma compared to normal liver controls. ## 2. Methods ### 2.1 Hepatoblastoma and Normal Liver RNA-seq Data The dataset used in this investigation includes RNA sequencing (RNA-seq) data from 10 hepatoblastoma samples and 3 normal liver controls. Tumor excision and RNA isolation was performed by the University of Pittsburg Children’s Hospital as part of a next-generation sequencing (NGS) study to identify activated cancer pathways in clinically aggressive hepatoblastoma [45]. The raw sequencing data are publicly available and were downloaded from the NCBI biorepository using the NCBI Sequence Read Archive (SRA) Toolkit (GEO accession ID GSE89775). According to the NCBI biorepository, total RNA (1 ug) was isolated from fresh frozen tissue (both hepatoblastoma and normal liver) and sequenced on an Illumina platform to generate 100 base-pair, strand-specific, paired-end reads to a sequencing depth of approximately 40M reads per sample. Prior to analysis in this study, all raw FASTQ files were pre-processed with Trimmomatic to remove adaptors and low-quality reads as well as to assure that only paired-ends reads with a minimum read length of 50 nucleotides were included [46]. The quality of the raw and trimmed reads from each sample was confirmed with the program FASTQC ([http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc](http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc)). ### 2.2 Analysis of HERV-K mRNA expression HERV-K proviruses are currently not well annotated in the human genome, meaning that standard RNA-seq analysis techniques cannot be used to determine the expression profile of HERV-Ks. To overcome this limitation, we created a HERV-K specific FASTA file using the genomic sequence of the 91 HERV-K proviruses deposited in NCBI (GenBank ID [JN675007](http://medrxiv.org/lookup/external-ref?link\_type=GEN&access\_num=JN675007&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom)-[JN675097](http://medrxiv.org/lookup/external-ref?link\_type=GEN&access\_num=JN675097&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom)). To determine the transcriptional profile and differential expression of HERV-K in all samples, we concatenated our working HERV-K FASTA file onto the GRCh38.95 cDNA fasta file downloaded from Ensembl. For RNA analysis, we did not annotate the individual potential spliced transcripts that would be expected to be expressed from an integrated provirus, but rather defined the entire proviral sequence as a single transcript. We then used the pseudo-aligner, Salmon [47] in mapping-based mode with the validateMappings flag to create a count matrix over the full human transcriptome including the concatenated HERV-K file (example code: salmon quant -i GRCh38\_HERVK.fa -l -1 FT6_1.fq -2 FT6_2.fq –validateMappings -o FT6_quant). We then sub-selected read counts assigned to HERV-K loci. We next indexed the HERV-K FASTA file with the alignment tool HISAT2 [48] and aligned our Hepatoblastoma and Normal Liver samples to the HISAT2-HERV-K index to create .BAM files. To control for multi-mapping of the repeat elements, we used SAMTOOLS to select for reads with a mapping quality (MAPQ) Score >= 50, which has been shown to be an effective way to eliminate multi-mapped reads from HERV-K analysis [49]. Uniquely aligned .BAM files for all samples were then imported into the software package Geneious (Biomatters, Auckland, New Zealand) for visualization of read position along an annotated HERV-K provirus which distinguishes reading frames for the different viral proteins. ### 2.3 HERV-K Expression Profiles and Differential Expression Transcript abundance read estimates from Salmon were imported into R (version 3.5.1) using tximport [50]. Transcript abundance estimates were normalized for sample sequencing depth using the R Bioconductor package DESeq2 [51]. This allowed us to determine normalized HERV-K expression across all proviral loci by sample, together with the number of loci responsible for total HERV-K expression and the range of reads across each locus. Differential expression of HERV-K in hepatoblastoma as compared to normal liver was also analyzed using DESeq2. HERV-K proviruses were considered differentially expressed if the p-adjusted values (calculated using the Benjamini-Hochberg False Discovery Rate implemented in DESEq2) were less than 0.05 and the absolute value of the log2 fold changes were greater than 1.5 [52]. Given the apparent heterogeneity in HERV-K expression across different hepatoblastoma samples, we also stratified hepatoblastoma samples by overall HERV-K expression (total number of normalized reads across all proviral loci). We selected the top 3 highest HERV-K expressing hepatoblastoma samples and the three lowest HERV-K expressing hepatoblastoma samples. We then performed a differential gene expression analysis again using DESeq2 to compare the high expressing to low expressing tumors. Genes with a p-adj value < 0.05 and |log2 fold change| > 1.5 were considered significant and included in a Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) over-representation functional analysis. Gene pathways were considered enriched if they had a p-adj value < 0.05. GO and KEGG analysis was performed using the clusterProfiler package in R [53]. Unless otherwise specified, all plots denoting the RNA expression profile and differential expression were generated using the ggplot2 package in R [54]. The scatterplot was created using the EnchancedVolcano visualization package in R [55]. ## 3. Results The RNA-seq that was downloaded from the NCBI biorepository included 10 hepatoblastoma samples (HB) from children undergoing liver transplantation for their disease. These samples represent aggressive hepatoblastomas that were not amenable to up-front resection. It also included 3 normal liver controls (NC) from orthotopic livers prior to transplant. Following pseudoalignment with Salmon and sample normalization with DESeq2 as described in the Methods, we found that the HERV-K RNA expression profile varied greatly across the hepatoblastoma (HB) samples and normal liver controls (NC). In HB, the median HERV-K read counts across all proviral locations for each sample was 342 (interquartile range (IQR) 235, 515). However, 2 samples had greater than 2,000 reads that aligned to several HERV-K proviruses, whereas 2 samples showed 150 read counts or less (Table 1). The general HERV-K expression profile across all samples is visualized in the HERV-K expression heatmap in Figure 1. Each cell in the heatmap represents the normalized expression Z-score calculated across samples (range -3 to 3) of a specific HERV-K provirus (y-axis) in individual HB or NC samples (x-axis). mRNA from numerous proviral loci were expressed at average levels across all the samples (Z-score = 0), which we confirmed were HERV-Ks expressed at low levels (average of less than 10 reads across a HERV-K provirus in individual samples). These HERV-K loci are represented by light blue in the heatmap. There was great heterogeneity in proviral expression in individual samples, including variation among the hepatoblastoma samples (9.5 expressed proviral loci, IQR 9.3). Variation was also was observed in the normal liver controls (3 expressed proviral loci, IQR 3) although the total number of expressed HERV-K loci was less than in HB. No proviral loci were common to all samples and many HERV-K loci were expressed in less than 5 total samples. View this table: [Table 1:](http://medrxiv.org/content/early/2020/03/03/2020.02.28.20026260/T1) Table 1: Normalized HERV-K read counts and number of transcribed HERV-K loci in hepatoblastoma and normal liver controls ![Figure 1:](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2020/03/03/2020.02.28.20026260/F1.medium.gif) [Figure 1:](http://medrxiv.org/content/early/2020/03/03/2020.02.28.20026260/F1) Figure 1: Heatmap of HERV-K expression in hepatoblastoma and normal liver controls. All tumor samples and normal liver controls are represented on the x-axis. All HERV-K proviral loci are represented on the y-axis. Color expression key is located in upper left of figure and is based on the calculated Z score (scale -3 to 3) across all samples in the study. Each individual cell represents the normalized HERV-K proviral expression (x-axis) in the respective sample (y-axis). We next calculated differential proviral expression across the HB and NC samples. All proviral loci were expressed at higher levels in HB compared to NC as demonstrated by the HERV-K expression profile scatter plot (Figure 2). Five proviruses (1q21.3, 3q27.2, 7q22.2, 12q24.33 and 17p13.1) were significantly differentially expressed (p-adjusted value < 0.05, |log2 fold change| > 1.5) across conditions (Table 2). Boxplots of the log10 normalized counts for the HB and NC samples across each differentially expressed provirus reveal much higher expression in HB compared to NC samples (Figure 3). For 3 proviruses (17p13.1, 3q27.2 and 7q22.2), the normalized expression across all 3 NC samples was less than 10 reads and was too low for graphical comparison. The absence of expression in NC led to large fold-changes between HB and NC. Thus 17p13.1 showed a 294-fold increase expression in HB as compared to NC (padj = 0.009). This provirus was expressed in all hepatoblastoma samples with one exception and was not expressed in 2 of the 3 normal controls. Similarly, in HB samples, 7q22.2 was expressed 93.1-fold above NC (padj = 0.027), while 3q27.2 was expressed 55-fold above NC (padj = 0.026). View this table: [Table 2:](http://medrxiv.org/content/early/2020/03/03/2020.02.28.20026260/T2) Table 2: Differential gene expression of HERV-K between hepatoblastoma and normal liver controls ![Figure 2:](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2020/03/03/2020.02.28.20026260/F2.medium.gif) [Figure 2:](http://medrxiv.org/content/early/2020/03/03/2020.02.28.20026260/F2) Figure 2: Scatter plot of HERV-K differential expression profile between hepatoblastoma and normal liver controls. Each point on the figure represents the log2 fold change (x-axis) between conditions of an individual HERV-K provirus plotted against the corresponding -log10 p-adjusted value (y-axis). Orange points represent HERV-K proviruses that were not significantly differentially expressed (p-adj < 0.05, |log2fold change| > 1.5) between conditions. Green points represent differentially expressed provirus, which are also labeled by genomic location (HK_1q21.3 represents the HERV-K provirus located at 1q21.3). ![Figure 3:](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2020/03/03/2020.02.28.20026260/F3.medium.gif) [Figure 3:](http://medrxiv.org/content/early/2020/03/03/2020.02.28.20026260/F3) Figure 3: Boxplot and overlaid dotplot of significantly differentially expressed HERV-K proviruses. Individual proviruses are represented on the x-axis. Log10 normalized count values are represented on the y-axis. The normalized count value for each provirus in each sample is represented by an individual colored point. Hepatoblastoma samples are represented in orange while normal liver controls are represented in green. The central line of the boxplot is determined by the median log10 normalized expression value across all grouped samples (hepatoblastoma or normal liver controls) for each significantly differentially expressed provirus. The ‘box’ represents the 25th (lower line) and 75th (upper line) percentile log10 normalized expression across grouped samples. Following alignment of all HB and NC samples with our HISAT2-HERV-K index and selection for uniquely mapped reads, we imported the alignment file (.BAM) into the bioinformatics platform Geneious. This allowed visualization of where the reads aligned across each provirus in the different samples. For provirus 17p13.1, the reads aligned across the entire provirus (Figure 4, panel A). Reads similarly mapped across the length of the provirus at 12q24.33 (Figure 4, panel B), with a small concentration of reads in the 5’ LTR (long terminal repeat). A similar pattern was seen in provirus 1q21.3, but with a concentration of reads in the 3’ LTR (Figure 4, panel C). Interestingly, despite 3q27.2 being a relatively complete provirus (9,100 bp), the majority of the reads aligned in the 3’ LTR for all samples (Figure 4, panel D). Conversely, all of the reads for 7q22.2 aligned uniquely to the 5’ LTR in all samples (Figure 4, panel E). Larger images for each individual provirus in Figure 4 are provided in Supplemental File 1. ![Figure 4:](http://medrxiv.org/http://medrxiv.stage.highwire.org/content/medrxiv/early/2020/03/03/2020.02.28.20026260/F4.medium.gif) [Figure 4:](http://medrxiv.org/content/early/2020/03/03/2020.02.28.20026260/F4) Figure 4: Graphical representation of uniquely aligned reads across HERV-K provirus (A)17p13.1 (B) 12q24.33 (C) 1q21.3 (D) 3q27.2 and (E) 7q22.2 created in bioinformatics platform Geneious. The x-axis represents the genomic position along the provirus. Major annotated regions of the proviral genome at each provirus are illustrated at the bottom of the panel. Coding regions for viral proteins Gag, Pro, Pol, Env, Rec or Np9 are represented by green bars, but does not necessarily infer an open-reading frame for the protein. Individual reads from each sample are represented on the y-axis. Abbreviations: FT-fetal tumor (hepatoblastoma), NC-normal control (liver). Hepatoblastoma samples FT8 (531 HERV-K reads), FT9 (2,778 HERV-K reads and FT11 (2,503 HERV-K reads) represented the tumor samples with the highest HERV-K expression. Conversely, FT7 (223 HERV-K reads), FT10 (148 HERV-K reads) and FT15 (152 HERV-K reads) represented the tumors with the lowest expression. A differential expression analysis comparing the high to low tumors revealed 775 differential expressed genes (Supplemental File 2). GO Biological Process enrichment analysis of the differentially expressed genes revealed over-representation of cellular processes involved in leukocyte activation and leukocyte mediated immunity (Table 3). Additional GO terms (top 20) for cellular localization and molecular function as well as enriched KEGG terms are provided in Supplemental File 3. View this table: [Table 3:](http://medrxiv.org/content/early/2020/03/03/2020.02.28.20026260/T3) Table 3: Gene Ontology Biological Process Enrichment Analysis following differential gene expression analysis of high HERV-K expressing HB vs low HERV-K expressing HB. ## 4. Discussion The data in this investigation establish that HERV-Ks are expressed in hepatoblastoma. Furthermore, the mRNA profile of HERV-K in hepatoblastoma is complex, with multiple proviruses transcribed from different loci in tumors from different individuals. The data also show that overall HERV-K expression is increased in hepatoblastoma compared to normal liver controls and that several proviruses show large fold increases in tumors compared to normal liver tissue. The significant increase makes HERV-Ks intriguing targets for immunotherapy. In addition, our data suggest that they could also serve as potential biomarkers for disease recurrence or progression, though further studies will be required to confirm this. This investigation is the first to demonstrate HERV-K expression in a pediatric solid organ malignancy. In addition, the bioinformatics pipeline described in this manuscript provides an effective tool to measure HERV-K RNA profiles in disease versus non-disease states that could be used to screen other fetal malignancies for HERV-K expression. ### 4.1 Expression of HERV-K in Hepatoblastoma and Potential Clinical Applications Multiple HERV-K proviruses showed increased expression in hepatoblastoma. The HERV-K provirus at 17p13.1 was the most dramatic example of a large differential expression value, with an almost 300-fold change in expression from the provirus compared to normal liver tissue. Similarly, large differential expression values were seen for proviral loci 1p21.3, 3q27.2, 7q22.2 and 12q24.33. The magnitude of the increased expression levels over normal liver controls was prominent in several instances. This is in part because mRNA levels from HERV-K proviruses in normal liver control were either not present at all, or present at very low levels (less than 10 read counts across the entire provirus). Our findings of low HERV-K expression in fully differentiated liver is consistent with previous investigations that have examined HERV-K expression in liver tissue [35, 56]. This finding is also consistent with the reported low levels of HERV-K expression in the majority of fully differentiated somatic tissues [35, 56]. Proviral expression profiles also differed across the hepatoblastomas themselves. Several hepatoblastoma samples had over 100 read counts aligned to the provirus at 17p13.1, while several other tumors had less than 10 counts (which was more similar to the expression profile in NC). This variation in proviral expression across HB samples was true for total HERV-K expression as well. Two hepatoblastoma samples had over 2,000 normalized read counts summed over all proviruses. In contrast, two HB samples had ∼150 normalized read counts across all proviruses with only 1 or 2 proviral locations with over 10 read counts. The HERV-K RNA expression in these tumors was thus more similar to the normal liver controls than to the other tumor samples. In follow-up investigations it will be important to determine whether HERV-K expression correlates with the molecular subtype of hepatoblastoma. Furthermore, a differential gene expression analysis and GO enrichment analysis comparing high and low HERV-K expressing tumors demonstrated a strong correlation with cellular pathways involving leukocyte activation as well as neutrophil and leukocyte mediated immunity. Cellular pathway enrichment analyses such as GO and KEGG are useful adjuncts to understand how a complex set of differentially expressed genes may regulate specific biological process or known phenomenon. Our data suggest that HERV-K mRNA levels may correlate with either direct tumor immunogenicity or an inflammatory microenvironment surrounding the tumors. The possibility that expressed HERV-K proteins in these tumors could be acting as cancer neoantigens is an intriguing possibility, but this will require additional experimental validation. In a recent investigation, Rooney et al. analyzed approximately 20 solid organ tumors as well as normal tissue controls from TCGA mRNA-seq datasets for expression of both endogenous retrovirus (ERV) families as well as cytolytic activity. The investigation found that several tumor specific ERVs existed across multiple tumors and that high gene expression of these tumor specific ERVs correlated with significantly enriched immune activation pathways [57]. In addition, experimental evidence has demonstrated that HERV-K viral proteins can directly activate both innate and adaptive immune response in multiple solid organ tumors including breast cancer, melanoma and colorectal cancer [38-42]. The read distribution across individual proviruses varied but was consistent for each provirus across all samples in this study. The proviruses located at 17p13.1, 12q24.33 and 1q21.3 had read counts distributed across the entire provirus. 17p13.1 and 1q21.3 do not have intact 5’ LTRs suggesting that transcription is driven by an upstream cellular promoter and proviral expression represents read-through transcription from a cellular promoter. This phenomenon has been previously described [58]. Interestingly, the provirus at 3q27.2 had the majority of reads located in the 3’ LTR region, suggesting that transcription was initiating at the 3’ LTR. In 7q22.2, reads in all samples aligned to the 5’ LTR but ceased before progressing through the viral genome suggesting that either the 5’ polyA site was being utilized or epigenetic modifications silenced expression. Another important possibility is that this expression phenomenon of all reads aligning in the LTR (either 5’ or 3’) is associated with multi-mapped reads to an unannotated HERV-K LTR that is not included in our annotation file. While we controlled for multi-mapped reads in our bioinformatic pipeline, our analysis is dependent on ‘known’ HERV-Ks included in NCBI. This particular aspect of the analysis highlights the importance of detailed alignment. For example, 3q27.2 has an open reading frame for viral proteins Gag and Pol. However, in the case of this provirus, all the reads align in the 3’ LTR, and thus no proteins will be produced. It is important to note that transcription from the 5’LTR or 3’LTR of HERV-K proviruses may still be of biological significance as proviral enhancers can affect transcription of cellular genes up to 100,000 base pairs upstream or downstream of the HERV-K provirus [59]. Similar to the findings of our current investigation in Hepatoblastoma, increased HERV-K expression has been associated with multiple solid organ malignancies when compared to normal tissue-matched samples-most notably with breast cancer [60], melanoma [61] and germ cell tumors [62]. More recent preliminary studies in hepatocellular carcinoma [63] and pancreatic adenocarcinoma [64] have also found increased HERV-K mRNA levels in cancer as compared to normal tissues. In all of these tumors, significant debate remains as to whether HERV-K expression acts as a disease driver or is simply a result of global epigenetic changes in the cancer cell [65]. Though causality remains unclear, there is clear evidence that HERV-K expression can have significant effects on the transcriptome of the cancer cell through insertional position near, for example, proto-oncogenes, or through acting as alternative promoters, polyadenylation signals or alternative splice sites within introns of human genes [66-68]. HERV-K-*env* expression has specifically been linked to perturbations in both the p53 and RAS signaling pathways in breast cancer [69], while in pancreatic adenocarcinoma, RNA-seq analysis following sh-RNA knock-down of HERV-K-*env* showed decreased expression of genes on the RAS-ERK-RSK pathway [64]. Equally important, HERV-K expression has been proposed as an important potential biomarkers for all the above diseases [15, 16], due in part to the very low expression levels in corresponding normal tissue. In breast cancer, HERV-K expression was found to effectively differentiate basal cell carcinoma from other breast cancer subtypes [34]; a clinically important finding given the aggressive nature of this tumor. Like many pediatric solid organ tumors, the next steps in hepatoblastoma therapy involves finding appropriate molecular targets for immunotherapy, as well as the ability to molecularly stratify clinically aggressive tumors. This has led multiple groups to explore the potential role of both microRNAs and long non-coding RNA (lncRNAs) in hepatoblastoma [70-72]. However, the field remains limited by a lack of preserved tissue, including normal patient-matched tissue controls that would enable a more extensive molecular analysis of these rare but important childhood tumors. ### 4.2 Importance of RNA-seq Analysis Pipelines for HERV-K Expression and its Limitations With the advancement in massive parallel sequencing technologies, which have now generated numerous RNA-seq datasets of multiple tumor types, bioinformatic techniques that allow analysis of these powerful datasets are becoming increasingly necessary [73]. HERV-K proviruses pose a difficult challenge for computational analysis of RNA-seq data as these regions are poorly annotated and constitute repeat elements making unique alignment of RNA-seq reads difficult [49]. As a result of these limitations, the vast majority of investigations into HERV-K RNA expression in cancer have not been able to localize expressed RNAs to individual HERV-K proviral loci. This has fundamentally limited the ability to discover how these proviruses may affect cell biology and identify potential translated HERV-K proteins that may also affect cell biology, as well as present unique targets for immunotherapy. Nevertheless, given the abundance of HERV sequences in the human genome, a complete understanding of human cancer biology will undoubtedly require elucidating the effects that these entities have on the cellular machinery. The methodology presented in this investigation allows for simple analysis and screening of RNA-seq datasets for HERV-K expression. Furthermore, methods used to quantify read counts across specific proviral loci allowed for a fuller description of the HERV-K expression, in addition to simple detection of the presence or absence of HERV-K mRNA. In our analysis, we used the Salmon alignment tool in transcriptome alignment mode (pseudoalignment). As Salmon identifies unique k-mers across the transcriptome, we were thus able to effectively distinguish unique k-mers across the different HERV-Ks and could assign uniquely aligned reads to individual proviruses. The alignment tool then allocated multi-map reads to proviral locations that had at least one unique alignment across the provirus. The limitation of the Salmon pseudoalignment analysis was that we could not determine where reads were aligning across the provirus. As the proviruses themselves have complex transcriptomes, some with large insertions or deletions of nucleotides (i.e. indels), this was necessary to infer the potential proviral mRNAs and proteins that could be produced. The use of a standard RNA-seq alignment tool HiSAT2 accomplished this task, though we still had to control for multi-mapped reads which was accomplished with Samtools. With this step, we lost ∼20-25% of the total reads that aligned to HERV-K loci. While this was acceptable for mapping reads across different regions of the proviruses, it confounded effective quantification. Furthermore, multi-map reads are more likely to occur with more recently integrated proviruses that are more complete [49]. This is unfortunate as these proviruses are also the most likely to produce viral proteins and therefore have the largest potential effect on the cell. ### 4.3 Overall Limitations of this Study There are several limitations in the dataset that we used in this analysis. The use of non-patient matched liver control tissue prevented a more thorough analysis of differential expression between tumors and normal liver, given that HERV-Ks remain polymorphic in the human population. A lack of aged-matched and patient-matched normal tissue controls is a common issue with current RNA-seq analysis studies of fetal solid organ malignancies. Furthermore, the small dataset did not allow the ability to correlate HERV-K transcription with clinical outcomes, including stage of disease, reoccurrence, disease resistance and over-all survival. It also limited the potential to correlate HERV-K expression with differentiation/histologic subtypes of hepatoblastoma including fetal, embryonal and undifferentiated disease. Lastly, from a bioinformatics perspective, a future annotation of all viral mRNAs-including Gag, Env, Rec and/or Np9 that can be produced from each individual provirus, will allow for direct screening of viral mRNAs that have the potential to encode for viral proteins. ## 5. Conclusion The current investigation demonstrates that several Human Endogenous Retrovirus-K proviruses are transcribed in hepatoblastoma, with increased RNA expression from several proviral loci in hepatoblastoma as compared to normal liver controls. The large difference in HERV-K expression profiles between hepatoblastoma and normal liver sheds light on several important questions that make HERV-K studies important. Future investigations are required both to explore HERV-K expression as a tool for molecular disease stratification, as well as for targeted immunotherapy. This study also highlighted the complex nature of HERV-K RNA expression in hepatoblastoma. Hepatoblastoma samples demonstrated large differences in overall expression profiles of HERV-K, both in total aligned reads to HERV-K proviruses and in the total number of expressed proviruses. Finally, our study highlights the important need to continue to develop tumor banks for pediatric solid organ tumors that include patient matched tissue controls for appropriate molecular comparison. ## Data Availability Data is publicly available. Supplemental File 2- Significant Differential Gene Expression Analysis (Pseudoalignment with Salmon, Differential Gene Expression with DESeq2) of High HERV-K Expression Hepatoblastoma Samples compared to Low HERV-K Expression Hepatoblastoma Samples (top 100 genes ranked by log2fc). View this table: [Table4](http://medrxiv.org/content/early/2020/03/03/2020.02.28.20026260/T4) Supplemental Figure 3. Gene Ontology (GO) terms for cellular localization (Supplemental Table 1) and molecular function (Supplemental Table 2) as well as enriched Kyoto Encyclopedia Genes and Genomes (KEGG) (Table 3) terms for significantly differentially expressed genes between high HERV-K expressing hepatoblastoma samples and low HERV-K expressing hepatoblastoma samples. View this table: [Supplemental Table 1:](http://medrxiv.org/content/early/2020/03/03/2020.02.28.20026260/T5) Supplemental Table 1: Gene Ontology (GO) molecular function analysis following differential gene expression analysis of high HERV-K expressing Hepatoblastoma vs low HERVK expressing Hepatoblastoma View this table: [Supplemental Table 2:](http://medrxiv.org/content/early/2020/03/03/2020.02.28.20026260/T6) Supplemental Table 2: Gene Ontology (GO) cellular localization analysis following differential gene expression analysis of high HERV-K expressing Hepatoblastoma vs low HERVK expressing Hepatoblastoma (Top 20 terms) View this table: [Supplemental Table 3:](http://medrxiv.org/content/early/2020/03/03/2020.02.28.20026260/T7) Supplemental Table 3: Kyoto Encyclopedia of Genes and Genomes Enrichment Analysis following differential gene expression analysis of high HERV-K expressing Hepatoblastoma vs low HERVK expressing Hepatoblastoma. ## Acknowledgements The authors kindly acknowledge Dr. Rakesh Sindhi of UPMC, who made the Hepatoblastoma RNA-seq dataset available for this analysis. Partial salary support for MLH and DR was provided by the Charles H. Ross Jr and Myles H. Thaler Professorship endowments at the University of Virginia. This work was supported by The National Cancer Institute of the National Institutes of Health (Grant numbers: T32 CA163177 and R01 CA206275). ## Footnotes * **Disclosures:** The authors report no proprietary or commercial interest in any product mentioned or concept discussed in this article. * **Funding:** This work was supported by The National Cancer Institute of the National Institutes of Health (Grant number: T32 CA163177 and R01 CA206275). ## Abbreviations (HERV-K) : Human Endogenous Retrovirus-K (HB) : Hepatoblastoma (FT) : Fetal Tumor (NC) : Normal Control * Received February 28, 2020. * Revision received February 28, 2020. * Accepted March 3, 2020. * © 2020, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NonCommercial-NoDerivs 4.0 International), CC BY-NC-ND 4.0, as described at [http://creativecommons.org/licenses/by-nc-nd/4.0/](http://creativecommons.org/licenses/by-nc-nd/4.0/) ## References 1. [1].Ranganathan S, Lopez-Terrada D, Alaggio R. Hepatoblastoma and Pediatric Hepatocellular Carcinoma: An Update. Pediatric and developmental pathology : the official journal of the Society for Pediatric Pathology and the Paediatric Pathology Society 2019:doi:1093526619875228. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=1093526619875228&link_type=DOI) 2. [2].Kremer N, Walther AE, Tiao GM. Management of hepatoblastoma: an update. Current opinion in pediatrics 2014;26(3):362–9. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/MOP.0000000000000081&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24759227&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) 3. [3].Mavila N, Thundimadathil J. The Emerging Roles of Cancer Stem Cells and Wnt/Beta-Catenin Signaling in Hepatoblastoma. Cancers 2019;11(10). 4. [4].Wu JF, Chang HH, Lu MY, Jou ST, Chang KC, Ni YH, et al. Prognostic roles of pathology markers immunoexpression and clinical parameters in Hepatoblastoma. Journal of biomedical science 2017;24(1):62. 5. [5].Ruck P, Xiao JC, Pietsch T, Von Schweinitz D, Kaiserling E. Hepatic stem-like cells in hepatoblastoma: expression of cytokeratin 7, albumin and oval cell associated antigens detected by OV-1 and OV-6. Histopathology 1997;31(4):324–9. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1046/j.1365-2559.1997.2750870.x&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=9363447&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1997YB73400005&link_type=ISI) 6. [6].Czauderna P, Lopez-Terrada D, Hiyama E, Haberle B, Malogolowkin MH, Meyers RL. Hepatoblastoma state of the art: pathology, genetics, risk stratification, and chemotherapy. Current opinion in pediatrics 2014;26(1):19–28. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/MOP.0000000000000046&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24322718&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) 7. [7].Meyers RL, Tiao G, de Ville de Goyet J, Superina R, Aronson DC. Hepatoblastoma state of the art: pre-treatment extent of disease, surgical resection guidelines and the role of liver transplantation. Current opinion in pediatrics 2014;26(1):29–36. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24362406&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) 8. [8].Feng J, He Y, Wei L, Chen D, Yang H, Tan R, et al. Assessment of Survival of Pediatric Patients With Hepatoblastoma Who Received Chemotherapy Following Liver Transplant or Liver Resection. JAMA network open 2019;2(10):e1912676. 9. [9].Carceller A, Blanchard H, Champagne J, St-Vil D, Bensoussan AL. Surgical resection and chemotherapy improve survival rate for patients with hepatoblastoma. Journal of pediatric surgery 2001;36(5):755–9. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1053/jpsu.2001.22953&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=11329582&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000168439500019&link_type=ISI) 10. [10].Ilmer M, Garnier A, Vykoukal J, Alt E, von Schweinitz D, Kappler R, et al. Targeting the Neurokinin-1 Receptor Compromises Canonical Wnt Signaling in Hepatoblastoma. Molecular cancer therapeutics 2015;14(12):2712–21. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTA6Im1vbGNhbnRoZXIiO3M6NToicmVzaWQiO3M6MTA6IjE0LzEyLzI3MTIiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMC8wMy8wMy8yMDIwLjAyLjI4LjIwMDI2MjYwLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 11. [11].Indersie E, Lesjean S, Hooks KB, Sagliocco F, Ernault T, Cairo S, et al. MicroRNA therapy inhibits hepatoblastoma growth in vivo by targeting beta-catenin and Wnt signaling. Hepatology communications 2017;1(2):168–83. 12. [12].Lee H, El Jabbour T, Ainechi S, Gay LM, Elvin JA, Vergilio JA, et al. General paucity of genomic alteration and low tumor mutation burden in refractory and metastatic hepatoblastoma: comprehensive genomic profiling study. Human pathology 2017;70:84–91. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.humpath.2017.10.007&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=29079173&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) 13. [13].Smith CC, Beckermann KE, Bortone DS, De Cubas AA, Bixby LM, Lee SJ, et al. Endogenous retroviral signatures predict immunotherapy response in clear cell renal cell carcinoma. The Journal of clinical investigation 2018;128(11):4804–20. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1172/jci121476&link_type=DOI) 14. [14].Wang-Johanning F, Rycaj K, Plummer JB, Li M, Yin B, Frerich K, et al. Immunotherapeutic potential of anti-human endogenous retrovirus-K envelope protein antibodies in targeting breast tumors. Journal of the National Cancer Institute 2012;104(3):189–210. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/jnci/djr540&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22247020&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000300343000009&link_type=ISI) 15. [15].Wang-Johanning F, Li M, Esteva FJ, Hess KR, Yin B, Rycaj K, et al. Human endogenous retrovirus type K antibodies and mRNA as serum biomarkers of early-stage breast cancer. International journal of cancer 2014;134(3):587–95. 16. [16].Hahn S, Ugurel S, Hanschmann KM, Strobel H, Tondera C, Schadendorf D, et al. Serological response to human endogenous retrovirus K in melanoma patients correlates with survival probability. AIDS research and human retroviruses 2008;24(5):717–23. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1089/aid.2007.0286&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18462078&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000256519300007&link_type=ISI) 17. [17].Barbulescu M, Turner G, Seaman MI, Deinard AS, Kidd KK, Lenz J. Many human endogenous retrovirus K (HERV-K) proviruses are unique to humans. Current biology : CB 1999;9(16):861–8. 18. [18].Rote NS, Chakrabarti S, Stetzer BP. The role of human endogenous retroviruses in trophoblast differentiation and placental development. Placenta 2004;25(8-9):673–83. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.placenta.2004.02.008&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15450384&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000223898300001&link_type=ISI) 19. [19].Samuelson LC, Wiebauer K, Snow CM, Meisler MH. Retroviral and pseudogene insertion sites reveal the lineage of human salivary and pancreatic amylase genes from a single gene during primate evolution. Molecular and cellular biology 1990;10(6):2513–20. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoibWNiIjtzOjU6InJlc2lkIjtzOjk6IjEwLzYvMjUxMyI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIwLzAzLzAzLzIwMjAuMDIuMjguMjAwMjYyNjAuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 20. [20].Meyer TJ, Rosenkrantz JL, Carbone L, Chavez SL. Endogenous Retroviruses: With Us and against Us. Frontiers in chemistry 2017;5:23. 21. [21].Garcia-Montojo M, Doucet-O’Hare T, Henderson L, Nath A. Human endogenous retrovirus-K (HML-2): a comprehensive review. Critical reviews in microbiology 2018;44(6):715–38. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1080/1040841X.2018.1501345&link_type=DOI) 22. [22].Grandi N, Tramontano E. HERV Envelope Proteins: Physiological Role and Pathogenic Potential in Cancer and Autoimmunity. Frontiers in microbiology 2018;9:462. 23. [23].Mayer J, Blomberg J, Seal RL. A revised nomenclature for transcribed human endogenous retroviral loci. Mobile DNA 2011;2(1):7. 24. [24].Sverdlov ED. Retroviruses and primate evolution. BioEssays : news and reviews in molecular, cellular and developmental biology 2000;22(2):161–71. 25. [25].Subramanian RP, Wildschutte JH, Russo C, Coffin JM. Identification, characterization, and comparative genomic distribution of the HERV-K (HML-2) group of human endogenous retroviruses. Retrovirology 2011;8:90. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/1742-4690-8-90&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22067224&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) 26. [26].Schmitt K, Reichrath J, Roesch A, Meese E, Mayer J. Transcriptional profiling of human endogenous retrovirus group HERV-K(HML-2) loci in melanoma. Genome biology and evolution 2013;5(2):307–28. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/gbe/evt010&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23338945&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) 27. [27].Kury P, Nath A, Creange A, Dolei A, Marche P, Gold J, et al. Human Endogenous Retroviruses in Neurological Diseases. Trends in molecular medicine 2018;24(4):379–94. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.molmed.2018.02.007&link_type=DOI) 28. [28].Nakagawa K, Harrison LC. The potential roles of endogenous retroviruses in autoimmunity. Immunological reviews 1996;152:193–236. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1111/j.1600-065X.1996.tb00917.x&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=8930674&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1996VR18000010&link_type=ISI) 29. [29].Bergallo M, Galliano I, Montanari P, Gambarino S, Mareschi K, Ferro F, et al. CMV induces HERV-K and HERV-W expression in kidney transplant recipients. Journal of clinical virology : the official publication of the Pan American Society for Clinical Virology 2015;68:28–31. 30. [30].Zhao J, Rycaj K, Geng S, Li M, Plummer JB, Yin B, et al. Expression of Human Endogenous Retrovirus Type K Envelope Protein is a Novel Candidate Prognostic Marker for Human Breast Cancer. Genes & cancer 2011;2(9):914–22. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1177/1947601911431841&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22593804&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) 31. [31].Muster T, Waltenberger A, Grassauer A, Hirschl S, Caucig P, Romirer I, et al. An endogenous retrovirus derived from human melanoma cells. Cancer research 2003;63(24):8735–41. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NjoiY2FucmVzIjtzOjU6InJlc2lkIjtzOjEwOiI2My8yNC84NzM1IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjAvMDMvMDMvMjAyMC4wMi4yOC4yMDAyNjI2MC5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 32. [32].Grow EJ, Flynn RA, Chavez SL, Bayless NL, Wossidlo M, Wesche DJ, et al. Intrinsic retroviral reactivation in human preimplantation embryos and pluripotent cells. Nature 2015;522(7555):221–5. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nature14308&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25896322&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) 33. [33].Bergallo M, Galliano I, Pirra A, Dapra V, Licciardi F, Montanari P, et al. Transcriptional activity of human endogenous retroviruses is higher at birth in inversed correlation with gestational age. Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases 2019;68:273–9. 34. [34].Johanning GL, Malouf GG, Zheng X, Esteva FJ, Weinstein JN, Wang-Johanning F, et al. Expression of human endogenous retrovirus-K is strongly associated with the basal-like breast cancer phenotype. Scientific reports 2017;7:41960. 35. [35].Ma W, Hong Z, Liu H, Chen X, Ding L, Liu Z, et al. Human Endogenous retroviruses-k (HML-2) expression is correlated with prognosis and progress of hepatocellular carcinoma. BioMed Research International 2016;2016((Ding L.; Zhou F.) Department of Clinical Hematology, Zhongnan Hospital, Wuhan University, Wuhan, China). 36. [36].Gonzalez-Cao M, Iduma P, Karachaliou N, Santarpia M, Blanco J, Rosell R. Human endogenous retroviruses and cancer. Cancer biology & medicine 2016;13(4):483–8. 37. [37].Grabski DF, Hu Y, Sharma M, Rasmussen SK. Close to the Bedside: A Systematic Review of Endogenous Retroviruses and Their Impact in Oncology. The Journal of surgical research 2019;240:145–55. 38. [38].Chiappinelli KB, Strissel PL, Desrichard A, Li H, Henke C, Akman B, et al. Inhibiting DNA Methylation Causes an Interferon Response in Cancer via dsRNA Including Endogenous Retroviruses. Cell 2015;162(5):974–86. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2015.07.011&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26317466&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) 39. [39].Roulois D, Loo Yau H, Singhania R, Wang Y, Danesh A, Shen SY, et al. DNA-Demethylating Agents Target Colorectal Cancer Cells by Inducing Viral Mimicry by Endogenous Transcripts. Cell 2015;162(5):961–73. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2015.07.056&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26317465&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) 40. [40].Kershaw MH, Hsu C, Mondesire W, Parker LL, Wang G, Overwijk WW, et al. Immunization against endogenous retroviral tumor-associated antigens. Cancer research 2001;61(21):7920–4. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NjoiY2FucmVzIjtzOjU6InJlc2lkIjtzOjEwOiI2MS8yMS83OTIwIjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjAvMDMvMDMvMjAyMC4wMi4yOC4yMDAyNjI2MC5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 41. [41].Sacha JB, Kim IJ, Chen L, Ullah JH, Goodwin DA, Simmons HA, et al. Vaccination with cancer- and HIV infection-associated endogenous retrotransposable elements is safe and immunogenic. Journal of immunology (Baltimore, Md : 1950) 2012;189(3):1467–79. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6ODoiamltbXVub2wiO3M6NToicmVzaWQiO3M6MTA6IjE4OS8zLzE0NjciO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMC8wMy8wMy8yMDIwLjAyLjI4LjIwMDI2MjYwLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 42. [42].Wang-Johanning F, Radvanyi L, Rycaj K, Plummer JB, Yan P, Sastry KJ, et al. Human endogenous retrovirus K triggers an antigen-specific immune response in breast cancer patients. Cancer research 2008;68(14):5869–77. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NjoiY2FucmVzIjtzOjU6InJlc2lkIjtzOjEwOiI2OC8xNC81ODY5IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjAvMDMvMDMvMjAyMC4wMi4yOC4yMDAyNjI2MC5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 43. [43].Zhou F, Krishnamurthy J, Wei Y, Li M, Hunt K, Johanning GL, et al. Chimeric antigen receptor T cells targeting HERV-K inhibit breast cancer and its metastasis through downregulation of Ras. Oncoimmunology 2015;4(11):e1047582–e. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1080/2162402X.2015.1047582&link_type=DOI) 44. [44].Krishnamurthy J, Rabinovich BA, Mi T, Switzer KC, Olivares S, Maiti SN, et al. Genetic engineering of T cells to target HERV-K, an ancient retrovirus on melanoma. Clinical Cancer Research 2015;21(14):3241–51. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTA6ImNsaW5jYW5yZXMiO3M6NToicmVzaWQiO3M6MTA6IjIxLzE0LzMyNDEiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMC8wMy8wMy8yMDIwLjAyLjI4LjIwMDI2MjYwLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 45. [45].Ranganathan S, Ningappa M, Ashokkumar C, Higgs BW, Min J, Sun Q, et al. Loss of EGFR-ASAP1 signaling in metastatic and unresectable hepatoblastoma. Scientific reports 2016;6:38347. 46. [46].Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics (Oxford, England) 2014;30(15):2114–20. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/bioinformatics/btu170&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24695404&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000340049100004&link_type=ISI) 47. [47].Patro R, Duggal G, Love MI, Irizarry RA, Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nature methods 2017;14(4):417–9. 48. [48].Pertea M, Kim D, Pertea GM, Leek JT, Salzberg SL. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nature protocols 2016;11(9):1650–67. 49. [49].Bhardwaj N, Montesion M, Roy F, Coffin JM. Differential expression of HERV-K (HML-2) proviruses in cells and virions of the teratocarcinoma cell line Tera-1. Viruses 2015;7(3):939–68. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.3390/v7030939&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25746218&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) 50. [50].Soneson C, Love MI, Robinson MD. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Research 2015;4:1521. 51. [51].Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome biology 2014;15(12):550. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s13059-014-0550-8&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25516281&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) 52. [52].Jornsten R, Wang HY, Welsh WJ, Ouyang M. DNA microarray data imputation and significance analysis of differential expression. Bioinformatics (Oxford, England) 2005;21(22):4155–61. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/bioinformatics/bti638&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16118262&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000233218100012&link_type=ISI) 53. [53].Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 2012;16(5):284–7. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1089/omi.2011.0118&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22455463&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000303653300007&link_type=ISI) 54. [54].Wickham H. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York; 2016. 55. [55].Blighe K RS, Lewis M. EnhancedVolcano: Publication-ready volcano plots with enhanced colouring and labeling. R package version 1.4.0, [https://github.com/kevinblighe/EnhancedVolcano](https://github.com/kevinblighe/EnhancedVolcano); 2019. 56. [56].Flockerzi A, Ruggieri A, Frank O, Sauter M, Maldener E, Kopper B, et al. Expression patterns of transcribed human endogenous retrovirus HERV-K(HML-2) loci in human tissues and the need for a HERV Transcriptome Project. BMC genomics 2008;9:354. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/1471-2164-9-354&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18664271&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) 57. [57].Rooney MS, Shukla SA, Wu CJ, Getz G, Hacohen N. Molecular and genetic properties of tumors associated with local immune cytolytic activity. Cell 2015;160(1-2):48–61. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2014.12.033&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25594174&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000347923200008&link_type=ISI) 58. [58].Montesion M, Bhardwaj N, Williams ZH, Kuperwasser C, Coffin JM. Mechanisms of HERV-K (HML-2) Transcription during Human Mammary Epithelial Cell Transformation. Journal of virology 2018;92(1). 59. [59].Jern P, Coffin JM. Effects of retroviruses on host genome function. Annual review of genetics 2008;42:709–32. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1146/annurev.genet.42.110807.091501&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18694346&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000261767000029&link_type=ISI) 60. [60].Wang-Johanning F, Frost AR, Johanning GL, Khazaeli MB, LoBuglio AF, Shaw DR, et al. Expression of human endogenous retrovirus k envelope transcripts in human breast cancer. Clinical cancer research : an official journal of the American Association for Cancer Research 2001;7(6):1553–60. 61. [61].Humer J, Waltenberger A, Grassauer A, Kurz M, Valencak J, Rapberger R, et al. Identification of a melanoma marker derived from melanoma-associated endogenous retroviruses. Cancer research 2006;66(3):1658–63. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NjoiY2FucmVzIjtzOjU6InJlc2lkIjtzOjk6IjY2LzMvMTY1OCI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIwLzAzLzAzLzIwMjAuMDIuMjguMjAwMjYyNjAuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 62. [62].Kleiman A, Senyuta N, Tryakin A, Sauter M, Karseladze A, Tjulandin S, et al. HERV-K(HML-2) GAG/ENV antibodies as indicator for therapy effect in patients with germ cell tumors. International journal of cancerJournal international du cancer 2004;110(3):459–61. 63. [63].Ma W, Hong Z, Liu H, Chen X, Ding L, Liu Z, et al. Human Endogenous Retroviruses-K (HML-2) Expression Is Correlated with Prognosis and Progress of Hepatocellular Carcinoma. Biomed Res Int 2016;2016:8201642. 64. [64].Li M, Radvanyi L, Yin B, Li J, Chivukula R, Lin K, et al. Downregulation of Human Endogenous Retrovirus Type K (HERV-K) Viral env RNA in Pancreatic Cancer Cells Decreases Cell Proliferation and Tumor Growth. Clin Cancer Res 2017;23(19):5892–911. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTA6ImNsaW5jYW5yZXMiO3M6NToicmVzaWQiO3M6MTA6IjIzLzE5LzU4OTIiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMC8wMy8wMy8yMDIwLjAyLjI4LjIwMDI2MjYwLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 65. [65].Kassiotis G, Stoye JP. Making a virtue of necessity: the pleiotropic role of human endogenous retroviruses in cancer. Philosophical transactions of the Royal Society of London Series B, Biological sciences 2017;372(1732). 66. [66].Rebollo R, Romanish MT, Mager DL. Transposable elements: an abundant and natural source of regulatory sequences for host genes. Annual review of genetics 2012;46:21–42. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1146/annurev-genet-110711-155621&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22905872&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000311568300002&link_type=ISI) 67. [67].Schulte AM, Lai S, Kurtz A, Czubayko F, Riegel AT, Wellstein A. Human trophoblast and choriocarcinoma expression of the growth factor pleiotrophin attributable to germ-line insertion of an endogenous retrovirus. Proceedings of the National Academy of Sciences of the United States of America 1996;93(25):14759–64. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxMToiOTMvMjUvMTQ3NTkiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMC8wMy8wMy8yMDIwLjAyLjI4LjIwMDI2MjYwLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 68. [68].Cohen CJ, Lock WM, Mager DL. Endogenous retroviral LTRs as promoters for human genes: a critical assessment. Gene 2009;448(2):105–14. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.gene.2009.06.020&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=19577618&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000271972200002&link_type=ISI) 69. [69].Zhou F, Li M, Wei Y, Lin K, Lu Y, Shen J, et al. Activation of HERV-K Env protein is essential for tumorigenesis and metastasis of breast cancer cells. Oncotarget 2016;7(51):84093–117. 70. [70].Wu JF, Ho MC, Ni YH, Hsu HY, Lee PH, Chang MH. Dysregulation of liver developmental microRNA contribute to hepatic carcinogenesis. Journal of the Formosan Medical Association = Taiwan yi zhi 2019. 71. [71].Dong R, Jia D, Xue P, Cui X, Li K, Zheng S, et al. Genome-wide analysis of long noncoding RNA (lncRNA) expression in hepatoblastoma tissues. PloS one 2014;9(1):e85599. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.pone.0085599&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24465615&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F03%2F03%2F2020.02.28.20026260.atom) 72. [72].Luo Z, Cao P. Long noncoding RNA PVT1 promotes hepatoblastoma cell proliferation through activating STAT3. Cancer management and research 2019;11:8517–27. 73. [73].Parvizpour S, Razmara J, Omidi Y. Breast cancer vaccination comes to age: impacts of bioinformatics. BioImpacts : BI 2018;8(3):223–35.