- Research
- Open access
- Published:
Colorectal cancer progression to metastasis is associated with dynamic genome-wide biphasic 5-hydroxymethylcytosine accumulation
BMC Biology volume 23, Article number: 100 (2025)
Abstract
Background
Colorectal cancer (CRC) progression from adenoma to adenocarcinoma is associated with global reduction in 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC). DNA hypomethylation continues upon liver metastasis. Here we examine 5hmC changes upon progression to liver metastasis.
Results
5hmC is increased in metastatic liver tissue relative to the primary colon tumour and expression of TET2 and TET3 is negatively correlated with risk for metastasis in patients with CRC. Genes associated with increased 5-hydroxymethylcytosine show KEGG enrichment for adherens junctions, cytoskeleton and cell migration around a core cadherin (CDH2) network. Overall, the 5-hydroxymethylcyosine profile in the liver metastasis is similar to normal colon appearing to recover at many loci where it was originally present in normal colon and then spreading to adjacent sites. The underlying sequences at the recover and spread regions are enriched for SALL4, ZNF770, ZNF121 and PAX5 transcription factor binding sites. Finally, we show in a zebrafish migration assay using SW480 CRISPR-engineered TET knockout and rescue cells that reduced TET expression leads to a reduced migration frequency.
Conclusions
Together these results suggest a biphasic trajectory for 5-hydroxymethyation dynamics that has bearing on potential therapeutic interventions aimed at manipulating 5-hydroxymethylcytosine levels.
Background
Colorectal cancer (CRC) is the fourth most common malignancy worldwide [1]. Most deaths in CRC are because of metastatic disease that occurs in 20% of CRC patients, the liver being the most common site for metastasis (70% of all cases) [2]. DNA methylation changes have emerged as a key driver of metastasis and may explain the organ-specific tropism exhibited by many cancers [3,4,5]. In CRC, changes in DNA methylation occur alongside genetic alterations during tumorigenesis. These include genome-wide hypomethylation and hypermethylation at specific gene promoters (reviewed [6]). Patient stratification on the basis of epigenetic signatures such as CpG island methylator phenotype (CIMP) [7, 8] and other methylation biomarkers (VIM, SFRP2 and SEPT9 [9]) are becoming promising therapeutic avenues for targeting CRC [10].
Methylated DNA can be oxidised to form 5-hydroxymethylcytosine (5hmC) that can be further oxidised to 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC) through the action of the ten-eleven-translocation (TET) dioxygenases known as TET1, TET2 and TET3 [11, 12]. These oxidised modifications of 5-methylcytosine are intermediates of active cytosine demethylation and targets for base excision repair. However, 5hmC and 5fC have been established as stable epigenetic marks that can be maintained during cell division [13, 14]. Genome-wide sequencing of 5hmC in various mammalian tissues supports its role as an active epigenetic marker at enhancers, gene bodies and promoters [15, 16]. Using deep neural network models incorporating gene expression and chromatin conformation as well as 5hmC profiles, predictive models of gene expression and putative regulatory regions has recently been developed based on 5hmC signals [17].
Reduced levels of 5hmC have been observed in many cancers in humans [18,19,20,21] and mouse cancer models [22, 23]. During normal development, 5hmC and the TETs have a role in cancer-relevant processes including differentiation and stem cell regulation [24, 25]. In a series of CRC patients, we observed that 5hmC levels are globally depleted in adenomas and adenocarcinomas, compared to normal colon, despite the presence of steady state TET mRNA transcripts at all cancer stages [21]. In normal colon, differentiated colonocytes had high levels of 5hmC, whereas stem cells in the crypts had low levels of 5hmC, similar to what we see in the colon tumours [21, 22]. hMeDIP sequencing of normal colon tissue showed that promoters enriched for 5hmC were less likely to become hypermethylated in cancer, despite the loss of 5hmC at these loci in colon tumours [21, 26, 27].
Here we report that the metastatic tumours have higher levels of 5hmC than the primary CRC tumours and that the genome-wide 5hmC profiles are similar to normal colon, suggesting that 5hmC is recovered at specific sites during metastasis. We note that several sites where 5hmC has been recovered, the adjacent CpGs are also hydroxymethylated. The loci where 5hmC recovered and spread were enriched for zinc finger transcription factor binding sites SALL4, PAX5, ZNF770 and ZNF121. Integration of 5hmC data with published RNA-seq data [26] showed an enrichment for genes involved in adherens junction and cell migration. Finally, we established that the TETs are important for tumour cell migration using CRISPR-Cas9 mediated triple TET knockout of SW480 cells in a zebrafish xenograft assay.
Results
5hmC levels are increased in liver metastasis tissues compared to primary colon tumours in colon cancer patients
We previously reported reduced 5hmC levels in adenomas and adenocarcinomas [21]. This same cohort included patients (n = 10) that at first diagnosis, prior to chemotherapy had liver metastases (Table 1). We examined the global 5hmC levels in the liver metastases for comparison to primary tumours in these patients using mass spectrometry and observed that although the global 5hmC levels in liver metastasis samples were reduced compared to normal colon tissue, they were significantly increased compared to the primary carcinoma tissue (Fig. 1A).
5hmC levels are increased in liver metastasis tissue compared to primary colon tumours in CRC. A Mass spectrometer analysis for global 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) levels in DNA from 14 normal colon (NC) tissues from CRC patients; 15 tumours in colon (TC) and 10 metastases to liver tumours (LT), showing global demethylation with CRC progression for 5mC. By contrast global 5hmC in metastasis increases compared to the primary cancers. Boxplots depict median and interquartile ranges and standard deviation. ***P < 0.005. Supporting data values are in Additional file 6. B RNA-seq data downloaded [28], showing normalised read counts for TET transcripts in normal colon (NC), tumour colon (TC) and metastasised to liver tumours metastasis (LT). Error bars represent standard deviation around the mean. C Kaplan-Meier plots for time (years) to metastasis generated from the Loboda Yeatman study of colon cancer [29, 30], dividing patients into high or low risk for metastasis depending on TET2/TET3 expression. Concordance index = 57.87, log-rank equal curves P = 0.1093, R^2 = 0.166/0.994. Risk groups hazard ratio = 1.83 (conf. int 0.86–3.86). D TET2 and TET 3 expression in these high and low metastasis risk groups. Patients with high risk of metastasis have higher expression of TET (P = 0.000333); no significant difference for TET3 expression (P = 0.965127)
In our patient cohort, the absolute levels of TET transcripts were previously measured by standard curve method in adenomas and adenocarcinomas. Despite the reduction in 5hmC at these stages, we found all three TETs to have the same mean expression levels as the matched normal tissue, with TET2 and TET3 being the most abundant, albeit that there was considerable variation around the mean [21]. No mutations in isocitrate dehydrogenase 1/2 (IDH1/2) or other genes encoding enzymes known to generate metabolites that inhibit TET catalytic activity were found [21]. We do not have RNA from our metastasis samples, but a publicly available RNAseq dataset for colon cancer with matched liver metastasis [26, 28] confirmed TET2 and TET3 to be the more abundant TET transcripts in colon and liver metastasis. Unlike our samples, in this dataset, TET3 expression was significantly increased in colon and liver tumours compared to the normal colon tissue (Fig. 1B). Kaplan–Meier plots generated using SurvExpress tool8 and colon cancer expression databases (GSE28722) [29, 30] indicated that ‘low-TET2/3’ patients have a better metastasis-free long-term survival (> 5 years) than ‘high-TET2/3’ (Fig. 1C, concordance index = 57.87, log-rank equal curves P = 0.1093, R2 = 0.166/0.994). High risk of metastasis was associated with significantly higher TET2 expression (P = 0.0003), while the difference for TET3 expression between high and low risk patients was not statistically significant (Fig. 1D). The overall poor prognosis associated with higher TET expression and the increase in overall 5hmC levels seen in the metastatic samples are counterintuitive to the perception that reduced 5hmC is a hallmark of cancer. We therefore set out to identify the genes that contain 5hmC in the metastatic tumours. We had metastasis samples and matched non-tumour liver tissue for 5 patients that we previously profiled 5hmC in normal and primary carcinoma tissues, using hMeDIP-seq. The same hMeDIP-seq protocol was used to examine 5hmC in the liver metastasis for direct comparison to the primary tumours in these patients. Important questions that we aimed to explore were whether the 5hmC profile in liver metastasis reflects the colon tissue origin of the primary tumour or the metastatic niche and whether the gain of 5hmC marked metastatic progression.
5hmC profiles in liver metastasis match colon profiles
Total counts of hMeDIPseq peaks (enriched over input) for normal colon (NC), primary tumour in the colon (TC) and metastatic tumours to the liver (LT) showed more 5hmC enrichment in liver metastasis compared to primary tumours (Additional file 1: Fig. S1). In normal colon, over 30,000 5hmC peak counts overlapped with about 1% of all cytosines, and 2% of all CpG sites. These were substantially decreased in primary colon tumours (< 800 peak counts, overlapping 0.01% cytosines and 0.03% CpG sites) and comparatively increased in liver metastasis (1794 peak counts, 0.05% cytosines, 0.1% CpG sites) (Additional file 1: Fig. S1 A). Normal liver (NL) had a higher amount of 5hmC compared to the tumour samples (> 20,000 peak counts, 0.5% cytosines and 1% of CpG sites, Additional file 1: Fig. S1 A). The distribution of the 5hmC peaks across genomic features showed most peaks to be present at promoters, and the 5′UTR for normal colon and the liver tumours. The colon and liver metastasis tumours followed the same trend with lower overall peak counts (Additional file 1: Fig. S1B). Within genomic regions, the 5hmC peaks followed a profile of highest concentration around the transcription start site (TSS), but an absence at the TSS itself (Additional file 1: Fig. S1 C). This trend was the same for all tissues, but with more noise in the tumours compared to the normal tissue, due to lower peak numbers.
Given the low amount of 5hmC in tumours, even a small amount of normal liver cells within the metastatic tumours could increase the overall levels of 5hmC. We analysed our liver metastasis tumours for tumour purity (i.e. fraction of tumour cells), using the AITAC (Accurate Inference of Tumour purity and Absolute Copy number) approach [31], using the input sequencing samples of our normal liver and metastases and found the AITAC tumour purity values to be between 0.84 and 0.86 for the liver metastasis samples (Table 1), which is considered ‘high purity’.
Principal component analysis (PCA) for 5hmC peaks in metastasis, primary tumour and normal tissues (Fig. 2A) indicated distinct clusters of 5hmC profiles for each phenotype. Normal colon and colon tumours clustered in one dimension, while the normal liver samples mostly clustered away from these, except for one outlier. The liver metastasis samples did not form a tight cluster, with three out of the five samples tending towards the colon clusters and the other two samples tending towards the normal liver cluster (Fig. 2A). The intersection of peaks is shown in an upset plot in Fig. 2B. As expected, normal liver and normal colon have the most 5hmC loci (57.9% of these are unique to liver, and 28.2% unique to colon), and the largest intersection is the 7.1% shared peaks between normal liver and colon (Fig. 2B). Considering that the set size for normal liver is 1.7 times higher than for normal colon, we would expect random overlap between normal liver and liver metastasis to be higher than an overlap with normal colon. However, even though liver metastasis tumours shared 5hmC peaks with both normal colon and with normal liver (1.7%), the intersection with normal liver only was 1.3%, compared to normal colon which was 1.1%.
5hmC profiles in liver metastasis and colon are similar. Colour scheme: yellow normal colon tissue from CRC patients (NC); light blue primary tumour in colon (TC); dark blue metastasised to liver tumours (LT); red normal liver (NL). A Principal component analysis plot showing distinct clustering between NC (n = 9), TC (n = 8), NL (n = 5) and LT (n = 5). B Upset plot showing the overlap of 5hmC peaks in normal and tumour tissues. C Heatmap of 5hmC profiles during CRC tumour progression; first four columns depict average 5hmC levels in each of NL, LM, CT and NC. Subsequent columns depict individual patient samples grouped by tissue type. Four distinct clusters are present: group 1 5hmC presence in all samples; group 2 5hmC present in NL only; group 3 liver metastasis with similarities to both NC and NL; and group 4 recovery of 5hmC in LT matching NC. Star * highlights patient sample LT51, which looks very similar to normal liver
A heatmap analysis to examine differential 5hmC levels between samples (Fig. 2C) highlighted distinct 5hmC profiles in normal liver and colon. Colon tumours had similar clustering to normal colon, albeit fewer loci with 5hmC. The liver metastasis samples had loci that fell within 4 hierarchical groups: ‘group 1’ having high 5hmC common to normal colon and liver; ‘group 2’ low 5hmC similar to normal colon and colon tumours; ‘group 3’ reduced levels of 5hmC compared to normal tissue, but more 5hmC than colon tumours; and ‘group 4’ showing a cluster of genes that appear to have gained 5hmC in the liver metastasis samples at the same loci as in the normal colon (Fig. 2C). One exception was a patient in which the liver metastasis had a similar 5hmC profile to normal liver. This metastasis sample is the one that clustered with the normal liver in the PCA, possibly suggesting contamination from adjacent normal liver at the time of the biopsy. We repeated the PCA and heatmap analyses without this potentially contaminated sample. This resulted in a tighter cluster of liver metastasis in the PCA (Additional file 1: Fig. S1D). With or without the inclusion of this sample (Additional file 1: Fig. S1E, F), the heatmaps still showed a clear cluster of loci that have 5hmC in the liver metastasis, shared with the normal colon and to a lesser extent the colon tumours, and mostly absent in the normal liver. Interestingly, this potentially contaminated sample had a high purity AITAC score, suggesting that in addition to CNV, other parameters (tumour aneuploidy, SNPs and heterogeneity) and possibly even 5hmC profiles should be considered in these pipelines to provide a more accurate estimation of tumour purity.
A 5hmC signature in liver metastasis is associated with the activation of genes associated with adherens junctions, regulation of cytoskeleton, calcium dependent and independent cell matrix adhesion and focal adhesion
Additional file 2: Table S1 lists the genomic coordinates for consensus peaks for the different CRC cancer stages. We examined the associated genes that had 5hmC peaks within the gene body and 5 kb upstream of the TSS. KEGG pathway analysis for 1548 genes associated with consensus 5hmC peaks uniquely present in the metastatic samples showed significant enrichment for genes involved in viral and bacterial infection indicative of characteristic CRC microbial dysbiosis, as well as metastasis-related functions such as adherens junction and focal adhesion (Fig. 3A). Repeating the KEGG analysis with the genes associated with 5hmC changes between primary and metastatic tumour returned enrichment for adherens junction, regulation of cytoskeleton and focal adhesion. To determine whether these 5hmC changes constitute a metastasis signature linked with gene expression, we used RNAseq data [26, 28] and integrated 5hmC changes with differential gene expression (Additional file 3: Table S2). Figure 3B illustrates differential gene expression between the liver metastasis and the primary colon tumours, with 5hmC marked genes superimposed. In liver metastasis, 2% (365) of the 20,470 transcripts identified (the whole transcriptome) were associated with genes that have 5hmC peaks overlapping the transcription unit or promoter. Most of these (324 5hmC marked loci) were not associated with the 2891 differentially expressed genes (\({-0.95>log}_{2}FC \text{or} {log}_{2}FC>0.95\)). Out of the 934 transcripts that were downregulated, 5 (1%) were marked by 5hmC, compared to 37 (2%) of the 1957 transcripts that were upregulated. Thus, out of the 5hmC marked genes, 42 (12%) were associated with differential expression, of which 37 (88%) were upregulated and 5 (12%) were downregulated in metastasis (Additional file 4: Table S3). The most highly expressed gene on this list was PLG (plasmin heavy chain A), which had a 28-fold increase in expression in liver metastasis compared to the primary tumour. We submitted the 37 upregulated with 5hmC to STRING analysis with k-means clustering, which returned a higher-than-expected number of interactions for genes involved in metastasis pathways including calcium dependent and independent cell matrix adhesion, wound healing and migration (Additional file 1: Fig. S2). A cluster with the most high-confidence edges included cadherin 2 (CDH2) and fibronectin (FN1) (PPI enrichment P value 9.31e − 10) (Fig. 3C). These two proteins are involved in cell adhesion and migration, hence their connection in the STRING. While they operate through different mechanisms and serve distinct functions, it is likely that they indirectly influence each other during metastatic processes and that the genes in this cluster (Table 2) which also include PLG and ESR1 constitute a 5hmC metastatic signature for colon cancer.
5hmC peaks associated with differential gene expression. A KEGG analysis of 5hmC marked genes that gain 5hmC in liver metastasis. B Volcano plot for gene expression changes after metastatic transition (data from [28]). Darker dots identify genes with at least one overlapping and significantly enriched 5hmC peak in hMeDIP-seq. C A core cadherin 2 (CDH2) and fibrinogen (FN1) network of associations for genes that have increased 5hmC and upregulated expression during metastasis
It has been demonstrated that the DNA hypermethylation profile in primary colon is maintained when cells migrate to the liver [27]. We previously reported that 5hmC-marked promoters from normal colon are protected from hypermethylation in primary tumours, but that gene promoters that had gained 5hmC in colon tumours had loss of DNA methylation [21]. Since methylated DNA is a substrate for TETs, we re-examined the colon tumour methylation array data from our previous study [21] to see whether the above panel of 5hmC metastasis signature genes were methylated. CDH2 and ESR1, two of the key genes in the 5hmC metastasis signature, were significantly hypermethylated at their promoters (P < 0.0001) in colon cancer compared to normal colon. In an MBD-seq dataset [27, 40] for colon tumour and liver metastasis tissue, we confirmed that CDH2 and ESR1 were hypermethylated in primary colon cancer consistent with our findings. In this dataset, 88% of genes that gained 5hmC in the liver metastasis were methylated in the primary tumours, which is 10% more than the overall percentage of genes that had methylation in the primary tumours. However, the gain of 5hmC was not accompanied by a reduction in DNA methylation in the metastasis tumours (Additional file 3: Table S2). The overlap of 5 mC and 5hmC in metastasis could suggest a dynamic turn-over between methylation and demethylation but more likely reflects the heterogeneity of tumours with different subsets of cells having either methylation or hydroxymethylation. A further consideration is that the 5 hmeDIP-seq technique enriches for 5hmC using single stranded DNA as input, whereas MBD-seq uses double stranded DNA [41]. Although MBD is highly specific for 5mC and is inhibited by 5hmC [42], it is feasible that some loci could be asymmetrically modified with 5mC and 5hmC on different strands of the same DNA molecule.
5hmC ‘recovery’ spreads to adjacent CpGs at a subset of loci
A visual examination of the 5hmC data on an integrative genomics viewer (IGV) showed the expected pattern of 5hmC ‘recovering’ at sites where 5hmC was present in normal colon tissue, absent in the colon tumour and recovered in the metastatic tumours (Additional file 1: Fig. S3). Interestingly, we observed several regions, where the 5hmC had not only recovered, but spread to form adjacent peaks that were not present in the normal tissue. Figure 4A is an example of the LIMS2 (LIM zinc finger domain containing 2) locus showing both recovered and spreading of 5hmC. To determine how frequently the ‘recovery and spreading’ occurred, we relaxed the stringency for consensus peaks to 50% and merged the 5hmC peaks that were close to each other (within 500 bp) in the liver metastasis data to obtain a list of merged peaks that had spread (n = 15,939, a minimum size of 1000 bp) and refined the list for overlap between normal colon (n = 3149, Fig. 4B, see Methods for detail). Since 5hmC is high in normal liver, we ran a similar merging of peaks in normal liver data and obtained 2328 merged and spread peaks that we compared to the ‘merged in metastasis’ peaks. This left 2405 unique ‘RS-peaks’ in the liver metastasis where 5hmC had recovered after an absence in the primary tumour and then spread to adjacent sites. We then randomly shuffled the NC, TC and LT merged peaks 1000 times and found that on average 522 recover and spread peaks were observed. A permutation test revealed that the 3149 recover and spread peaks we observed occur significantly more than expected than by chance (P < 0.001) and shows that the RS-peaks that we found are not occurring randomly.
In liver metastasis, 5hmC peaks that were initially present in normal colon are recovered. A IGV screenshot with uploaded tracks of 5hmC consensus peaks for normal colon, colon tumour and liver metastasis. Boxes above R highlight a 5hmC peak that has ‘recovered’ since it is present in normal colon, absent in colon tumours and reappears in liver metastasis. Boxes above RS highlight a 5hmC peak that has ‘recovered and spread’. Thus, the peak is present in normal colon, lost in tumour colons and reappears in metastasis, but also spreads to adjacent sites. B The strategy for identifying RS peaks and the consensus binding sites for SALL4, ZNF770, ZNF121 and PAX5 identified after SEA. Abbreviations: MLT and MNL are Merged liver tumour (LT) and normal liver (NL) peaks respectively. LTO and NLO are LT or NL peaks that Overlap normal colon. C Normalised RNAseq counts (data used [28]) for SALL4, ZNF770, ZNF121 and PAX5 in normal colon, colon tumours and liver metastasis (NC, CT and LM). Wald test, Benjamini-Hochberg adjusted. Error bars represent ± SEM. Supporting PCR data values in Additional file 6
The underlying sequences were then examined for enrichment of specific transcription factor binding sites. The most frequently enriched transcription factor binding sites included SALL4 (n = 6879, 90.4%), PAX5 (n = 5345, 70.2%), ZNF770 (n = 4363, 57.3%) and ZNF121 (n = 3843, 50.4%) (Fig. 4B). For these the total number of binding sites were as follows: SALL4: 11,825 (1.71 per sequence); ZNF770: 8414 (1.92 per sequence); PAX5: 7665 (1.43 per sequence) and ZNF121 = 3851 (1.00 per sequence). The coordinates for these ‘recovered and spread’ 5hmC peaks are listed in Additional file 5: Table S4.
PAX5 transcripts are present at very low levels in normal tissue and are further reduced in tumours, whereas SALL4, ZNF770 and ZNF121 transcripts are upregulated in primary and metastatic tumours compared to normal colon (Fig. 4C). Liver metastasis tends to have lower expression of SALL4 and ZNF770 compared to the primary tumours (Fig. 4C). Thus, SALL4 and ZNF770 transcription was inversely correlated with 5hmC levels during progression from normal to primary colon cancer to liver metastasis.
TET expression in a colon cancer cell line SW480 increases the incidence of cell migration in zebrafish xenograft assays
The 5hmC signature included a core of genes associated with cell migration (CDH2, FN, ESR1), suggesting that the TETs have a role in metastatic progression. We predicted that targeting 5hmC via the TETs in a colon cancer cell line would inhibit cell migration in a xenograft assay. Embryo-larva-zebrafish are increasingly being shown to be an excellent model for human cell transplantation and cell migration [43,44,45] and SW480 cells have been shown to successfully implant and migrate in zebrafish larvae [46, 47]. CRISP-Cas9 SW480 TET1- 3 triple knockout cell line (SW480TKO, Additional file 1: Fig. S4 A–E) had significantly depleted TET mRNA and protein levels (Additional file 1: Fig. S4B and C). A TET activity assay confirmed enzymatic activity in the wild type cells and showed substantial reduction in the SW480TKO cells (Additional file 1: Fig. S4D), consistent with reduced 5hmC levels measured by immunoblotting (Additional file 1: Fig. S4E). The low levels of persistent 5hmC in SW580TKO suggest a ‘knockdown’ of TET proteins rather than a complete knock out. The reduction of TETs and 5hmC had no effect on the global DNA methylation levels (Additional file 1: Fig. S4E). No significant differences in proliferation- or colony forming rate was observed between the SW480 wild type and TKO cells (results not shown).
SW480 wild type and TKO cells were injected into the perivitelline space of zebrafish larvae. Those that developed a tumour mass in the yolk sac, with or without migration to distant sites, were scored after 48 h. We limited migration scoring to the tail since this has less auto-fluorescence than sites such as the eye or brain. Examples of tumour mass in the yolk sac and distant foci are shown in Fig. 5A. For the SW480 wild type cells (n = 60), a primary tumour mass in the yolk sac was observed in 37 larvae, of which 15 (40%) had evidence of migration with an average number of 14.75 distant foci per animal (Table 3). A total of 160 larvae were injected with the SW480 TKO cells (two clones with 92 and 68 respectively), and primary tumour mass was observed in 137 larvae. Out of these, only 15% of larvae had tumour cells that had migrated away from the primary tumour with an average number of 14 foci per animal (P < 0.05 Table 3). Thus, significantly fewer larvae that received the SW480-TKO cells had evidence of cells migrating from the yolk sac to the tail compared to embryos that received the unmodified SW480 cells (Fig. 5B). Aside from the significant difference in migration rate between the wild type and TKO cells in the xenograft assay, no differences were seen in the size of tumours or the number of metastatic foci per fish. To determine whether we could rescue the TKO migration effect, we transiently transfected a mouse Tet2 construct in the SW480TKO cells before injecting these into zebrafish embryos. The ectopic expression of Tet2 resulted in increased global 5hmC levels compared to the SW480TKO cells (Fig. 5C and D) and showed an increase in migration in the zebrafish assay, with 34% of the larvae having distant foci compared to 18% in untreated SW480TKO cells (Fig. 5B). These results support a role for the TET proteins in mesenchymal cell migration.
Reducing TET expression and 5hmC levels inhibits migration in a zebrafish xenograft assay. A Bright field and fluorescence examples of two larvae with accumulation of Dil stained SW480 cells within the perivitelline space and the yolk sac. The panels on the left show cells that have migrated towards the head and tail; the panels on the right show cells remaining at site of injection and no migration. B Incidence of cell migration (number embryos with foci in tail/all embryos with cells in yolk sac), after injecting SW480 cells (WT), CRISPR TET1-3 (TKO) and rescue with transiently expressed mouse Tet2 (mTet2). N = 37 (WT), 137 (TKO) and 49 (mTET2) also see Table 3 for statistics. C A representative immunoblots for 5hmC levels in WT, TKO and mTet2 rescue cells, confirming that 5hmC is reduced in TKO and rescued with transient mTet2 expression. Antibody controls include spots for nucleotide standards: 5hmC top, 5mC middle and unmodified C bottom. D Representative RT-qPCR assay confirming ectopic expression of mouse (mTeT2) in the TKO cells after transient transfection. As a control, we used a PCR assay for human (hTET2) which measures residual endogenous TET2 expression
Altogether, our data suggest that 5hmC levels follow a biphasic trajectory during colon cancer progression to metastasis, where many loci that fail to accumulate 5hmC in primary tumours restore 5hmC in metastasis. TET-mediated regulation of key genes with functions associated with cell migration may facilitate the transition from primary tumour to metastasis. A subset of loci that regain 5hmC also accumulate 5hmC at adjacent CPGs potentially mediated by the Krab-Zn finger transcription factors ZNF770 and ZNF121, which themselves are regulated by TETs.
Discussion
Global 5hmC levels are distinctively cell- and tissue-specific, as a result of tissue-specific differences in mitotic cell cycling, TET expression and cofactors that influence the turn-over and accumulation of 5hmC within a cell (reviewed [48]). It has been suggested that the distribution of 5hmC could be an ‘imprint’ of cell identity in various normal tissues, acquired during adult progenitor cell differentiation [48]. Cancer-associated 5hmC signatures for colon cancer and other cancer types have been studied in cell free DNA [49]. It has further been shown that 5hmC is reduced compared to normal cells in almost all primary tumours and haematological malignancies (reviewed in [50]) suggesting that restoring 5hmC levels through the upregulation of TET activity could be a therapeutic avenue for cancer treatment [51,52,53,54]. The fact that several cofactors including vitamin C have a positive effect on upregulating TET activity and increasing 5hmC levels [55,56,57,58,59,60] provides further opportunities for holistic adjuvant treatments. Our study showing that 5hmC levels are increased in metastatic tumours of the liver raise a cautionary caveat for therapeutic approaches aimed at increasing TET activity in colon cancer patients. An independent data set [29, 30] demonstrated poor metastasis-free survival in patients with high TET2 expression levels and our zebrafish xenograft assays further support a role for TET2 in metastatic transformation in colon cancer. In our patient cohort, there was considerable variation in the absolute levels of TET transcripts in primary colon tumours and the matched normal tissue [21]. More patients would be required to stratify the risk according to high and low TET expression and to correlate this with genetic polymorphisms and splice variants of TET genes. While our earlier studies found no correlation with 5hmC levels or any mutations in IDH1/2 or other enzymes involved in generating metabolites that could inhibit TET catalytic activity [21], there is an extensive number of miRNAs that may affect TET2 function, including miR- 7, miR- 125b, miR- 29b/c, miR- 26, miR- 101, miR142 and Let- 7 [61], post transcriptional modifications and numerous interacting transcription factors, chromatin modifiers and signalling proteins (reviewed in [62]).
A literature search for examples of poor prognosis and TET expression showed the TETs are predominantly tumour suppressors in most cancers and predict a worse outcome with low TET expression levels. However, several examples of oncogenic tumour promoting instances have been reported in breast [63, 64], gastric [65], lung [66], ovarian cancers [67], hepatocarcinoma [68] and glioma [69]. In breast cancer, hypoxia induced upregulation of TET1/TET3 transcription and increased 5hmC levels were reported to drive metastatic transformation and poor prognosis [64]. TET1 and TET2 promoters contain consensus binding sites for hypoxia-inducible factor 1 alpha (HIF1-alpha) and upon induction by hypoxia, the resultant changes in 5hmC were associated with increased (tumour necrosis factor) TNF-alpha expression and activation of the TNF-alpha-p38-MAPK signalling [64]. Another example is in hepatocarcinoma where increased TET2 expression has been associated with poor prognosis [68]. In that study, TET2 was shown to mediate the canonical epithelial mesenchymal transition E-cadherin–N-cadherin switch by repressing the E-cadherin promoter via recruitment of HDAC1 leading to the activation of B-catenin [68]. These examples indicate context dependent oncogenic/tumour suppressing activity for TETs and 5hmC levels. A precedent for such opposing tumour promoting/suppressor functions is the well described case of the TGF-beta gene which acts as a tumour suppressor by inhibiting proliferation and inducing apoptosis in various cell types, and yet the overexpression of TGF-beta in tumour cells induces epithelial-mesenchymal transitions and promotes invasiveness and metastasis [70].
Our study illustrates the difficulty in ascribing either 5hmC or the TET proteins as causative factors in colon cancer metastasis. While a global increase in 5hmC in metastatic tumours may reflect increased TET activity in the nucleus, the non-catalytic functions of TET proteins cannot be excluded. Thus, the TETs may transcriptionally activate or repress target genes to drive metastasis, through the interaction with histone deacetylase 2 (Hdac 2), O-glcNAC transferase (OGT), Sin 3 A complex and hypoxia-inducible factors (reviewed [62]). The genome-wide profile in this study showed that 5hmC in the metastatic tumours were similar to their cells of origin (colon) and were mostly associated with active gene expression. A limitation of 5hMeDIP methodology is that although it provides good overall genome coverage, it does not provide base-pair resolution of 5hmC. Newer single cell sequencing technologies, together with a larger cohort of patients, will resolve the uncertainties of tumour purity and intertumoral heterogeneity and enable the identification of rare cell populations and clonal evolution within primary and metastatic tumours.
KEGG pathway analysis for the genes nearest and overlapping the 5hmC peaks in the liver metastasis highlighted enrichment for adherens junction, cytoskeleton and cell migration pathways, suggesting that 5hmC remodelling contributes to the metastatic mesenchymal transition. Filtering for the genes with the most significant changes in gene expression delivered a core cluster of genes enriched for metastasis pathways including calcium dependent and independent cell matrix adhesion and fibroblast migration. This cluster included cadherin 2 (CDH2) and fibronectin (FN1), both of which are involved in cell adhesion and migration through different mechanisms. Future analyses in a larger cohort of primary colon and liver metastasis samples examining the combined effects of these known biomarkers will determine whether 5hmC has a role in mediating any synergistic effects during metastasis.
Colon tumours are heterogenous displaying several subpopulations with differences in morphology, inflammatory infiltrates, mutational status and gene expression profiles. In our previous research and that of others, we have shown using immunohistochemistry that in humans and mice, 5hmC is abundant in all the differentiated colonocytes and at very low levels in the colon stem cell compartments. In bulk sequencing therefore, most of the 5hmC comes from the differentiated colonocytes in normal colon, and the global loss of 5hmC in colon tumours seems to occur regardless of the subtype of tumour. Indeed, global 5hmC levels do not stratify between different tumour types. It could be that within a heterogeneous colon tumour a subset of cells that have retained 5hmC may be more metastatic and be expanded during metastasis and that this clonal outgrowth is what we detect in the liver metastasis.
Building upon our previous hypothesis that the distribution of 5hmC could be an ‘imprint’ of cell identity that is preprogrammed in progenitor cells and established during differentiation [48], we propose that 5hmC does not accumulate in tumour cells, as in progenitor cells, but that the preprogrammed 5hmC imprint is maintained and can be reestablished upon progression to metastasis. Our data showing the similarity between metastatic tumours and normal colon cells would fit this model, with the caveat that malignant cells are epigenetically aberrant. Thus, the model would need to include the additional de novo peaks and take into consideration tumour heterogeneity.
The detection of several loci where 5hmC was recovered at the same site as in the normal colon and then spreading in cis may either be a consequence of additional transcription factors being recruited to these sites, or a lack of factors that normally prevent the spread of 5hmC. The transcription factor binding sites identified at the ‘recover and spread’ regions include SALL4, PAX5, ZNF770 and ZNF121. PAX5 has previously been listed as a transcription factor that binds to unmodified CpG sites and recruits TET-histone modifying complexes [71]. However, it is expressed at very low levels in normal colon and reduced even further in tumours, so unlikely to have a 5hmC memory function in these cells. SALL4 has previously been shown to bind to 5hmC and to accelerate demethylation through interaction with TET2 [72]. But SALL4 transcript levels are low in colon and colon tumours, even after being increased in primary tumours. Not much is known about ZNF770 and it has not previously been associated with TET activity or 5hmC levels. In our study, ZNF770 transcript levels were inversely correlated with 5hmC levels in the primary tumour and metastasis. Both ZNF770 and ZNF121 (also known as ZHC32; ZNF20; D19S204) belong to the Krüppel C2H2-type zinc-finger protein family. ZNF121 has been shown to be a c-MYC-interacting protein with functional effects on MYC and cell proliferation in breast cancer [73]. C-MYC has also been listed as a transcription factor that preferably binds to its binding sites when the CpGs are unmethylated and recruits TET-histone modifying complexes to maintain the unmethylated state [71]. Both ZNF770 and ZNF121 have TET2 binding sites in their promoters and can be regulated by TET2. Future functional analyses will reveal whether ZNF770 and ZNF121 have binding affinities for 5hmC or 5mC and whether they interact with TET enzymes to regulate 5hmC accumulation in normal tissue or cancers.
Finally, we used a primary colon adenocarcinoma cell line, SW480, as a model system to study 5hmC dynamics in zebrafish migration assays. Despite an ongoing debate regarding the validity of cancer cell migration and invasion from the yolk sac and whether this is an active migration or passive diffusion process [74], zebrafish larvae continue to provide an excellent alternative to mouse xenograft experiments enabling larger throughput, a quicker timeframe and more replicate experiments. In our assays, depleted TET expression significantly reduced the incidence of cell migration which is the first essential epithelial to mesenchymal transition stage of metastasis. The levels of 5hmC in the triple knockout SW480 cells could be rescued by transient ectopic expression of mouse Tet2 constructs. The rescued cells showed increased migration rates in the zebrafish assay.
Conclusions
The 5hmC profiles from CRC patients together with Kaplan–Meier survival curves, integrative -OMICs analysis and experimental cell biology provide evidence of a biphasic profile of reduced 5hmC in primary cancer that recovers during metastasis. This supports a role for TET enzyme activity in reprogramming primary cancer cells towards mesenchymal migrating cells.
Methods
Patient samples
Research using patient tumour samples was conducted under the principles of the World Medical Association Helsinki agreement with ethical approval obtained from the Cambridgeshire Local Research Ethics Committee (LREC references 04/Q0108/125 and 06/Q0108/307) as previously reported [21, 75]. From these previously reported cohort of 119 CRC patients, available DNA samples included normal colonic mucosa (n = 22), taken from tissue some distance away from the tumour, invasive primary carcinoma (n = 65), liver metastatic deposits (n = 32) and normal liver [5], taken from a site adjacent to the tumour. For this study, 16 patients were selected from the CRC cohort and used for mass spectrometry analyses, and 5 of these were sequenced for 5hmC (Table 1). Under LREC, researchers were not provided with extensive patient data, beyond Dukes stage, age, sex and MSI status where known. Tumour purity for the liver metastasis samples used for 5hmC hMeDIP sequencing was estimated from DNA sequence data using an AITAC pipeline [31] which estimates tumour purity based on CNV. AITAC values are given in Table 1.
Antibodies
Anti- 5hmC (RRID:AB_10013602, Active Motif, 39769), anti-TET1 (RRID AB_2537831, Invitrogen, MA5 - 16,312), anti-TET2 (RRID AB_2687506, Sigma-Aldrich, MABE462), anti-TET3 (RRID AB_11150700, GeneTex, GTX121453) and anti-beta tubulin (RRID AB_1841238 Sigma-Aldrich, T8328), anti-mouse IgG (RRID AB_258476, Sigma-Aldrich, A2304) and anti-rabbit IgG (RRID AB_10709927, Abcam, 205718).
PCR primer sequences
TET1: Fwd (forward) 5′CAGATTAGTCAGGAAGGAAGATGTAA3′, Rev (reverse) 5′ATTTTCCAGGGCTTAAAGTCTTGA3′; TET2: Fwd 5′GCAGCACACCCTCTCAAGATT3′, Rev 5′AATTCAGCAGCTCAGTCCCTTACT3′; TET3: Fwd 5′AGAACCAGGTGACCAACGAG3′, Rev 5′CGCAGCGATTGTCTTCCTTG3′; ZNF121: Fwd 5′TTCGCCTTTATCGTGGTG 3′, Rev 5′AATGTTGTTGAGGTGCTGAC3′; ZNF770: Fwd 5′CCTCAATACCGCCAAGGTCTTTC3′, Rev 5′CCAATGTTGCCTCAAGGCTG3′; SALL4: Fwd 5′TCGTCTGC TAGCGCTCTTCAGATC3′, Rev 5′CGGCGGGCTGAGTTATTGTTCG3′; PAX5: Fwd 5′ GCGCAAGAGAGACGAAGGT3′, Rev 5′CTGCTGCTGTGTGAACAAGTC3′.
Mass spectrometry
One microgram of genomic DNA was incubated with 5 U of DNA Degradase Plus (Zymo Research) at 37 °C for 3 h and filtered through Amicon 10 kDa centrifugal filter units (Millipore). The concentrations of 2′-deoxycytidine, 5- methyl- 2′-deoxycytidine and 5-hydroxymethyl- 2′-deoxycytidine in the filtrate were determined using an AB Sciex Triple Quad 6500 mass spectrometer fitted with an Agilent Infinity 1290 LC system and an Acquity UPLC HSS column. The global levels of 5mC and 5hmC were expressed as percentages over total 2′-deoxycytidines.
Survival curves
The online SurvExpress tool, Biomarker Viewer (bioinformatics.mx) designed by [76], was used to generate Kaplan-Meier curves for time (years) to metastasis. Data from [29, 30], https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE28722, contained 129 colon cancer patient RNA profiles from a spectrum of clinical stages.
Immunoblot (dot blot) for global 5hmC/5mC detection
Two micrograms of DNA per sample and standards (10 ng cytosine, 5 ng 5-methylcytosine, 0.125 ng 5-hydroxymethylcytosine, Zymo Research) were added to a total volume of 100 µL 0.1 M NaOH and denatured at 95 °C for 5 min. Then, an equal volume of ice cold 2 M ammonium acetate was added before transfer to a nitrocellulose membrane using a 96-well vacuum dot blot apparatus (GE Healthcare). After crosslinking the DNA onto the membrane using an ultraviolet crosslinker (UVP), total DNA was visualised with methylene blue stain (0.05% methylene blue in 0.3 M sodium acetate). The membrane was washed twice in 75% ethanol, blocked in 5% w/v milk in PBST (0.1% v/v Tween- 20 in PBS) and incubated with the primary 5hmC antibody (1:5000, Active Motif, 39,769) at 4 °C overnight. After three 5-min washes in PBST, the secondary antibody (1:2000, Abcam, ab205718) was applied for detection by chemiluminescence (Thermo Scientific) according to the manufacturer’s instructions. A Fusion SL analyser (Vilber) was used to visualise the signal.
hMeDIP-seq
Illumina libraries were prepared before the pull-down using 1 to 3 µg of sonicated genomic DNA (Bioruptor). Libraries were prepared using the TruSeq DNA sample preparation kit (Illumina) following manufacturer’s instructions. Adaptor modified genomic DNA was then immunoprecipitated following as described by [21]. Input and pull-down material was whole genome amplified as previously described [21].
Read processing, peak calling and normalisation
hMeDIP-seq reads in FASTQ format were quality checked with FASTQC and trimmed using FASTP in default settings. Processed hMeDIP-seq reads were aligned to the GRCh38.p14 human genome with BOWTIE2 using default settings. Then, GreyListChIP was run on input files available to filter out regions with a high signal in the input. Next, MACS2 callpeak was then used to call peaks in narrowpeak mode with the following settings: -g 3.0e9 -B -q 0.01 –bw 300. Output from MACS2 was normalised with MAnorm2 [17]. Normalised peaks were retained only if they were present in greater than 75% of the samples in each category.
Principal component analysis
Principal component analysis was performed with the fviz_pca_ind function from the FACTOEXTRA R package.
5hmC peak enrichment and upset plots
MAnorm2 diffTest was used to compare two tissue types to each other and find differentially enriched peaks from the MACS2 peaks. Differentially enriched peak counts from MAnorm2 diffTest were extracted and plotted using upset plot.
Differential expression analysis
RNA-seq FASTQ reads downloaded from the NCBI Sequence Read Archive (SRA), SRR2089755, deposited by Lee et al. [26, 28]. Patients in this data set ranged from 62 to 73 in age, were microsatellite stable (MSS), M1, and chemo naïve when sampled, similar to our patient cohort. FASTQ reads were quality checked with FASTQC and trimmed using TRIMMOMATIC. Processed RNA-seq reads were then aligned to the GRCh38.p14 human genome using STAR (v2.7.10). HTseqcount was used to obtain read counts for each gene using the GRCh38.p14 human genome gene annotation. Differential expression was performed with edgeR. Lowly expressed genes were first filtered out using the edgeR function filterByExpr() followed by calcNormFactors() and estimateDisp(). Output from these were then input into a generalised linear model using glmFit().
DNA methylation analysis
We downloaded the MBD-seq peaks [27, 40] for colon tumour and liver metastasis tissue and aligned to the GRCh38.p14 human genome. The patient cohort was MSS, CIMP -ve, and had liver metastasis samples from chemo naïve patients. Metastasis samples were not matched to primary CRC donors and were a mix of M0 and M1.
Extracting 5hmC gain and recover marks
A custom script was created (adjacent_methylation_peaks.py) to extract 5hmC regions that have recovered and spread. This script first filtered out peaks that were absent in over 50% of the samples from the output.narrowPeak files from MACS2 (v2.2.6). It then merged peaks that were within 500 base pairs of each other using bedtools merge. The merged peaks for LT were retained if they were at least 1000 base pairs long; for NC and TC all merged peaks were retained. bedtools intersect was then used to uncover peaks that were present in NC and LT but not TC (recovered) and LT only (gained).
We also performed a similar peak filter for methylation marks to uncover LT 5hmC regions that were near to 5mC in LT. Here, 5mC peak region filtering was also performed similarly to above, using MACS2 output.narrowPeak files. Again a 50% consensus filter was applied for a peak to be retained. Similar to above, LT 5mC peaks were merged if they were within 250 base pairs of each other and retained if they were at least 250 base pairs long. Finally, 5hmC recovered or gained peaks were searched for being with 750 base pairs of a 5mC peak region.
To determine whether the recovery and spreading that we observed was not due to chance, we ran a permutation test [77] by randomly shuffling the NC and TC merged peaks within the genome, while leaving the original LT merged peaks. We then calculated the number of recovered and spread peaks (shuffled count), the same as described previously. This was repeated 1000 times to generate an empirical null distribution of overlap counts and from this we computed p = [number of permutations where the shuffled count ≥ the observed count (n = 3149)]/[total permutations + 1].
Motif enrichment
5hmC spread peaks were searched for transcription factor binding motifs using simple enrichment analysis (SEA) with the HOCOMOCO Human v11 Core motif database from the online MEME suite toolbox (v 5.3.3).
Cell culture
SW480 cells were obtained from the ATCC (SW480 [SW- 480]—CCL- 228 |). Congenic triple TET knockout SW480 clones and wild type cells were cultured in Dulbecco’s modified eagle medium (DMEM) supplemented with 10% FBS and 1% penicillin–streptomycin solution. All cell lines were cultured in a humidified incubator at 37 °C and 5% CO2 and periodically sent for in-house mycoplasma testing.
TET activity assay
Nuclear extracts from SW480 cells and CRISPR knockouts were purified using the ABCAM nuclear extraction kit (AB 113474). Two micrograms of nuclear extracts were used in an 8-well strip format colorimetric 5mC-hydrolase TET activity kit (Abcam Cat ~ AB156912) according to manufacturer’s instructions and calibrated against TET standards (supplied with the kit, using dilutions in range of 0.05–0.5 ng/µl).
CRISPR-CAS9 mediated triple TET knockdown
SW480 cells (genotype (TET1(+/+);TET2(+/+);TET3(+/+/+))) were sent to Horizon Discovery for CRISPR-CAS9 gene editing. The parental SW480 cells were first targeted for the TET2 locus (Horizon gRNA1779) and a single clone with an out-of-frame deletion of TET2 was subsequently used to target the TET3 locus (Horizon gRNA1780). A single clone in which all TET3 alleles were confirmed to have out-of-frame deletions was used to target the TET1 locus (Horizon gRNA1778). Two clones were confirmed to have out-of-frame deletions for both TET1 alleles. Two triple knock-out (TKO) clones, SW480-TKO-217 and SW480-TKO-6B12, were used in this study. TET expression was compared to WT SW480 cells by western blotting and RTqPCR.
Transfection
To recover the expression of Tet2 in SW480 TET TKO lines, cells cultured to 75% confluency in 12-well plates before being transfected with 200 μg of mTet2 plasmid (FH-Tet2-pEF was a gift from Anjana Rao (Addgene plasmid # 41,710; http://n2t.net/addgene:41710; RRID:Addgene_41710), [78]) using Lipofectamine 2000 according to manufacturer’s instructions. Cells were then incubated for 24 h prior to use in downstream experiments.
Zebrafish tumour xenograft assays
Tumour cells were labelled with 1,1′-dioctadecyl-3,3,3′,3′-tetramethylindocarbocyanine perchlorate (Dil) stain. Briefly, a 5 μM solution of Vybrant-CM-DiI (Invitrogen V- 22888) resuspended in DMEM and injected into the perivitelline space (PVS) of 48 days post fertilisation (dpf) Casper (roy−/−; nacre−/−) embryos. Xenografts were incubated at 33 °C for 48 h, followed by counting of metastatic foci under a fluorescent microscope. Prior to injection, the embryos were de-chorionated, anaesthetised with 164 mg/L tricaine (Merck A5040) and mounted onto glass slides with 100 mg/ml low-melting point agarose. A micromanipulator and gas injection system was used to inject the cells into the embryos. SW480 cells seeded into 6-well culture plates and grown to 90% confluency, the media was then removed, and the cells were washed in PBS several times before labelling. Cells were then washed three times in PBS before being labelled in with the Dil stain for 15 min at 37 °C. After labelling, cells were washed for 5 min in PBS three times, before being trypsinised and resuspended to a concentration of 2 × 105 cells/μl. Labelled cells were stored on ice until injection. After injection, zebrafish embryos were released from the agarose, placed into petri dishes containing fresh embryo medium and were incubated at 33 °C for 48 h. The xenografts were then fixed in 4% paraformaldehyde, before being visualised using a confocal microscope. The presence/absence of migratory cells and the number of distant foci were recorded for each embryo. The scoring was done by three independent researchers, two of whom were blinded as to which embryos had been injected with control or test cells.
Data availability
All data generated or analysed during this study are included in this published article, its supplementary information files, and publicly available repositories. The datasets generated and analysed during the current study are available in the GEO database repository, BioProject accession PRJNA206436 and GEO Series GSE268934 [79]. URL: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE268934. Supporting data values for figures are included in Additional File 6.
Abbreviations
- 5caC:
-
5-Carboxylcytosine
- 5fC:
-
5-Formylcytosine
- 5hmC:
-
5-Hydroxymethylcytosine
- 5mC:
-
5-Methylcytosine
- AITAC:
-
Accurate Inference of Tumour purity and Absolute Copy number
- ATCC:
-
American Type Culture Collection
- Cas9:
-
CRISPR associated protein 9
- CADM1 :
-
Cell adhesion molecule 1
- DAPK1 :
-
Death-associated protein kinase
- CDH2 :
-
Cadherin 2
- CIMP:
-
CpG island methylator phenotype
- CNV:
-
Copy number variation
- CRC:
-
Colorectal cancer
- CRISPR:
-
Clustered Regularly Interspaced Short Palindromic Repeats
- Dil:
-
1,1′-Dioctadecyl- 3,3,3′,3′-tetramethylindocarbocyanine perchlorate
- DMEM:
-
Dulbecco’s modified eagle medium
- ESR1 :
-
Oestrogen receptor alpha 1
- FBS:
-
Foetal bovine serum
- FN1:
-
Fibronectin
- Fwd:
-
Forward
- HDAC 2:
-
Histone deacetylase 2
- HIF1-alpha:
-
Hypoxia-inducible factor 1 alpha
- IGV:
-
Integrative genomics viewer
- hMeDIP/hMeDIPseq:
-
HydroxyMethylcytosine DNA Immunoprecipitation sequencing
- IDH1/2 :
-
Isocitrate dehydrogenase 1/2 gene
- KEGG:
-
Kyoto Encyclopaedia of Genes and Genomes
- LIMS2 :
-
LIM zinc finger domain containing 2 gene
- LT:
-
Liver tumours
- MBD-seq:
-
Methyl binding domain sequencing
- NC:
-
Normal colon
- NL:
-
Normal liver
- OGT:
-
O-glcNAC transferase
- PAX5:
-
Paired box protein
- PBS:
-
Phosphate buffered saline
- PCA:
-
Principal component analysis
- PLG :
-
Plasmin heavy chain A
- PPI:
-
Protein-protein interaction
- PVS:
-
Perivitelline space
- REV:
-
Reverse
- RS:
-
Recovered and spread
- SEA:
-
Simple enrichment analysis
- SALL4:
-
Spalt-like protein
- TC:
-
Tumour colon
- TET :
-
Ten-eleven-translocation dioxygenase gene
- TGF-beta:
-
Transforming growth factor beta gene
- TNF:
-
Tumour necrosis factor
- TSS:
-
Transcription start site
- ZNF770:
-
Zinc-finger protein 770
- ZNF121:
-
Zinc-finger protein 12
References
Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–49.
Riihimäki M, Hemminki A, Sundquist J, Hemminki K. Patterns of metastasis in colon and rectal cancer. Sci Rep. 2016;6:29765.
Chatterjee A, Rodger EJ, Eccles MR. Epigenetic drivers of tumourigenesis and cancer metastasis. Semin Cancer Biol. 2018;51:149–59.
Pretzsch E, Bösch F, Neumann J, Ganschow P, Bazhin A, Guba M, et al. Mechanisms of metastasis in colorectal cancer and metastatic organotropism: hematogenous versus peritoneal spread. J Oncol. 2019;2019:7407190.
Teng S, Li YE, Yang M, Qi R, Huang Y, Wang Q, et al. Tissue-specific transcription reprogramming promotes liver metastasis of colorectal cancer. Cell Res. 2020;30(1):34–49.
Nishiyama A, Nakanishi M. Navigating the DNA methylation landscape of cancer. Trends Genet. 2021;37(11):1012–27.
Issa JP. CpG island methylator phenotype in cancer. Nat Rev Cancer. 2004;4(12):988–93.
Bae JM, Kim JH, Cho NY, Kim TY, Kang GH. Prognostic implication of the CpG island methylator phenotype in colorectal cancers depends on tumour location. Br J Cancer. 2013;109(4):1004–12.
Jung G, Hernandez-Illan E, Moreira L, Balaguer F, Goel A. Epigenetics of colorectal cancer: biomarker and therapeutic potential. Nat Rev Gastroenterol Hepatol. 2020;17(2):111–30.
Tse JWT, Jenkins LJ, Chionh F, Mariadason JM. Aberrant DNA methylation in colorectal cancer: what should we target? Trends Cancer. 2017;3(10):698–712.
Tahiliani M, Koh KP, Shen Y, Pastor WA, Bandukwala H, Brudno Y, et al. Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science. 2009;324(5929):930–5.
Ito S, D’Alessio AC, Taranova OV, Hong K, Sowers LC, Zhang Y. Role of Tet proteins in 5mC to 5hmC conversion, ES-cell self-renewal and inner cell mass specification. Nature. 2010;466(7310):1129–33.
Bachman M, Uribe-Lewis S, Yang X, Burgess HE, Iurlaro M, Reik W, et al. 5-Formylcytosine can be a stable DNA modification in mammals. Nat Chem Biol. 2015;11(8):555–7.
Bachman M, Uribe-Lewis S, Yang X, Williams M, Murrell A, Balasubramanian S. 5-Hydroxymethylcytosine is a predominantly stable DNA modification. Nat Chem. 2014;6(12):1049–55.
Han D, Lu X, Shih AH, Nie J, You Q, Xu MM, et al. A highly sensitive and robust method for genome-wide 5hmC profiling of rare cell populations. Mol Cell. 2016;63(4):711–9.
Konstandin N, Bultmann S, Szwagierczak A, Dufour A, Ksienzyk B, Schneider F, et al. Genomic 5-hydroxymethylcytosine levels correlate with TET2 mutations and a distinct global gene expression pattern in secondary acute myeloid leukemia. Leukemia. 2011;25(10):1649–52.
Gonzalez-Avalos E, Onodera A, Samaniego-Castruita D, Rao A, Ay F. Predicting gene expression state and prioritizing putative enhancers using 5hmC signal. Genome Biol. 2024;25(1):142.
Yang H, Liu Y, Bai F, Zhang JY, Ma SH, Liu J, et al. Tumor development is associated with decrease of TET gene expression and 5-methylcytosine hydroxylation. Oncogene. 2013;32(5):663–9.
Udali S, De Santis D, Ruzzenente A, Moruzzi S, Mazzi F, Beschin G, et al. DNA methylation and hydroxymethylation in primary colon cancer and synchronous hepatic metastasis. Front Genet. 2017;8:229.
Udali S, Guarini P, Moruzzi S, Ruzzenente A, Tammen SA, Guglielmi A, et al. Global DNA methylation and hydroxymethylation differ in hepatocellular carcinoma and cholangiocarcinoma and relate to survival rate. Hepatology. 2015;62(2):496–504.
Uribe-Lewis S, Stark R, Carroll T, Dunning MJ, Bachman M, Ito Y, et al. 5-Hydroxymethylcytosine marks promoters in colon that resist DNA hypermethylation in cancer. Genome Biol. 2015;16(1):69.
Uribe-Lewis S, Carroll T, Menon S, Nicholson A, Manasterski PJ, Winton DJ, et al. 5-Hydroxymethylcytosine and gene activity in mouse intestinal differentiation. Sci Rep. 2020;10(1):546.
Zhang M, Wang J, Zhang K, Lu G, Liu Y, Ren K, et al. Ten-eleven translocation 1 mediated-DNA hydroxymethylation is required for myelination and remyelination in the mouse brain. Nat Commun. 2021;12(1):5091.
Kim R, Sheaffer KL, Choi I, Won KJ, Kaestner KH. Epigenetic regulation of intestinal stem cells by Tet1-mediated DNA hydroxymethylation. Genes Dev. 2016;30(21):2433–42.
Pastor WA, Pape UJ, Huang Y, Henderson HR, Lister R, Ko M, et al. Genome-wide mapping of 5-hydroxymethylcytosine in embryonic stem cells. Nature. 2011;473(7347):394–7.
Lee JR, Kwon CH, Choi Y, Park HJ, Kim HS, Jo HJ, et al. Transcriptome analysis of paired primary colorectal carcinoma and liver metastases reveals fusion transcripts and similar gene expression profiles in primary carcinoma and liver metastases. BMC Cancer. 2016;16:539.
Orjuela S, Menigatti M, Schraml P, Kambakamba P, Robinson MD, Marra G. The DNA hypermethylation phenotype of colorectal cancer liver metastases resembles that of the primary colorectal cancers. BMC Cancer. 2020;20(1):290.
Lee JR, Kwon CH, Choi Y, Park HJ, Kim HS, Jo HJ, et al. Transcriptome analysis of paired primary colorectal carcinoma and liver metastases reveals fusion transcripts and similar gene expression profiles in primary carcinoma and liver metastases. Dataset Bioproject Accession PRJNA288518. 2016. https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA288518.
Loboda A, Nebozhyn MV, Watters JW, Buser CA, Shaw PM, Huang PS, et al. EMT is the dominant program in human colon cancer. BMC Med Genomics. 2011;4:9.
Loboda A, Nebozhyn MV, Watters JW, Buser CA, Shaw PM, Huang PS, et al. EMT is the dominant program in human colon cancer. NCBI GEO Dataset GSE28722. 2011. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE28722.
Yuan X, Li Z, Zhao H, Bai J, Zhang J. Accurate inference of tumor purity and absolute copy numbers from high-throughput sequencing data. Front Genet. 2020;11:458.
Rick JW, Chandra A, Dalle Ore C, Nguyen AT, Yagnik G, Aghi MK. Fibronectin in malignancy: cancer-specific alterations, protumoral effects, and therapeutic implications. Semin Oncol. 2019;46(3):284–90.
Kaszak I, Witkowska-Pilaszewicz O, Niewiadomska Z, Dworecka-Kaszak B, Ngosa Toka F, Jurka P. Role of cadherins in cancer-a review. Int J Mol Sci. 2020;21(20):7624.
Zhou Y, Jia Q, Meng X, Chen D, Zhu B. ERRalpha regulates OTUB1 expression to promote colorectal cancer cell migration. J Cancer. 2019;10(23):5812–9.
Bharadwaj AG, Holloway RW, Miller VA, Waisman DM. Plasmin and plasminogen system in the tumor microenvironment: implications for cancer diagnosis, prognosis, and therapy. Cancers (Basel). 2021;13(8):1838.
Li H, Gao J, Zhang S. Functional and clinical characteristics of cell adhesion molecule CADM1 in cancer. Front Cell Dev Biol. 2021;9:714298.
Jin YJ, Aycheh HM, Han S, Chamberlin J, Shin J, Byun S, et al. Differential alternative splicing between hepatocellular carcinoma with normal and elevated serum alpha-fetoprotein. BMC Med Genomics. 2020;13(Suppl 11):194.
Li H, Ray G, Yoo BH, Erdogan M, Rosen KV. Down-regulation of death-associated protein kinase-2 is required for beta-catenin-induced anoikis resistance of malignant epithelial cells. J Biol Chem. 2009;284(4):2012–22.
Steinmann S, Kunze P, Hampel C, Eckstein M, Bertram Bramsen J, Muenzner JK, et al. DAPK1 loss triggers tumor invasion in colorectal tumor cells. Cell Death Dis. 2019;10(12):895.
Orjuela S. Sequencing of DNA captured with a methyl-CpG binding domain (MBD) of primary colorectal cancers, with paired normal tissue, and liver metastases from different patients., biostudies-arrayexpress, V1; 1970. https://www.ebi.ac.uk/biostudies/studies/E-MTAB-8232.
Harris RA, Wang T, Coarfa C, Nagarajan RP, Hong C, Downey SL, et al. Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications. Nat Biotechnol. 2010;28(10):1097–105.
Hashimoto H, Liu Y, Upadhyay AK, Chang Y, Howerton SB, Vertino PM, et al. Recognition and potential mechanisms for replication and erasure of cytosine hydroxymethylation. Nucleic Acids Res. 2012;40(11):4841–9.
Lee LM, Seftor EA, Bonde G, Cornell RA, Hendrix MJ. The fate of human malignant melanoma cells transplanted into zebrafish embryos: assessment of migration and cell division in the absence of tumor formation. Dev Dyn. 2005;233(4):1560–70.
Marques IJ, Weiss FU, Vlecken DH, Nitsche C, Bakkers J, Lagendijk AK, et al. Metastatic behaviour of primary human tumours in a zebrafish xenotransplantation model. BMC Cancer. 2009;9:128.
Pruvot B, Jacquel A, Droin N, Auberger P, Bouscary D, Tamburini J, et al. Leukemic cell xenograft in zebrafish embryo for investigating drug efficacy. Haematologica. 2011;96(4):612–6.
Kopsida M, Liu N, Kotti A, Wang J, Jensen L, Jothimani G, et al. RhoB expression associated with chemotherapy response and prognosis in colorectal cancer. Cancer Cell Int. 2024;24(1):75.
Povoa V, Rebelo de Almeida C, Maia-Gil M, Sobral D, Domingues M, Martinez-Lopez M, et al. Innate immune evasion revealed in a colorectal zebrafish xenograft model. Nat Commun. 2021;12(1):1156.
Ecsedi S, RodrÃguez-Aguilera JR, Hernandez-Vargas H. 5-Hydroxymethylcytosine (5hmC), or how to identify your favorite cell. Epigenomes. 2018;2(1):3.
Li W, Zhang X, Lu X, You L, Song Y, Luo Z, et al. 5-Hydroxymethylcytosine signatures in circulating cell-free DNA as diagnostic biomarkers for human cancers. Cell Res. 2017;27(10):1243–57.
Huang Y, Rao A. Connections between TET proteins and aberrant DNA modification in cancer. Trends Genet. 2014;30(10):464–74.
Xu T, Gao H. Hydroxymethylation and tumors: can 5-hydroxymethylation be used as a marker for tumor diagnosis and treatment? Hum Genomics. 2020;14(1):15.
Wang F, Zhang J, Qi J. Ten-eleven translocation-2 affects the fate of cells and has therapeutic potential in digestive tumors. Chronic Dis Transl Med. 2019;5(4):267–72.
Tucker DW, Getchell CR, McCarthy ET, Ohman AW, Sasamoto N, Xu S, et al. Epigenetic reprogramming strategies to reverse global loss of 5-hydroxymethylcytosine, a prognostic factor for poor survival in high-grade serous ovarian cancer. Clin Cancer Res. 2018;24(6):1389–401.
Sajadian SO, Ehnert S, Vakilian H, Koutsouraki E, Damm G, Seehofer D, et al. Induction of active demethylation and 5hmC formation by 5-azacytidine is TET2 dependent and suggests new treatment strategies against hepatocellular carcinoma. Clin Epigenetics. 2015;7(1):98.
Ge G, Peng D, Xu Z, Guan B, Xin Z, He Q, et al. Restoration of 5-hydroxymethylcytosine by ascorbate blocks kidney tumour growth. EMBO Rep. 2018;19(8): e45401.
Gillberg L, Ørskov AD, Nasif A, Ohtani H, Madaj Z, Hansen JW, et al. Oral vitamin C supplementation to patients with myeloid cancer on azacitidine treatment: normalization of plasma vitamin C induces epigenetic changes. Clin Epigenetics. 2019;11(1):143.
Gustafson CB, Yang C, Dickson KM, Shao H, Van Booven D, Harbour JW, et al. Epigenetic reprogramming of melanoma cells by vitamin C treatment. Clin Epigenetics. 2015;7(1):51.
Liu Y, Xu S, Zu T, Li F, Sang S, Liu C, et al. Reversal of TET-mediated 5-hmC loss in hypoxic fibroblasts by ascorbic acid. Lab Invest. 2019;99(8):1193–202.
Matuleviciute R, Cunha PP, Johnson RS, Foskolou IP. Oxygen regulation of TET enzymes. Febs j. 2021;288(24):7143–61.
Shenoy N, Bhagat TD, Cheville J, Lohse C, Bhattacharyya S, Tischer A, et al. Ascorbic acid-induced TET activation mitigates adverse hydroxymethylcytosine loss in renal cell carcinoma. J Clin Invest. 2019;129(4):1612–25.
Cheng J, Guo S, Chen S, Mastriano SJ, Liu C, D’Alessio AC, et al. An extensive network of TET2-targeting MicroRNAs regulates malignant hematopoiesis. Cell Rep. 2013;5(2):471–81.
Joshi K, Liu S, Breslin SJP, Zhang J. Mechanisms that regulate the activities of TET proteins. Cell Mol Life Sci. 2022;79(7):363.
Tsai KW, Li GC, Chen CH, Yeh MH, Huang JS, Tseng HH, et al. Reduction of global 5-hydroxymethylcytosine is a poor prognostic factor in breast cancer patients, especially for an ER/PR-negative subtype. Breast Cancer Res Treat. 2015;153(1):219–34.
Wu MZ, Chen SF, Nieh S, Benner C, Ger LP, Jan CI, et al. Hypoxia drives breast tumor malignancy through a TET-TNFalpha-p38-MAPK signaling axis. Cancer Res. 2015;75(18):3912–24.
Deng M, Zhang R, He Z, Qiu Q, Lu X, Yin J, et al. TET-mediated sequestration of miR-26 drives EZH2 expression and gastric carcinogenesis. Cancer Res. 2017;77(22):6069–82.
Filipczak PT, Leng S, Tellez CS, Do KC, Grimes MJ, Thomas CL, et al. p53-suppressed oncogene TET1 prevents cellular aging in lung cancer. Cancer Res. 2019;79(8):1758–68.
Chen LY, Huang RL, Chan MW, Yan PS, Huang TS, Wu RC, et al. TET1 reprograms the epithelial ovarian cancer epigenome and reveals casein kinase 2alpha as a therapeutic target. J Pathol. 2019;248(3):363–76.
Yang G, Zeng X, Wang M, Wu A. The TET2/E-cadherin/β-catenin regulatory loop confers growth and invasion in hepatocellular carcinoma cells. Exp Cell Res. 2018;363(2):218–26.
Muller T, Gessi M, Waha A, Isselstein LJ, Luxen D, Freihoff D, et al. Nuclear exclusion of TET1 is associated with loss of 5-hydroxymethylcytosine in IDH1 wild-type gliomas. Am J Pathol. 2012;181(2):675–83.
Pardali K, Moustakas A. Actions of TGF-beta as tumor suppressor and pro-metastatic factor in human cancer. Biochim Biophys Acta. 2007;1775(1):21–62.
Domcke S, Bardet AF, Adrian Ginno P, Hartl D, Burger L, Schubeler D. Competition between DNA methylation and transcription factors determines binding of NRF1. Nature. 2015;528(7583):575–9.
Xiong J, Zhang Z, Chen J, Huang H, Xu Y, Ding X, et al. Cooperative action between SALL4A and TET proteins in stepwise oxidation of 5-methylcytosine. Mol Cell. 2016;64(5):913–25.
Luo A, Zhang X, Fu L, Zhu Z, Dong JT. Zinc finger factor ZNF121 is a MYC-interacting protein functionally affecting MYC and cell proliferation in epithelial cells. J Genet Genomics. 2016;43(12):677–85.
Hill D, Chen L, Snaar-Jagalska E, Chaudhry B. Embryonic zebrafish xenograft assay of human cancer metastasis. F1000Res. 2018;7:1682.
Ibrahim AE, Arends MJ, Silva AL, Wyllie AH, Greger L, Ito Y, et al. Sequential DNA methylation changes are associated with DNMT3B overexpression in colorectal neoplastic progression. Gut. 2011;60(4):499–508.
Aguirre-Gamboa R, Gomez-Rueda H, Martinez-Ledesma E, Martinez-Torteya A, Chacolla-Huaringa R, Rodriguez-Barrientos A, et al. SurvExpress: an online biomarker validation tool and database for cancer gene expression data using survival analysis. PLoS ONE. 2013;8(9):e74250.
Ernst M. Permutation methods: a basis for exact inference. Stat Sci. 2004;19.
Ko M, Huang Y, Jankowska AM, Pape UJ, Tahiliani M, Bandukwala HS, et al. Impaired hydroxylation of 5-methylcytosine in myeloid cancers with mutant TET2. Nature. 2010;468(7325):839–43.
Murrell A, Colorectal cancer progression to metastasis is associated with dynamic genome-wide biphasic 5-hydroxymethylcytosine accumulation, Data set NCBI GEO GSE268934 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi.
Acknowledgements
Dr Anna Caldwell at the mass spectrometry facility at Kings College London.
Funding
UKRI: Medical Research Council London GB Grant numbers MR/T000481/1 and MR/P000711/1; Engineering and Physical Research Council Grant number 259868, Biotechnology and Biological Sciences Research Council 2598658. Wellcome Trust Sir Henry Dale fellowship, number 220188/A/20/Z. Royal Society RG|/R2/232121. The Academy of Medical Sciences Springboard Award SBF008/1073. Cancer Research at Bath CR@B network; The University of Cambridge, Cancer Research UK (CRUK SEB-Institute Group Award A ref10182).
Author information
Authors and Affiliations
Contributions
SU-L prepared sequencing libraries, analysed data. BM, DOH, YT, and FH performed the bioinformatics analyses and prepared figures 1-4; FH, DLR, PM, JP, TD-P, JLR and SU-L produced and analysed genomic and molecular biology data; JLR, TD-P, JS, DG and NN designed, performed and analysed zebrafish experiments and prepared figure 5, AEKI consulted on clinical analysis and coordinated patient sample selection. AM conceived and designed the study and wrote the manuscript together with BM, DOH and FH. All authors have read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Research using patient tumour samples was conducted under the principles of the World Medical Association Helsinki agreement with ethical approval obtained from the Cambridgeshire Local Research Ethics Committee (LREC references 04/Q0108/ 125 and 06/Q0108/307).
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
12915_2025_2205_MOESM1_ESM.pdf
Additional file 1: Fig. S1 Total counts of hMeDIPseq peaks (enriched over input) show metastatic tumours to the liver have more 5hmC enrichment compared to primary tumours, and hMeDIPseq peak loci in metastatic liver tumours cluster towards the colon. Fig. S2 STRING and Gene Ontology analyses of 5hmC marked genes that have differential expression in metastasis. Fig. S3 A visual examination of the 5hmC data on an integrative genomics viewer (IGV), showing a pattern of 5hmC ‘recovering’ at sites where 5hmC was present in normal colon tissue, absent in the colon tumour and recovered in the metastatic tumours. Fig. S4 CRISPR-Cas9 strategy for TET1/TET2/TET3 triple knock out (TKO) in SW480 cells and effects on 5hmC accumulation.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Murcott, B., Honig, F., Halliwell, D.O. et al. Colorectal cancer progression to metastasis is associated with dynamic genome-wide biphasic 5-hydroxymethylcytosine accumulation. BMC Biol 23, 100 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12915-025-02205-y
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12915-025-02205-y