Abstract
Pediatric glioblastoma (pedGBM) represent a highly malignant primary brain tumor with recurrent mutations in the chromatin remodeler ATRX and the histone variant H3.3 that is typically associated with a fatal outcome. ATRX acts as suppressor of the alternative lengthening of telomeres (ALT) pathway, which is frequently activated in pedGBM. However, telomere features of pedGBMs have not been studied in detail, and ALT-positive model cell lines are lacking. Here, we systematically characterized a panel of pedGBM models that carry a representative set of recurrent genomic mutations for a variety of telomere features. These included the presence of ALT-associated promyelocytic leukemia nuclear bodies and C-circles, a specific type of extrachromosomal telomeric repeats, the telomere repeat content, and phosphorylation of histone H3.3 at serine 31. From an integrated analysis of seven pedGBM cell lines and 57 primary tumor samples we identified cell lines and tumors that represent the different telomere maintenance mechanisms and conclude the following: (i) A positive signal in the C-circle assay is a reliable ALT marker. (ii) ALT features occur heterogeneously and one pedGBM subgroup uses a ‘non-canonical’ ALT mechanism in the presence of wild-type ATRX. (iii) The spreading of H3.3S31 phosphorylation during mitosis is associated with loss of ATRX but not with ALT per se. (iv) In contrast to a previous study in glioma stem cells, we did not find a hypersensitivity of ALT cells towards the ATR inhibitor VE-821. (v) ALT-positive pedGBMs can be reliably identified from a classification scheme developed here that evaluates various combinations of cytogenetic and/or genomic data. Thus, our findings elucidate further details of the ALT pathway in pedGBMs, provide valuable models for evaluating ALT targeted therapies in a preclinical setting, and introduce an ALT classification scheme for primary tumor samples.
Introduction
Cancer cells acquire a telomere maintenance mechanism (TMM) to avoid cellular senescence and apoptosis induced by the replicative shortening of their chromosome ends. Frequently, telomerase is reactivated to extend the telomeres. However, in 5-25 % of cases (depending on tumor entity), alternative lengthening of telomeres (ALT) pathways exist that operate via DNA repair and recombination processes [1-3]. In comparison to other tumor entities, the ALT phenotype has an unusually high prevalence in glioblastoma (GBM) [4-6]. With an estimate of 44% of ALT-positive cases, as determined by telomere FISH, pediatric GBMs (pedGBMs) are even more prone to develop ALT than adult GBMs (14%) [6]. The ALT phenotype shows a strong correlation with mutations in the DNA helicase and chromatin remodeler ATRX in pedGBMs (14-35%) as well as in adult GBMs (7%) [1, 5]. Interestingly, ATRX mutations often co-occur with mutations in TP53 [7, 8]. Based on these recurrent mutations and on a number of mechanistic cell line studies, it is emerging that ATRX acts as tumor suppressor and inhibits the emergence of ALT as discussed in several recent reviews [9-11]. This activity appears to be related to the proper deposition of the histone variant H3.3 at heterochromatic genomic regions such as telomeres, pericentromeres, endogenous retroviral repeats, and imprinted genes. Interestingly, pedGBMs frequently harbor mutations in H3.3, namely K27M or G34R/V, which are almost exclusively found in glial tumors of childhood but it is unclear if H3.3 mutations and the ALT phenotype are linked.
Identifying the TMM that is active in a tumor is important as it can provide valuable prognostic and potentially predictive information in some cancer types [4, 12-15]. Furthermore, deregulated TMM factors represent potential targets for anticancer therapies that are mostly unique to cancer cells. While therapies that target ALT are lacking, small molecule telomerase inhibitors are currently being tested in clinical trials [16-18]. However, treatment with telomerase inhibitors may select for the emergence of an ALT-positive tumor population [19-21]. The presence of ALT is typically inferred from a number of characteristics that include very heterogeneous telomere lengths [2, 22], ultra-bright telomere foci detected by telomere FISH in tumor tissues [1], increased telomeric recombination [23], the presence of a specific type of circular mostly single-stranded C-rich extrachromosomal telomeric repeats (C-circles) [2, 24-26] and complexes of PML nuclear bodies with telomeres, termed ALT-associated promyelocytic leukemia (PML) nuclear bodies (APBs) [23, 27, 28]. More recently, two additional features have been linked to the presence of ALT: High levels of the telomeric repeat-containing non-coding RNA TERRA, and the above-mentioned mutations in the chromatin remodeler ATRX [1, 5, 29, 30]. Furthermore, the phosphorylation of serine 31 on H3.3 has been reported to spread abnormally along mitotic chromosomes specifically in ALT-positive cells [31].
Despite the high prevalence of ALT in pedGBMs, a systematic characterization and evaluation of ALT is currently lacking. It is unclear which combination of markers is suitable for reliable identification of ALT in this entity, and what would be an applicable workflow to be used in a clinical routine procedure. Furthermore, ALT has not been characterized in cell lines derived from pedGBMs and only two adult glioblastoma cell lines with an ALT phenotype have been described to date [32, 33]. Thus, studies on the ALT mechanism in pedGBMs that require in vitro models cannot be conducted. To address these shortcomings, we here performed a comprehensive analysis of characteristic ALT features in pedGBM tumor samples from the ICGC PedBrain cohort as well as a panel of pedGBM cell lines. These cell lines can be used to evaluate potentially ALT-specific treatment approaches, such as the application of the ataxia telangiectasia- and RAD3-related (ATR) protein inhibitor, which has previously been reported to specifically target cells that have ALT [15]. Finally, we provide a classification scheme to predict the presence of ALT in a tumor sample from a subset of measured ALT features.
Results
Identification of five ALT-positive pedGBM cell lines reveals a ‘non-canonical’ ALT phenotype
We analyzed seven pedGBM cell lines including the three well-established cell lines SF188, SJ-G2 and KNS42 as well as cell lines derived from freshly resected H3.3-K27M mutant pediatric high-grade gliomas (NEM157, NEM165, NEM168) and a H3.3-G34R mutant tumor (MGBM1) that were described previously [45, 46]. Previous studies have reported various mutations in these cell lines. These included a C250T mutation in the promoter of the TERT gene encoding the protein component of telomerase and a hypomorphic ATRX point mutation in the KNS42 cell line as well as severe mutations in ATRX in SJ-G2, MGBM1 and NEM157 [43, 47]. We additionally performed whole-genome sequencing (WGS) of the NEM165 and NEM168 cell lines and found no mutation in ATRX but mutations in TP53 (Fig 1A). The panel of pedGBM cell lines covered different combinations of recurrent mutations known to occur in this entity, i.e. mutations in both ATRX and in the H3.3 encoding gene H3F3A (MGBM1, NEM157), or only in ATRX (SJ-G2) or only in H3F3A (KNS42, NEM165, NEM168). Additionally, ATRX expression was assessed by immunofluorescence. While the ATRX protein was undetectable in the severely ATRX-mutated cell lines, it was present in the other cells and localized in distinct nuclear foci (Fig 1A, S1 Fig). These foci largely colocalized with PML nuclear bodies as reported previously for HeLa cells and human fibroblasts [48, 49].
To classify the cell lines according to their TMM, we performed terminal restriction fragment (TRF) analysis to visualize the telomere length distribution in each cell line. C-circle assays were done to assess the presence of ALT-specific extrachromosomal telomeric repeat structures. As references, the ALT-positive U2OS cell line and the telomerase-positive HeLa cell lines were included in the analyses. In the TRF blot, SF188 and KNS42 cells showed a homogenous distribution of telomere lengths characteristic of telomerase-positive cell lines, as exemplified by the reference HeLa cell line (Fig 1B). In contrast, ALT-positive tumors or cell lines typically displayed a sustained smear on the TRF blot that ranges from less than 2 kb to more than 50 kb indicating a heterogeneous distribution of telomere lengths as exemplified by the ALT-positive U2OS cell line. It was observed for SJ-G2, MGBM1, NEM157, NEM165 and NEM168 (Fig 1B).
Those five cell lines were also found to be positive in the C-circle assay, albeit with strong variations regarding the levels of these extrachromosomal telomeric repeat structures, while the SF188 and KNS42 cells were negative in the C-circle assay (Fig 1C). Based on the TRF and C-circle analysis and the TERT expression levels, SJ-G2, MGBM1, NEM157, NEM165 and NEM168 were identified as ALT-positive (Table 1). Notably, two of these, namely NEM165 and NEM168, have wild-type ATRX and showed normal protein expression and localization (Fig 1A, S1 Fig). This finding is particularly interesting, since the vast majority of ALT cell lines harbor ATRX mutations and/or show an aberrant ATRX protein expression [30]. Additionally, ATRX has been shown to act as an ALT suppressor [50, 51]. Since mutations in the ATRX binding partner DAXX have also been associated with ALT [1], we further examined DAXX in the NEM165 and NEM168 cell lines but did not find any mutations. Thus, we propose that the NEM165 and NEM168 cell lines employ an ALT pathway in the presence of functional ATRX/DAXX, which is in line with a recent study that suggested a high fraction of tumors with neither TERT activation nor ATRX/DAXX mutations [52]. Since both of the ALT-positive, ATRX wild type cell lines harbored the H3.3-K27M mutation, we further investigated whether stable ectopic expression of H3.3 mutants in SF188, SJ-G2 and HeLa LT (long telomeres) cells is sufficient to induce or enhance the ALT phenotype. The HeLa LT cell line was selected because for these cells a successful induction of ALT has been reported previously [53]. Long-term ectopic expression of H3.3 mutants was not sufficient to induce or enhance an ALT phenotype as judged from the C-circle assay (S2 Fig).
We next analyzed a set of pedGBM samples from the ICGC PedBrain cohort by telomere FISH and C-circle assay to determine ALT activity, and compared the results with the matched sequencing information either generated in the present study or previously reported [43] (Fig 2, S3 Fig, S1 Table). The tumor samples were classified as ALT-positive based on the C-circle assay and the presence of ultra-bright telomere foci detectable by telomere FISH (Fig 2) as described in further detail below in the context of the ALT classifier scheme. Out of the 13 ALT-positive samples, 12 were found to harbor a mutation in ATRX, while one sample had a wild-type ATRX DNA sequence (S1 Table). However, this tumor (GBM33) was negative for ATRX expression when tested by immunohistochemistry, suggesting a DNA sequence independent mechanism for deregulated protein expression. Thus, the unusual ALT subgroup identified in the cell lines with functional ATRX protein was not represented in our primary tumor sample set.
ALT features are heterogeneous in ALT-positive pedGBM cell lines
A comprehensive study of ALT features in pedGBM was lacking so far. Accordingly, we characterized additional ALT features in our set of pedGBM cell lines (Fig 3) that are summarized in Table 1 together with the TMM status. The heterogeneous telomere length of ALT-positive cells evaluated in the TRF blot (Fig 1B) is frequently associated also with a higher telomere content [29]. In line with this previous observation, the telomere content of the pedGBM cell lines as determined by quantitative PCR was found to be higher in ALT-positive cells (Fig 3A). A particularly high telomere content was measured for the NEM168 cell line. Next, the presence of APBs, defined as colocalization between telomeres and PML nuclear bodies, was analyzed by telomere FISH and immunofluorescence of PML. The telomere pattern differed between the cell lines in terms of signal intensity and detectable telomere foci, again indicative of varying telomere lengths (Fig 3B). Yet, the difference in telomere lengths was much less evident than in the TRF analysis (Fig 1B). We further note that ultra-bright foci frequently detected in tissue sections of ALT-positive tumor samples are absent in cell lines. Thus, visual inspection of telomere FISH signals in cell lines is not sufficient for reliable TMM classification. For quantification of APBs, a previously established automated confocal 3D-image acquisition and analysis was employed [40]. In the non-ALT cell lines 0.5 ± 0.1 colocalizations of telomeres and PML bodies were detected per cell, which is similar to the reference telomerase-positive HeLa cell line (Fig 3C). The number of colocalizations per cell was consistently higher in the ALT-positive cell lines. These displayed a rather large variation between 0.8 APBs to 8.5 APBs per cell (Fig 3C). The latter was determined in the ATRX and H3.3-G34R mutated MGBM1 cell line, which thus had about twice the number of APBs as the ALT-positive U2OS reference cell line. Interestingly, the MGBM1 cell line also had a particularly high amount of C-circles (Fig 1C).
Since the telomeric non-coding RNA TERRA has been implicated in the ALT mechanism (reviewed in [11]), we next determined TERRA levels in each pedGBM cell line by quantitative reverse transcription PCR (qRT-PCR) in relation to HeLa and U2OS cells (Fig 3D). TERRA levels in the telomerase-positive cells were about 10-fold lower than in U2OS cells (Fig 3D), an ALT cell line with high TERRA levels [30]. TERRA levels in the ALT-positive SJ-G2 and MGBM1 cells were similar to U2OS. Interestingly, relative TERRA levels in the ALT-positive H3.3-K27M mutant NEM165 and NEM168 cell lines were determined to be 5-fold less than in U2OS cells, while still being 2-fold higher than in HeLa cells. This indicates that in general TERRA is a good marker for ALT. However, ALT determination solely based on TERRA levels could lead to an inaccurate classification. TERRA levels have been reported to correlate with hypomethylation of the subtelomeres, where the TERRA promoter is located [29, 54]. Accordingly, methylation at CpG sites located within one megabase (Mb) of the chromosome end was analyzed based on Illumina 450K methylation array data [39]. These regions were not found to be differentially methylated in the two H3.3-G34R/V mutant cell lines studied here (S4 Fig).
Aberrant H3.3 serine 31 phosphorylation is linked to ATRX loss rather than ALT per se
A recent study reported that ALT-positive cells specifically display high levels of histone H3.3 serine 31 phosphorylation (H3.3S31p) on the entire chromosome, due to an elevated activity of CHK1 serine/threonine kinase [31]. We therefore analyzed the distribution of H3.3S31p on metaphase chromosomes of the pedGBM cell lines. For telomerase-positive pedGBM cells, H3.3S31p was confined to a region close to the centromeres, in line with previous reports [31, 55]. In contrast, the phosphorylation was spread over the entire chromosomes in ATRX-mutated ALT-positive cells (Fig 4). Notably, this was not the case for ATRX wild type ALT-positive NEM165 and NEM168 cells. In the latter cell lines H3.3S31p was mainly restricted to the centromeres and pericentromeres, resembling the H3.3S31p staining of telomerase-positive cells. A similar observation has been reported previously for the ALT-positive, ATRX wild type SKLU1 cell line [31]. Hence, the aberrant H3.3S31p correlates with the absence of ATRX loss rather than ALT activity. The wild-type H3F3A containing SJ-G2 cell line displayed the same H3.3S31 phosphorylation pattern mitotic chromosomes as the cell lines MGBM1 and NEM157, which harbor the G34R- and the K27M-mutations, respectively. Thus, the presence of H3F3A mutations did not show any obvious correlation with the aberrant H3.3S31p mark.
ALT-positive pedGBM cells are not hypersensitive to ATR inhibition
An intact shelterin complex protects telomeres from being erroneously recognized by the DNA damage response machinery, including ATR protein kinase signaling. In ALT-positive cells, ATR inhibition leads to a reduction of ALT markers [15, 53]. Inhibition of ATR in combination with other treatments has also been reported to have an effect on viability of cancer cells in general [56, 57]. In addition, a hypersensitivity of ALT-positive tumors has been proposed by Flynn et al. [15]. Accordingly, we tested ATR inhibitor sensitivity for the panel of 7 pedGBM cell lines introduced here. In line with our previous study on non-glioma cell lines [42], we found no correlation of ALT status and response to the VE-821 ATR inhibitor (Fig 5A). Instead, the sensitivity varied between cell lines irrespective of the telomere maintenance mechanism. For example, the ALT cell line SJ-G2 was highly sensitive to the ATR inhibitor, whereas another ALT-positive cell line, MGBM1, was resistant to the treatment. In addition, we quantified induced cell death by FACS analysis of annexin V and propidium iodide stained ALT and non-ALT cells treated with 3 μM VE-821 for 6 days (Fig 5B, S5A Fig). To better compare the results of the cell viability assays with FACS results, we tested the influence of cell density on VE-821 sensitivity and observed that a higher starting cell density results in a lower sensitivity (S5B Fig). This is consistent with our previous observation of a strong influence of the initial cell number in this assay [42]. In general, we did not observe a selective killing of ALT cells by ATR inhibition in these experiments.
In order to validate this finding in an isogenic setting, we generated ATRX knockout cells of one of the pedGBM cell lines, NEM168, using CRISPR/Cas9. ATRX has been shown to repress the ALT phenotype upon re-introduction into ATRX-deficient, ALT-positive cells [42, 50]. Accordingly, we detected a more pronounced ALT phenotype in ALT-positive NEM168 cells after ATRX knockout (Fig 5C) when evaluating C-circle levels. We next tested whether this enhanced ALT phenotype affects VE-821 sensitivity, which would be expected if ALT rendered cancer cells hypersensitive towards ATR inhibitors as proposed [15]. However, the loss of ATRX and concomitant enhancement of the ALT phenotype did not result in an increased VE-821 sensitivity (Fig 5D). In summary, characteristics of the cell lines other than the TMM determine the cellular response to ATR inhibition, which confirms our previous observations in non-glioma cells [42].
A selected set of features defines the TMM status in a pedGBM tumor cohort
We next analyzed pedGBM tumors from the ICGC cohort in more detail by integrating their previously published (epi)genetic features [43] in relation to their telomere maintenance mechanism. Based on the C-circle assay results introduced above, we classified 13 samples as ALT-positive (Fig 2B, S3 Fig, S1 Table). A positive C-circle signal is a reliable indicator for the presence of ALT [26]. However, the lack of signal in the C-circle reaction has to be interpreted with caution. The mostly single-stranded C-circles can rapidly degrade during repeated freeze-thaw cycles and may thus have been lost during storage of the tumor DNA. For a non-overlapping fraction of samples from the same cohort, telomere FISH was performed to detect ultra-bright telomere foci (Fig 2A, S1 Table). A high agreement between telomere FISH and C-circle assay was detected in our study. This finding corroborates previous observations of ultra-bright telomeric foci in tissue sections being a good ALT marker [1]. Based on these observations, we defined those samples with a positive C-circle result as ‘ALT’ and the ones that tested negative for both C-circles and ultra-bright foci as ‘non-ALT’. In addition, the C250T and C228T mutations in the TERT gene promoter were used as a reference feature for the telomerase-positive non-ALT phenotype. The C250T/C228T nucleotide exchanges create novel binding sites for ETS transcription factors and result in TERT expression [58, 59]. Of the three tumor samples that carry the C228T mutation, two were subjected to the C-circle assay and revealed no signal (S1 Table). This finding is in line with previous literature and the hypothesis that the occurrence of ALT and TERT promoter mutations are mutually exclusive [60].
For the pedGBMs samples (epi)genetic maps were determined previously [43]. These data were evaluated here in terms of the TMM status for our reference data set as defined above. Based on previous published literature we selected the following features as informative for a TMM classification (S1 Table): (i) Loss of function of the chromatin remodeler ATRX as inferred from sequencing analysis and/or the lack of protein expression seen in immunohistochemistry (IHC) strongly correlated with ALT. (ii) Among the 7 tumor samples that harbored mutations in the H3.3 encoding H3F3A gene and were analyzed with respect to their ALT status, 5 were found to be ALT-positive. (iii) In line with previous reports that associated TP53 mutations with ALT, 12 of the 13 C-circle-positive tumors also harbored TP53 mutations [61-63]. (iv) Chromothripsis, the mutational ‘shattering’ of large parts of a chromosome has been functionally connected to dysfunctional telomeres and ALT activity [64, 65]. In our set of reference samples we found a higher number of chromothripsis-positive samples that were non-ALT compared to ALT-positive ones. Thus, there is no association between ALT and chromothripsis. (v) TERT expression as detected by RNA-seq. In line with previous reports the number of TERT transcripts was low (S6A and S6B Figs), but their detection still correlated with the classification as ‘non-ALT’, further supporting the notion that in these samples telomerase is activated [66, 67]. (vi) The DNA methylation status of a region upstream of the TERT transcription start site was used as an additional surrogate marker for TERT expression [68]. The samples evaluated as ALT-positive showed consistently lower TERT promoter methylation compared to the ‘non-ALT’-samples. Thus, the latter group or part of it maintains their telomeres via a promoter DNA methylation-linked upregulation of telomerase (S1 Table, S6C and S6D Figs). (vii) As described above for the pedGBM cell lines, telomere content was usually higher in ALT-positive cells compared to ALT-negative ones (Fig 3A). Using qPCR we measured the relative telomere content in a subset of pedGBM tumor samples where tumor and matched control blood samples were available. As expected, the ‘ALT’ tumor reference samples displayed a somewhat higher telomere content ratio than the ‘non-ALT’ group (S6E and S6F Figs).
The analysis of differentially expressed genes in ALT and non-ALT tumors from RNA-seq data revealed 141 significant genes (adjusted p-value ≤ 0.05, 65 genes with adjusted p-value ≤ 0.01). However, no obvious TMM gene signature was identified that could be used for a classification (S7 Fig).
In summary, we defined a reference data set for TMM classification by evaluating samples with a positive C-circle signal as ‘ALT’ and those without C-circle signal in combination with the absence of ultra-bright foci in telomere FISH as well as samples with C250T/C228T TERT promoter mutations as ‘non-ALT’. Moreover, the additional features described above were selected to provide further information on the TMM status by integrating them into a classification scheme.
The ALT status can be evaluated with a decision tree-based classifier
Next, we systematically evaluated the relation of the above-described features with ALT occurrence. Furthermore, we identified suitable combinations to predict if ALT is active in a given pedGBM sample. This analysis was conducted by developing a decision tree-based ALT classifier for pedGBM tumors. At http://www.cancertelsys.org/paint/index.html a web-based tool termed PAINT (Predicting Alt IN Tumors) is available. PAINT makes predictions from the available, frequently incomplete data set of a sample and allows the integration of results from very different techniques, i.e. DNA/RNA-sequencing, methylation array, IHC, qPCR, and FISH. For constructing such a classifier, the values of all features introduced above were translated into binary values and then used to construct decision trees as described in Materials and Methods (Fig 6A). The accuracy of the resulting decision tree is given as a performance value P where P = 1 represents 100% correct prediction with the samples from the reference data set. In addition, a p-value (p) was calculated for each tree based on the confusion matrix to test if it is better than a random one. Using each feature alone, ultra-bright telomeric foci (P = 0.95), ATRX protein expression (P = 0.90), and ATRX mutation (P = 0.89) were the best predictors for ALT in pedGBM (Fig 6A, S2 Table). On the other hand, information on chromothripsis alone is not able to predict the class reliably (P = 0.63, p > 0.05). The decision tree based only on telomere content is also not associated with a significant p-value (S2 Table). For the latter case, the low number of reference samples should be taken into account (n = 15), and the addition of further samples to the training set might improve predictions based on this feature in the future. The ATRX status has a strong predictive power in pedGBM and addition of further features such as TERT expression and telomere content improves the performance of the resulting decision tree only marginally from P = 0.90 to P = 0.91 (p = 3.12 x 10−5) (Fig 6B). On the other hand, merging features that on their own were only relatively weak predictors, e.g. TP53 mutation (P = 0.69) and TERT expression (P = 0.77), or were not significant such as telomere content (P = 0.73, p = 0.14), into one decision tree can clearly improve the performance of the prediction (P = 0.82, p = 4.29 x 10−4) (Fig 6C, S2 Table).
Discussion
Patients suffering from glioblastoma have a very poor prognosis with treatment options being often limited and ineffective [4, 69]. Thus, it is important to gain a better understanding of the tumor biology underlying GBMs. Assessment of telomere maintenance mechanisms represents a novel promising approach towards this goal. The majority of (adult) GBMs employ telomerase to maintain their telomeres. However, the ALT mechanism has an unusually high prevalence in pediatric GBM, with up to 44% of ALT-positive cases, as assessed by ultra-bright foci in telomere FISH [4, 5, 70]. Yet, other ALT markers have not been studied in detail and ALT-positive pedGBM cell lines as suitable model systems were lacking. Here, five ALT-positive pedGBM cell lines were identified. These cell lines provide valuable models to gain a better understanding of the ALT pathway in pedGBMs and to test novel ALT-targeted therapies in a preclinical setting. For this purpose it is also of advantage that the identified ALT-positive cell lines have different genetic backgrounds (Table 1) representing distinct epigenetic and biological subgroups of pedGBMs [39]. The detailed analysis of characteristic ALT markers revealed that the extent to which these features are present varied considerably among the ALT-positive cell lines (Table 1). The NEM165 and NEM168 cell lines displayed all the hallmarks of ALT without carrying an ATRX mutation, which points to a ‘non-canonical’ ALT mechanism. It is noted that the diffuse intrinsic pontine glioma (DIPG) type of pedGBM represented by the NEM cell lines carries ATRX mutations less frequently than non-brainstem pGBMs [71, 72]. Interestingly, we found that removing ATRX from the NEM168 cell line resulted in a pronounced accumulation of C-circles indicative of increased ALT activity (Fig 5C). Thus, ATRX mutations might be a secondary event that prolongs and/or enhances an attenuated ALT mechanism in place initially.
In summary, we conclude that some variation of the ALT phenotype exists in pedGBM with respect to the degree to which typical ALT features are present. Another main observation from our characterization of the pedGBM cell lines refers to the aberrant pattern of H3.3 serine 31 phosphorylation, which was previously linked to the presence of ALT [31]. We show here that this modification is not a direct ALT marker but in the cell lines tested here rather is associated with the loss of ATRX. The mechanism by which this leads to an extended mitotic H3.3S31p signal is unknown. One model would be that loss of ATRX results in an increase of stalled replication forks and accumulation of single-stranded DNA. This in turn could trigger an RPA- and ATR-mediated activation of CHK1, which was implied in establishing the H3.3S31p mark [31, 73].
In contrast to a previous report, ALT activity is not suited as a biomarker to predict sensitivity to ATR inhibition, which confirmed our experiments with non-glioma cell lines [15, 42]. While drug sensitivity varied among the cell lines, no association with the active TMM was observed. Thus, an efficient drug that specifically targets ALT-positive tumors is still lacking. Nevertheless, the active tumor-specific TMM is an attractive therapeutical target that needs to be further exploited in the future. In this context, a reliable TMM classification of primary tumor samples is crucial to determine whether a patient could benefit from telomerase inhibiting agents or ALT targeting drugs. In addition, the stratification of GBM patients according to the active TMM provides valuable prognostic information as ALT is associated with longer survival [4, 61, 74]. However, as discussed above typical ALT features display a significant degree of variations in pedGBM cell lines. Thus, it is essential to base the ALT identification on a suitable set of ALT markers.
Consistent with reports for other tumor entities we find the enrichment of C-circles in pedGBM to be a very good marker for ALT, with a high agreement between this assay and the detection of ultra-bright telomere foci visualized by FISH [61]. This is of particular interest as telomere FISH currently is the major technique to detect ALT. However, it requires tissue section material, expertise in FISH and a way to quantitatively measure the telomere-specific fluorescence for a proper interpretation as discussed previously [75]. In contrast, for executing a C-circle assay as little as 30 ng of DNA is sufficient, thus making it a reasonable method for clinical applications. However, a draw-back of this assay is the instability of the mostly single-stranded C-circles since their detection can be impaired by improper storage and frequent freeze-thaw cycles and lead to false-negative results [26]. Hence, combination with other ALT markers becomes essential. Accordingly, we introduce the PAINT decision tree-based classifier as a web tool that integrates results from up to 11 different features to predict the presence of ALT in a given pedGBM sample (Fig 6). Importantly PAINT integrates sequencing-based information with results from other assays. In addition, it allows taking parameters into account, which on their own are much less conclusive with regards to the active TMM (Fig 6C). For example, the mutational status of TP53 by itself does not provide a convincing marker for ALT. However, the TMM prediction is largely improved when combined with information about the telomere content and the RNA-seq derived TERT expression, which by itself is also difficult to interpret due to generally low TERT mRNA levels [66, 67]. It is noted that the prediction of the ALT status from sequencing data via PAINT can be used for a retrospective analysis of cohorts that have been sequenced previously to link the TMM with disease progression. This will largely improve patient stratification with respect to their activated TMM since much larger sample numbers are available if telomere FISH or C-circle data are not required.
In the current implementation the decision tree is based on a binary feature discrimination with continuous data as from DNA methylation arrays or RNA-seq analyses being converted into binary values. In the future, the performance of current classifiers can be improved when integrating support vector machines. Extending the reference sample set of pedGBM samples with defined ALT status and its implementation as training data in the classification scheme will also increase the prediction accuracy.
The activation of a TMM is an important step in oncogenesis and an integrated analysis of (epi)genomic features in relation to the telomere length is relevant for the analysis of all cancer types [52]. Accordingly, it will be informative to implement an expanded PAINT classifier approach that combines sequencing- and imaging-based TMM analysis with molecular assays into a pan-cancer tool. With a growing number of samples and more information to train the classifier we will also be able to extend the scheme and distinguish more groups such as telomerase-positive, ALT with loss of ATRX, ALT in the presence of functional ATRX, and cases that do not display any signs of both ALT and telomerase activation. The latter groups are particularly interesting as a recent study found 22% of cancers with no sign for telomerase activation or ATRX/DAXX aberrations [52]. Accordingly, we envision that the approach described in our present study will support the TMM analysis and subsequent patient stratification not only in pedGBM but also for other tumor entities that have a high prevalence of ALT such as neuroblastoma or soft tissue sarcoma [2].
Materials and Methods
Cell culture
U2OS, HeLa cells and HeLa LT cells were cultured in DMEM and RPMI1640 medium (Gibco), respectively, each supplemented with 10% FCS, 2 mM L-glutamine and 100 μg/ml penicillin/ streptomycin. SF188, SJ-G2, KNS42 and MGBM1 were cultured in high-glucose DMEM supplemented as described above. NEM157, NEM165 and NEM168 were cultured in Amniomax C-100 Basal Medium with 10% Amniomax C-100 supplement (both Gibco). All cell lines were cultured at 37°C in 5% CO2.
Generation of NEM168 ATRX knockout cell lines
Two candidate guide RNA sequences directed against ATRX were cloned into pX459v2.0 as described in [34]. The construct was transfected into NEM168 using TransIT-LT1 transfection reagent (Mirus), transfected cells were selected using 1.5 μg/ml puromycin and single clones were assayed for indels in the ATRX gene using the Surveyor assay (IDT). Deletion of ATRX in positive clones was validated in a western blot with anti-ATRX antibody (HPA001906, Sigma).
Terminal restriction fragment (TRF) analysis
Genomic DNA was purified using the Gentra Puregene Cell Kit (Qiagen) and DNA integrity was assessed on a 1% agarose gel. Genomic DNA (5 μg) was digested overnight with the restriction enzymes Hinf I and Rsa I (12.5 U each). The digested DNA was resolved on a 0.6% agarose gel (gold, Biozym) in 0.5X TAE buffer using the CHEF-DRII pulsed-field gel electrophoresis system (Biorad) with the following settings: 4 V/cm, initial switch time 1 s, final switch time 6 s, and 13 h duration. As size references, a DIG-labeled molecular weight marker from the TeloTAGGG telomere length assay kit (Roche) was loaded alongside the digested DNA. Southern blotting and chemiluminescent detection was performed using the TeloTAGGG kit according to the manufacturer’s instructions. The resulting chemiluminescent signals were measured with a Chemidoc MP imaging system (Biorad).
C-circle assay
The C-circle assay was essentially performed as described previously [26]. Briefly, genomic DNA was isolated from cell lines using the QIAamp DNA mini kit (Qiagen). 30 ng DNA (in 10 μl) was combined with 10 μl 2X Φ29 buffer supplemented with 7.5 U Φ29 DNA polymerase (New England Biolabs), 0.2 mg/ml BSA, 0.1% (v/v) Tween-20, 1 mM each of dATP, dGTP and dTTP and incubated at 30°C for 8 h, followed by an incubation at 65 °C for 20 min. Reactions without addition of the Φ29 polymerase were included as control (“–pol”). After addition of 40 μl 2x SSC, the amplified DNA was dot-blotted with a 96-well dot blotter onto a 2x-SSC-soaked nylon membrane. The membrane was baked for 20 min at 120°C and hybridized and developed using the TeloTAGGG kit according to manufacturer’s instructions. The result of the C-circle assay was evaluated as ALT-positive if the signal intensity of the reaction with polymerase (“+pol”) was at least 1.4-fold higher than the intensity of the C-circle signal of the reaction without polymerase (“-pol”) and additionally, at least fourfold higher than the background.
TERRA qRT-PCR
Total RNA was extracted with the RNeasy kit (Qiagen) and subsequently digested with DNase I (Promega) for 30 min to ensure depletion of contaminating genomic DNA. Total RNA (1 μg) was reverse transcribed with gene-specific primers (Telo RT 5′-CCC TAA CCC TAA CCC TAA CCC TAA CCC TAA-3′, β-actin RT 5′-AGT CCG CCT AGA AGC ATT TG-3′, see ref. [35]) using Superscript III at 55 °C for 1 h, followed by RNase H treatment. Reactions without reverse transcriptase were performed as controls. qRT-PCR analysis was performed using SYBR green master mix (Roche). Reactions were set up in triplicates with 500 nM telomere specific primers (forward 5′-CGG TTT GTT TGG GTT TGG GTT TGG GTT TGG GTT TGG GGT-3′, reverse (5′-GGC TTG CCT TAC CCT TAC CCT TAC CCT TAC CCT TAC CCT-3′) as described [36]. β-actin specific primers were used as described previously [35]. The amplification program was as follows: 95°C for 10 min followed by 36 cycles at 95°C, 58°C and 72°C each for 10 s. TERRA levels were normalized against β-actin signals and a standard curve was used to obtain relative quantities of TERRA.
Telomere qPCR
Telomere-repeat quantitative PCR was conducted essentially as described previously [36, 37]. In short, 5 ng DNA, 1x Lightcycler 480 SYBR Green master mix and 500 nM each of forward and reverse primer were added per 10 μl reaction (for telomere repeats telo-fwd 5’-CGG TTT GTT TGG GTT TGG GTT TGG GTT TGG GTT TGG GTT-3’ and telo-rev 5’-GGC TTG CCT TAC CCT TAC CCT TAC CCT TAC CCT TAC CCT-3’ and for the single copy gene RPLP0/36B4 36B4-fwd 5’-CAG CAA GTG GGA AGG TGT AAT CC-3’ and 36B4-rev 5’-CCC ATT CTA TCA TCA ACG GGT ACA A-3’). Cycling conditions were 10 min at 95 °C, followed by 40 cycles of 95 °C for 15 s and 60 °C for 1 min. A standard curve was used to determine relative quantities of telomere repeats (T) to those of the single copy gene (S). The log2 ratio of telomere content was determined by dividing the T/S ratio of the tumor sample by the T/S ratio of the control sample. The calculated log2 ratio represents the increase or decrease in telomere content in tumor versus control samples.
Immunofluorescence and FISH
Immunofluorescence and telomere FISH on cell lines were performed as described previously [38]. The following antibodies were used: mouse anti-PML (1:100, Santa Cruz, sc-966) and rabbit anti-ATRX (1:500, Bethyl Labs, A301-045). Immunohistochemistry for ATRX (Sigma, HPA001906; dilution 1:750) on FFPE samples was performed as previously described [39]. Interphase telomere FISH was performed using a PNA probe kit (Dako).
H3.3S31p staining on metaphase spreads
Cells were arrested in mitosis by adding Karyomax colcemid solution (Gibco) at a final concentration of 0.1 μg/ml to the culture medium and incubating for 2 h. Mitotic cells were harvested by shake-off and resuspended in a small volume of medium. Pre-warmed hypotonic solution (0.5% sodium citrate) was added dropwise to the cell suspension to obtain a final concentration of 2 x 104 cells/ml. After incubation at 37 °C for 10 min, 500 μl of the cell suspension was cytocentrifuged on a microscopic glass slide (1200 rpm, 5 min). Slides were air-dried shortly and transferred to KCM buffer (120 mM KCl, 20 mM NaCl, 10 mM Tris-HCl pH7.2, 0.5 mM EDTA, 0.1% (v/v) Triton X-100) for 15 min at RT. Next, slides were incubated in anti-H3.3S31p rabbit antibody (Active motif, 1:100 in 10% goat serum in KCM buffer) for 1 h at room temperature in a humid chamber. Slides were washed two times for 5 min in KCM buffer and incubated with anti-rabbit Alexa 488 secondary antibody (Invitrogen, 1:300 in 10% goat serum in KCM buffer). Slides were again washed two times for 5 min in KCM buffer, fixed for 10 min in 4% (v/v) paraformaldehyde in KCM buffer, washed for 5 min in H2O, air-dried and mounted in prolong gold antifade reagent with DAPI (Thermo Fisher Scientific).
Confocal image acquisition and analysis
Fluorescence microscopy images were acquired with a Leica TCS SP5 confocal laser scanning microscope (oil immersion objective lens, 63x, 1.4 NA) and are displayed as maximum intensity projections. For the quantification of APBs, z-stacks of cells stained for PML and telomeres were acquired by automated confocal image acquisition, as described previously [40, 41]. Colocalizations of PML and telomeres, representing APBs, were quantified using a 3D-model-based segmentation approach as described previously [40, 41].
ATR inhibitor sensitivity assays
For cell viability assays, 1,500 cells were seeded in triplicate in 96-well plates and incubated overnight as described previously [42]. The following day cells were either treated with DMSO (control) or with increasing concentrations of the ATR inhibitor VE-821 (Selleckchem) dissolved in DMSO. Cells were incubated for 6 days without medium change and cell viability was analyzed using Celltiter Glo (Promega) and a TECAN Infinite M200 plate reader according to the manufacturers’ instructions. For FACS analysis, cells were seeded in T25 flasks. Each cell line was either treated with 3 μM VE-821 or with the same volume of DMSO for the control samples. After incubation for 6 days without medium change, cells (including dead cells) were collected by trypsin and total cell numbers were determined using the LUNA cell counter (Biozym). Cells were resuspended in FACS binding buffer (10 mM HEPES, 2.5 mM CaCl2, 140 mM NaCl) at a final concentration of 2x106 cells/ml, stained with FITC annexin V (Biolegend) and propidium iodide (Invitrogen) according to the manufacturers’ instructions, and analyzed by flow cytometry on a FACS Canto II (BD Biosciences). Cells that were negative for both PI and Annexin V were identified as viable cells. Annexin V positive events were characterized as apoptotic cells, whereas PI positive events were labeled as necrotic. All fractions of viable, apoptotic and necrotic cells were quantified using the WEASEL software. The percentage of induced cell death dind was calculated as
Whole-genome sequencing (WGS) and data analysis
Whole-genome sequencing and data analysis were performed as part of the ICGC PedBrain Tumor Project, as previously described [43]. Additional new samples (ICGC_GBM84, ICGC_GBM95, ICGC_GBM96, ICGC_GBM98, ICGC_GBM100, and cell lines NEM168, NEM165) were processed in the same way.
Accession codes
Sequencing and methylation data are available at the European Genome-phenome Archive (http://www.ebi.ac.uk/ega/), hosted by the European Bioinformatics Institute under the accession number EGAS00001001139 (see also [43]).
Construction of decision tree-based classifier
To distinguish between different TMMs in pedGBM tumor samples, a classification scheme was developed. A sample with a positive C-circle signal was classified as ALT-positive, while a sample with an activating TERT promoter mutation was classified as ALT-negative (and bona fide telomerase-positive). For samples for which both features were negative or not available, a binary decision tree was constructed based on intelligent enumerating. TMM-specific features were extracted from imaging data (presence of ultra-bright telomere foci, ATRX loss of expression from immunohistochemistry staining) and combined with parameters derived from whole-genome sequencing (chromothripsis, loss of function mutations in ATRX and TP53, K27M and G34R/V mutations in H3F3A), RNA-sequencing (TERT expression), telomere qPCR (telomere content) and from methylation data from the Illumina 450K array (TERT promoter methylation). In total, feature information was available for seven cell lines (Table 1) and 57 pediatric glioblastoma patients (S1 Table). WGS information was mainly extracted from the data in ref. [43]. For DNA methylation as well as the telomere content, only data from primary tumor samples were used. The decision tree-based classifier was constructed from 35 samples with known TMM status. The training set consisted of the 7 cell lines (5 ALT and 2 non-ALT), 13 ALT-positive tumor samples and 15 non-ALT tumor samples, of which two were telomerase-positive as inferred from detection of a TERT promoter mutation. The C-circle signal determined the class ‘ALT’ (C-circle-positive) or ‘non-ALT’ (C-circle-negative and no ultra-bright FISH foci) and was not used for constructing the decision tree. The feature values from Table 1 and S1 Table were translated into binary values with 1 representing the presence of a feature. For continuous values (TERT promoter methylation, TERT expression in RPKM and telomere content), optimal thresholds were determined for ALT and non-ALT samples from the training data set. The thresholds were 0.01 RPKM for TERT expression (S6B Fig), a methylated fraction of 0.22 for TERT promoter methylation (S6D Fig), and a telomere content log 2 ratio of 0 relative to the healthy control sample (S6F Fig). Possible binary decision trees were compared to identify the tree that had the minimal number of misclassified samples from the training set and the minimal number of questions. A leave-one-out cross-validation approach was used to determine the performance of the tree, which is measured by the percentage of correct predictions in the training data and indicated by the performance value P. Finally, the optimal tree was derived using the complete set of samples (S2 Table). In cases where the number of samples that simultaneously had multiple features of interest was small, it was tested if the construction of the decision tree could be improved by including additional samples where a missing feature was arbitrarily selected as positive or negative. Since the number of samples with overlapping information on the features was different depending on the analyzed set of features, variations in the performance values can occur even when the same set of features was used. In cases where the performance of a tree did not improve with more features for the same number of samples,the tree with the best subset was used. In addition, a p-value was calculated for every tree based on the confusion matrix using Fisher’s exact test corrected according to Benjamini and Hochberg [44]. The p-value indicates if the calculated confusion matrix is better than a random one. We note that non-significant p-values in most cases are due to a small sample size for training the respective tree.
The decision trees constructed with this approach were applied to predict the class (‘ALT’ or ‘non-ALT’) of the 29 samples with unknown TMM status. A total of 12 samples were predicted as ALT and 17 as non-ALT (S1 Table). It is noted that the input is calculated for any (incomplete) combination of known features from the corresponding decision tree.
Finally, the method was implemented as a web-based program termed PAINT for Predicting ALT IN Tumors. It is available at http://www.cancertelsys.org/paint/index.html to predict if a sample is ALT-positive or -negative. The user can select a combination of available information for a given pedGBM sample and the predicted ALT state is returned. To assess the prediction quality, PAINT also displays the performance P, the p-value as well as the sample size used to construct the corresponding tree (see S2 Table). In cases of non-significant performances (adjusted p-value > 0.05), PAINT replaces the feature set with the subset that has a significant p-value (adjusted p-value ≤ 0.05) and the best performance.
Funding
The work was funded by the German Federal Ministry of Education and Research (BMBF) within e:Med projects CancerTelSys (grant number 01ZX1302) and SYS-GLIO (grant number 031A425A) and the International Cancer Genome Consortium (ICGC, grant number 01KU1201A), and by an IFB/CSCC grant (01EO1502). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Supplementary Methods
Viral transduction
Histone H3.3-overexpressing SF188 cells and H3.3 variant-containing constructs pLVX-Puro_H3.3-wt_HA, pLVX-Puro_H3.3-K27M_HA, pLVX-Puro_H3.3-G34R_HA have been described in ref. [2]. Lentiviral particles were produced in HEK293T cells the supernatant containing the lentiviral particles was used for transduction of SJ-G2 and HeLa LT cells. Transduced cells were selected by adding puromycin to the medium in a final concentration of 3 μg/ml.
RNA-Seq
Total RNA was purified and genomic DNA was removed as described above. Ribosomal RNA was removed with the Ribozero gold kit (Illumina) following the manufacturer’s instructions. RNA libraries were prepared using the NEB next ultra directional RNA library preparation kit according to the manufacturer’s protocol for use with ribosome depleted RNA. Library quality was assessed on a Bioanalyzer using an Agilent High Sensitivity Chip. Two independent RNA libraries were prepared from each cell line and sequenced on a HiSeq 2000, single-read, 50 bp). The reads from RNA-seq were mapped to the UCSC hg19 assembly allowing for no mismatches using TopHat. Expression levels of TERT were determined in reads per kilobase per million mapped reads (RPKM).
Analysis of differentially expressed genes
After prediction of the presence of ALT using the decision tree-based classifier all samples from ref. [1], for which RNA-Seq data was available, were used to derive differential expressed genes between ALT and non-ALT samples. Differential expression was calculated with DESeq2 [5] based on the raw read counts of the samples. For calculating the significance, DESeq2 employs the Wald test and adjusts for multiple testing using the method by Benjamini and Hochberg [3].
DNA methylation profiling
For genome-wide assessment of DNA methylation, pedGBM cell lines and tumor samples were arrayed using the Illumina human methylation 450 bead chip according to the manufacturer’s instructions at the DKFZ. For addressing subtelomeric methylation of cell lines, the IDs of all CpG sites that were located within 1 Mb of the chromosome ends were determined using R and the length of each chromosome as specified in the UCSC hg19 assembly. The median methylation value of the selected CpG positions was determined for each cell line.
Acknowledgments
We thank Caroline Bauer, Stefan Wörz, Karl Rohr, the DKFZ Genomics and Proteomics Core Facility, and the DKFZ Flow Cytometry Core Facility for help and support.