Abstract
MicroRNAs are potent post-transcriptional regulators involved in all hallmarks of cancer. Mir-196a is transcribed from two loci and has been implicated in a wide range of developmental and pathogenic processes, with targets including Hox, Fox, Cdk inhibitors and annexins. Genetic variants and altered expression of miR196a are associated with risk and progression of multiple cancers including breast cancer, however little is known about the regulation of the genes encoding this miRNA, nor the impact of variants therein. Here we demonstrate that MIR196A displays complex and dynamic expression patterns, in part controlled by long range transcriptional regulation between promoter and enhancer elements bound by ERα. Expression of this miRNA is significantly increased in models of hormone receptor positive disease resistance. The expression of MIR196A also proves to be a robust prognostic factor for patients with advanced and post-menopausal ER+ disease. This work sheds light on the normal and abnormal regulation of MIR196A and provides a novel stratification method for therapeutically resistant breast cancer.
Introduction
MicroRNAs are short non-coding RNAs that post-transcriptionally regulate gene expression (1). MicroRNAs have been implicated in many disease, ranging from rare inherited syndromes arising from germline mutations in MiRNA genes through cancers arising from an accumulation of germline and somatic mutations and epigenetic deregulation (2). Research into the biology and pathology of these molecules has led to the identification of clinically useful genetic and epigenetic biomarkers and novel therapeutic agents, often based on antagomiR technology, that have shown promise in the control of disease symptoms and progression (3).
MicroRNA-196A (miR-196a, MIR196A) is transcribed in two genomic locations, the HOXC (Chr12 in humans, gene MIR196A2) and HOXB (Chr17 in humans, gene MIR196A1) loci, downstream of HOXC10 and upstream of HOXB9 respectively. It has been strongly implicated in a range of cancers, primarily as an oncogene. For example, MIR196A is overexpressed in breast tumours (4), and a single nucleotide polymorphism (SNP, rs116149130) within the MIR196A2 gene is associated with a decreased risk of breast cancer (5). MIR196A has been shown to target the 3’ UTR of Annexin-1 (ANXA1), an important mediator of apoptosis in various pathways (6), in response to the pro-angiogenic vascular endothelium growth factor (VEGF), leading to alterations in angiogenesis. A separate study demonstrated that MIR196A could increase growth, migration and invasion of a non-small cell lung cancer cell line through direct targeting of HOXA5 (7). Two studies have recently shown that MIR196A can directly influence the cell cycle by targeting p27/Kip1, an inhibitor of cell cycle progression, to dramatically increase growth and pro-oncogenic features of cancer cell lines (8, 9). Despite the clear importance on miR-196a in cancer, its transcriptional regulation remains poorly understood.
Transcriptional regulation is a complex multi-faceted biological process that is significantly altered in cancer. MicroRNA genes are regulated transcriptionally in a similar manner to protein coding and long non-coding RNA genes. Promoters mostly lie upstream (within 10kb of the mature miRNA), contain a CpG island and in an active state when the miRNAs are transcribed by RNA Pol II are enriched for H3K4me3 and lack H3K27me3 similar to protein coding genes (10, 11). Taken together, these data indicate that potential promoters for miRNAs can be identified in a similar manner to methods for protein coding genes. Several instances of miRNA regulation by enhancers have been described, but this area is very much in its infancy (12, 13).
In this study, we aimed to characterise the expression landscape of MIR196A including factors regulating its expression and explore potential roles of regulatory elements and factors in breast cancer prognostication.
Results
MIR196A expression correlates with HOXC genes in breast cancer
Several HOXC protein coding and non-coding genes have shown associations with breast cancer progression. We first identified expression patterns of HOXC genes in breast cancers (Supp Figure 1). These data indicate that MIR196A expression highly correlates to HOXC genes, particularly HOXC10.
Next we investigated whether these associations are also observed in normal cells of the human breast. The association between MIR196A expression and HOXC genes is more limited, observed most strongly with HOXC11 and HOXC10, the genes surrounding the MIR196A gene (Supp Figure 2A). MIR196A appears to be mostly expressed within the basal stem-cell (BSC) derived cells, whilst much lower in expression of the more differentiated cell types (Supp Figure 2B).
MIR196A expression is regulated by oestrogen
We and others have previously demonstrated regulation of HOXC genes by oestrogen in breast cancer (14-18). Given that MIR196A expression strongly correlates with expression of HOXC protein coding genes in breast cancer (Supp Figure 1), we sought to determine if oestrogen also regulates MIR196A2. Chromatin immunoprecipitation (ChIP-Seq) for RNA polymerase II demonstrates that polymerase binding in the region surrounding the HOXC10 gene and MIR196A gene is dependent on oestrogen in MCF7 cells and is repressed with both tamoxifen or fulvestrant treatment (Figure 1A). Global-run-on sequencing (GRO-Seq) is able to measure nascent RNA, assessing changes in transcription with high sensitivity. Analysis of MCF7 GRO-Seq data clearly indicates a dramatic increase in RNA production in the genomic region surrounding MIR196A2, peaking at 40 mins following addition of oestradiol (E2) (Figure 1B). This increase in RNA production from the HOXC locus was validated with qRT-PCR and RNA-Seq from MCF7 cells following addition of E2 (Figures 1C and D). These data clearly indicate an increase in precursor miRNA from MIR196A2 but not MIR196A1 in response to E2. Taken together this suggests that MIR196A2 is transcriptionally regulated by oestrogen.
Transcriptional regulation of miR196A
To identify the structural elements associated with the transcriptional regulation of MIR196A, histone methylation patterns in the MCF7 breast cancer cell line were assessed. This analysis uncovered putative promoter elements upstream of MIR196A including a shared promoter with HOXC10 (Figure 1A).
Given that MIR196A expression is regulated by oestrogen we hypothesized that its transcription may be controlled by the oestrogen receptor (ER). Using publically available datasets we established that oestrogen mediated upregulation of miR-196A expression is accompanied by binding of ERα and its pioneer factor FOXA1 to two putative promoter regions, putative promoters 1 and 3 (PP1 and PP3), upstream of the miR196A2 transcription start site (Figure 1B).
Upon cloning of these putative promoter sites into luciferase reporter vectors where PP1 and also PP2; modestly; increases luciferase gene transcription (Figure 1E), with the most active promoter in MCF7 cells, PP1 (HOXC10 promoter).
Given that ERα often binds to distal enhancer elements to exert its function, we examined the hypothesis that MIR196A2 is controlled by long-range transcriptional regulation, mediated by ERα tethered gene looping. Using ChIA-PET (Chromatin Interact Analysis by Paired End Tags) genome-wide chromatin interactions that immunoprecipitate with either ERα or RNA Polymerase II (correlative with active promoters and enhancers), we identified two major sites of interaction with the MIR196A2 promoters (Figure 2A). One of these is a previously identified HOTAIR enhancer (HOTAIR distal enhancer, HDE (15)) and the other a novel interacting partner (MIR196A2-Enhancer, mE). Chromosome conformation capture (3C) digestion of the HOXC genomic locus digests the MIR196A2 region into two fragments. 3C-qPCR demonstrates that both enhancer elements physically interact with each of the two MIR196A2/HOXC10 promoter region (Figure 2B). Cloning of these fragments downstream of the putative promoter luciferase reporters clearly demonstrates significant augmentation of transcription for both the PP1 and PP2, with HDE appearing to be the most active in MCF7 cells (Figure 2C).
Interestingly, a previous study (5) identified a SNP and an upstream CpG island associated with a decrease in breast cancer risk. This SNP lies within the MIR196A2 gene and the CpG island (CpG_Hoffman) is immediately upstream, falling into the 3’ end of the PP3. Analysis of DNA methylation reveals that this CpG island is mostly methylated in non-malignant MCF10A and cancerous MCF7 cells, whilst unmethylated in human mammary epithelial cells (HMEC) (Figure 1A).
MIR196A is differentially expressed in breast cancer
Given that MIR196A is regulated by ERα, we investigated its expression patterns in relation to commonly utilised molecular markers of breast tumours (Figure 3A). This analysis identified four distinct clusters of MIR196A expression (Clusters 1-4). Interestingly clusters 1 and 3 show a strong correlation to expression of hormone receptors (HR) (AR, ERα, PGR, HER2) and HR cofactors (Figure 3B). In contrast, clusters 2 and 4 have significant negative correlation to expression of ERα, PGR, FOXA1 and GATA3, whilst associating with EGFR and HER2. This expression is further defined by the PAM50 intrinsic subtypes where MIR196A is strongly expressed in the HER2 subtype, whist in the luminal A and B subtypes expression is very dynamic (Figure 3C).
MIR196A is a biomarker of breast cancer progression
To further explore the expression of MIR196A in breast cancer, we utilised expression data from the METABRIC cohort of breast tumours. Expression analysis of this miRNA indicate that it is significantly over-expressed in breast tumours compared to normal adjacent tissue and over-expression is associated with an increase in tumour stage (Figure 4A and 4B). Interestingly, high expression of MIR196A is associated with a poor survival in estrogen receptor positive (ER+) breast cancer, whilst high expression associates with a better outcome in triple-negative breast cancer (TNBC) (Figure 4C and D).
MIR196A is a prognostic biomarker in advanced ER+ disease
Using MIR196A expression, overall survival of ER+ tumours responding to both hormone therapy (HT) and chemotherapy (CT) was stratified (Figure 5A). Women with low MIR196A expression had exhibited a high rate of survival (>95% at 10 years), whilst most women within the high expression group died within 17 years (61% at 10 years).
Given that MIR196A is regulated in part by oestrogen, and the disparity in prognostication of ER+ and TNBC, we investigated the effects of menopause on the stratification of survival for ER+ women. The effects of menopause on the human breast are largely unknown, however serum levels of oestrogen and progesterone dramatically reduce post menopause. In pre-menopausal women, high expression of MIR196A is associated with a good outcome in ER+ disease (Table 1). Multivariate analysis demonstrates that MIR196A is one of the few significant biomarkers for ER+ tumours arising before menopause. In post-menopausal women, all tested biomarkers were significant in ER+ disease, including MIR196A, however high expression is now associated with a poor outcome.
Therapeutic resistance leads to increases in MIR196A expression
TNBC is resistant to hormone-based therapies and HR+ disease often becomes resistant to anti-oestrogen treatment. Using established models of HR+ disease resistance we found that MIR196A expression is significantly increased in tamoxifen resistant MCF7 cells (TAMR) whilst it is almost depleted in fulvestrant resistance (FASR) (Figure 5B). These expression patterns match changes in DNA methylation to the HOXC10/MIR196A2 promoters in these same cells (Figure 5C). For HR+ resistant tumours the only remaining therapeutic options are radiotherapy and chemotherapy. Using RNA-Seq data for cell line models of resistance to paclitaxel and adromycin, two common chemotherapeutics, MIR196A expression again increases in resistant cell lines compared to the treatment sensitive cell line (Figure 5D).
Utilising ERα ChIP-Seq performed in human patients with HR+ disease, binding sites for ERα were identified in the genomic region of MIR196A. This tumour cohort contains women who respond to HR therapy, those who do not and metastases from resistant tumours. An increase in ERα occupancy is seen at both enhancer and promoter regions of MIR196A in non-responders and metastases (Figure 5E). The increased genome-wide ERα binding in the more resistant tumours was shown by the authors to associate with changes to expression patterns crucial for the resistant tumour to survive therapy and become resistant.
Discussion
The expression of MIR196A in breast cancer is both dynamic and complex. In this paper, we have elucidated important elements, factors and mechanisms controlling the transcriptional regulation of MIR196A and shown that changes in this regulation are associated with breast cancer progression and therapeutic resistance.
Previous genetic association studies have shown that SNPs within MIR196A2 confer a reduced risk of breast cancer (5, 19, 20). Hoffman and colleagues (5) postulated that the polymorphism located within the MIR196A2 gene reduces microRNA maturation thereby reducing expression of the gene. They also identified that an upstream CpG island is associated with reduced risk when hypermethylated. Here we show that this upstream CpG island lies within the transcriptionally active region of HOXC10 and MIR196A2 as observed through GRO-Seq. Interestingly, this CpG island is completely methylated in models of oestrogen deprivation and fulvestrant treatment, but not in tamoxifen resistant cells. DNA methylation is most commonly associated with repressed transcription (21), hypermethylation of this region in a transcriptional high region may severely impair expression. Given that various transcription factors strongly influence transcription in endocrine resistant breast cancer, these data suggest that binding of ERα accompanied by cofactors may be needed to maintain low methylation levels and active transcription in breast cancer (22-26).
We have previously demonstrated that long-range regulation of HOXC genes occurs in breast cancer and is influenced by ERα and its associated cofactors (15). HOX gene expression is tightly controlled in a spatiotemporal manner to ensure proper axial formation along the anterior-posterior axis (27). Within the cell types of the human breast, HOX gene expression appears dynamic and the association between MIR196A and HOXC genes is not significant. The strong correlation in expression of all HOXC genes in breast tumours with MIR196A is in stark contrast to expression in normal tissues. Several instances have been described regarding the influence of multiple distal enhancers on gene expression, such as the well characterised locus-control-region (LCR) of the Beta-globin genes or the c-Myc enhancers active across multiple cancer types (28-31). Given the extensive interactions between this locus and its adjacent gene desert, we hypothesise that a consorted effort of multiple enhancers is responsible for the overexpression of these genes in cancer possibly driven by extensive binding and activity of ERα. To explore this hypothesis a high resolution chromatin interaction analysis of this region in breast cancer cells would be required, such as 5C (32) or NG Capture-C (33), coupled with ERα ChIP-Seq and ChIA-PET (34).
Whilst this manuscript was in preparation new data has come to light which corroborates our conclusions. Jiang et al (35) demonstrate that the mature MIR196A transcript positively responds to oestrogen stimulation in MCF7 cells, and this is mediated by upstream ERα binding. This binding peak falls within PP3. Whilst we show that PP3 is not able to increase luciferase expression in a luciferase reporter assay, the binding of ERα may be important for the activity of the HOXC10 and MIR196A2 promoters.
Using hierarchical clustering of breast tumour RNA-Seq data, we observed two distinct expression patterns associated with MIR196A expression. Data presented here suggests that the two loci encoding for this miRNA contribute greatly to the complexity of its expression. It is currently unclear how dual-encoded miRs are regulated of which many exist (36).
Interestingly, DNA methylation at several sites within the HOXC locus negatively correlates with the expression of this miRNA, supporting the notion of DNA methylation as a repressive epigenetic modification in this context (21).
High expression of MIR196A is a biomarker of poor prognosis in ER+ tumours, especially in those patients resistant to therapy. Expression of MIR196A increases in response to tamoxifen and chemotherapeutic agents in oestrogen responsive MCF7 cells. This increase in expression is associated with loss of DNA methylation within the promoter regions of the miRNA. In poor responders with ER+ tumours, HOXC enhancer elements appear to more readily bind the ER. These data raise the possibility that the pathway to resistance to therapy in ER+ tumours involves the de-repression and over-activation of promoter and enhancer elements. This is commonly seen throughout cancer (37-39), with suggestions that enhancer disruption can revert cells to a non-terminally-differentiated state a common hallmark of tumourigenesis. HOX genes are essential in embryonic development, these genes would be a valuable asset for any tumour cell to use to sustain a stem-cell like state (40, 41).
Breast cancer incidence and relative subtype changes after menopause (42, 43). In women younger than 45, luminal breast tumours account for 33-44% (44, 45). This increases to 70-72% in women older than 65. In contrast, basal-like tumours are more common in younger women, suggesting a switch or evolution in the factors driving cancer following menopause, most likely related to the decline in oestrogen production. It is then interesting to note that higher expression of MIR196A associates with good outcome in pre-menopausal women with ER+ tumours, and a poor outcome of ER+ tumours following menopause. Given the strong involvement of HOX genes in development, we hypothesise that there is a change in the regulation and expression of these genes through and following menopause, which in turn impacts their contribution to the development of certain breast cancer subtypes.
MIR196A is a dynamically expressed miRNA in both normal mammary cells and breast tumours. This miRNA is a possible biomarker for the progression of breast tumour to becoming resistant to therapy. Future studies should aim to uncover the purpose of increase MIR196A expression and if it is required for development of resistance alone or in combination with other HOXC genes.
Material and Methods
Cell Culture
MCF7 cells, for the development of endocrine resistance sub-lines were obtained from AstraZeneca. MCF7, Tamoxifen-resistant (TAMR), Fulvestrant-resistant (FASR), and oestrogen-deprived (MCF7x) cells were cultured as described (46-48). All cell lines were cultured for less than 6 months after authentication by short-tandem repeat (STR) profiling (Cell Bank, Australia).
Cloning and reporter assays
All PCR products for luciferase reporter assays were ligated into Invitrogen’s pCR-Blunt plasmid using T4 DNA Ligase, at 4±C overnight. MIR196A enhancers and promoters were digested from pCR-Blunt and cloned into the luciferase reporter plasmid pGL3-Basic. Enhancers were cloned into the BamHI/SalI site whilst promoters were cloned into the multiple cloning site immediately upstream of the luciferase gene. See Supplementary Table1 for primers.
MCF7 cells were transfected in antibiotic free media with 500 ng of modified pGL3 reporter constructs, 20 ng of pRL-TK (Renilla transfection control) and with 0.5μL of Lipofectamine 3000 (Life Technologies, L3000-008). 48 hours post transfection luciferase readings were measured using a DTX-880 luminometer and Dual-Glo Stop and Glo luciferase reporter kit (Promega, E2920), following the manufacturer’s recommended protocol.
RNA extraction and Gene Expression
Cell lysates were prepared using Life Technologies TRIzol® reagent and RNA was chloroform extracted and isopropanol precipitated. RNA was DNaseI treated with the DNA free kit from Ambion (Life Technologies, AM1906). RNA for miRNA analysis was reverse transcribed using the miScript RT II kit from Qiagen (218161), following instructions as per the manufacturer. Assays for all miRNAs were performed with Qiagen’s miScript SYBR Green PCR Kit (218073). Primers specific to each mature or precursor miRNA were assayed coupled with a universal primer, see Supplementary Table 2 for assay IDs. Expression data for miRNAs was normalised to the snoRNA RNU6b. All qRT-PCRs were performed using the protocols advised by the manufacturers on a Corbet Rotorgene-6000.
RNA-Seq on MCF7 cells following oestradiol treatment was performed as described previously by K. Nephew (see author list) (49). RNA-Seq from Adriamycin (ADM) and paclitaxel (PTX) resistant MCF7 derived cells was sourced from GSE68815 (50). Expression of HOX genes in human breast cells was sourced from Gascard et al (51).
Genomic Data Analysis
Accession codes for publically available data were, MCF7 ChIP-Seq (GSE14664, (52)), GRO-Seq (GSE27463, (53)), ChIA-PET (GSE39495, (34, 54)), Breast tumour ERα ChIP-Seq (GSE32222, (55)). MCF7 histone ChIP-Seq and breast cell 450K array data was sourced from ENCODE via http://genome.ucsc.edu/ENCODE/downloads.html. ChIP-Seq data was mapped with Bowtie (56) and peaks called by MACS (57) and viewed in the Interactive Genome Viewer (IGV) (58) available through the Broad Institute servers. DNA methylation 450K array data for MCF7 and endocrine resistant sublines was previously published, see Stone et al (59). DNA methylation of breast tumours was sourced from The Cancer Genome Atlas (TCGA) (60) and correlated to the gene expression of MIR196A from the TCGA cohort.
Breast Tumour Expression Analysis
METABRIC expression and clinical information were sourced from EGAS00000083 (61, 62). Clustering of Illumina Array and miR-Seq data was performed using the Multiple Experiment Viewer (MeV, (63)). Data was mean-centred and hierarchically clustered via Manhattan average-linkage based clustering of both rows and columns. Genes were correlated within clusters using the CORREL function of Microsoft Excel.
Survival Analysis
Univariate and multivariate Cox proportional hazard regression analyses were performed using MedCalc for Windows, version 12.7 (MedCalc Software, Ostend, Belgium). Kaplan-Meier survival analysis and generation of survival curves was done GraphPad Prism. Optimal cutoffs for low and high expression groups were determined using receiver operator characteristic (ROC) curves.
3C and ChIA-PET
Chromosome conformation capture (3C) was adapted from Vakoc 2005 (29), Hagege 2007 (64) and Tan-Wong 2008 (65). Briefly, cells were grown to 60-80% confluence and fixed with 1% formaldehyde. Libraries were generated for each cell line using HindIII with control libraries undigested and unligated, representing native gDNA without chromosome conformation. GAPDH primers (amplified fragment contains no cut sites for these enzymes) were used to determine the digestion and ligation efficiency of each library by comparing 3C-qPCR values to primers that amplify a fragment containing a HindIII cut site. For each 3C-qPCR, primers were designed between 100-250 bp up or downstream of each HindIII cut site with the primer across the putative enhancer used as bait in each 3C-qPC.
Supplementary Figure 1: MIR196A expression correlates with HOXC genes in breast cancer. Hierarchically clustered normalised expression for HOXC genes across breast tumours. Pearson correlation coefficients to MIR196A expression are indicated on the right-hand side.
Supplementary Figure 2: MIR196A is highly expressed in breast stem cells. A) Heatmap, Manhattan hierarchically clustered demonstrating expression data for HOX genes in human breast cells. Data is reads per million (RPM) for miRNAs and reads per million per kilobase (RPKM) for mRNAs. Data is log2 normalised and mean centred by row. Right, Pearson correlation coefficients for the expression of each gene again MIR196A. B) RPM for MIR196A across the human breast cells. Error bars represent biological replicates when available. Data for A and B sourced from GSE16368 (51). BSC = breast stem cell, BF = breast fibroblast, BME = breast myoepithelium, BLEC = breast luminal epithelial cell and HMEC = human mammary epithelial cell.
Acknowledgements
This study makes use of data generated by the Molecular Taxonomy of Breast Cancer International Consortium. Funding for the project was provided by Cancer Research UK and the British Columbia Cancer Agency Branch (61, 62).
Footnotes
* shared first