Abstract
The mammalian circadian clock is a critical regulator of metabolism and cell division. Although multiple lines of evidence indicate that systemic disruption of the circadian clock can promote cancer, whether the clock is disrupted in primary human tumors is unknown. Here we used transcriptome data from mice to define a signature of the mammalian circadian clock based on the co-expression of 12 genes that form the core clock or are directly controlled by the clock. Our approach can be applied to samples that are not labeled with time of day and were not acquired over the entire circadian (24-h) cycle. We validated the clock signature in circadian transcriptome data from humans, then developed a metric we call the delta clock correlation distance (ΔCCD) to describe the extent to which the signature is perturbed in samples from one condition relative to another. We calculated the ΔCCD comparing human tumor and non-tumor samples from The Cancer Genome Atlas and eight independent datasets, discovering widespread dysregulation of clock gene co-expression in tumor samples. Subsequent analysis of gene expression in clock gene knockouts in mice suggested that clock dysregulation in human cancer is not caused solely by loss of activity of clock genes. Our findings suggest that dysregulation of the circadian clock is a common mechanism by which human cancers achieve unrestrained growth and division. In addition, our approach opens the door to using publicly available transcriptome data to quantify clock disruption in a multitude of human phenotypes. Our method is available as a web application at https://jakejh.shinyapps.io/deltaccd.
Background
Daily rhythms in mammalian physiology are guided by a system of oscillators called the circadian clock [1]. The core clock consists of feedback loops between several genes and proteins, and based on work in mice, is active in nearly every tissue in the body [2,3]. The clock aligns itself to environmental cues, particularly cycles of light-dark and food intake [4–6]. In turn, the clock regulates various aspects of metabolism [7–9] and is tightly linked to the cell cycle [10–15].
Consistent with the tight connections between the circadian clock, metabolism, and the cell cycle, multiple studies have found that systemic disruption of the circadian system can promote cancer. In humans, long-term rotating shift work and night shift work, which perturb sleep-wake and circadian rhythms, have been associated with breast, colon, and lung cancer [16–19]. In mice, environmental disruption of the circadian system (e.g., through severe and chronic jet lag) increases the risk of breast cancer and hepatocellular carcinoma [20,21]. Furthermore, both environmental and genetic disruption of the circadian system promote tumor growth and decrease survival in a mouse model of human lung adenocarcinoma [22]. Finally, pharmacological stimulation of circadian clock function slows tumor growth in a mouse model of melanoma [23]. While these studies support the link from the clock to cancer, complementary work has established a link in the other direction, namely that multiple components of a tumor, including the RAS and MYC oncogenes, can induce dysregulation of the circadian clock [24–26]. Despite this progress, however, whether the clock is disrupted in human tumors has remained unclear.
When the mammalian circadian clock is functioning normally, clock genes and clock-controlled genes show characteristic rhythms in expression throughout the body and in vitro [2,3,27]. Measurements of these rhythms through time-course experiments have revealed that the clock is altered or perturbed in some human breast cancer cell lines [28,29]. Existing computational methods for this type of analysis require that samples be labeled with time of day (or time since start of experiment) and acquired throughout the 24-h cycle [30–32]. Unfortunately, existing data from resected human tumors meet neither of these criteria.
A common approach to analyze cancer transcriptome data is to look for associations between levels of gene expression and other biological and clinical variables. For example, in human breast cancer, the expression levels of several clock genes have been associated with metastasis-free survival (with the direction of association depending on the gene) [33]. However, because a functional circadian clock is marked not by the actual levels of gene expression, but by periodic variation in gene expression, this type of analysis cannot necessarily be used to determine whether the clock is functional.
To account for this periodic variation, one approach to detect a functional clock might be to examine the correlations in expression between clock genes. Indeed, a previous study found different levels of co-expression between a few clock genes in different subtypes and grades of human breast cancer [33]. Although this finding was an important first step, its generalizability has been limited because the correlations in expression were not examined (1) for all clock genes, (2) in other human cancer types, or (3) in healthy tissues where the circadian clock is known to be functional. Thus, a definitive answer to whether the circadian clock is functional across the spectrum of human cancers is still lacking.
The goal of this study was to determine whether the circadian clock is functional in human cancer. Using transcriptome data from mice, we defined a robust signature of the mammalian circadian clock based on the co-expression of clock genes. We validated the signature in circadian transcriptome data from humans, then examined the extent to which the signature was perturbed in tumor compared to non-tumor samples from The Cancer Genome Atlas (TCGA) and from multiple independent datasets. Our findings suggest that the circadian clock is dysfunctional in a wide range of human cancers.
Results
Consistent correlations in expression of clock genes in mice
The progression of the mammalian circadian clock is marked by characteristic rhythms in gene expression throughout the body [3]. We hypothesized that the relative phasing of the rhythms of different genes would give rise to a characteristic pattern of correlations between genes. Such a pattern could be used to infer the activity of the clock, even in datasets in which samples are not labeled with time of day (Fig. 1A). To investigate this hypothesis, we first collected eight publicly available datasets of genome-wide, circadian gene expression from various mouse organs under both constant darkness and alternating light-dark cycles [3,12,34–38] (Table S1). We focused on 12 genes that are part of the core circadian clock or are directly controlled by the clock and that exhibit strong, consistently phased rhythms in expression across organs [32]. For the rest of the manuscript, we refer to these 12 genes as “clock genes.”
For each dataset, we calculated the Spearman correlation between expression values (over all samples) of each pair of genes. The pattern of correlations was highly similar across datasets and revealed two groups of genes, where the genes within a group tended to be positively correlated with each other and negatively correlated with genes in the other group (Fig. 1B). Genes in the first group (Arntl, Npas2, and Clock), which are known to form the positive arm of the clock [39], peaked in expression shortly before zeitgeber time 0 (ZT0, which corresponds to time of lights on or sunrise; Fig. S1). Genes in the second group (Cry2, Nr1d1, Nr1d2, Per1, Per2, Per3, Dbp, and Tef), which are known to form the negative arms of the clock, peaked in expression near ZT10. Cry1, which appeared to be part of the first group in some datasets and the second group in others, tended to peak in expression around ZT18. These results indicate that the progression of the circadian clock in mice produces a consistent pattern of correlations in expression between clock genes. The pattern does not depend on the absolute phasing of clock gene expression relative to time of day. Consequently, the pattern is not affected by phase shifts, such as those caused by temporally restricted feeding [40] (Fig. S2).
Most computational methods for quantifying circadian rhythmicity and inferring the status of the clock require that samples be acquired over the entire 24-h cycle. Because our approach does not attempt to infer oscillations, we wondered how robust it would be to partial coverage of the 24-h cycle. We therefore examined clock gene expression in three of the previous datasets, in samples acquired during the first 8 h of the day (or subjective day) or the first 8 h of the night (or subjective night). In each dataset, the correlation pattern was preserved in both daytime and nighttime samples (Fig. S3). These results suggest that our approach can detect an active circadian clock in groups of samples without using time of day information, even if the samples’ coverage of the 24-h cycle is incomplete.
Validation of the correlation pattern in humans
We next applied our approach to nine publicly available datasets of circadian transcriptome data from human tissues: one from skin [41], two from brain [42,43], three from blood [44–46], and three from cells cultured in vitro [36,47,48] (Table S1). The dataset from human skin consisted of samples taken at only three time-points for each of 19 subjects (9:30am, 2:30pm, and 7:30pm). The datasets from human brain were based on postmortem tissue from multiple anatomical areas, and zeitgeber time for each sample was based on the respective donor’s date and time of death and geographic location. The datasets from human blood consisted of multiple samples taken throughout the 24-h cycle for each subject. The datasets from cells cultured in vitro were based on time-courses following synchronization by dexamethasone, serum, or alternating temperature cycles.
The patterns of clock gene co-expression in human tissues and cells were similar to the patterns in mice (Fig. 2 and Fig. S4), which is consistent with our previous findings of similar relative phasing of clock gene expression in mice and humans [49]. The pattern was less distinct in human blood (Fig. S5), likely because several clock genes show weak or no rhythmicity in expression in blood cells [49]. The strong pattern in human skin was due to clock gene co-expression both between the three time-points and between individuals at a given time-point (Fig. S6). Compared to co-expression patterns in mouse organs and human skin, those in human brain were somewhat weaker, which is consistent with the weaker circadian rhythmicity for clock genes in those two brain-specific datasets [49].
To confirm our findings in a broader range of human organs, we analyzed five transcriptome datasets from healthy human lung, liver, skin, and adipose tissue [50–54] (Table S1). Samples from these datasets were not collected for the purpose of studying circadian rhythms and therefore are not labeled with time of day. Nonetheless, we observed the expected pattern of clock gene co-expression in each dataset (Fig. 2C and Fig. S7). We conclude that our approach can detect the signature of a functional circadian clock in a variety of human tissues in vitro and in vivo, even in datasets not designed to study circadian rhythms.
Aberrant patterns of clock gene co-expression in human cancer
To examine patterns of clock gene co-expression in human cancer, we applied our approach to RNA-seq data collected by The Cancer Genome Atlas (TCGA) and reprocessed using the Rsubread package [55]. TCGA samples are from surgical resections performed prior to neoadjuvant treatment. The times of day of surgery are not available and the surgeries were likely only performed during part of the day. We analyzed data from the 12 cancer types that included at least 30 samples from adjacent non-tumor tissue (Table S1). For each cancer type, we calculated the Spearman correlations in expression between clock genes across all tumor samples and all non-tumor samples.
In non-tumor samples from most cancer types, we observed a similar pattern of clock gene co-expression as in the mouse and human circadian datasets (Fig. 3A-B and Fig. S8A). In contrast, in tumor samples from each cancer type, the pattern was weaker or absent. We observed the same trend when we restricted our analysis to only matched samples, i.e., samples from patients from whom both non-tumor and tumor samples were collected (Fig. S9). To confirm these findings, we analyzed eight additional datasets of gene expression in human cancer, four from liver and four from lung, each of which included matched tumor and adjacent non-tumor samples [56–63] (Table S1). As in the TCGA data, clock gene co-expression in tumor samples was perturbed relative to non-tumor samples (Fig. 3C and Fig. S8B).
To quantify the dysregulation of clock gene co-expression in human cancer, we first combined the eight mouse datasets in a fixed-effects meta-analysis (Fig. 4A and Methods) in order to construct a single “reference” correlation pattern (Fig. S10 and Table S2). For each of the 12 TCGA cancer types and each of the eight additional datasets of human cancer, we then calculated the Euclidean distances between the reference pattern and the non-tumor pattern and between the reference pattern and the tumor pattern. We refer to each of these distances as a clock correlation distance (CCD), and we refer to the difference between the tumor and non-tumor CCDs as the delta clock correlation distance (ΔCCD). A positive ΔCCD indicates that the correlation pattern of the non-tumor samples is more similar to the reference than is the correlation pattern of the tumor samples.
Consistent with the visualizations of clock gene co-expression, every TCGA cancer type and additional cancer dataset had a positive ΔCCD (Fig. 4B), as did the individual tumor grades in the TCGA data (Fig. S11). Among the three TCGA cancer types with the lowest ΔCCD, prostate adenocarcinoma had a relatively high non-tumor CCD (suggesting dysregulated clock gene co-expression even in non-tumor samples), whereas renal clear cell carcinoma and thyroid carcinoma each had a relatively low tumor CCD (Fig. S12). To evaluate the statistical significance of the ΔCCD, we permuted the sample labels (non-tumor or tumor) in each dataset and re-calculated the ΔCCD 1000 times. Based on this permutation testing, the observed ΔCCD for 11 of the 18 datasets had a one-sided P < 0.001 (Fig. 4B). Overall, these results suggest that the circadian clock is dysregulated in a wide range of human cancers.
Tumors are a complex mixture of cancer cells and various non-cancerous cell types. The proportion of cancer cells in a tumor sample is called the tumor purity and is an important factor to consider in genomic analyses of bulk tumors [64]. We therefore examined the relationship between ΔCCD and tumor purity in the TCGA data. With the exception of thyroid carcinoma and prostate adenocarcinoma, ΔCCD and median tumor purity in TCGA cancer types were positively correlated (Fig. S13; Spearman correlation = 0.67, P = 0.059 by exact test). These findings suggest that at least in some cancer types, dysregulation of the circadian clock is stronger in cancer cells than in non-cancerous cells.
Distinct patterns of clock gene expression in human cancer and mouse clock knockouts
Finally, we investigated whether the clock dysregulation in human cancer resembled that caused by genetic mutations to core clock genes. We assembled seven datasets of circadian gene expression that included samples from wild-type mice and from mice in which at least one core clock gene was knocked out, either in the entire animal or in a specific cell type [8,40,65–69] (Table S1). For each dataset, we calculated the correlations in expression between pairs of clock genes in wild-type and mutant samples and calculated the ΔCCD (Fig. S14 and Fig. S15).
The two datasets with the highest ΔCCD (>50% higher than any ΔCCD we observed in human cancer) were those in which the mutant mice had not one, but two components of the clock knocked out (Cry1 and Cry2 in GSE13093; Nr1d1 and Nr1d2 in GSE34018). The ΔCCDs for the other five mutants were similar to or somewhat lower than the ΔCCDs we observed in human cancer. Given the smaller sample sizes compared to the human cancer datasets, the ΔCCDs for those five mutants were not significantly greater than zero (one-sided P > 0.05 by permutation test).
To further compare clock dysregulation in human cancer and clock knockouts, we calculated differential expression of the clock genes between non-tumor and tumor samples and between wild-type and mutant samples (Fig. 5A). Differential expression in the clock knockouts was largely consistent with current understanding of the core clock. For example, knockout of Arntl (Bmal1, the primary transcriptional activator) tended to cause reduced expression (irrespective of time of day) of Nr1d1, Nr1d2, Per1, Per2, Per3, Dbp, and Tef, and increased or unchanged expression of the other clock genes. In the double knockout of Cry1 and Cry2 (two negative regulators of the clock), this pattern was reversed. Interestingly, neither of these patterns of differential expression was apparent in human cancer.
In the clock gene knockouts, rhythmic expression of the clock genes was reduced or lost (Fig. S16). Although it was not possible to quantify the rhythmicity of expression in the human cancer datasets directly, we reasoned that a proxy for rhythmicity could be the magnitude of variation in expression. Therefore, for each TCGA cancer type and each additional human cancer dataset, we calculated the median absolute deviation (MAD) in expression of the clock genes in non-tumor and tumor samples. We then compared the log2 ratios of MAD between tumor and non-tumor samples to the log2 ratios of MAD between mutant and wild-type samples from the clock gene knockout data (Fig. 5B). As expected, samples from clock gene knockouts showed widespread reductions in MAD compared to samples from wild-type mice. In contrast, human tumor samples tended to show similar or even somewhat higher MAD compared to non-tumor samples. Taken together with the differential expression analysis, these results suggest that the dysregulation of the clock in human cancer is not due solely to loss of activity of one or more core clock genes.
Discussion
Increasing evidence has suggested that systemic disruption of the circadian clock can promote tumor development and that components of a tumor can disrupt the circadian clock. Until now, however, the question of whether the clock is functional in primary human cancers has lacked a clear answer. Here we developed a simple method to probe clock function based on the co-expression of a small set of clock genes. By applying the method to cancer transcriptome data, we uncovered widespread dysregulation of the clock in human cancer tissue.
Our approach for detecting a functional circadian clock is based on three principles. First, we rely on prior knowledge of clock genes and clock-controlled genes. Second, we account for the fact that the clock is defined not by a static condition, but by a dynamic cycle. Our approach thus exploits the co-expression of clock genes that arises from (1) different genes having different circadian phases and (2) different samples being taken from different points in the cycle. Finally, our method does not attempt to infer an oscillatory pattern, but instead uses only the statistical correlations between pairs of genes. The assumption is that perturbations to the core clock will alter the relative phases of rhythms in clock gene expression and the correlations in expression between clock genes. Although the correlation matrix only partially captures the complex relationship between genes (to fully capture the relationship would require modeling the joint probability density of gene expression), it is intuitive and straightforward to calculate. Altogether, these principles enable our method to detect the signature of the circadian clock in groups of samples whose times of day of acquisition are unknown and whose coverage of the 24-h cycle is incomplete.
Despite these advantages, our method does have limitations. First, it is insensitive to the alignment of the circadian clock to the time of day, and so cannot detect phase differences between conditions. This limitation, however, allowed us to readily construct a reference pattern using data from mice and compare it to data from humans, despite the circadian phase difference between the two species [49]. Second, the ΔCCD is invariant to the relative levels of gene expression between conditions, which is why we complemented our analysis of clock gene co-expression with analysis of differential expression and differential variability. Third, transcription is only one facet of the core clock mechanism, and perturbations to post-translational modification or degradation of clock proteins (if unaccompanied by changes in clock gene expression) would not be detected by our approach. Finally, because the ΔCCD relies on co-expression across samples, it does not immediately lend itself to quantifying clock disruption in single samples. In the future, it may be possible to complement the ΔCCD and assess clock function in some datasets by directly comparing matched samples from the same patient.
In healthy tissues in vivo, the circadian clocks of individual cells are entrained and oscillating together, which is what allows bulk measurements to contain robust circadian signals. Consequently, the loss of a circadian signature in human tumor samples could result from dysfunction in either entrainment, the oscillator, or both. Dysfunction in entrainment would imply that the clocks in at least some of the cancer cells are out of sync with each other and therefore free running, i.e. ignoring zeitgeber signals. Dysfunction in the oscillator would imply that the clocks in at least some of the cancer cells are no longer “ticking” (albeit in such a way that variation in clock gene expression, at least across patients, is not diminished). Given the current data, which are based on averaged clock gene expression from many cells, these scenarios cannot be distinguished. Furthermore, the moderate correlation between ΔCCD and tumor purity across cancer types leads us to speculate that the circadian clocks in stromal and/or infiltrating immune cells may be operating normally. In the future, these issues may be resolved through a combination of mathematical modeling [70,71] and single-cell measurements. A separate matter not addressed here is how the cancer influences circadian rhythms in the rest of the body [72], which may be relevant for optimizing the daily timing of anticancer treatments [73].
Based on the current data alone, which are observational, it is not possible to determine whether dysregulation of the circadian clock is a driver of the cancer or merely a passenger. However, given the clock’s established role in regulating metabolism and a recent finding that stimulation of the clock inhibits tumor growth in melanoma [23], our findings suggest that clock dysregulation may be a cancer driver in multiple solid tissues. On the other hand, a functional circadian clock seems to be required for growth of acute myeloid leukemia cells [74], so further work is necessary to clarify this issue.
Conclusions
Our findings suggest that dysregulation of the circadian clock is a common mechanism by which human cancers achieve unrestrained growth and division. Thus, restoring clock function could be a viable therapeutic strategy in a wide range of cancer types. In addition, given the practical challenges of studying circadian rhythms at the cellular level in humans, our method offers the possibility to quantify clock function in a wide range of human phenotypes using publicly available transcriptome data.
Methods
Selecting the datasets
We selected the datasets of circadian gene expression in mice (both for defining the reference pattern and for comparing clock gene knockouts to wild-type) to represent multiple organs, light-dark regimens, and microarray platforms. For circadian gene expression in humans, we included three datasets from blood, two from brain, and one from skin. The samples from blood and skin were obtained from living volunteers, whereas the samples from brain were obtained from postmortem donors who had died rapidly. For GSE45642 (human brain), we only included samples from control subjects (i.e., we excluded subjects with major depressive disorder).
Zeitgeber times for samples from GSE56931 (human blood) were calculated as described previously [75]. For the TCGA data, we analyzed all cancer types that had at least 30 non-tumor samples (all of which also had at least 291 tumor samples). When analyzing clock gene expression in human cancer, unless otherwise noted, we included all tumor and non-tumor samples, not just those from patients from whom both non-tumor and tumor samples were collected. For details of the datasets, all of which are publicly available, see Table S1.
Processing the gene expression data
For TCGA samples, we obtained the processed RNA-seq data (in units of transcripts per million, TPM, on a gene-level basis) and the corresponding metadata (cancer type, patient ID, etc.) from GSE62944 [55]. For E-MTAB-3428, we downloaded the RNA-seq read files from the European Nucleotide Archive, used Salmon to quantify transcript-level abundances in units of TPM [76], then used the mapping between Ensembl Transcript IDs and Entrez Gene IDs to calculate gene-level abundances.
For the remaining datasets, raw (in the case of Affymetrix) or processed microarray data were obtained from NCBI GEO and processed using MetaPredict, which maps probes to Entrez Gene IDs and performs intra-study normalization and log-transformation [77]. MetaPredict processes raw Affymetrix data using RMA and customCDFs [78,79]. As in our previous study, we used ComBat to reduce batch effects between anatomical areas in human brain and between subjects in human blood [49,80].
Analyzing the gene expression data
We focused our analysis on the expression of 12 genes that are considered part of the core clock or are directly controlled by the clock and that show strong, consistently phased rhythms in a wide range of mouse organs [3,32]. We calculated times of peak expression and strengths of circadian rhythmicity of expression in wild-type and mutant mice using ZeitZeiger [32], with three knots for the periodic smoothing splines [81].
We quantified the relationship between expression values of pairs of genes using the Spearman correlation (Spearman’s rho), which is rank-based and therefore invariant to monotonic transformations such as the logarithm and less sensitive to outliers than the Pearson correlation. Using the biweight midcorrelation, which is also robust to outliers, gave very similar results. All heatmaps of gene-gene correlations in this paper have the same mapping of correlation value to color, so they are directly visually comparable.
We calculated the reference Spearman correlation for each pair of genes (Table S2) using a fixed-effects meta-analysis of the eight mouse datasets shown in Fig. 1 [82]. First, we applied the Fisher z-transformation (arctanh) to the correlations from each dataset. Then we calculated a weighted average of the transformed correlations, where the weight for dataset i was ni − 3 (corresponding to the inverse variance of the transformed correlation), where ni is the number of samples in dataset i. Finally, we applied the inverse transformation (tanh) to the weighted average.
To quantify the similarity in clock gene expression between two groups of samples (e.g., between the mouse reference and human tumor samples), we calculated the Euclidean distance between the respective Spearman correlation vectors, which contains all values in the strictly lower (or strictly upper) triangular part of the correlation matrix. Given a reference and a dataset with samples from two conditions, we calculated the Euclidean distances between the reference and each condition, which we call the clock correlation distances (CCDs). We then calculated the difference between these two distances, which indicates how much more similar to the reference one condition is than the other and which we refer to as the delta clock correlation distance (ΔCCD). Although here we used Euclidean distance, other distance metrics could be used as well.
To evaluate the statistical significance of the ΔCCD for a given dataset, we conducted the permutation test as follows: First, we permuted the relationship between the sample labels (e.g., non-tumor or tumor) and the gene expression values and recalculated the ΔCCD 1000 times, always keeping the reference fixed. We then calculated the one-sided p-value as the fraction of permutations that gave a ΔCCD greater than or equal to the observed ΔCCD. Since we used the one-sided p-value, the alternative hypothesis was that non-tumor (or wild-type) is more similar to the reference than is tumor (or mutant).
To calculate the ΔCCD for individual tumor grades, we used the clinical metadata provided in GSE62944. We analyzed all combinations of TGCA cancer type and tumor grade that included at least 50 tumor samples. In each case, we calculated the ΔCCD using all non-tumor samples of the respective cancer type.
To compare ΔCCD and tumor purity, we used published consensus purity estimates for TCGA tumor samples [64]. The estimates are based on DNA methylation, somatic copy number variation, and the expression of immune genes and stromal genes (none of which are clock genes).
We quantified differential expression between tumor and non-tumor samples and between mutant and wild-type samples using limma and voom [83,84]. As these techniques as designed for transcriptome data, we calculated differential expression using all measured genes, then focused subsequent analysis on the clock genes. To ensure a fair comparison between human and mouse data, we ignored time of day information in the mouse samples. We quantified the variation in expression of clock genes in each dataset and condition using the median absolute deviation (MAD), which is less sensitive to outliers than the standard deviation.
Abbreviations
- ΔCCD
- delta clock correlation distance
- MAD
- median absolute deviation
- TCGA
- The Cancer Genome Atlas
- ZT
- zeitgeber time
- BRCA
- breast invasive cell carcinoma
- COAD
- colon adenocarcinoma
- HNSC
- head and neck squamous cell carcinoma
- KIRC
- kidney renal clear cell carcinoma
- KIRP
- kidney renal papillary cell carcinoma
- LIHC
- liver hepatocellular carcinoma
- LUAD
- lung adenocarcinoma
- LUSC
- lung squamous cell carcinoma
- PRAD
- prostate adenocarcinoma
- STAD
- stomach adenocarcinoma
- THCA
- thyroid carcinoma
- UCEC
- uterine corpus endometrial carcinoma
Declarations
Acknowledgments
We thank Dvir Aran and the VUMC Editor’s Club for helpful comments on the manuscript.
Funding
This work was supported by start-up funds from the Vanderbilt University School of Medicine (to JJH) and by NIH grants 1U2COD023196-01 and U01HG009086-01 (to GC).
Availability of data and materials
All data and code to reproduce this study are available at https://figshare.com/s/2eaf11e88642418f7e81. The original gene expression data and metadata for all datasets are available from NCBI GEO or Array Express. A web application to calculate ΔCCD for one’s own gene expression data is available at https://jakejh.shinyapps.io/deltaccd.
Author contributions
JS performed the analysis and reviewed drafts of the paper. GC conceived and designed the analysis and reviewed drafts of the paper. JJH conceived and designed the analysis, performed the analysis, wrote the paper, and reviewed drafts of the paper. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Ethics approval and consent to participate
Not applicable.
Additional files
Additional file 1: Table S1
Additional file 2: Figures S1-S15
Additional file 3: Table S2
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.
- 12.↵
- 13.
- 14.
- 15.↵
- 16.↵
- 17.
- 18.
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.
- 32.↵
- 33.↵
- 34.↵
- 35.
- 36.↵
- 37.
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.
- 52.
- 53.
- 54.↵
- 55.↵
- 56.↵
- 57.
- 58.
- 59.
- 60.
- 61.
- 62.
- 63.↵
- 64.↵
- 65.↵
- 66.
- 67.
- 68.
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵