Abstract
Background Growing evidence suggests that chemicals in disparate structural classes activate specific subsets of PPARγ’s transcriptional programs to generate adipocytes with distinct phenotypes.
Objectives Our objectives were to 1) establish a novel classification method to predict PPARγ-interacting and modifying chemicals, and 2) create a taxonomy to group chemicals based on their effects on PPARγ’s transcriptome and adipocyte phenotype. We tested the hypothesis that environmental ligands highly ranked by the taxonomy, but that segregated from the therapeutic ligands, would induce white but not brite adipogenesis.
Methods 3T3-L1 cells were differentiated in the presence of 76 chemicals (negative controls, synthetic nuclear receptor ligands known to influence adipocyte biology, suspected environmental PPARγ ligands). Differentiation was assessed by measuring lipid accumulation. mRNA expression was determined by multiplexed RNA-Seq and validated by RT-qPCR. A novel classification model was developed using an amended random forest procedure tailored to the experimental design. A subset of environmental contaminants identified as strong PPARγ agonists were characterized for lipid handling, mitochondrial biogenesis and cellular respiration in mouse and human adipocyte models.
Results The 76 chemicals generated a spectrum of adipogenic differentiation. We used lipid accumulation and RNA sequencing data to develop a classification system that 1) identified PPARγ agonists, and 2) sorted agonists into likely white or brite adipogens. Expression of Cidec was the most efficacious indicator of strong PPARγ activation. Two known environmental PPARγ ligands, tetrabromobisphenol A and triphenyl phosphate, which sorted distinctly from therapeutic ligands, induced white but not brite adipogenesis. Moreover, two chemicals were identified as highly ranked PPARγ agonists, tonalide and quinoxyfen, induced white adipogenesis without the concomitant health-promoting effects in 3T3-L1 cells and primary human preadipocytes.
Discussion A novel classification procedure accurately identified environmental chemicals as PPARγ-modifying chemicals distinct from known PPARγ-modifying therapeutics. The developed computational and experimental framework also has general applicability to the classification of as-yet uncharacterized chemicals.
Acknowledgements
Declaration of competing financial interests (CFI)
The authors declare they have no actual or potential competing financial interests.
Grant Information
Superfund Research Program [P42 ES007381]
BU-Joslin Pilot & Feasibility Program Award (2016)
Introduction
Since 1980, the prevalence of obesity has been increasing globally and has doubled in more than 70 countries. In 2015, it was estimated that a total of 108 million children and 604 million adults were obese worldwide (Collaborators et al. 2017). This poses a major public health threat since overweight and obesity increase the risk of metabolic syndrome, which, in turn, sets the stage for metabolic diseases, such as type 2 diabetes, cardiovascular disease, nonalcoholic fatty liver disease and stroke (Park et al. 2003). The Endocrine Society’s latest scientific statement on the obesity pathogenesis states that obesity is a disorder of the energy homeostasis system, rather than just a passive accumulation of adipose, and that environmental factors, including chemicals, confer obesity risk (Schwartz et al. 2017). The rapid increases in obesity and metabolic diseases correlate with substantial increases in environmental chemical production and exposures over the last few decades, and experimental evidence in animal models demonstrates the ability of a broad spectrum of chemicals to induce adiposity and metabolic disruption (Heindel et al. 2017).
Adipocytes are crucial for maintaining metabolic homeostasis as they are repositories of free fatty acids and release hormones that can modulate body fat mass (Rosen and Spiegelman 2006). Adipogenesis is a highly regulated process that involves a network of transcription factors acting at different time points during differentiation (Farmer 2006) Peroxisome proliferator activated receptor γ (PPARγ) is a ligand activated, nuclear receptor and essential regulator of adipocyte formation and function (Tontonoz et al. 1994), as well as metabolic homeostasis, as all PPARγ haploinsufficient and KO models present with lack of adipocyte formation and metabolic disruption (Gumbilai et al. 2016; He et al. 2003; Jiang et al. 2014; O’Donnell et al. 2016; Zhang et al. 2004).
PPARγ activation regulates energy homeostasis by both stimulating storage of excess energy as lipids in white adipocytes and stimulating energy utilization by triggering mitochondrial biogenesis, fatty acid oxidation and thermogenesis in brite and brown adipocytes. The white adipogenic, brite/brown adipogenic and insulin sensitizing activities of PPARγ are regulated separately through post-translational modifications (Banks et al. 2015; Choi et al. 2010; Choi et al. 2011; Qiang et al. 2012) and differential co-regulator recruitment,(Burgermeister et al. 2006; Feige et al. 2007; Ohno et al. 2012; Villanueva et al. 2013). Rapid expansion of white adipose depots and adipocyte hypertrophy that outpace vascularization generates hypoxic conditions that trigger the inflammation, fibrosis and lipotoxicity that contribute to the development of metabolic syndrome (Kusminski et al. 2016). Importantly, humans with minimal brite adipocyte populations are at higher risk for obesity and type 2 diabetes.(Claussnitzer et al. 2015; Sidossis and Kajimura 2015; Timmons and Pedersen 2009).
Growing evidence supports the hypothesis that environmental PPARγ ligands induce phenotypically distinct adipocytes. Tributyltin (TBT) induces the formation of an adipocyte with reduced adiponectin expression and altered glucose homeostasis (Regnier et al. 2015).
Furthermore, TBT fails to induce expression of genes associated with browning of adipocytes (e.g. Ppara, Pgc1a, Cidea, Elovl3, Ucp1) in differentiating 3T3-L1 adipocytes (Kim et al. 2018; Shoucri et al. 2018). As a result, TBT-induced adipocytes fail to upregulate mitochondrial biogenesis and have low levels of cellular respiration (Kim et al. 2018; Shoucri et al. 2018). The structurally similar environmental PPARγ ligand, triphenyl phosphate, also fails to induce brite adipogenesis, and this correlates with an inability to prevent PPARγ from being phosphorylated at S273 (Schlezinger 2018).
The EPA developed the Toxicity Forecaster (ToxCast™) program to use high-throughput screening assays to prioritize chemicals and inform regulatory decisions regarding thousands of environmental chemicals (Kavlock et al. 2012). Several ToxCast™ assays can measure the ability of chemicals to bind to or activate PPARγ, and these assays have been used to generate a toxicological priority index (ToxPi) that were expected to predict the adipogenic potential of chemicals in cell culture models (Auerbach et al. 2016). Yet, it has been shown that the results of ToxCast™ PPARγ assays do not always correlate well with activity measured in a laboratory setting and that the ToxPi designed for adipogenesis was prone to predicting false positives (Janesick et al. 2016). Furthermore, the ToxCast/ToxPi approach cannot distinguish between white and brite adipogens.
Here, we present phenotypic and transcriptomic data from adipocytes differentiated in the presence of 76 different chemicals. We combined the cost-effective generation of agonistic transcriptomic data by the novel highly multiplexed RNA-seq technology 3’Digital Gene Expression with a new classification method to predict PPARγ-interacting and modifying chemicals. Further, we investigated metabolism-related outcome pathways as effects of the chemical exposures. We created a data-driven taxonomy to specifically classify chemicals into distinct categories based on their various interactions with and effects on PPARγ. Based on the taxonomy-based predictions, we tested the phenotype (white vs. brite adipocyte functions) of environmental adipogens predicted to fail to induce brite adipogenesis in 3T3 L1 cells and primary human adipocytes.
Methods
Chemicals
DMSO was purchased from American Bioanalytical (Natick, MA). CAS numbers, sources and catalog numbers of experimental chemicals are provided in Table S1. Human insulin, dexamethasone, 3-isobutyl-1-methylxanthine (IBMX), and all other chemicals were from Sigma-Aldrich (St. Louis, MO) unless noted.
Cell Culture
NIH 3T3-L1 (ATCC: CL-173, RRID:CVCL_0123) pre-adipocytes were maintained in high-glucose DMEM with 10% calf serum, 100 U/ml penicillin, 100 μg/ml streptomycin, 0.25 μg/ml amphotericin B. Cells were plated in maintenance for experiments and incubated for 4 days. “Naïve” cells were cultured in maintenance medium for the duration of an experiment. To induce adipogenesis, the medium was replaced with DMEM containing 10% fetal bovine serum (FBS, Sigma-Aldrich), 250 nM dexamethasone, 167 nM human insulin, 0.5 mM IBMX, 100 U/ml penicillin, and 100 μg/ml streptomycin. Experimental wells received induction medium and were treated with Vh (DMSO, 0.2% final concentration) or test chemicals at concentrations indicated in Table S1. On days 3 and 5 of differentiation, medium was replaced with adipocyte maintenance medium (DMEM, 10% FBS, 167 nM human insulin, 100 U/ml penicillin, 100 μg/ml streptomycin), and the cultures were re-dosed. On Day 7 of differentiation, medium was replaced with adipocyte medium (DMEM, 10% FBS, 100 U/ml penicillin, 100 μg/ml streptomycin), and the cultures were re-dosed. On day 10, cytotoxicity was assessed by microscopic inspection, with cultures containing more than 10% rounded cells excluded from consideration. Healthy cells were harvested for analysis of gene expression, lipid accumulation, fatty acid uptake, mitochondrial biogenesis, mitochondrial membrane potential, and cellular respiration.
Primary human subcutaneous pre-adipocytes were obtained from the Boston Nutrition Obesity Research Center (Boston, MA). The pre-adipocytes were maintained in αMEM with 10% FBS, 100 U/ml penicillin, 100 μg/ml streptomycin, 0.25 μg/ml amphotericin B. Pre-adipocytes were plated in maintenance medium for experiments and incubated for 3 days. “Naïve” cells were cultured in maintenance medium for the duration of an experiment. To induce adipogenesis, the medium was replaced with DMEM/F12, 25 mM NaHCO3, 100 U/ml penicillin, 100 μg/ml streptomycin, 33 μM d-Biotin, 17 μM pantothenate, 100 nM dexamethasone, 100 nM human insulin, 0.5 mM IBMX, 2 nM T3, and 10 μg/ml transferrin. Experimental wells received induction medium and were treated with vehicle (0.1% DMSO), tonalide, or quinoxyfen (4 μM)). On day 3 of differentiation, medium was replaced with induction medium, and the cultures were re-dosed. On days 5, 7, 10, and 12 of differentiation, the medium was replaced with adipocyte medium (DMEM/F12, 25 mM NaHCO3 100 U/ml penicillin, 100 μg/ml streptomycin, 33 μM d-Biotin, 17 μM pantothenate, 10 nM dexamethasone, and 10 nM insulin), and the cultures were re-dosed. Following 14 days of differentiation and dosing, cells were harvested for analysis of gene expression, lipid accumulation, fatty acid uptake, mitochondrial biogenesis, and cellular respiration.
Transcriptome Analysis
3T3-L1 cells were plated in 24 well plates at 50,000 cells per well in 0.5 ml maintenance medium at initiation of the experiment and then cultured as described above. Total RNA was extracted and genomic DNA was removed using the Direct-zol MagBead RNA Kit (Zymo Research, Orange, CA). A final concentration of 5 ng RNA/ul was used for each sample (n = 6 for naïve, n = 3-4 per Vh or chemical) across six 96 well plates. Sequencing and gene expression quantification was carried out by the Broad Institute (Cambridge, MA). RNA was sequenced using highly multiplexed 3’ Digital Gene Expression (3’ DGE, (Xiong et al. 2017)). Only instances of uniquely aligned reads were quantified, i.e., reads that aligned to only one transcript. Furthermore, multiple reads with the same UMI, aligning to the same gene were quantified as a single count.
Gene Expression Data Preprocessing
All analyses of gene expression data was carried out in R v3.4.3 (Team 2013). The number of counted reads per samples varied widely with a range of 7.9E1 to 2.27E6 (Mean = 2.25E5, SD = 2.94E5). To determine threshold of acceptable sample level quantification, we performed an iterative clustering-based approach to determine sets of low expression outlier samples. Each iteration included four steps: removal of low count genes, normalization, plate-level batch correction, and hierarchical clustering. Low count genes were defined as genes with mean counts < 1 across all samples. Normalization was performed using Trimmed mean of M-values, the default method employed by limma v3.34.9 (Ritchie et al. 2015). Batch correction was performed by ComBat v3.26.0 (Leek et al. 2012). Hierarchical clustering was performed on the 3000 genes with the largest median absolute deviation (MAD) score, using Euclidean distance and 1-Pearson correlation as the distance metric for samples and genes, respectively, as well as Ward’s agglomerative method. Clusters of samples clearly representative of low expression quantification were removed. This process was repeated until no low expression outlier sample was present (four iterations). For the remaining samples, once again low count genes were removed and samples were normalized and batch corrected by the same procedure. The final data set includes 9,616 genes across 234 samples.
PPARγ Modifier Classification
A classification model was inferred from the training set consisting of 38 known PPARγ-modifying compounds and 22 known non-PPARγ modifying compounds, including vehicle, to predict the label of the test set of 17 suspected PPARγ-modifying compounds. The model inference was based on an amended random forest procedure developed to better account for the presence of biological replicates in the data (manuscript in preparation). Specifically, for each classification tree, samples and genes were bagged, such that samples were sampled with replacement and the square root of the total number of genes were randomly selected. Within these “bags”, replicates of the same chemical exposure were then collapsed to their mean expression. The random forest classification vote, a number between 0 and 1, was then computed as the proportion of trees in the forest that assign the sample to the positive class. Prior to running the random forest procedure, genes were filtered based on within versus between exposure variance, using ANOVA. Genes with an F-statistics associated with an FDR corrected p-value < 0.05, were used in the classification procedure. The predictive performance of the classification approach was estimated using 10-fold cross validation over the training set. For each fold, samples were stratified at the chemical exposure level, such that each fold included 6 distinct compounds and a different number of samples, and all replicates of the same compound were only included in either the training or the test folds. Thresholds for determining class membership based on voting was determined by running the training folds through the random forest and selecting the threshold producing the highest F1-score, i.e., the harmonic mean of precision and sensitivity. Performance was assessed in terms of area under the ROC curve (AUC), as well as precision, sensitivity, specificity, F1-score, and balanced accuracy, i.e., the mean of specificity and sensitivity. All random forests were generated using 2000 decision trees. The final classification model used to predict the unlabeled chemicals was built using the full training set of 60 labelled chemicals and 1,199 genes after filtering. The performance of this procedure was compared to three alternative random forest strategies. In the first, denoted as pre-merge, the mean gene expression across replicates is computed, and a classic random forest is applied to the classification of each merged chemical profile. In the second, denoted as classic, replicate samples are treated as independent perturbations and classified based on a classic random forest. Finally, in the third, denoted as pooled, the mean of votes across replicates from the previous strategy are used to estimate class membership per compound. To compare the performance of each strategy, the 10-fold CV procedure applied to the training set was repeated 10-times to generate a distribution of performance statistics. The importance of each gene in each random forest model was measured using the gini importance measure (Breiman 2001).
PPARγ Activity Modifier Clustering
Known and suspected PPARγ modifiers were clustered based on their test statistics from univariate analysis comparing each chemical or naive exposure to vehicle using limma v3.34.9. In order to assess taxonomic differences between different exposure outcomes, a recursive clustering procedure, which we refer to as “K2 clustering”, was developed, whereby a set of chemicals is iteratively split into two subgroups. At each iteration of the procedure, the genes with the top 10% of the sum of squared test statistics across all samples within the current set are selected. Samples are then clustered using Euclidean distance and Ward’s agglomerative method, and are split into two clusters using the cutree R function. The procedure is then recursively applied to each of the two clusters, until the two-cluster split would result in a single chemical in the terminal subgroup. To obtain and measure the most stable clusters, each iteration was bootstrapped 200 times by resampling gene-level statistics with replacement. The most common clusters were used, and the proportion of total bootstrapping iterations that included these identical clustering assignments was reported. At each step, all clusters must include at least non-vehicle exposures.
In order to derive gene-signatures of each split, differential analysis was performed between samples from compounds of either cluster at a split. In these models, biological replicate status was accounted for using the duplicate correlation procedure in the limma package. From these models, signatures of genes assigned to either the two subgroups were generated based on two criteria. First, for a particular gene, the difference between mean expression between the two groups must have |log2(Fold-Change)|> 1 and an FDR Q-value < 0.1. Each gene is then assigned to either of the two subgroups based on the mean of their test statistics from the comparison of each chemical to vehicle, i.e., a gene is assigned to a subgroup with maximum absolute value of the mean of these test statistics. This yielded four gene sets per split, pertaining to both subgroup assignment and direction. Functional enrichment, comparing these gene sets to independently annotated gene sets was carried out via Fisher’s Exact Test. These gene sets include those of the Gene Ontology Biological Processes gene set compendia downloaded from MSigDB (c5.bp.v6.2.symbols.gmt), as well two gene sets derived from publicly available microarray expression data from an experiment using mouse embryonic fibroblasts to compare wild-type samples with mutant samples that do not undergo phosphorylation of PPARγ at Ser273, GEO accession number GSE22033 (Choi et al. 2010). These additional gene sets were comprised of genes, measured to be significantly up- or down-regulated (FDR Q-Value < 0.05) in mutant samples, based on differential analysis of RMA normalized expression with limma.
Reverse Transcriptase (RT)-qPCR
Cells were plated in 24 well plates at 50,000 cells per well in 0.5 ml maintenance medium at initiation of the experiment and then cultured as described above. Total RNA was extracted and genomic DNA was removed using the 96-well Direct-zol MagBead RNA Kit (Zymo Research). cDNA was synthesized from total RNA using the iScript™ Reverse Transcription System (BioRad, Hercules, CA). All qPCR reactions were performed using the PowerUp™ SYBR Green Master Mix (Thermo Fisher Scientific, Waltham, MA). The qPCR reactions were performed using a 7500 Fast Real-Time PCR System (Applied Biosystems, Carlsbad, CA): UDG activation (50°C for 2 min), polymerase activation (95°C for 2 min), 40 cycles of denaturation (95°C for 15 sec) and annealing (various temperatures for 15 sec), extension (72°C for 60 sec). The primer sequences and annealing temperatures are provided in Table S2. Relative gene expression was determined using the Pfaffl method to account for differential primer efficiencies (Pfaffl 2001), using the geometric mean of the Cq values for beta-2-microglobulin (B2m) and 18s ribosomal RNA (Rn18s) for mouse gene normalization and of ribosomal protein L27 (RPL27) and B2M for human gene normalization. The Cq value from naïve, undifferentiated cultures was used as the reference point. Data are reported as “Relative Expression.”
Lipid Accumulation
Cells were plated in 24 well plates at 50,000 cells per well in 0.5 ml maintenance medium at initiation of the experiment and then cultured as described above. Medium was removed from the differentiated cells, and they were rinsed with PBS. The cells were then incubated with Nile Red (1 µg/ml in PBS) for 15 min in the dark. Fluorescence (λex= 485 nm, λem= 530 nm) was measured using a Synergy2 plate reader (BioTek Inc., Winooski, VT). The fluorescence in experimental wells was normalized by subtracting the fluorescence measured in naïve (undifferentiated) cells and reported as relative fluorescence units (“RFUs”).
Mitochondrial Membrane Potential
Cells were plated in 96 well, black-sided plates at 10,000 cells per well in 0.2 ml maintenance medium at initiation of the experiment and then cultured as described above. Mitchondrial membrane potential was measured by treating differentiated cells will MitoOrange Dye according to manufacturer’s protocol (Abcam, Cambridge, MA). Measurement of fluorescence intensity (λex= 485 nm, λem= 530 nm) was performed using a Synergy2 plate reader. The fluorescence in experimental wells was normalized by subtracting the fluorescence measured in naïve (undifferentiated) cells and reported as “RFUs.”
Fatty Acid Uptake
Cells were plated in 96 well, black-sided plates at 10,000 cells per well in 0.2 ml maintenance medium at initiation of the experiment and then cultured as described above. Fatty acid uptake was measured by treating differentiated cells with 100 μL of Fatty Acid Dye Loading Solution (Sigma-Aldrich, MAK156). Following a 1 hr incubation, measurement of fluorescence intensity (λex= 485nm, λem= 530nm) was performed using a Synergy2 plate reader. The fluorescence in experimental wells was normalized by subtracting the fluorescence measured in naïve (undifferentiated) cells and reported as fold difference from vehicle “RFUs.”
Mitochondrial Biogenesis
Cells were plated in 24 well plates at 50,000 cells per well in 0.5 ml maintenance medium at initiation of the experiment and then cultured as described above. Mitochondrial biogenesis was measured in differentiated cells using the MitoBiogenesis In-Cell Elisa Colorimetric Kit, following the manufacturer’s protocol (Abcam).
The expression of two mitochondrial proteins (COX1 and SDH) were measured simultaneously and normalized to the total protein content via JANUS staining. Absorbance (OD 600nm for COX1, OD 405nm for SDH, and OD 595nm for JANUS) was measured using a BioTek Synergy2 plate reader. The absorbance ratios of COX/SDH in experimental wells were normalized to the naïve (undifferentiated) cells.
Oxygen Consumption
Cells were plated in Agilent Seahorse plates at a density of 50,000 cells per well in 0.5 ml maintenance medium at initiation of the experiment and then cultured as described above. Prior to all assays, cell media was changed to Seahorse XF Assay Medium without glucose (1mM sodium pyruvate, 1mM GlutaMax, pH 7.4) and incubated at 37°C in a non-CO2 incubator for 30 min. To measure mitochondrial respiration, the Agilent Seahorse XF96 Cell Mito Stress Test Analyzer (available at BUMC Analytical Instrumentation Core) was used, following the manufacturer’s standard protocol. The compounds and their concentrations used to determine oxygen consumption rate (OCR) included 1) 0.5 μM oligomycin, 1.0 μM carbonyl cyanide-p-trifluoromethoxyphenylhydrazone (FCCP) and 2 μM rotenone for 3T3-L1s; and 2) 5 μM oligomycin, 2.5 μM FCCP, and 10 μM rotenone for the primary human adipocytes.
Statistical Analyses
All statistical analyses were performed in R (v 3.4.3) and Prism 7 (GraphPad Software, Inc., La Jolla, CA). Data are presented as means ± standard error (SE). For 3T3-L1 experiments the biological replicates correspond to independently plated experiments. For human primary preadipocyte experiments the biological replicates correspond to distinct individuals’ preadipocytes (3 individuals in all). The qPCR data were log-transformed before statistical analyses. One-factor ANOVAs (Dunnett’s) were performed to analyze the qPCR and phenotypic data. Sequencing data from 3’DGE have been deposited into GEO (Accession: GSE124564).
Results
Development of novel taxonomic subgroups of PPARγ modifiers
Potential adipogens (chemicals that change the differentiation and/or function of adipocytes including endogenous, natural, therapeutic, synthetic and environmental chemicals) were identified by review of the literature and based on reports of PPARγ agonism or modulation of adipocyte differentiation. We also identified chemicals to act as negative controls. Our classification groups were based on “yes”, “no”, or “-” of the chemical’s potential ability to interact or modify PPARγ (i.e., to alter its post translational modifications) as noted in the “PPARγ Modifier” column in Table S1.
The classic mouse pre-adipocyte model, 3T3-L1 cells, was either maintained in an undifferentiated state (naive), or differentiated and treated with Vh (0.1% DMSO, final concentration) or with each of the chemicals (concentrations are reported in Table S1). Lipid accumulation, indicative of adipocyte differentiation, was determined after 10 days. A spectrum of adipocyte differentiation was induced (Figure 1). Of the 27 chemicals that significantly increased adipocyte differentiation, 18 were known PPARγ modifiers and 9 were suspected PPARγ modifiers. Mono(2-ethylhexyl) phthalate (MEHP), SR1664, and 15-deoxy-Δ12,14-prostaglandin J2 (15dPGJ2) are PPARγ agonists that were expected to increase adipocyte differentiation, but did not. LG268 and TBT are RXR agonists that were also expected to significantly increase adipocyte differentiation, but did not. The 3 chemicals that significantly downregulated adipocyte differentiation are all known to interact with the retinoic acid receptor. T007 is a PPARγ antagonist that was expected to decrease adipocyte differentiation, but did not. The negative controls did not significantly influence adipocyte differentiation. Many of the suspected PPARγ modifiers did not significantly increase adipocyte differentiation. We hypothesize that this likely resulted from the fact that we did not apply any chemical above 20 μM (with the exception of fenthion), while higher concentrations were used in previous studies.
When predicting PPARγ modifying status (“yes” vs. “no”), the mean AUC, precision, sensitivity, specificity, F1-score, and balanced accuracy from repeated 10-fold cross validation (over the training set) of the random forest with bag merging procedure was 0.89, 0.90, 0.80, 0.85, 0.85, and 0.82, respectively (Figure 2A). We observed the most drastic improvement of measured balanced accuracy, precision, and specificity by the bag merging procedure compared to other assessed strategies (Figure S1). The first two metrics in particular reflect expectation of relatively few false positive results compared to the other strategies. In the final model, the voting threshold that produced the highest F1-score was 0.53. Of the 17 chemicals of unknown interaction with PPARγ, 13 had random forest vote greater than this value (Table 1). Of these 13 compounds, four had vote > 0.88. These chemicals included quinoxyfen, tonalide, allethrin, and fenthion. Of the 1,199 genes used to train the final classification model, ribosomal protein L13 (Rpl13) and cell death Inducing DFFA Like Effector C (Cidec) had the highest measured Gini Importance (Figure 2B) with Rpl13 mostly down-regulated and Cidec mostly up-regulated by known PPARγ-modifying compounds (Figure S2). This is consistent with known relationships between cellular processes and adipogenesis. Specifically, ribosomal machinery is down-regulated during human adipogensis (Marcon et al. 2017). Cidec is a lipid droplet structural gene, the expression of which is positively correlated with adipocyte lipid droplet size, insulin levels, and glycerol release (Ito et al. 2010).
The taxonomy derived by the K2 clustering procedure recapitulates many known characteristics shared by PPARγ-modifying compounds included in this study (Figure 3). For example, three terminal subgroups are labelled in Figure 3 based on their shared characteristics. These include: flame retardants (tetrabromobisphenol A (TBBPA) and triphenyl phosphate (TPHP)), phthalates (MBUP, MEHP, MBZP, and BBZP), and RXR agonists (TBT and LG268). Interestingly, we observe two subgroups containing all of the four thiazolidinediones, with rosiglitazone (Rosig) segregating with the non-thiazolidinedione S26948 and pioglitazone, MCC 555, and troglitazone segregating together.
All of these terminal subgroups fall within a larger module containing 26 chemicals, highlighted by expression patterns consistent with increased adipogenic activity including up-regulation of genes significantly enriched in pathways involved in adipogenesis and lipid metabolism (Soukas et al. 2001). In addition, these chemicals also demonstrated consistent down-regulation of extracellular component genes. Up-regulation of extracellular matrix (ECM) genes are known to be associated with obesity (Huber et al. 2007), though to our knowledge down-regulation of extracellular matrix genes has not been reported as a direct result of exposure to PPARγ agonists in adipocytes. This effect was strongest in cells exposed to thiazolidinediones and flame retardants, two classes of chemicals well-described to be strong PPARγ agonists (Berger et al. 1996; Fang et al. 2015; Riu et al. 2011). The subgroup of thiazolidinediones, which also includes S26948, is highlighted by up-regulation of genes involved in beta-oxidation, the process by which fatty acids are metabolized. This metabolic process has been previously observed with Rosig exposure (Benton et al. 2008).
The gene expression profiles of the remaining 17 chemicals, including naïve controls, demonstrate markedly less up-regulation of genes regulated by PPARγ. Of these 17 chemicals, a subgroup of 8 (BADGE, PrPar, 15dPGJ2, SR1664, METBP, DINP, BuPA, and fenthion) includes the reference vehicle signature. Compared to the next closest subgroup, expression profiles of these compounds are characterized by up-regulation of adipogenesis related pathways indicative of modest PPARγ agonism. Additionally, a subgroup comprised of 9CRA, DBT, LG754, ATRA, and the naïve exposure signatures is characterized by down-regulation of genes involved in adipogenesis and lipid metabolism, indicating repression of PPARγ activity. Interestingly, both protectin D1 (Prote) and resolvin E1 (Resol) cluster closely in a subgroup with the CDK inhibitor, roscovitine (Rosco), which is known to induce insulin sensitivity and brite adipogenesis (Wang et al., 2016).
In summary, our top-down clustering approach elucidates subgroups of PPARγ activity modifying compounds, characterized by differential transcriptomic activity at each split. Annotation of these transcriptomic signatures reveals clear differences in the set and magnitude of perturbations to known adipocyte biological processes by subgroups of chemicals. Membership of these subgroups confirms many expectations, such as subgroups comprised of solely of phthalates, thiazolidinediones, or flame retardants. The novel observation that the transcriptomic patterns induced by Resol and Prote segregate with the CDK5 inhibitor Rosco, suggests that Resol and Prote may modify PPARγ phosphorylation and activation distinctly from synthetic PPARγ ligands.
Adipogen portal
Given the breadth of results generated by this analysis, this description is far from exhaustive. As such, we have created an interactive website (https://montilab.bu.edu/Adipogen/) to support the interactive exploration of these results at both the gene and pathway-level. The portal is built around a point-and-click dendrogram of the clustering results as in Figure 3. Selecting a node of this dendrogram will populate the rest of the portal with the chemical lists, differential analysis, and pathway level hyper-enrichment results for each subgroup defined by a split. For instance, selecting node “H” will show the chemicals in each subgroup to the right (Group 1 = Honokiol, T007907; Group 2 = Prote, Resol, and Rosco), as well as the differential gene signature for each group below. Selecting Cidec, the top gene in the Group 2 signature, displays hyper-enrichment results for gene sets which include Cidec and have a nominal p-value < 0.50. The hyper-enrichment results for all genes can be found below this table. Finally, selecting a gene set name will display the gene set members at the bottom frame of the portal, with gene hits in bold. All tables are queryable and downloadable.
Investigation of the white and brite adipocyte taxonomy
We aimed to better assess how the distinction between gene expression patterns translated into functional differences in the induced adipocytes. Therefore, we selected chemicals from representative groups related to PPARγ modification for genotypic and phenotypic characterization. We compared a strong PPARγ therapeutic agonist that also modifies PPARγ phosphorylation (Rosig), a chemical that modifies only PPARγ phosphorylation (Rosco), a weak PPARγ agonist and endogenous molecule (15dPGJ2) and two known environmental PPARγ ligands (TBBPA and TPhP). 3T3-L1 cells were either maintained in an undifferentiated state (naive) or differentiated and treated with Vh (0.1% DMSO, final concentration), Rosig (1 μM), Rosco (4 μM), 15dPGJ2 (1 μM), TBBPA (20 μM) and TPhP (10 μM). Gene expression and phenotype were determined after 10 days. Analysis of mitochondrial membrane potential confirmed that the concentrations used were not toxic (Figure S3A).
The balance of white and brite adipogenesis is controlled by PPARγ, and the balance is skewed toward brite adipogenesis by recruitment of specific coactivators to PPARγ (e.g., PGC1α and PRDM16)(Chrisman et al. 2018; Puigserver et al. 1998; Qiang et al. 2012; Seale et al. 2007). As expected, all of the PPARγ agonists (Rosig, 15dPGJ2, TBBPA, TPhP) significantly increased Pparg expression, while Rosco did not (Figure 4A). Similarly, the PPARγ agonists induced expression of adipocyte genes common to all adipocytes (Plin, Fabp4, Cidec), while roscovitine did not (Figure 4B). In contrast, only the chemicals known to prevent phosphorylation of PPARγ at S273 (i.e., Rosig and Rosco) induced expression of Pgc1a (Figure 4A) and induced expression of brite adipocyte genes (Cidea, Elovl3)(Figure 4C). Rosig, Rosco, and 15dPGJ2 induced the expression of Adipoq (Figure 4C). In order for brite adipocytes to catabolize fatty acids and expend excess energy, they must upregulate expression of β-oxidation genes and mitochondrial biogenesis. In line with their browning capacity, Rosig and Rosco upregulated expression of Ppara and the mitochondrial marker gene Acaa2 (Figure 4D). Furthermore, only Rosig and Rosco strongly upregulated Ucp1, the protein product of which dissociates the H+gradient the mitochondrial electron transport chain creates from ATP synthesis (Figure 4D).
Next, we determined if changes in gene expression correlated with changes in adipocyte function. Fatty acid uptake by adipocytes is necessary for lipid droplet formation and for removal of free fatty acids from circulation. Compared to vehicle-treated cells, all of the adipogens significantly induced fatty acid uptake (Figure 5). In order to increase the utilization of fatty acids, mitochondrial number and/or function must increase. Only Rosig and Rosco significantly induced mitochondrial biogenesis, while 15dPGJ2 and the environmental PPARγ agonists had no effect (Figure 6). Interestingly, Rosig significantly reduced the pH of the culture medium, suggesting that the rosiglitazone-induced adipocytes were highly energetic (Figure S4).
Rosig and Rosco, therapeutic PPARγ ligand and PPARγ modifier, respectively, were able to induce gene expression and metabolic phenotypes related to upregulation of mitochondrial processes and energy expenditure. In comparison, environmental PPARγ ligands (TBBPA and TPhP) were not able to induce the gene and phenotypic markers of brite adipocytes.
Identification of novel adipogens that favor white adipogenesis
Quinoxyfen (Quino) and tonalide (Tonal) were two of the environmental chemicals that received the highest PPARγ modifiers vote and segregated distinctly from the therapeutic ligands (Table1). Thus, we tested the hypothesis that Quino and Tonal are adipogens that do not induce gene expression or metabolic phenotypes indicative of healthy energy expenditure or brite adipogenesis. We tested this hypothesis in the 3T3-L1 model and primary human preadipocytes. In 3T3-L1 cells, Quino and Tonal significantly induced lipid accumulation (Figure 7A). They significantly increased expression of the white adipocyte marker gene, Cidec. However, Quino failed to significantly increase expression of Cidea, the brite adipocyte marker gene, while Tonal significantly suppressed Cidea expression (Figure 7B). Accordingly, Quino and Tonal increased fatty acid uptake (Figure 7C) but not mitochondrial biogenesis (Figure 7D). Quino increased maximal cellular respiration, but did not change spare capacity (Figure 7E). Consistent with the 3T3-L1 results, in human preadipocytes Quino and Tonal significantly induced lipid accumulation (Figure 8A) and expression of CIDEC (Figure 8B). Furthermore, Quino failed to induce CIDEA expression, while Tonal suppressed CIDEA expression (Figure 8B). In contrast to 3T3-L1 cells, Quino and Tonal did not increase fatty acid uptake over that induced by the hormonal cocktail (Figure 8C). However, the reduction in mitochondrial biogenesis and cellular respiration (Figure 8E) can still explain the ability of these chemicals to increase lipid accumulation.
In summary, the combination of random forest classification voting and gene expression clustering identified two environmental contaminants likely to favor the induction of white adipocytes. Hypothesis testing carried out with functional analyses confirmed that Quino and Tonal induce white, but not brite, adipogenesis in both mouse and human preadipocyte models. Importantly, hypothesis testing can be conducted with readily available cells lines and analytical reagents.
Discussion
The chemical environment has changed dramatically in the past 40 years, and an epidemic increase in the prevalence of obesity has occurred over the same time period. Yet, it is still unclear how chemical exposures may be contributing to adverse metabolic health effects. New tools are needed not just to identify potential adipogens, but to provide information on the type of adipocyte that is formed. Here, we have both developed a new analytical framework for adipogen identification and characterization and tested its utility in hypothesis generation. We show that adipogens segregate based on distinct patterns of gene expression, which we used to identify two environmental contaminants for hypothesis testing. Our results support the conclusion that quinoxyfen and tonalide have a limited capacity to induce the health-promoting effects of mitochondrial biogenesis and brite adipocyte differentiation.
Adipogen taxonomy identifies environmental chemicals that favor white adipogenesis
Of the four compounds predicted with high confidence to modify PPARγ activity, quinoxyfen and tonalide are of particular public health concern. Quinoxyfen is among a panel of pesticides with different chemical structures and modes of action (i.e., zoxamide, spirodiclofen, fludioxonil, tebupirimfos, forchlorfenuron, flusilazole, acetamiprid, and pymetrozine) that induce adipogenesis and adipogenic gene expression in 3T3-L1 cells (Janesick et al. 2016). Quinoxyfen is a fungicide widely used to prevent the growth of powdery mildew on grapes (Duncan et al. 2018). We chose to test tonalide because it was reported to strongly increase adipogenesis in 3T3-L1 cells, although it was concluded that this response was not due to direct PPARγ activation (Pereira-Fernandes et al. 2013). Our results differ in this regard. Tonalide bioaccumulates in adipose tissue of many organisms including humans, and exposure is widespread because of its common use in cosmetics and cleaning agents (Kannan et al. 2005). Combined, tonalide and galaxolide constitute 95% of the polycyclic musks used in the EU market and 90% of that of the US market (HERA 2004).
Our results support the conclusion that quinoxyfen and tonalide are adipogenic chemicals, likely to be acting through PPARγ. In clustering analysis, quinoxyfen and tonalide were among the largest subgroup of eight potential strong PPARγ agonists (Figure 3). Notably, this cluster includes both synthetic/therapeutic (nTZDpa, tesaglitazar, telmisartan) and environmental compounds (allethrin, tributyl phosphate, and TPhP) and is characterized by general up-regulation of pathways of adipogenic activity. However, quinoxyfen and tonalide generate adipocytes that are phenotypically distinct from adipocytes induced by therapeutics such as rosiglitazone. Quinoxyfen and tonalide induced white adipocyte functions such as increased lipid accumulation, but in contrast to rosiglitazone, did not induce mitochondrial biogenesis, energy expenditure or brite adipocyte gene expression.
We hypothesize that the differences in adipocyte phenotype that are induced by environmental PPARγ ligands (e.g. TBBPA, TPhP, quinoxyfen, tonalide) result from the conformation that PPARγ assumes when liganded with these chemicals rather than with therapeutic agents. These differences in conformation not only determine the efficacy to which PPARγ is activated but also the transcriptional repertoire (Chrisman et al. 2018). Access to post-translational modification sites and coregulator binding surfaces depends upon the structure that PPARγ assumes. Furthermore, the white adipogenic, brite/brown adipogenic and insulin sensitizing activities of PPARγ are regulated separately through differential co-regulator recruitment (Villanueva et al. 2013) and post-translational modifications,(Choi et al. 2010; Choi et al. 2011) with ligands having distinct abilities to activate each of PPARγ’s functions. Suites of genes have been shown to be specifically regulated by the acetylation status of PPARγ (SirT1-mediated)(Qiang et al. 2012), by the phosphorylation status of PPARγ (ERK/MEK/CDK5-mediated)(Choi et al. 2010; Wang et al. 2016) and/or by the recruitment of Prdm16 to PPARγ (Seale et al. 2007). Future work will investigate the connections between the phosphorylation status of PPARγ liganded with environmental PPARγ ligands such as quinoxyfen and tonalide, the recruitment and release of coregulators, and the ability of PPARγ to recruit transcriptional machinery to specific DNA-binding sites.
Analytical approaches for adipogen characterization
In this study, we performed high-throughput, cost-effective transcriptomic screening to profile adipocytes formed from 3T3-L1 preadipocytes exposed to a panel of compounds of known and unknown adipogenic impact. Common to toxicogenomic projects, this panel-based study design allows for characterization of the extent to which each chemical modifies differentiation (in this case, adipogenesis as related to the change in lipid accumulation). It also supports the exploration of how subsets of chemicals influence multiple biological processes that determine the functional status of a cell (in this case, processes that determine white vs. brite adipogenesis). Exploration of these biological processes allows for the prediction of the phenotypic impact of previously unclassified compounds, as well as for the characterization of the heterogeneity of the cellular activity of compounds with similar known phenotypic impact. Here we have performed both types of analyses: first through the implementation and application of random forest classification models to identify potential PPARγ activity-modifying compounds, and second via the recursive clustering of the data to identify and characterize taxonomic subgroup of known and predicted PPARγ activity modifying compounds.
For both analyses, we introduced amendments to commonly used machine learning procedures, to improve accuracy and resolution of the acquired result. For the classification task, we amended the random forest algorithm to tailor it to study designs typically adopted in toxicogenomic projects (see Methods). With the addition of an extra step to average the expression across replicates of the bootstrapped samples, we observe consistently higher performance across conventional metrics than with the standard algorithm (Figure S1). For the clustering task, we employ a procedure where we recursively divide sets of chemicals into two subgroups and assess the robustness of each division, as well as annotate transcriptional drivers of each division. As a result, we are not limited to interpreting the clustering results as mutually exclusive groups, but rather as a taxonomy of subgroups where sets of compounds share some transcriptional impact and differ in others, as is expected given the dynamic nature of the modifications by which compounds directly and indirectly affect PPARγ activity.
Future work will generalize random forest method to incorporate more complex study designs. To this end, the classification approach adopted in this project is being developed as a random forest software tool soon to be made available as an R package, allowing for the interchanging independent functions at different steps of the algorithm. The strength and utility of this approach extends beyond toxicogenomic studies, and can be used in a variety of applications of high-throughput screening, including drug discovery, such as the Connectivity Map (CMAP) (Subramanian et al. 2017), and longitudinal molecular epidemiology studies, such as the Framingham Heart Study (Mahmood et al. 2014).
Conclusions
Emerging data implicate contributions of environmental metabolism-disrupting chemicals to perturbations of pathways related to metabolic disease pathogenesis, such as disruptions in insulin signaling and mitochondrial activity. There is still a gap in identifying and examining how environmental chemicals can act as obesity-inducing and metabolism-disrupting chemicals. Our implementation of novel strategies for classification and taxonomy development can help identify environmental chemicals that are acting on PPARγ. Further, our approach provides a basis from which to investigate effects of adipogens on not just the generation of adipocytes, but potentially pathological changes in their function. To this end, we have shown how two environmental contaminants, quinoxyfen and tonalide, are inducers of white adipogenesis.