Abstract
Cells respond in complex ways to topographies, making it challenging to identify a direct relationship between surface topography and cell response. A key problem is the lack of informative representations of topographical parameters that translate directly into biological properties. Here, we present a platform to relate the effects of nanotopography on morphology to function. This platform utilizes the ‘morphome’, a multivariate dataset containing single cell measures of focal adhesions, the cytoskeleton, and chromatin. We demonstrate that nanotopography-induced changes in cell phenotype are uniquely encoded by the morphome. The morphome was used to create a Bayesian linear regression model that robustly predicted changes in bone, cartilage, muscle and fibrous tissue gene expression induced by nanotopography. Furthermore, the morphome effectively predicted nanotopography-induced phenotype within a complex co-culture microenvironment. Thus, the morphome enables the cell function-oriented exploration of new topographies, with potential applications in the development of novel surface-patterned biomaterials for tissue implants.
Introduction
Biomedical implants continue to be developed to improve patient outcomes. One way to enhance implant efficacy and tissue regeneration is to vary substrate texture with nanotopographies. Topographies at the cell-material interface are widely shown to direct cell behaviour: nanopillars change cell morphology1; nanogratings drastically alter lipid metabolism2, and pluripotent3–5 and multipotent cell differentiation6; nanogradients improve stem cell cardiomyogenic differentiation; and subtle changes to nanopit geometric arrangement switches human mesenchymal stem cells from multipotent to osteogenic fate7–10. Morphological responses to nanotopography are manifested through varying focal adhesion size17–19, orientation19–21, and composition22, 23 and changes in actin contractility and nuclear deformation24.
A quantitative relationship exists between a material’s physicochemical structure and its biological activity. Rational drug design has long relied on molecule solubility, ionisation and lipophilicity to predict activity11. Protein engineering has similarly modelled protein-peptide interactions from protein structure12. Cell metabolic activity correlates with synthetic polymer composition, glass transition temperature, and water contact angle13. Meanwhile, bacterial attachment can be predicted from descriptors of secondary ionic hydrocarbon chains14. In contrast to active biomolecules, the mechanotransductive effects of topography on cell response do not intuitively relate to topography length scale, isotropy, geometry, and polarity. This limits the discovery of functional topography to the screening of libraries for hits using a single, representative cell type15–18. Among its limitations (which include cost, inefficiency and the sampling of a small topography space), this screening approach disregards the cell specificity of response to nanotopogrpahy. Thus, it is vitally important to develop a systematic method to capture cell phenotypes induced by topography.
Here, we demonstrate a platform that encompasses cell phenotype and morphological response induced by nanotopography using image-based cellular features. We obtained measurements of focal adhesions, actin cytoskeleton and chromatin, referred to here as the ‘morphome’, from single cells on nanotopographies. We observed a unique morphological signature based on cell identity and nanotopography in the morphome. We also identified changes in the transcription of myogenic, osteogenic, chondrogenic and fibroblastic genes across all combinations of cells and nanotopographies. We used a Bayesian linear regression model to directly relate the morphological and phenotypic changes induced by nanotopography. Using the morphome as predictors and without prior knowledge about the nanotopography, the model robustly predicted gene expression induced by nanotopography. A new morphome dataset from a co-culture of osteoblastic and fibroblastic cells verified that the model positively predicts bone tissue formation induced by nanotopography, even within a complex microenvironment. The lack of topography-specific parameters in the model and the presence of the mechanosensing machinery in all cell types highlights the broad applicability of this platform to many biomaterial and cell systems, thereby supporting the function-oriented exploration and rational design of new topographies.
Results
Measuring morphological cell responses to topography: the morphome
We used nanopit topographies consisting of 120 nm diameter, 100 nm depth and with a 300 nm centre-to-centre distance in a square array (SQ)7, hexagonal array (HEX)25–27, and arranged with centre-to-centre distance offset from 300 nm by 50 nm in both x and y directions (NSQ array)7, 8. An unpatterned (‘FLAT’) surface was used as a control (Figure 1A).
We employed cells of the musculoskeletal system due to the diverse responses of muscle, bone, cartilage and fibrous cell types to nanotopographies27–29. Mouse myoblasts, osteoblasts, chondrocytes, fibroblasts, and pre-osteoblast30 and premyoblast31 progenitors, were grown on nanotopographies. Responses from combinations of each cell type on all nanotopographies were measured, providing 24 unique combinations of cell type and nanotopography (Figure S1). Effects of nanotopography on morphological characteristics such as cell and nuclear area, and in pFAK activation levels, actin intensity and focal adhesion area were evident (Figure S2-S4).
Morphology (shape and geometry of different compartments), texture (spatial patterns of fluorescence), intensity (total fluorescence value), and radial distribution of intensity (measuring radial arrangement of fluorescence), were analysed for chromatin, actin, focal adhesion kinase (FAK) and phosphorylated FAK (pFAK) signals at day 2 of culture (Supplementary data). The morphome consisted of 624 single cell measurements (“features”), with 75 chromatin and nuclear features, 211 actin and whole-cell features, 168 FAK features, and 170 pFAK features (Figure 1B). Quantitative polymerase chain reaction (qPCR) was then used to assess changes in lineage marker expression induced by nanotopography by day 7. Machine learning was then applied on the morphome: (i) hierarchical clustering was used to uncover unique patterns of morphological features that distinguish cell type-specific responses on nanotopographies, and nanotopography-specific morphological changes induced within the cell; (ii) Bayesian linear regression was used to predict myogenic, osteogenic, chondrogenic and fibrogenic gene expression induced by nanotopography using the morphome as predictors (Figure 1D).
The morphome depicts unique cell signatures induced by nanotopographies
Patterns of nanotopography-induced morphological changes were visible from the morphome (Figure 2A). Immediately apparent were large blocks of actin, FAK and pFAK measurements with similar values within a cell type on a specific nanotopography. These features correspond to increasingly complex measures of texture, granularity and radial intensity distribution for chromatin, actin, FAK and pFAK (Table S1). Frequency of pixel gray levels measures texture and thus of homogeneity, with high texture values indicating coarseness. Granularity measures an object’s coarseness, with higher values indicating heterogeneity of pixel intensities and coarser texture. The Zernike coefficient measures the spatial arrangement of intensity as it resembles the increasingly complex Zernike polynomials (Figure 2A). Interestingly, higher Zernike polynomials resemble the punctate shape and spatial distribution of focal adhesions32.
Nanotopography induces cell type-specific gene expression changes
Gene expression was used to determine the effect of nanotopographies on cell phenotype (Figure S5). We discuss here the changes induced by nanotopography on lineage-specific gene expression relevant to the cell type. Pre-myoblasts showed significantly higher expression of the early lineage marker MYOD1, and of the late markers MYOG and MYH7 when cultured on SQ surfaces relative to FLAT surfaces (Figure 2B-2D). Both pre-osteoblasts and osteoblasts showed increased expression of early (RUNX2, SP7) and late (BGLAP, SPP1) osteogenic markers when cultured on NSQ relative to FLAT (Figure 2E-2H), in line with previous studies7,8,10,33. Chondrocytes cultured on HEX showed increased expression of COL2A1 (early marker) and ACAN (late marker) compared to those cultured on FLAT (Figure 2I-2K). Meanwhile, fibroblasts showed increased expression of pathogenic fibrosis markers, TGFB1I1, COL3A1 and ELN34, on all nanotopographies compared with FLAT (Figure 2M-2O). In general, a cell type-specific response to nanotopography was observed. Each lineage was notably enhanced in a specific cell type and nanotopography combination: SQ stimulated the myoblast phenotype, NSQ enhanced the osteoblast phenotype, HEX stimulated the chondrocyte phenotype, while all surfaces except FLAT stimulated the fibrotic phenotype.
Distinct nanotopographical responses of single cells are reflected in the morphome
A subset of the morphome, consisting of 185 features, varied significantly across nanotopographies (Figure 3, Supplementary Data). Hierarchical clustering revealed that measurements of actin and pFAK radial distribution were most dissimilar to each other (Figure S6). Notably, the morphome of cells on FLAT already show that unique cell type signatures are manifested in the morphome.
Hierarchical clustering within each cell type reflected the cell type specificity of nanotopography response. Here, we highlight morphome for cell type and nanotopography combinations that induced highest lineage-specific gene expression. When compared to FLAT, pre-myoblasts on SQ showed high average values of: focal adhesion textures (Figure 3A, cluster 2); pFAK radial distribution (cluster 4); nuclear morphometry; and chromatin textures (cluster 4). The morphome of pre-myoblasts cultured on SQ reflects the need for FAK phosphorylation and for its preferential localization at stress fiber edges, which is necessary for myotube differentiation36, 37. In contrast, myoblasts on SQ showed a particularly high average value for chromatin granularity and nuclear morphometry (cluster 4), and near-zero values for radial distribution of actin and of focal adhesions (cluster 1). High chromatin granularity observed for both pre-myoblasts and myoblasts on SQ denotes chromatin heterogeneity and condensation and transcriptional activity, which is reportedly higher prior to myotube formation38, 39.
Pre-osteoblasts on SQ and NSQ had high average values for pFAK radial distribution, intensity, granularity and texture, and high average values for granularity of chromatin and actin (clusters 1-5). However, pre-osteoblasts on SQ had higher order pFAK and FAK radial distribution than on NSQ, which induced the highest expression of osteogenic markers. The morphome of pre-osteoblasts grown on NSQ featured radially variable actin that resemble bone cells, which have high contractility and actin stress fibers40.
For osteoblasts, the differences between the SQ and NSQ morphome were more prominent: NSQ induced lower average values of focal adhesion granularity, chromatin texture and nuclear morphometry (cluster 1), and higher average values for focal adhesion radial distribution (clusters 4-7) compared to SQ (Figure 3D). The osteoblast morphome on NSQ indicates that focal adhesions localize at regular intervals along the periphery, which is associated with osteogenesis41. Furthermore, changes in nuclear morphometry attributed to spreading after growth on stiff surfaces is also associated with osteogenic differentiation42.
Chondrocytes on HEX, which significantly increased chondrogenic marker gene expression relative to FLAT, showed high average values of radial distribution, texture and granularity of actin and FAK, high average nuclear morphometry, and low average values of pFAK and chromatin measurements (Figure 3E, clusters 1-3). These characteristics reflect the morphological changes (including reduced contractility and stress fiber formation, increased cell circularity, and decreased cell spreading40, low FAK phosphorylation43 and poor focal adhesion formation44) of stem cells undergoing chondrogenesis.
The morphome of fibroblasts cultured on FLAT had high average values of both actin and focal adhesion measurements (Figure 3F, clusters 3 and 5). The highly uniform radial arrangement of focal adhesions and actin of cells on FLAT indicate reduced polarization and contractile morphology of fibroblasts activated to a fibrotic state45. Inflammation pathways are reportedly increased in fibroblasts on HEX46, inducing low adhesion that is reflected in low actin and focal adhesion radial distribution (clusters 3-9). Fibroblasts grown on NSQ and HEX showed low average values of focal adhesion and actin radial distribution but high values when grown on SQ (cluster 2-5).
Overall, the morphome reflected cell-type specific responses to nanotopography. The morphome also exhibited patterns from different cell types (e.g. pre-osteoblasts vs osteoblasts) despite similarity in origin (Table S2). A multi-class logistic regression classifier was able to accurately distinguish 6 different cell types using the morphome (Figure S7). The radial arrangement of actin and focal adhesions were a critical distinction of the musculoskeletal cell types induced by nanotopography, while the arrangement of actin fibers into stress fibers or into cortical, circular bundles provided information on various cell states.
We also clustered the morphome based on cell type (Figure S8-S9, Supplementary data). Patterns emerge in the morphome in direct response to nanotopography: NSQ induced high average values of pFAK radial distribution, texture and granularity; and HEX induced high average values of actin radial distribution. Correlation between the dendrograms confirm that the morphome clusters of different nanotopographies were dissimilar to each other (Table S3). The morphome enabled discrimination between 4 nanotopographies with 68% overall accuracy (87% classification rate for HEX) using multi-class logistic regression (Figure S10).
The morphome predicts nanotopography induced gene expression changes
The Spearman rank correlation revealed that varying degrees of correlation exist between morphome features and gene expression (Figure S11). We hypothesized that the features of the morphomes would sufficiently encompass cell phenotypes induced by nanotopography thus allowing prediction. We utilized Bayesian linear regression to predict gene expression using the morphome features as predictors (for the explicit model definition, see Materials and Methods). A Bayesian linear regression model reflects uncertainty in the estimation of regression weights compared to point value estimates using maximum likelihood regression. Gene expression was modelled independently of each other, thereby creating 14 different equations with variable weighting of the morphome features. Importantly, the model was trained without any prior knowledge of topography type, instead relying on the morphome to encode for this information.
The morphome clearly captured gene expression changes induced by nanotopography (Figure 4A). The heterogeneity inherent in single cells, usually uncaptured by population measurements of gene expression, are apparent in the variance of the predictions using the morphome and the model. The mean absolute error (MAE) for prediction of all genes was between 10% (for prediction of MYOD1, MYOG and MYH7) and 21% (for prediction of COL3A1, Table S4). Features with the highest contribution in predicting the 14 different genes were predominantly measurements of FAK texture and radial distribution, actin texture, and chromatin granularity (Figure S12, Supplementary data). pFAK activation, as indicated by pFAK/FAK intensity ratio, consistently contributed to the prediction of all 14 genes. pFAK was particularly important to the model due to its relevance in contractility induced by nanotopography19, fibrosis and scar tissue formation48, in vitro osteogenesis49, 50, and chondrogenic maintenance43. Through the variability in the model, we obtain realistic expectations of single cell predictions from population level measurements
The sensitivity and predictive power of the morphome was verified by iteratively training a model without one combination of cell type and topography (Figure 4E). Drastic increases in MAE were observed when lineage-specific genes were predicted from models that excluded the particular cell type lineage being tested, regardless of nanotopography.
The morphome encompasses cell response to topography within a complex environment
We demonstrate the application of the linear regression model by predicting the outcome of pre-osteoblasts and fibroblasts co-cultured on nanotopographies. A new morphome were obtained from all cells on the entire nanotopography (Figure S13). This co-culture morphome was then used in the linear regression model to predict gene expression (Figure S14).
For visualization, the sum of predicted osteogenic (RUNX2, SP7, BGLAP and SPP1) and fibrotic (TGFB1I1, COL3 and ELN) genes was plotted against the spatial coordinates of the pre-osteoblasts and fibroblasts. Osteogenic gene expression was highest on NSQ, which also induced concentrated areas of highly expressing cells (Figure 5A). These areas might represent hotspots of osteogenic paracrine signaling induced by the NSQ nanotopography9. In contrast, osteogenic gene expression was low and homogenous on the FLAT, SQ and HEX topographies.
Fibrotic gene expression showed more spatial variability across the nanotopographies but was also maximized on the NSQ nanotopography, and largely overlaps with the spatial pattern of osteogenic gene expression (Figure 5B). This is attributable to the synergistic interaction of osteoblasts and fibroblasts on osteogenic differentiation and mineralization51. The predicted effect of high osteogenic gene expression induced by NSQ was verified in the increased mineralization and bone tissue formation compared to FLAT at 28 days (Figure S15). Our results demonstrate the capability of the morphome-based platform to capture topography induced responses within complex microenvironments reminiscent of in vivo settings.
The morphome and its application in the linear regression model enables the simultaneous assessment of morphological and functional effects of nanotopography at the single cell level. For instance, nanotopographies show no discernible effects on nuclear perimeter or on the ratio of pFAK/FAK intensity (Figure 5C). Yet there was a clear gradient in predicted osteogenic gene expression: as the nucleus reduced in size and pFAK activation rises, osteogenic gene expression increased. This gradient was less clear with fibrotic gene expression. However, the nanotopographies have discernably different effects on actin and pFAK radial distribution, with cells on SQ showing the lowest values (Figure 5D). These nanotopography-specific cell clusters also revealed low osteogenic gene expression yet high fibrotic gene expression. The morphome clearly enables new insight into single cell responses induced by nanotopography.
Discussion
In this study, we present a system that robustly encodes nanotopography parameters into functional cell specific output. The morphome, which is the collective morphological measurements of chromatin, actin and focal adhesions within single cells, were found to manifest nanotopography-induced changes in cell morphology and cell function. The information encoded in the morphome underpinned the performance of a Bayesian linear regression model for predicting gene expression.
Focal adhesion complexes are the primary mechanosensing machinery of the cell that respond to the biophysical microenvironment. Focal adhesion assembly or confinement induced by nanotopography allows spontaneous actin cytoskeleton assembly, nucleus deformation, and cell fate determination24,52,53. Cells also change and control their microenvironment, thereby dictating focal adhesion and cellular characteristics to varying degrees54, 55. Focal adhesion growth is particularly significant in sensing nanoscale changes in ligand arrangement, as this supports force redistribution throughout the cell 56. This reciprocity between the nanotopographical microenvironment, focal adhesions, cytoskeleton and chromatin, and cell function underline the platform presented in this study.
Our results demonstrate that quantitative measures of high-level cell response to nanotopography emerge from the morphome. The morphome-based platform reported here offers two unique advantages. First, the design parameters of the nanotopographies were explicitly excluded in predicting cell function. To use structural information from topography (e.g. depth, diameter, pitch, geometry, isotropy, roughness) considerably limits the dataset to only a handful of descriptors. As opposed to polymer-based biomaterials that contain easily interpretable physicochemical properties derived from chemical structure, the effects of topography parameters on hydrophobicity, serum adsorption and cell attachment are less well understood and quantified. Instead, the information contained in the morphome encompass both topography and cell function. Furthermore, by creating a general model that is independent of topographical parameters, this approach can be easily applied to predict cell functions induced by new nanotopographies.
Second, in this approach, gene expression-based measures of cell phenotype determined the specialized function of cells. In contrast to our system, many quantitative structure and function relationship studies of polymeric biomaterials have focused only on unspecialized cellular behaviours, such as adhesion and metabolic activity13, 14. Since gene expression is highly scalable, the linear regression model can be adapted to predict other cell behaviours or phenotypes. This also ensures that the morphome supports a function-focused exploration of the topography space and the rational design of topographies, as opposed to the currently used trial-and-error screening approach.
Our data reveals that the morphome can also manifest information on the stage within the cellular lineage commitment timeline. The morphome easily captured the higher levels of mature gene expression in cells farther along lineage commitment compared to precursor cells. Additionally, our co-culture experiments robustly predicted the osteogenic properties induced by NSQ. The elevated level of predicted osteogenic gene expression was supported by high mineralization on NSQ after 28 days of culture. Our results suggest that the morphome can manifest cellular changes induced by nanotopography, as well as changes driven by chemical or paracrine cues. This property of the morphome can be exploited to predict cell behavior in settings that recapitulate in vivo-like environments with numerous microenvironmental signals.
The co-culture experiment also demonstrated that the morphome dataset encompasses, at high resolution, structural, functional and spatial information. Structural information is crucial for determining how cells look, a highly valuable feature that can also be used to study drug-and disease-induced cell perturbation. The functional information – represented by changes in the expression of 14 different genes – can be observed at single-cell level, reflecting spatial resolution that is invaluable in high-throughput formats. When developed as an automated platform that executes topography production, cell seeding and staining, image acquisition, morphome extraction and gene expression prediction, this approach could increase screening efficiency and make it more cell function-oriented.
Clearly, morphome capture is crucial to the ability of the linear regression model to predict nanotopography induced cell function. While population-level measures of gene expression strongly indicate cell function, they introduce a measure of uncertainty and biological variability into the linear regression model. Thus, a one-to-one relationship between the morphome and cell function is essential to develop. Non-destructive microscopic and molecular tools57 that combine spatial and structural information from the morphome with single-cell functional assays are vitally important for establishing quantitative topography structure- and cell-function relationships using the morphome. However, the use of routine methods, such as high-content imaging and qPCR, permits any lab to measure the morphome and to model it against the cell function in question.
By generating a multivariate morphome dataset and combining it with machine learning, we have created a powerful platform for relating topography structure to cell function. The predictive power of the Bayesian linear regression model we have developed easily lends to sequential experimental design by exploiting the uncertainty and variability within the model1, 2. This platform can ultimately be used to guide the testing and exploration of new topographies.
Materials and methods
Polycarbonate surfaces with nanotopography
Surfaces patterned with 120 nm diameter and 100 nm depth nanopits were fabricated on polycarbonate using injection molding60. The following nanotopographies were used: surfaces without nanopits (FLAT); nanopits in a square array with 300 nm center-to-center spacing (SQ); nanopits in a square array with approximately 300 nm center-to-center spacing distorted by 50 nm in both x and y directions (NSQ); nanopits in a hexagonal array with 300 nm center-to-center spacing (HEX). Samples were cleaned in 70% ethanol and dried before treating with O2 plasma at 120 W for 1.5 mins. Samples were sterilized using UV light in biological safety cabinet for at least 20 mins before cell seeding.
Cell culture
Mouse fibroblast cell line NIH3T3 (ATCC) was cultured in reduced sodium bicarbonate content (1.5 g/L) Dulbecco’s modified eagle’s medium with (DMEM) supplemented with L-glutamate (2mM), 10% bovine calf serum, and 1% penicillin-streptomycin. Mouse C2C12 myoblasts (ATCC) were cultured in DMEM with 20% FBS and 1% penicillin-streptomycin, and differentiated into skeletal muscle using DMEM supplemented with 2% horse serum and 1% penicillin streptomycin. Mouse primary chondrocytes were cultured in minimum essential medium alpha (MEMa) with nucleosides, ascorbic acid, glutamate, sodium pyruvate supplemented with 10% FBS and 1% penicillin-streptomycin. Mouse MC3T3 cells (ATCC) were cultured in MEMa with nucleosides and L-glutamine without ascorbic acid and supplemented with 10% FBS and 1% penicillin-streptomycin. To differentiate MC3T3 into osteoblasts, MC3T3 media was supplemented with 10 nM dexamethasone, 50 μg/ml ascorbic acid and 10 mM beta-glycerophosphate61. Lineage committed progenitor cells, referred to here as pre-osteoblasts and pre-myoblasts, were also included in the study to mimic the osteogenic and myogenic regeneration in the adult tissue30, 31.
Cell seeding
Cells were harvested from flasks using trypsin in versene buffer and spun down at 400 x g for 5 minutes. NIH3T3 and MC3T3 cells were resuspended in complete media and seeded at 4000 cells/cm2. Chondrocytes and C2C12 were seeded at 2500 cells/cm2. Cells were seeded at different densities to ensure single cells at approximately 30% confluency on each surface after 2 days culture. To ensure homogeneity of seeding, cells were seeded using a cell seeder that controlled fluid flow. For co-culture studies, MC3T3 and NIH3T3 cells were simultaneously seeded at 2000 cells/cm2 per cell type in MC3T3 growth media. All cells were grown for 2 days on nanotopographies before fixation for immunofluorescence staining.
Quantitative polymerase chain reaction (qPCR)
After 6 days, total RNA was obtained from lysed cells according to manufacturer’s instructions (Promega ReliaPrep Cell Miniprep kit). Gene expression was measured directly from 5 ng RNA using a one-step RT PCR kit with SYBR dye (PrimerDesign). qPCR was run on the BioRad CFX96 platform. Relative gene expression was normalized to the 18S ribosomal RNA reference gene. A list of the forward and reverse primers used to study different mouse genes is given in Table S5. Bar charts for qPCR data were obtained using GraphPad Prism (v7.0). One-way ANOVA with Tukey’s post hoc test for multiple comparisons was performed to determine the effect of nanotopography on gene expression compared with FLAT. Statistical significance was considered at p < 0.05.
Immunofluorescence staining
After 2 days, cells on surfaces were fixed with 4% paraformaldehyde solution in phosphate buffered saline (PBS) at 4°C for 15 minutes. Fixed cells were then permeabilized and blocked with 10% goat serum and 2% bovine serum albumin in PBS for 1 hour at room temperature. Surfaces were stained with the following primary antibodies overnight at 4°C: pFAK Y397 (Abcam 39967, 1:400) and FAK (ThermoScientific 396500, 1:400). Afterwards, Alexa Fluor-conjugated secondary antibodies (ThermoScientific, 1:500) against the host species of the primary antibody were used. Alexa Fluor 549 conjugated phalloidin (ThermoScientific, 1:200) were used to structuralize the actin cytoskeleton. Cells were also stained with DAPI (ThermoScientific) to structuralize nuclei. All surfaces were mounted on 0.17 μm thick Glass coverslips with ProLong mounting medium (ThermoScientific) and dried overnight at 4°C before imaging.
Image acquisition and feature extraction
For single-cell population studies, monochrome images of each fluorophore were obtained at 40X magnification (numeric aperture 1.3) using the EVOS FL1 System (ThermoScientific). For co-culture studies, the entire field per nanotopography was imaged and stitched through an automated microscope (EVOS FL2 Auto) with a 40X magnification. All images from one cell type were obtained using the same camera and light settings. Afterwards, images were analysed using CellProfiler (v2.4.0, The Broad Institute). Image processing, including illumination correction and channel alignment, was performed across each independent experiment and each cell type62. Nucleus and cell body were segmented from each cell in each image. Shape or morphometric measurements, total and local intensities and textures of chromatin, actin, pFAK and FAK were obtained using built-in object measurement modules in CellProfiler63.
Machine learning and multivariate analysis
The morphome initially consisted of a total of 1050 measurements obtained from single cells. Morphome measurements from single-cell populations were combined and were scaled by subtracting the mean and normalizing by the standard deviation of the entire data set. Morphome data from co-culture studies were similarly scaled and normalized using the mean and standard deviation from the initial dataset consisting of homogeneous cell populations. Prior to machine learning, features with zero variance within each batch (e.g. Zernike Phase measurements) were removed from the data set. A Pearson correlation method at significance level 90% was used to remove features with correlation higher than 0.9 without significantly reducing total data variance (Figure S16) using the KMDA package64. After preprocessing, 624 morphome features were used further in the study. To determine the features that were significantly varied either across nanotopography or cell type, a one-way ANOVA was performed (KNIME v3.3.0). The intersection of nanotopography-specific and cell type-specific features were obtained. Hierarchical clustering was performed using Spearman correlation as a distance metric and an average linkage method for cluster linkage using gplots package65. Cluster membership and stability was obtained from silhouette analysis using the cluster package66. Dendrogram correlation was performed using corrplot package67.
Bayesian linear log-Normal regression
Only morphome features with an absolute Spearman correlation coefficient ≥ 0.7 against all examined expression markers were used in the linear regression model (Supplementary data). The linear regression model used 243 morphome features, containing 22 nuclear morphometry and chromatin, 71 actin, 75 FAK and 75 pFAK measurements, as predictors of the model. Each replicate of the qPCR data was propagated across each replicate of the single-cell morphome data. For each gene analysed, data were rescaled from 0 to 1 by normalizing to the maximum gene expression.
Linear regression was performed as a simple approximation of the relationship between the morphome and myogenic, osteogenic, chondrogenic and fibrotic gene expression. Established Bayesian inference methods were used to determine the probabilities of observing gene expression with a given morphome set. We conside a linear model where expression of one gene (response y) was predicted through a linear combination of the morphome features (predictors x) transformed by the inverse identity link function. We assume that y follows a log Normal distribution parametrized by the mean μ and variance :
And that μ is a linear function of X parametrized by β:
All model parameters β were assumed a priori to come from a normal distribution, parametrized by mean and standard deviation:
Each gene was trained independently resulting in 14 different linear regression equations. A 60%-40% training and test split for Bayesian linear regression was performed randomly and with stratification using the caret package for R68. The Bayesian linear model was created using the brms package for R69, which utilizes the Hamiltonian Markov Chain Monte Carlo sampler for estimation of the posterior distribution of β. Bayesian linear modelling was carried out using with 1000 warm-up iterations and 1000 sampling iterations within each chain for 3 independent chains. All models were confirmed to converge to the equilibrium distribution by confirming potential scale reduction statistic split-, effective sample size was smaller than total sample size and low autocorrelation. Predicted gene expression was performed by using the test set or the morphome obtained from the co-culture study as input to the linear model. Predicted values were averaged across 50 draws from the posterior distribution.
To determine the predictive power of the morphome, a specific combination of cell type, topography and replicate were iteratively omitted and the remaining dataset was used to refit new models. Thus, 576 additional models were created to test 24 different cell type combinations across 12 genes and 2 replicates. The predictive quality of the models was assessed by predicting the expression of all 14 genes from the omitted cell type, topography and replicate dataset. We report the mean absolute error (MAE) of qPCR prediction for each cell type and topography combination averaged across 2 replicates. MAE was calculated as the average across all absolute differences between predicted and actual gene expression.
Statistics, visualization and software
Statistical analysis and machine learning were performed using statistical software R (v3.4.3) and its graphical interface RStudio (v1.0)70. Scatterplots, boxplots and histograms were generated using ggplot2 in R71. Pearson correlation, Spearman rank correlation and Bayesian linear regression were performed in R (v3.4). Interpolation of x and y coordinates and sum of predicted osteogenic or fibrotic gene expression was performed using bivariate interpolation of a regularly gridded dataset using akima package72. Afterwards, the contour plots were created using the fitted.contour function in the base package of R, with the nuclear centroid position used as spatial coordinates of the cell. Barcharts and one-way ANOVA analysis of qPCR values were obtained using GraphPad Prism (v7.0a).
Author contributions
MFAC, PMR and NG designed the biological experiments. MFAC and BSJ designed the machine learning analysis. MFAC carried out imaging, image analysis, qPCR, machine learning. PMR fabricated and characterized the nanotopographical surfaces. MFAC, BSJ, PMR and NG wrote the manuscript. All authors have read and approved the manuscript before submission.
Disclosure of competing financial interests
The authors have no competing financial interests to disclose.
Acknowledgements
We acknowledge ERC funding through FAKIR 648892 Consolidator Award. MFAC is financially supported by the University of Glasgow MG Dunlop Bequest, College of Science and Engineering Scholarship, and FAKIR consolidator award. We acknowledge the James Watt Nanofabrication Centre for fabrication work, and Steen Lillelund for initiating the machine learning work. We thank Julie Russell for her contribution to the qPCR, Carmen Huesa for providing the primary cartilage cells and Rachel Love for the injection moulding of nanopatterned polycarbonate surfaces.