Abstract
Introduction Schizophrenia (SCZ) is a complex, heterogeneous, psychotic disorder with variable phenotypic expressions, variable course patterns and a complex etiology. It affects individuals, families, and the society at large, with most of the affected population having severe symptomatic and functional outcome.
Aims The objectives of this systematic review were to summarize the number of clusters and trajectories of schizophrenia symptoms course patterns discovered until now, to identify predictors of clusters and trajectories, to highlight knowledge gaps and to point out a way forward to optimize future cluster- and trajectory-based studies.
Methods PsycINFO, PubMed, PsycTESTS, PsycARTICLES, SCOPUS, EMBASE, and Web of Science databases were searched using a comprehensive search strategy adapted to each database. Cross-sectional and longitudinal studies published from 2008 to 2018, that reported at least two clusters or trajectory groups of patients, siblings and controls using a statistical method across positive, negative, and cognitive impairment symptom dimensions or combination of symptom dimensions were included. Two reviewers independently screened and extracted the data from included studies. A narrative synthesis was performed, and data were summarized using tables.
Results Of 2,282 studies identified, 47 fulfilled the inclusion criteria and were included in the qualitative synthesis. These studies were conducted globally in more than 17 countries (15 studies in the USA) and published from 2009 to 2019. Sixteen of the included studies had a longitudinal design, involving 11,475 patients with schizophrenia-spectrum disorders, 1,059 siblings and 653 controls, whereas 31 studies had a cross-sectional design involving 5,271 patients with schizophrenia-spectrum disorders, 7,423 siblings, and 2,346 controls. The longitudinal studies discovered two to five patient trajectory groups based on positive and negative symptoms, and four to five patient and sibling trajectory groups identified based on cognitive deficits. Regarding cross-sectional studies, three clusters of patients were found based on positive and negative symptoms while four clusters of siblings were identified based on positive and negative schizotypy. Regarding cognitive deficits, three to five clusters were reported in patients and their unaffected siblings. Age, gender, ethnicity, educational status, age of illness onset, diagnosis of schizophrenia, depressive symptoms, general psychopathology, severity of positive and negative symptoms, cognitive performance, premorbid functioning, quality of life and global functioning were important predictors among patients and their unaffected siblings.
Conclusions The evidence from cluster- and trajectory-based studies in the past decade showed that clinical symptoms of schizophrenia are clearly heterogeneous across patients, siblings and controls. Despite this fact, the extent of heterogeneity is yet to be investigated. To fully understand the heterogeneity, further work is expected from psychiatric researchers targeting longitudinal study design, unaffected siblings and utilizing genetic markers as a predictor.
Introduction
Schizophrenia is a heterogeneous complex psychotic disorder with variable phenotypic expression, variable patterns of course and complex etiology that affects individuals, families and the society at large, with most of the affected population having a severe course of symptomatic and functional outcome.1 The prevalence of schizophrenia is 4.6 per 1.000 individuals with a lifetime morbidity risk of 0.7%.2 The incidence in men and women is 15 and 10 per 100,000 individuals, respectively. The first episode of schizophrenia usually occurs in late adolescence or early adulthood.2 Schizophrenia has been associated with various environmental and genetic factors.2 It is also known that siblings of patients with schizophrenia may develop schizophrenia over time due to shared genetic and environmental factors.3,4 The concordance rate of schizophrenia is 33% in monozygotic twins and 7% in dizygotic twins.4
Schizophrenia has three groups of clinical symptoms including positive symptoms, negative symptoms and cognitive deficits, which are important outcomes quite often used in psychiatric research. These symptoms assessed by standard psychometric assessment tools with or without validation to the local population. Positive symptom includes hallucinations, delusions, and disorganized thinking.5 Negative symptoms include emotional expressive deficit, social amotivation, social withdrawal and difficulty in experiencing pleasure and the prevalence is 50-90% in FEP and persists in 20-40% of patients with SCZ.6–8 Negative and positive symptoms assessed by the positive and negative syndrome scales.9–12 Cognitive deficit affects 75-80% of patients with schizophrenia.13 The most common deficit occurred in executive function, processing speed, memory (e.g. episodic, verbal and working), attention, verbal fluency, problem-solving and social cognition.14–22 Cognitive dysfunction can be evaluated by various standard neuropsychological battery tests.23,24
From the beginning of conception of the term schizophrenia until know, various attempts have been made to identify sociodemographic, clinical, neurocognitive and other factors that can influence the heterogeneity of clinical and functional outcomes. Existing evidence shows that the course of schizophrenia has four different trajectories over time - progressive deterioration, relapsing, progressive amelioration, and stability.25 These divergent views of the course of schizophrenia have recently been investigated by subtyping using imaging, biological and symptom data.26 The other methods of subtyping are using statistical approaches, such as cluster analysis, latent class analysis, and growth mixture models, based on clinical symptom (sub)scale scores examining baseline and subsequent assessments.25–27 These models identify groups of individuals who have a similar profile or course of symptoms over time and estimate the effects of predictors on trajectory shape and group membership.28 A trajectory or cluster is a group of people that has a homogenous symptom profile within that group and a significantly dissimilar (i.e., heterogeneous) profile from other groups.29 In this review, we used ‘trajectory’ for groups identified by longitudinal studies and ‘cluster’ for groups identified by cross-sectional studies. Subgrouping approaches are useful for categorization of patients, understanding of etiologies and pathophysiology, and to predict treatment response.27
Despite a century of efforts, understanding the heterogeneity and course of schizophrenia has been unsuccessful due to nature of its clinical symptoms, variation in response to treatment, and the lack of valid, stable, and meaningful subtyping methods.27,30,31 Heterogeneity in clinical outcomes can be manifested between groups or subjects, within subjects over time, within and between diagnostic groupings, and caused by several intrinsic and extrinsic factors.30,32 The classification of clinical outcomes into dichotomous categories, such as recovered or not, and symptom remission or not, is a common practice within schizophrenia research. However, this can also be problematic if dichotomization is undertaken without evidence of the distribution of a population. Cut-off scores used for categorization are often arbitrary in nature. All these pitfalls may lead to the loss of information, inefficient analysis of continuous data and difficulties in the translation of results into clinically meaning information.29
Cluster- and trajectory-based studies of clinical symptoms of schizophrenia show inconsistent findings due to high symptomatic variability between patients and within patients over time and also have several limitations.6,15 Previous studies are often hampered by the heterogeneity of age, sex, and diagnosis of patients, severity of symptoms, use of various assessment tools, use of different clustering algorithms, use of different scoring and standardization techniques, small sample size and shorter duration of follow-up.33 Furthermore, not all prior studies included sibling and control participants.33 All these factors blur our understanding of the current state-of-art regarding heterogeneity of schizophrenia symptoms. Therefore, there is a pressing need to synthesize the contemporary evidence, evaluate the extent and origin of heterogeneity, and develop a consensus outline for clinical practice.
To our knowledge, there is no comprehensive review based on cluster- and trajectory-based studies of positive symptoms, negative symptoms and cognitive deficits in patients with schizophrenia-spectrum disorders, their unaffected siblings and healthy controls. To date, reviews have been conducted on various aspects of cognitive dysfunction13,34–43, negative symptoms8,44,45, and positive symptoms.46 The focus of these past reviews has largely been based on the traditional approach of determining average change in course of symptoms over time, and variation between subjects (patient vs relatives, relatives vs controls, patients vs controls) and diagnosis. They are also based on correlation analysis which is believed not the strong measure of association. In addition, none of these reviews fully addressed symptomatic clusters or trajectories in patients with SCZ, their unaffected siblings and healthy controls. In this review, we summarized the number of clusters and trajectories in patients with schizophrenia-spectrum disorders, their unaffected siblings and healthy controls that discovered by longitudinal and cross-sectional studies until now. The predictors of clusters or trajectories were also identified and discussed. We further highlight gaps in current knowledge and point out a way forward to optimize evidence from the future cluster- and trajectory-based studies.
Methods
Registration and reporting
This systematic review was conducted and reported based on a registered protocol (http://www.crd.york.ac.uk/PROSPERO/display_record.php?ID=CRD42018093566) and the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) statement guideline respectively.47,48 The screening and selection process of the reviewed articles are further illustrated using a PRISMA flow diagram.
Databases and search terms
For all available publications, a systematic search of PubMed, PsycINFO, PsycTESTS, PsycARTICLES, SCOPUS, EMBASE and Web of Science electronic databases was performed. A comprehensive search strategy was developed for PubMed and adapted to each database in consultation with a medical information specialist (Supplementary file 1). The following search terms were used in their singular or plural form in their title, abstract, keywords and text: “schizophrenia”, “psychosis”, “non-affective psychosis”, “cognitive deficit”, “cognitive dysfunction”, “cognitive alteration”, “negative symptoms”, “deficit syndrome”, “positive symptoms”, “psychopathology”, “cognit*”, “neuropsycholog*”, “neurocognition”, “longitudinal”, “follow-up”, “course”, “heterogeneity”, “endophenotype”, “profile”, “cluster analysis”, “siblings”, “healthy controls”, “latent class analyses”, “Symptom trajectories”, “traject*”, “group modeling” and “trajectory”. Cross-references of included articles and grey literature were also hand-searched. Furthermore, we searched the table of contents of the journals of Schizophrenia Research, Schizophrenia Bulletin, Acta Psychiatrica Scandinavica and British Journal of Psychiatry to explore relevant studies. The final search was conducted in March 2019.
Inclusion and exclusion criteria
Studies meeting the following criteria were included: (1) cross-sectional and longitudinal studies; (2) studies that reported at least two clusters or trajectory groups of individuals using a statistical method based on distinct positive symptom, negative symptom, and neurocognitive or social cognitive) impairment dimensions or a combination of these symptom dimensions; (3) studies conducted in patients with schizophrenia-spectrum disorders, their unaffected siblings, and healthy controls irrespective of any clinical (e.g. medication status, severity of illness) and sociodemographic characteristics; and (4) English published studies from 2009 to 2019. The publication year was limited to the last decade to capture the latest available evidence. In addition, the number of large sample cohorts has been increasing in the last decade which we believe studies can provide statistically powerful precise estimates and successful subtyping of schizophrenia symptoms. In order to maximize the number of searched articles, the follow-up period in longitudinal studies was not restricted. Trajectory-based studies based on mean score change over time were excluded because they did not cross-sectionally or longitudinally reveal the actual heterogeneity of groups.13,34–43 In addition, studies based on the non-statistical methods of clustering (e.g. family-based clustering) were excluded. Review papers, commentaries, duplicate studies, editorials, and qualitative studies were excluded as well. Furthermore, we excluded studies where the trajectory groups or clusters generated based on scores constructed using a combination of schizophrenia symptoms and other unspecified psychotic symptoms.
Data retrieval and synthesis
Studies retrieved from all databases were exported to RefWorks version 2.0 for Windows web-based citation manager. Close and exact duplicates were deleted. All independent studies were exported to Microsoft Excel spreadsheet to screen for further inclusion criteria. TD and LR independently screened the titles and abstracts followed by a test of agreement using Cohen’s Kappa coefficient. The two reviewers had substantial agreement, which Kappa coefficient was 0.62. Inconsistent decisions on title and abstract inclusion were managed by discussion. Finally, full-text review was performed and the following data were independently extracted by TD and LR: first author name, publication year, country, cohort/research center, study population, sample size, symptom dimension(s), assessment tool, study design, duration of follow-up (only for longitudinal studies), frequency of assessment, method of calculating tests composite score, method of clustering/trajectory analysis, number of identified clusters/trajectory groups and significant predictors of clusters or trajectories.49 The corresponding author(s) were contacted by email when full-text of an included article was not accessible. Whenever the cohort or research center was not clearly reported, we extracted the institutional affiliation of the first/corresponding author. The disagreement was resolved by consensus and in consultation with BZ and RB. Due to substantial heterogeneity of studies, a narrative synthesis was done and summarized using tables.
Results
Search results
In total, 2,262 studies were identified through database searching and an additional 20 studies through manual searching of cross-references and table of contents of relevant journals. After removing duplicate articles and applying the inclusion and exclusion criteria, title and abstract of 1,291 articles were screened which resulted in the exclusion of 1,236 articles. As a result, 55 articles were selected for full-text review and seven articles50–57 were excluded due to unclear outcome, mixed diagnosis of the study population, use of non-statistical method of clustering and clustering based on different phenotypes of schizophrenia. Finally, data were extracted from 47 cluster- and trajectory-based studies. The PRISMA flow diagram of screening and selection process is shown in Figure 1.
Overview of included studies
The included 47 studies were conducted globally in more than 17 countries (15 studies in USA) and published from 2009 to 2019. Of these, 16 studies were longitudinal studies involving 11,475 patients, 1,059 siblings and 653 controls, whereas 31 studies were cross-sectional studies involving 5,271 patients, 7,423 siblings, and 2,346 controls. Only one longitudinal study58 and six cross-sectional studies59–63 included siblings. Almost all longitudinal studies reported trajectories of positive and negative symptoms whereas most cross-sectional studies reported clusters based on cognitive function.
Evidence from longitudinal studies
From the total of 16 longitudinal studies, conducted in more than eight countries, 14 studies25,29,31,64–74 investigated the trajectory of positive and negative positive symptoms in patients, and only two studies30,58 examined the trajectory of cognitive impairment in patients and siblings. The duration of follow-up ranges from six weeks to 10 years and included all population age groups. The total sample size in each study ranges from 138 to 1990 subjects though it varied in symptom dimensions. One study58 investigated the association between patients’ and siblings’ cognitive trajectories whereas another study73 examined the association between positive and negative symptom trajectories. Moreover, five studies reported the influence of trajectories on long-term social, occupational and global functioning, and health-related or general quality of life.68,69,71,72,74 Even though all studies had similar aims, they used different name for the trajectory analysis methods, such as growth mixture modelling (GMM)31,65,73, latent class growth analysis (LCGA)29,30,66,69,71,72,74, mixed mode latent class regression modelling25,64,70 and group-based trajectory modelling (GBTM).58,67,68 Akaike’s Information Criterion (AIC), Bayesian information criterion (BIC), logged Bayes factor, sample-size-adjusted BIC [aBIC], bootstrap likelihood ratio test [BLRT], Lo–Mendell–Rubin Likelihood Ratio Test (LMR-LRT) and entropy were reported model selection indices. Of these indices, Bayesian information criterion (BIC) reported by all studies except one study30 that reported deviance information criterion (DIC).
Among studies with less than two years of follow-up (Table 1), three studies25,65,67 discovered five trajectories, and the other three studies31,64,73 identified three trajectories of positive symptoms. These trajectories were predicted by age, gender, ethnicity, cannabis use, age of illness onset, diagnosis, duration of untreated psychosis, extrapyramidal symptoms, depressive symptoms, quality of life, general psychopathology, types of antipsychotic medication, cognitive performance, premorbid functioning, severity of positive and negative symptoms.25,31,64,65,67,73 Similarly for the negative symptom dimension, three studies25,65,67 discovered five trajectories and the other three studies31,72,73 reported four trajectories. The identified predictors of were age, gender, ethnicity, family history of non-affective psychosis, age of onset of illness, extrapyramidal symptoms, quality of life, general psychopathology, diagnosis of schizophrenia, cognitive performance, premorbid functioning, premorbid adjustment, depressive symptoms, types of antipsychotic medication, and severity of positive and negative symptoms.25,31,64,65,67,72,73
Combining both positive and negative symptom dimensions, two studies25,66 discovered five trajectories, one study31 found out four trajectories and one study73 identified three trajectories. The predictors of these trajectories were ethnicity, age of illness onset, duration of illness, previous hospitalizations, extrapyramidal and depressive symptoms, quality of life, severity of positive and negative symptoms, general psychopathology, diagnosis of schizophrenia, cognitive performance and premorbid functioning.25,31,66,73
In studies with two years or longer duration of follow-up (Table 2), two studies29,70 discovered five trajectories and two studies71,74 found two trajectories of positive symptoms. The predictors were gender, educational status, duration of untreated psychosis, global functioning, living situation, involuntary admission, premorbid functioning, cognitive performance, severity of positive and negative symptoms, substance abuse and diagnosis.29,70,71,74 Regarding the negative symptom dimension, one study70 identified five trajectories, two studies29,74 discovered four trajectories, one study69 depicted three trajectories and one study71 found two trajectories. In addition, a study68 from our research group identified four trajectories of negative symptom subdomains of social amotivation and expressive deficits. The predictors were age, gender, employment status, ethnicity, marital status, educational status, duration of untreated psychosis, diagnosis, global functioning, living situation, involuntary admission, quality of life, premorbid functioning, cognitive performance, severity of positive and negative symptoms, premorbid adjustment, social functioning, and disorganized and depressive symptoms.29,68–71,74 Combining both positive and negative symptom dimensions, one study70 identified five trajectories. The predictors were diagnosis, premorbid functioning, cognitive performance, severity of positive and negative symptoms.70
Furthermore, the six years longitudinal study58 from our research group discovered five trajectories of cognitive impairment in patients and four trajectories in healthy siblings. Another study30 reported three trajectories of global cognitive function combining patients and controls together. The predictors identified by both studies were educational status, IQ, premorbid functioning, age, gender, ethnicity, severity of positive and negative symptoms, frequency of psychotic experiences, cognitive performance, and living situation.30,58
Evidence from cross-sectional studies
Of the 46 included studies, 31 were cross-sectional studies. The total sample size per study ranges from 62 to 6,137 individuals irrespective of participants diagnosis status. The reported clustering methods were K-means clustering or non-hierarchical analysis26,59,61–63,75–80, Ward’s method or hierarchical analysis81–86, K-means clustering and Ward’s method32,33,60,87–92, latent class analysis27,93 and two-step cluster analysis.94,95 One study96 identified clusters using a combination of clinical/empirical method, K-means clustering method and Ward’s method. The model selection criteria or similarity metrics were visual inspection of dendrogram, Pearson correlation, squared Euclidean distance, agglomeration coefficients, Dunn index, Silhouette width, Duda and Hart index, elbow test, variance explained, inverse scree plot, average proportion of non-overlap, Akaike information criterion (AIC), Bayesian information criterion (BIC), sample size adjusted Bayesian (ABIC), Schwarz’s Bayesian information criterion (BIC), Lo– Mendell–Rubin (LMR) test, adjusted LMR and the bootstrap likelihood ratio test (BLRT). Squared Euclidean distance was the most common index used to determine the number of clusters.
Twenty-one studies32,33,59,60,76,77,79–85,87,89–91,93–96 reported clusters in patients and their siblings based on neurocognitive and/or social cognitive function. Of these 21 studies, 15 studies33,59,60,76,77,80–85,87,94–96 found out three clusters, five studies32,79,89,90,93 reported four clusters and one study91 discovered five clusters of patients and siblings. Clusters were predicted by multidimensional factors including age, gender, socioeconomic status, educational status or years of education, employment status, ethnicity, risky drinking, obstetric complications, family history of mental disorders, premorbid adjustment, premorbid and current IQ, age of illness onset, duration of illness, clinical diagnosis, cortical thickness, neural activity, general psychopathology, severity of positive schizotypy, severity of negative and positive symptoms, neurocognitive and social cognitive performance, anxiety, disorganization, depression, stigma, state mania, neurological soft signs, antipsychotics dosage, adherence to treatment, global functioning, community functioning, socio-occupational functioning and quality of life.32,33,59,60,76,77,79–85,87,89–91,93–96 (Table 3)
Likewise, two studies27,88 reported three clusters of patients based on the negative symptom dimension. The predictors were gender, season of birth, ethnicity, years of education, age of illness onset, hospitalization, treatment history, social anhedonia, severity of positive and negative symptoms, general psychopathology, attitude, neurocognitive and social cognitive performance, global functioning, premorbid adjustment, and psychosocial functioning.27,88 Regarding positive symptoms, only one study86 identified three clusters of patients and two clusters in the general population based on hallucination symptom. (Table 3)
One study92 found three clusters of patients based on social cognition and negative symptom that were predicted by marital status, hospitalization, quality of life, severity of negative symptoms and social cognitive performance whereas another study78 found four clusters of patients based on neurocognition (attention domain) and negative symptom, which were predicted by self-esteem, attention performance, acceptance of stigma, severity of positive and negative symptoms and social functioning. In addition, one study26 reported three clusters while another study75 found out four clusters based on positive and negative symptoms that predicted by IQ, meta-cognition, age of illness onset, global functioning, comorbid diseases, and severity of positive and negative symptoms. (Table 3)
Moreover, three studies61–63 consistently reported four clusters of unaffected siblings or general population based on positive and negative schizotypy dimensions. The predictors of cluster membership were gender, severity of positive and negative schizotypy, pleasure experiences, somatic symptoms, substance use and abuse, neurocognitive functioning, social functioning, psychotic-like experiences, depression, schizoid and negative symptoms, personality, proneness to positive and negative symptoms, social adjustment and emotional expressivity.61–63 (Table 3)
Summary of clusters/trajectories and predictors
As illustrated in Table 4, two to five clusters/trajectories and 57 predictors were identified by longitudinal and cross-sectional studies in all group of participants across the three symptom dimensions. In patients or siblings or health subjects, three to four subgroups based on the three symptom dimensions were identified irrespective of study design and duration of follow-up. In addition, both longitudinal and cross-sectional studies identified five subgroups only in patients based on all symptom dimensions. In patient clusters or trajectories based on the three symptom dimensions, age, gender, ethnicity, educational status, age of illness onset, diagnosis of schizophrenia, depressive symptoms, general psychopathology, severity of positive and negative symptoms, cognitive performance, premorbid functioning, quality of life and global functioning were important predictors identified by both longitudinal and cross-sectional studies. Likewise, in sibling clusters or trajectories, age, gender, ethnicity, educational status, severity of positive and negative schizotypy, cognitive performance and premorbid functioning were relevant predictors.
Discussion
To our knowledge, this is the first comprehensive systematic review based on recent cluster- and trajectory-based studies of positive symptoms, negative symptoms and cognitive deficits. The reviewed studies involved various groups of study population including patients with first-episode psychosis or chronic schizophrenia, antipsychotics naïve patients or patients who were on antipsychotic treatment for a month or longer, patients from different age groups and ethnicities, and healthy siblings and controls. In this review, we summarized the number of clusters or trajectories, predictors clusters or trajectories and statistical methods based on the evidence in the last decade.
Longitudinal trajectory-based studies discovered two to five patient trajectory groups based on positive and negative symptoms, and four to five patient and sibling trajectory groups identified based on cognitive deficit. Based on cross-sectional cluster-based studies, three clusters of patients were identified for positive and negative symptoms while four clusters of siblings were identified based on positive and negative schizotypy. Regarding cognitive deficits, three to five clusters were reported in patients and their unaffected siblings. Overall, three to four subgroups were discovered across all the three symptom dimensions and study population irrespective of study design and duration of follow-up. This implicates schizophrenia symptoms are inherently heterogeneous and clinicians should treat clients based on group-level or severity of illness instead of using the broad hallmark symptoms. Age, gender, ethnicity, educational status, age of illness onset, diagnosis of schizophrenia, depressive symptoms, general psychopathology, severity of positive and negative symptoms, cognitive performance, premorbid functioning, quality of life and global functioning were important predictors among patients and their unaffected siblings. These factors could be used for developing clinical risk prediction model as well as machine learning.
In this review, we showed that longitudinal studies have used slightly different latent growth modelling methods, such as growth mixture modelling (GMM)31,65,73, latent class growth analysis (LCGA)29,30,66,69,71,72,74, mixed mode latent class regression modelling25,64,70 and group-based trajectory modelling (GBTM).58,67,68 Latent growth mixture models (LGMMs) are a generalization of linear mixed-effects models (LMEs) to identify categories based on temporal patterns of change by assuming the existence of latent classes or subgroups of subjects exhibiting similarity with regard to unobserved (latent) variables.29,97 LGMMs often providing more realistic estimates of heterogeneity in longitudinal trajectories. With LGMMs, latent classes are defined as unobserved groups within which the random effects and error terms are normally distributed with constant mean and variance. GMMs have four potential advantages for modelling longitudinal data compared to other methods, such as LMEs. First, it enables flexible, data-driven estimates of the random effect and error distributions separately that can more accurately reflect observed heterogeneity. Second, it allows for classification of individual subjects into latent classes based on the largest probability of class membership. Third, it is sensitive to the pattern of change over time and robust in the presence of missing data. Fourth, subject-level factors can be directly assessed for association with class membership and hence with different trajectory subtypes.29,30,97
Among the reviewed 31 cross-sectional studies, we observed that 26 studies identified meaningful clusters using either K-means clustering analysis26,59,61–63,75–80 or Ward’s method clustering analysis81–86 or both K-means and Ward’s method32,33,60,87–92 clustering analysis. Cluster analysis, applied to investigate symptoms in schizophrenia for several decades, is an atheoretical, data-driven approach to classify people into homogeneous groups by determining clusters of participants that displays less within-cluster variation relative to the between-cluster variation.84 K-mean cluster analysis is a non-hierarchical form of cluster analysis appropriate when previous evidence or hypotheses exist regarding the number of clusters contained in a sample. It produces the number of clusters initially called for by minimizing variability within clusters and maximizing variability between clusters.78 In K-means clustering analysis, participants pass through several iterative processes, assigned to a cluster and moved from one cluster to another until terminating conditions are met. Ward’s method is a hierarchical cluster analysis aiming to determine group assignment without knowing ahead of time or prior hypothesis.78 K-means iterative cluster analyses handle larger data sets better than hierarchical agglomerative methods.62 Longitudinal studies are scarce as shown by our review with only one-third of the included studies being longitudinal. In this circumstance, researchers can conduct cross-sectional studies can be an option and homogenous subgroups can be identified using either K-means or Ward’s method clustering analysis methods or both.
Following the rigorous review, we identified several gaps that could be helpful for future neuroscience and behavioral science researchers. First, only two longitudinal studies30,58 investigated the trajectory of cognitive impairment in patients and siblings which may limit our knowledge regarding the change in cognitive function over time. This may be because neuropsychological assessment is resource intensive, time-consuming, and needs specialized training to collect the data as well as study participants’ commitment. Therefore, additional longitudinal studies are warranted to examine the long-term trajectories of cognitive deficits in schizophrenia. Second, we observed limited use of data from siblings and healthy controls to unravel clusters or trajectories, which would be relevant to validate the heterogeneity of clinical symptoms detected in patients. For example, less than ten studies (mostly cross-sectional studies) examined clusters in siblings. Besides, most studies used healthy controls to standardize patients test scores and other few studies used to compare the distribution of controls across patient cluster or trajectory groups. Comparing patient clusters or trajectories with healthy siblings and controls could provide an accurate means of disentangling the causes of heterogeneity of schizophrenia symptoms. Third, even though all reviewed studies identified cluster or trajectory groups based on their own models, subtle differences between-researchers were noted in terms of constructing composite scores, use of model selection criteria and method of parameter estimates. Fourth, only five longitudinal studies evaluated the long-term effect of trajectories on functioning and quality of life. Given the relevance of functioning and quality of life as an outcome measure of treatment effect and prognosis of patients, it is worthwhile to explore the heterogeneity and trajectories. Fifth, we noted several ways of subtyping and nomenclature for subgroups, which may be confusing for clinicians to use the evidence. Therefore, even though statistical subtyping of SCZ based on its symptoms is important, the output must be translatable to clinical practice. Statisticians must also work together with clinicians and create a common understanding. Finally, none of the reviewed studies used a single or an aggregated effect of genetic susceptibility that potentially helps to accurately predict clusters or trajectories. Genetic markers are believed to be a specific and sensitive biomarker that shows the inherent heterogeneity in the course of illness. Genetic and epigenetic susceptibility may also influence differences between groups, between subjects, or within subjects over time in patients, their siblings and healthy controls. Through identifying subgroups, gene fine-mapping and enrichment analysis can be done for those groups with severe impairment.55
Even though the interpretability and validity of findings are debatable, our comprehensive review synthesized the up-to-date evidence from cross-sectional and longitudinal studies that identified data-driven clusters or trajectories. The results of statistical subtyping approaches, such as cluster or trajectory analysis depends on the mathematical assumptions that do not have a direct relationship to clinical reality, type of data, number of variables or tests, sample size and sampling characteristics. Therefore, the results can be unstable and clinical symptoms may not converge on a consistent set of subgroups.71,90,98 For example, intermediate clusters or trajectories vary substantially from study to study.90 So that, to ensure comparability and interpretability of identified clusters or trajectories, it is recommended that future psychiatric researchers validate their model using additional comparable statistical method. In our review, we found that nine cross-sectional studies32,33,60,87–92 cross-validate their model using K-means and Ward’s clustering analysis method though none of the reviewed longitudinal studies does compare their model. It is also relevant to combine statistical methods of subtyping with empirical method. To this end, only one study96 used a combination of clustering and clinical experience to identify homogeneous subgroups. Additionally, replication of cluster or trajectory groups using separate samples, different assessment tools that measure the same construct and different linkage methods (e.g. cluster analysis) is highly relevant for establishing the validity and generalizability of identified subgroups.33,99
Identification of meaningful clusters or trajectories with greater homogeneity based on clinical features or endophenotypes, such as neuropsychological markers, neural substrates, and other neurological soft signs among patients, siblings, and healthy subjects require the use of advanced statistical modeling techniques including machine learning approaches. This could facilitate efforts to identify common etiology, examine the patterns of clinical symptoms, understanding the inherent course of the disease and developing new treatment strategies specific to that subgroup to improve recovery and functional outcomes.29,30,74 In addition, statistical methods that accurately identify clusters or trajectories and describe within and between-variation can help clinicians and statisticians to characterize the relationship of schizophrenia with various clinical and functional outcomes, treatment history, treatment response, and imaging patterns that inform neuropathology. To accomplish these goals, numerous efforts have been undertaken by carefully designing cross-sectional and longitudinal studies, and developing statistical programming language and software.30 As a result, identification of clinically and statistically relevant clusters or trajectories became promising from time to time.
In general, researchers claim that schizophrenia clinical symptoms are heterogeneous though the existing evidence is still divergent. Further work is expected from psychiatric researchers targeting a longitudinal study design, unaffected siblings and utilizing genetic markers as a predictor. Despite this, our review may help clinicians to optimize the efficacy of evidence-based personalized medicine by providing personalized assessment, initiating early intervention strategies, and by selecting treatments relevant for subgroups of patients with similar characteristics. Our review would also help the translation of the statistical findings into clinical practice and using clustering and trajectory analysis methods in precision medicine to treat subgroup of patients with poor outcome and to prevent prodromal symptoms in their relatives and the general population.
Funding
Tesfa Dejenie supported by the University of Groningen Scholarship
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.
- 8.↵
- 9.↵
- 10.
- 11.
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.
- 41.
- 42.
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.
- 52.
- 53.
- 54.
- 55.↵
- 56.
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.
- 83.
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.↵
- 93.↵
- 94.↵
- 95.↵
- 96.↵
- 97.↵
- 98.↵
- 99.↵