Molecular Medicine Israel

Using brain structural neuroimaging measures to predict psychosis onset for individuals at clinical high-risk

Abstract

Machine learning approaches using structural magnetic resonance imaging (sMRI) can be informative for disease classification, although their ability to predict psychosis is largely unknown. We created a model with individuals at CHR who developed psychosis later (CHR-PS+) from healthy controls (HCs) that can differentiate each other. We also evaluated whether we could distinguish CHR-PS+ individuals from those who did not develop psychosis later (CHR-PS-) and those with uncertain follow-up status (CHR-UNK). T1-weighted structural brain MRI scans from 1165 individuals at CHR (CHR-PS+, n = 144; CHR-PS-, n = 793; and CHR-UNK, n = 228), and 1029 HCs, were obtained from 21 sites. We used ComBat to harmonize measures of subcortical volume, cortical thickness and surface area data and corrected for non-linear effects of age and sex using a general additive model. CHR-PS+ (n = 120) and HC (n = 799) data from 20 sites served as a training dataset, which we used to build a classifier. The remaining samples were used external validation datasets to evaluate classifier performance (test, independent confirmatory, and independent group [CHR-PS- and CHR-UNK] datasets). The accuracy of the classifier on the training and independent confirmatory datasets was 85% and 73% respectively. Regional cortical surface area measures-including those from the right superior frontal, right superior temporal, and bilateral insular cortices strongly contributed to classifying CHR-PS+ from HC. CHR-PS- and CHR-UNK individuals were more likely to be classified as HC compared to CHR-PS+ (classification rate to HC: CHR-PS+, 30%; CHR-PS-, 73%; CHR-UNK, 80%). We used multisite sMRI to train a classifier to predict psychosis onset in CHR individuals, and it showed promise predicting CHR-PS+ in an independent sample. The results suggest that when considering adolescent brain development, baseline MRI scans for CHR individuals may be helpful to identify their prognosis. Future prospective studies are required about whether the classifier could be actually helpful in the clinical settings.

Introduction

The clinical high risk (CHR) paradigm is widely used with the goal of improving early detection of and prevention of psychotic disorders [1]. Individuals are considered at CHR for psychosis if they meet criteria for attenuated positive symptom syndrome (APSS), brief intermittent (limited) psychotic syndrome (BLIPS), and/or genetic risk and deterioration syndrome (GRDS) based on semistructured interviews [2,3,4,5]. The CHR state is present in 1.7% of the general population and 19.2% of clinical samples [6]. CHR individuals have a higher risk of developing psychosis (0.15 at 1 year) comparing to healthy controls, the transition risk increased from 0.09 at half years to 0.27 at 4 years [7]. However, most CHR subjects who do not transition to psychosis will continue to meet CHR criteria or experience attenuated psychosis symptoms at follow-up and only 33% will eventually remit [78].

The CHR state, is also associated with alterations in proxy measures of brain structure [9,10,11,12,13,14,15]. Previous structural magnetic resonance imaging (MRI) studies reported a progressive decrease in gray matter volume in the medial and superior temporal and medial frontal cortex during the transition period among CHR individuals [14,15,16,17]. Gray matter volume continued to decrease several years after disease onset [151618]. Cortical surface area (SA) and cortical thickness (CT), which can be extracted using FreeSurfer software [19,20,21], are also crucial predictors of important life outcome [22] and associated with neurological, psychological, and behavioral traits [23]. SA is strongly correlated with grey matter volume compared to CT, suggesting SA and CT are unique structural features in the grey matter cortex [2425]. Recent study indicated that the multivariate architectures with respect to the makeup of the genetic factors were distinct across cortical surface area and thickness [22]. This is in line with the radial unit hypothesis [26] that the expansion of cortical surface area is driven by the proliferation of neural progenitor cells, whereas numbers of neurogenetic division of these cells for thickness [23]. Widespread lower CT has also been identified in cross-sectional MRI data in individuals at CHR in a large-scale pooled analysis of the Enhancing Neuro Imaging Genetics through Meta-Analysis (ENIGMA) CHR Working Group [27]. Among these widespread alterations, frontal cortical and temporal regions (e.g., fusiform, superior temporal, and paracentral) have been relatively consistently associated with CHR status [9,10,1128,29,30], with these regions also exhibiting lower CT in individuals with established schizophrenia [29]. In addition to regional changes, individuals with CHR, have showed greater neuroanatomical variability in global SA, CT, and subcortical volume compared to HC [31]. Furthermore, longitudinal studies have shown reductions of cortical thickness in the paracentral, superior temporal, and fusiform gyrus have been reported to be associated with psychosis conversion in those at CHR [131432]. Recent work has indicated that whole-brain sMRI patterns of schizophrenia forecasted 2-year psychosocial impairments in individuals with CHR [33], suggesting that alterations in brain structure may predict real-life outcomes.

Adolescent development is a crucial time window that is associated with brain-wide changes, including reductions in cortical thickness and volume [3435]. Cortical characteristics such as gray matter volume, cortical surface area, and cortical thickness decline by about 10% during adolescence [36]. On the other hand, white matter volume was reported peaking in young adulthood [36]. Since the period from adolescence to early adulthood is a high risk time window for psychosis onset [32], age-related anatomical deviations from typically-occuring declines may hold valuable information to predict later psychosis conversion, especially in frontal and temporal regions that have been implicated in CHR [273237,38,39] and schizophrenia [40,41,42,43,44,45]. Further, greater brain age deviations were found to be associated with a higher risk for psychosis over time [1138]. Importantly, these results suggest that the adolescent brain development pattern of CHR individuals may differ from that of HCs. Indeed, the ENIGMA CHR Working Group has reported that CHR compared to HC participants exhibit altered non-linear age associations with cortical thickness [27], suggesting that cross-sectional between-group differences in sMRI metrics may involve altered adolescent development, trait characteristics associated with psychosis liability, and/or progressive brain pathology around the onset of psychosis [323946].

An increasing number of studies have attempted to use (cross-sectional) sMRI data to predict outcome or case-control status. These prior studies show that machine learning approaches are informative for differentiating individuals with schizophrenia from HCs [47,48,49,50,51,52]. Similar findings were observed in different clinical stages of psychosis, including first episode schizophrenia and CHR individuals [4849]. A major limitation, however, is the need for large and diverse sample sizes to establish a well-tuned classifier that also provides generalized predictive performance [1253]. Since single sites cannot typically provide the necessary sample sizes [495455], multisite consortia data may be advantageous if site effects are adequately accounted for (e.g., via cross-site harmonization procedures) [495456]. For example, without harmonization, a prior study failed to build a useful model with multi site data [38]. In the current study, we aimed to investigate whether cross-sectional sMRI data can be used to build a classifier to differentiate the neuroanatomical developmental patterns of HCs relative to participants who later developed a psychotic disorder (CHR-PS+) as biomarkers for future psychosis conversion. As altered developmental processes are implicated in psychosis risk, we considered the potential non-linear effects of age and sex to gain optimal predictive accuracy of trained classifiers.

Here, we combined data from 21 sites harmonized through the ENIGMA CHR Working Group using ComBat [57] to minimize differences related to site-, scanner- and scanning protocols using an Empirical Bayes method. Second, to model non-linear age effects, we fitted generalized additive models (GAMs) [5859] to the HC data, and then applied the fitted GAMs to obtain non-linear age- and sex-corrected features for the entire sample [60]. More specifically, we estimated the model in HCs and applied it to individuals at CHR to capture deviations from the expected patterns of physiological aging. As for patients with early-onset psychosis [61] and schizophrenia [41] have been reported to have abnormally low estimated intracranial volume (ICV), all procedures were performed after adjusting the MRI features for effects of ICV. Third, we developed an XGBoost [62] classifier using only HCs and CHR-PS+ to determine deviation in neuroanatomical developmental patterns as potential predictors of future psychosis conversion. Finally, we tested the predictive performance of the classifier with the left-out site data, to avoid the potential for information leakage between the training and test data.

We hypothesized that CHR-PS+ individuals would be distinguishable from HCs based on features derived from structural MRI features, based on the assumption that those CHR individuals who are most likely to convert to psychosis would show the greatest baseline anatomical alterations. Second, we expected our classifier to label individuals at CHR who had not developed a psychotic disorder (CHR-PS-) at follow-up, and individuals at CHR who did not complete follow-up visits, resulting in missing information about their transition status (CHR-UNK), as HCs. Third, we expected the classifiers to perform similarly in independent confirmatory datasets, and expected to find associations between classifications and symptom severity.

Methods

Participants

We included data from a total of 1165 CHR individuals (144 CHR-PS+, 793 CHR-PS−, and 228 CHR-UNK individuals) and 1029 healthy controls (HCs) from 21 ENIGMA Clinical High Risk for Psychosis Working Group sites (Table 1). As previous study showed that using CHR psychometric instruments to assess the CHR state in clinical samples is associated with an excellent overall prognostic performance [63], we combined two assessments directly as previous studies [273164]. CHR status was assessed using the full version of the Comprehensive Assessment of At-Risk Mental States (CAARMS [65]; n = 650) or the Structured Interview for Prodromal Syndromes (SIPS [6667]; n = 799). Site-specific inclusion and exclusion criteria, the available scale scores in premorbid IQ, symptom severity, global functioning, and antipsychotic use at scan are the same as in a prior publication (Supplementary Table S1) [27]. All sites obtained local institutional review board approval prior to data collection. Written informed consent was obtained from every participant, or from the participant’s guardian for participants younger than 18 years. All studies were conducted in accordance with the Declaration of Helsinki [68].

We applied a two-step approach [49] to evaluate the performance of the models by dividing the data into four datasets: training, test, independent confirmatory, and independent group datasets (Fig. 1). Test and independent confirmatory datasets were used as external validation datasets. First, the training and test datasets comprised the data from CHR-PS+ and HC from 20 sites except for Toyama, which was used as the independent confirmatory dataset. We chose this dataset because the Toyama site contributed the largest HC sample and excluding this dataset reduced sample imbalance between groups in building a machine learning classifier. Ninety percent of the data were randomly sorted as the training dataset, and the remaining 10% as the test dataset. A Kolmogorov–Smirnov test did not show any significant differences between training and test datasets in any structural features. The independent confirmatory dataset comprised the data from HCs and CHRs at the Toyama site; this data was completely excluded from the training partition, and was used to perform an independent first-step evaluation without site information leakage. To evaluate the classifier on unseen new data, we defined the CHR-PS− and CHR-UNK individuals in all sites as the independent group dataset to perform the second step….

Sign up for our Newsletter