Introduction

Bipolar disorder (BD) is a severe psychiatric condition that is characterized by fundamental and distinctive distortions of emotion regulation and perception. BD has an equal sex incidence, affects all age groups and has a worldwide lifetime prevalence of approximately 1%. Family and twin studies of BD have provided unequivocal evidence that inherited genetic variation contributes substantially to disease risk,1, 2, 3 and genome-wide association studies (GWAS) have identified several risk variants.4, 5, 6, 7, 8 Further analyses of these data have suggested that the risk of BD has a substantial polygenic component, involving a large number of common risk alleles of small effect.9

Previous studies of genetic risk for BD have been based on a categorical diagnosis. However, genes that act through specific biological mechanisms are unlikely to have a similar influence on all BD symptoms, and thus association signals for more specific BD subtypes may have been missed. This hypothesis is supported by findings from previous BD studies. In linkage studies, subphenotyping BD patients according to relevant selected features led to the identification of potential susceptibility loci specific to psychotic BD,10, 11, 12, 13 BD with comorbid anxiety10 and BD with attention deficit hyperactivity disorder symptoms.14 In candidate gene studies, NRG1,15 5-HTTLPR16 and COMT17 have been implicated in BD with psychotic symptoms, whereas DAOA has been implicated in BD with persecutory delusions.18

The aim of the present study was to test the hypotheses that symptom dimensions derived through a factor analysis approach that takes into account variations in core, and associated symptoms would enable the formation of genetically more homogenous BD subgroups, and that analysis of these subgroups would identify associations missed in GWAS of the broad BD phenotype, despite a reduction in sample size. The GWAS step was performed in BD patients of German ancestry, and the top findings were followed-up in an European American BD sample.

Materials and methods

Sample ascertainment and genotyping

All participants provided written informed consent. The study protocols were approved by the respective institutional review boards or ethics committees.

German sample

The German sample was used in a previous GWAS,8 and in a study exploring copy number variation of a categorical diagnosis of BD.19 These references provide a detailed description of the sampling and genotyping procedures. In brief, the present study included 927 in-patients with a DSM-IV diagnosis of BD and 2168 control subjects, all of German ancestry (for sample description see Table 1). The BD diagnoses were assigned on the basis of multiple sources of information, including the German version of the Structured Clinical Interview for DSM-IV axis I disorders (SCID-I),20 the Operational Criteria Checklist for Psychotic Illness (OPCRIT v3.32),21 medical records and family history. All study participants were individually genotyped using Illumina HumanHap550v3 BeadChips, Illumina Human610Quad Beadchips or Illumina Human660Quad BeadChips (Illumina, Inc., San Diego, CA, USA). Following stringent quality control, the final GWAS data set was comprised of 378 570 single-nucleotide polymorphisms (SNPs) with a minor allele frequency of at least 10%.

Table 1 Descriptive data for the German bipolar disorder patients

European American sample

A detailed description of the sample and the genotyping procedure is provided elsewhere.22 The GAIN/TGEN sample included 1247 patients with a best estimate DSM IV diagnosis of either bipolar I disorder or schizoaffective disorder bipolar subtype based on the Diagnostic Interview for Genetic Studies (DIGS 4.0)23 and 1434 controls. All the study participants were of European American ancestry and were genotyped using the Affymetrix Genome-Wide Human SNP Array 6.0 (Santa Clara, CA, USA) (Figure 1).

Figure 1
figure 1

Quantile–quantile (QQ) plot of the genome-wide association data. QQ plot of allelic χ2 test P values from autosomal SNPs following the application of all quality control filters. Good adherence of data points to the line of expectance was observed. This implies that spurious associations, characterized by an increase in the number of potential highly significant P values, had been systematically removed. All remaining slight deviations from the line of expectance in the extreme tail are presumed to reflect true-positive genetic effects.

Factor analysis and the use of factors as binary traits for the association study

To refine the phenotypic characterization of the German BD sample (n=970), we performed a principal component analysis of 48 clinical OPCRIT items (Supplementary Table 1). From this analysis, we derived 12 factor dimensions. In contrast to other rating scales, the OPCRIT has neither a positive nor a negative symptom subscale, either of which might bias symptom rating. Factors derived from OPCRIT ratings are therefore less likely to be statistical artifacts of scale development. Genotypic data were available for 927 of these patients, and these patients were therefore included in the present study. The OPCRIT items referred to appearance and behavior, speech, form of thought, affect, and abnormal beliefs. Missing data varied from approximately 0.8 (excessive activity, suicidal ideation) to 9% (increased sociability). Missing values were replaced by the median for the specific item, taking into account the ordinal characteristic of the OPCRIT data.

We inspected our data to determine whether it would be reasonable to conduct an orthogonal principal component analysis. The Kaiser–Meyer–Olkin (KMO) criterion measures the covariance of variables.24 Values above 0.6 indicate variables that are suitable for factor analysis. In our BD data, the KMO sampling adequacy was 0.855. Factor dimensions were extracted according to the commonly used Kaiser–Guttman criterion,25 resulting in a 12-factor solution. Fulfilment of the Kaiser–Guttman criterion indicates that a factor analysis-derived dimension explains more of the variance in clinical symptoms than any single clinical symptom alone. This 12-factor solution explained 54.2% of the total variance in the BD sample. For each patient, a personal regression factor score was calculated for each dimension. The 12-factor dimensions derived in the BD sample were termed: ‘depression’, ‘mania’, ‘delusions’, ‘grandiose delusions’, ‘depersonalization’, ‘voices’, ‘agitation’, ‘disorganization’, ‘other hallucinations’, ‘negative mood delusions’, ‘catatonia’ and ‘negative symptoms’ (see Supplementary Tables 2 and 3).

As the factor dimensions failed to display a normal distribution, the factor scores were transformed to binary format. Factor scores of 1.0 were rated as ‘high scores’, and factor scores of 1.0 as ‘low scores’. A low score indicates that almost none of the items loading on that particular factor were observed in the patient. Each factor dimension was present in around 15% of the BD patients. As all SNPs with a minor allele frequency of 0.1 were included, only factor dimensions present in at least 25 patients from the German sample were analyzed, as expected cell frequencies of <5 would violate one of the premises of the χ2 test. Eleven factor dimensions fulfilled this criterion (‘mania’, ‘delusions’, ‘grandiose delusions’, ‘depersonalization’, ‘voices’, ‘agitation’, ‘disorganization’, ‘other hallucinations’, ‘negative mood delusions’, ‘catatonia’ and ‘negative symptoms’), and were therefore included in the dimensional GWAS.

Statistical analyses for the GWAS and follow-up study

All association analyses were performed using PLINK26 (v1.07). In the single-marker analysis, all of the autosomal SNPs that passed quality control checks were tested for association with the 11 binary factor dimensions using the allelic χ2 model and the Armitage trend test. The P values were corrected using the genomic inflation factor. A P-value of <5 × 10−8 per trait was selected as the threshold for genome-wide significance,27, 28 under the assumption of the presence of one million non-correlated common SNPs in the genome. Adjustment for the number of traits tested appeared too conservative, as factor dimensions were intercorrelated (Supplementary Table 4).

We then explored whether genome-wide significant variants were associated with the categorical diagnosis of BD per se or primarily with a specific BD subphenotype. For this purpose, we tested for association in the entire sample, in subgroups of patients with and without symptoms from the ‘negative mood delusions’ dimension and in a control cohort. In addition, these findings were followed up in an independent European American sample of BD patients. Given that a different diagnostic instrument had been employed in the European American sample, we attempted to find support for our findings on the basis of symptoms rather than factor dimensions. Symptoms with a factor loading of 0.32 (that is, 10% of shared variance between variable and factor) were considered to contribute to the specific factor dimension.29 The P values of the follow-up sample were corrected using the genomic inflation factor.

Results

Factor analysis resulted in a 12-factor solution. Eleven factor dimensions fulfilled the statistical premises for inclusion in the GWAS (‘mania’, ‘delusions’, ‘grandiose delusions’, ‘depersonalization’, ‘voices’, ‘agitation’, ‘disorganization’, ‘other hallucinations’, ‘negative mood delusions’, ‘catatonia’ and ‘negative symptoms’; see Supplementary Tables 2 and 3).

The association between the rs9875793 G allele and the factor dimension ‘negative mood delusions’ (delusions of poverty, delusions of guilt and nihilistic delusions, see Supplementary Table 5) surpassed the threshold for genome-wide significance of P <5 × 10−8 under the assumption of an allelic χ2 model (PG=4.65 × 10−8, odds ratio (OR)=2.66; factor present, n=88). One additional SNP—rs1499821—showed a trend towards genome-wide significance (PG=5.8 × 10−8, OR=2.65; factor present, n=88). Rs9875793 is located in the vicinity of the solute carrier family 2 (facilitated glucose transporter), member 2 gene SLC2A2. Rs1499821, which is in complete linkage disequilibrium (D′=1.0, r2=0.925) with rs9875793, is located within this gene. Analysis of the seven intragenic SLC2A2 SNPs represented on the array revealed that four (rs5398, rs1499821, rs8192675, rs11924032) were significantly associated with the factor dimension ‘negative mood delusions’ (P<0.05, with Bonferroni correction for seven SNPs) in the combined sample (see Table 2, Figure 2). Association with rs9875793 was only found for ‘negative mood delusions’. No nominally significant association was found with any other factor dimension.

Table 2 Associations between the factor dimension ‘negative mood delusions’ and rs9875793 and the seven SLC2A2 SNPs in the German GWAS sample (see Figure 2)
Figure 2
figure 2

Regional-association plots displaying rs9875793 and the SCL2A2 SNPs. Allelic χ2 test P values from SNPs are plotted against positions from the March 2006 human reference sequence, annotated by RefSeq genes. The most highly associated marker from the combined analysis (P) is indicated by an enlarged red diamond, which is in the center of a genomic window of around 300 Mb. The strength of linkage disequilibrium (in r2) between the top SNP and its adjacent markers is demonstrated by the red (high) to white (low) color bar (top right corner).

Exploration of the association in the entire sample and in subgroups of patients with and without symptoms from the ‘negative mood delusions’ dimension in comparison to a control cohort (n=2168) revealed that a significant association with the G allele of rs9875793 was only present in the subgroup of patients with ‘negative mood delusions’ symptom (n=89) (allelic χ2 model: PG=0.0001, OR=1.92).

The only ‘negative mood delusions’ symptom rated in the European American follow-up sample (GAIN/TGEN) was delusions of guilt or sin. This corresponds to delusions of guilt in the German sample (item present, n=83; mapping 93% of the patients with the ‘negative mood delusions’ symptom). This symptom also showed significant association with the G allele in a separate single item analysis (allelic χ2 model: PG=0.0008, OR=1.87; see Supplementary Table 6). This association in the European American sample was in the same direction as that identified in the German sample (allelic χ2 model: PEA=0.028, OR=1.27). In the European American BD sample, no such association was observed in either the total sample or in patients without this symptom, in comparison to controls (see Supplementary Table 7).

Discussion

To our knowledge, the present GWAS of BD is the first to be based on factor analysis-derived symptom dimensions. We hypothesized that subsampling of patients according to selected clinical features would enable identification of novel genetic associations that would be missed in analyses of a broad BD phenotype. Using this new approach, we were able to identify a genome-wide significant association between the factor dimension ‘negative mood delusions’ (delusions of poverty, delusions of guilt and nihilistic delusions) and the chromosome 3 variant rs9875793 (PG=4.65 × 10−8, OR=2.66).

Rs9875793 is located in an intergenic region on chromosome 3, approximately 28 kb downstream of the glucose transporter gene SLC2A2. The cortically expressed SLC2A2 gene30 is a promising candidate gene for BD, as it is involved in the lithium-sensitive phosphatidyl inositol pathway,31 and its expression is modulated by psychological stress.32 Additionally, our top SNP rs9875793 was recently reported to be associated with differing activity of the right dorsolateral prefrontal cortex during a working memory paradigm in patients with schizophrenia in comparison to healthy controls.33 The dorsolateral prefrontal cortex is a critical interface between emotion regulation and cognition, and is particularly involved in the processing of negative emotions.34, 35 Structural and functional abnormalities in this region have been reported in patients with BD, schizophrenia and major depression.33, 36, 37

Case–control analyses revealed a unique association between the G allele of rs9875793 and BD patients with ‘negative mood delusions’ compared with controls, and this association was also observed in an independent European American BD sample. The fact that not even a trend toward association with the G allele of rs9875793 was observed for the categorical diagnosis of BD may suggest that patients with ‘negative mood delusions’ symptom constitute a biologically more homogenous BD subgroup.

The present results indicate that BD patients characterized by the factor dimension ‘negative mood delusions’ may represent a genetically more homogenous subgroup. A limitation of the present study is that no stringent adjustment was made for the number of factor dimensions included. However, as our finding gained further support in the follow-up study, we consider our finding to be robust. The present study also underlines the feasibility of the factor dimensional approach, as it allows subphenotyping in clinical practice. Screening for the presence of particular symptoms might allow the identification of genetically more homogenous BD subgroups, which may in turn facilitate the development of individual treatment strategies.