Introduction

The increasing prevalence of type 2 diabetes seems to follow that of obesity. It has been estimated that arresting weight increase would prevent type 2 diabetes in six out of ten cases [1]. However, the extent to which this kind of association is dependent upon genes influencing BMI and type 2 diabetes is not clear. What is known is that inter-individual variability in the magnitude of weight gain is accounted for by genes [2]. On the other hand, it is not clear how the increase in BMI under positive energy balance translates into increased incidence of type 2 diabetes mellitus. In theory, type 2 diabetes could result either: (1) solely from obesity, i.e. through action of obesity genes, under other plausible non-genetic influences; or (2) from the simultaneous but independent action of diabetes and obesity genes. In other words, obesity and type 2 diabetes may share 100% (in the first instance) or 0% (in the second instance) of their genes. The reality probably lies between these extremes.

The results from several recent genome-wide association studies have identified at least 18 genes consistently associated with type 2 diabetes [3]. After the initial discovery of FTO and MC4R genes, 15 new loci have so far been found for BMI [4, 5]. However, the compiled contribution of the numerous known allelic regions to the risk of type 2 diabetes is only in the region 10% to 15% [6] and that for risk of obesity only a few per cent [4, 5]. FTO is one of the few genes that have been found to influence BMI and risk of type 2 diabetes [7, 8]. Thus, contrary to earlier expectations, confirmation of the existence of major genetic factors behind type 2 diabetes and obesity, or shared by them has proven to be a challenging task. In addition to allelic variations, copy number variation, epigenetic changes and gene–gene interaction may contribute to susceptibility to obesity and type 2 diabetes [3]. It is therefore unlikely that molecular genetic methods will be able to uncover the complete genetic make-up in the foreseeable future. To this end, quantification of the impact of common genes and environmental factors using genetic epidemiology models helps understand the nature of complex disorders such as obesity and diabetes [9].

According to cross-sectional twin and family studies, 20% to 30% of the covariance between BMI and type 2 diabetes could be due to shared genes [10, 11]. In such studies, however, the nature of the observed association between obesity and type 2 diabetes may have been biased due to the increased mortality rates associated with both conditions. The inability to control for the influence of excess mortality may lead to underestimation of the presence of shared genes. A longitudinal study of population representative samples, on the other hand, helps to reduce the influence of such bias.

The aim of the current study was to use a large population-based cohort of adult Finnish twins with an almost 30 year follow-up to test whether BMI predicts the onset of type 2 diabetes and, especially, to what extent shared genetic influences explain this association.

Methods

Participants

The Finnish Twin Cohort Study is a programme for the study of genetic and environmental determinants of chronic diseases and their risk factors [9, 12]. A baseline questionnaire was distributed in 1975 to same-sex twin pairs born before 1958, with an individual response rate of 89%. Twin zygosity was determined in 1975 by examining the responses of both members of each pair to two questions on similarity of appearance at primary school age. A set of decision rules was used to classify 93.1% of the twin pairs as monozygotic or dizygotic; 6.9% of twin pairs remained undetermined. The questionnaire method has been shown to be highly accurate in a validation study using genetic markers [13].

The participants were informed about the purposes of the overall cohort study at the baseline questionnaire in 1975 and in conjunction with the subsequent questionnaires received by cohort members. In responding to the questionnaire, participants also gave informed consent. The record-linkages were also approved by the appropriate authorities responsible for the registers, the data-protection ombudsman and the Ethics Committee of the Department of Public Health, University of Helsinki.

The twins were asked to provide their current weight (kg) and height (cm), which were used to assess their BMI (kg/m2) at baseline. The validity of the self-reported BMI was tested in a subsample of twins (100 men, 125 women) from the same cohort by first asking them to provide weight and height in a questionnaire in 1990 and then weighing and measuring them between 1992 and 1993. The correlation coefficients between self-reported and measured BMI in men and women were 0.89 and 0.90, respectively [14].

For the present study, data on incidence of type 2 diabetes in twins between 1976 and 1987 were first collected through computerised record linkage based on the unique personal ID assigned to all Finns, from death certificates, the National Hospital Discharge Register and the Medication Register of the Social Insurance Institution. The results were published in 1992 [15]. The same information resources were used to gather information on new incident cases of diabetes between 1988 and 1995 in the same cohort of twins. These data were examined by two research physicians, using the same classification of diabetes as in the earlier report, to determine a date for the diagnosis. Finally, a computerised linkage to the Medication Register was performed in 2006 to update the information on all medical certificates for diabetes in the twin cohort between 1996 and 2004. The data thus collected included only a general diagnosis of diabetes, birth date and the date of certificate admission. As all study participants were 38 years or older in 1996, it was assumed that all of them had type 2 diabetes and that the date of diagnosis was the date of certification. Individuals with diabetes in 1975, based upon questionnaire or retrospective register-linkage, were excluded from the analysis. Data on vital status during follow-up was obtained from the Population Register Centre of Finland, which holds information on deaths, migration or current address in Finland.

The possibility of classifying patients with type 1 diabetes as type 2 diabetes patients in our database is small. From the medical certificates available until 1995 for the current twin cohort, we assessed that the proportion of type 2 diabetes was 90.0% in the years 1976 to 1987 and 97.9% in the years 1988 to 1995, respectively. After 1995, the proportion is likely to be even higher. Similar conclusions could be made on the basis of population indices of diabetes in Finland in 2005, where 96.5% of 45- to 64-year-old patients had type 2 diabetes [14]. To assess possible misclassification error, background population incidences of type 1 and 2 diabetes mellitus in Finland in 2005 were determined as follows. In the 30- to 44-year age group, the incidence of type 1 and type 2 diabetes were 17 and 83/100,000; in the 45- to 64-year age group, incidence was 18 and 491/100,000, respectively. Taken together, it can be said that although a fraction of the cases after 1995 were cases other than type 2 diabetes, the admixture was trivial with regard to analysis of the effect of BMI.

The compilation of the actual dataset used to estimate HRs, twin correlations and variance components is shown in Fig. 1. After removing participants with diabetes at baseline or those who had undefined zygosity (mainly due to non-response to the questionnaire), or had moved abroad or died before 1 January 1976, self-reported BMI data were available for 23,585 of the 32,860 twins originally identified. The number of complete pairs was 10,400. In addition, there were 2,785 twins (401 monozygotic men, 377 monozygotic women, 1,115 dizygotic men and 892 dizygotic women) without a co-twin.

Fig. 1
figure 1

Compilation of the study population. DZF, dizygotic female; DZM, dizygotic male; MZF, monozygotic female; MZM, monozygotic male

Data analysis and statistical methods

All individuals were used in the analyses of HR and all pairs in the analyses of concordances, twin correlations and variance components of type 2 diabetes.

Cox proportional hazard regression was used to estimate the relative risk of incident type 2 diabetes by categories of BMI (underweight <18.5 kg/m2; normal weight 18.5 to 24.9 kg/m2; overweight 25 to 29.9 kg/m2; obesity 30 to 34.9 kg/m2; and morbid obesity ≥35 kg/m2 and over), by integer values of BMI, as well as per unit BMI and per SD of BMI. For type 2 diabetes cases, follow-up ended at the date of diagnosis, while other participants were censored at the time of death, migration from Finland or end of follow-up (31 December 2004). Regression models were adjusted for age and sex only, as well as for additional covariates. Standard errors and p values were corrected for clustering of twins within twin pairs in analyses of individuals. The proportionality assumption was evaluated by plotting of log–log of survival probability against log (time). The parallelism of the curves was good, although it diminished somewhat at the very highest ages. Pairwise (the number of pairs concordant for type 2 diabetes divided by the number of pairs either concordant or discordant for type 2 diabetes) and probandwise (the proportion of twins with diabetes among the co-twins of twins with diabetes) concordance rates were estimated as described in our previous report [15].

Sex-specific genetic univariate models were first tested for BMI and type 2 diabetes. Models were fitted to the data, assuming that variation in the phenotypes were attributable to a combination of a non-shared environmental component (E), a shared environmental component (C), an additive genetic component (A) and genetic effects due to dominance (D). The estimates were calculated as the proportion of variance divided by the total variance. The superiority of alternative, hierarchically nested models was tested by minus two times log likelihood test (−2LL) with df equal to the differences in df of the models to be compared. The relative proportion of A in relation to the total trait variance was also the estimate of heritability, h 2. Age-adjusted bivariate genetic factor models were then estimated separately for male and female twin pairs, with mean and variance weighted least squares method (WLSMV) implemented in MPlus software (MPlus version 5; www.statmodel.com, accessed 18 December 2008). Bivariate models included age- and Box–Cox-transformed BMI as a continuous trait and type 2 diabetes as a dichotomous trait, and yielded, in addition to the variance components, the coefficients for phenotypic (r), intraclass (IC) correlation between genetic variance components (r g) and correlation between environmental variance components (r e). The bias-corrected 95% CIs were estimated using a bootstrap method with 1000 draws.

Results

BMI and the risk of developing type 2 diabetes

The phenotypic correlation between BMI and type 2 diabetes was 0.40 in men and 0.38 in women. The overall HR of type 2 diabetes was 1.22 (95% CI 1.20–1.24) per BMI unit and 1.97 (95% CI 1.87–2.08) per 1 SD of BMI, respectively (Fig. 2). Altogether 1332 twin individuals (6.3% of men, 5.1% of women) developed type 2 diabetes by the end of the follow-up, with no significant difference in zygosity (HR for dizygotic twin individuals 0.93, 95% CI 0.82–1.06). The HR of developing type 2 diabetes increased monotonically from 1.00 (normal weight) to 13.6 (morbid obesity, BMI ≥ 35 kg/m2), when age, sex, physical activity, alcohol consumption, educational status and social class, respectively, were all used as covariates (Table 1).

Fig. 2
figure 2

The risk (Cox proportional hazards model) of developing type 2 diabetes (T2DM) (logarithmic scale) plotted against self-reported BMI recorded in 1975 during follow-up of 21,933 twins between 1975 and 2005, with respective 95% CIs (whiskers). Due to small numbers at the extremes of BMI distribution, participants with BMI ≤ 18 kg/m2, or BMI ≥ 37 kg/m2 more were pooled in two groups respectively. Age, sex, educational level, occupation, social class, physical activity and alcohol consumption were used as covariates

Table 1 The number of individuals categorised to BMI (kg/m2) classes together with the number of incident cases of type 2 diabetes in each class

The single twins (those without a co-twin) had higher BMI than the twins belonging to complete pairs (23.9 vs 23.1, p < 0.0001). Accordingly, the HR for type 2 diabetes was 1.31 among single twins, as compared with complete pairs. Therefore, we calculated the HR per BMI unit separately for single twins (those without a co-twin) and complete pairs. The HR was 1.22 (95% CI 1.18–1.26) among single twins and 1.23 (95% CI 1.21–1.25) among complete pairs, respectively. In other words, no difference in the BMI-associated risk of type 2 diabetes was observed between twins from incomplete vs those from complete pairs. Participants who were excluded due to insufficient information being available (n = 9275) were older than (39.3 ± 0.2 vs 35.6 ± 0.1 years, p < 0.0001), but their BMI did not differ from that of (23.7 ± 0.1 vs 23.2 ± 0.0, p = 0.50) the twins included in the analysis.

Concordance rates for type 2 diabetes during follow-up

At 0.41 proband-wise and 0.34 pairwise, the monozygotic concordance rates were higher than those in dizygotic twins (0.19 and 0.12, respectively). Over the follow-up period, approximately half of the monozygotic twins and a third of the dizygotic twins with an affected co-twin developed type 2 diabetes (Fig. 3). The mean age at start of follow-up, sex or the age at which twin 1 was diagnosed with type 2 diabetes were not significant predictors in the Cox regression model of type 2 diabetes in the co-twins of a twin in whom type 2 diabetes was already present.

Fig. 3
figure 3

The relative risk (%) of the non-diabetic co-twin within pairs discordant for type 2 diabetes as a function of follow-up time. n values for MZ and DZ indicate number at risk (number of events) during each interval. Black line, monozygotic; grey line, dizygotic. The respective mean age on 1 January 1976 at start of follow-up was 42.1 years for monozygotic and 42.9 years for dizygotic twins, respectively. Twins who died or moved abroad or were lost to follow-up were taken into account in the calculation of risk

Correlations and variance components

The intraclass correlations for monozygotic were roughly twice those for dizygotic twins (Table 2) for BMI (0.71 in monozygotic men and women, 0.41 in dizygotic men, 0.36 in dizygotic women) and for type 2 diabetes (0.74 in monozygotic men, 0.63 in monozygotic women, 0.34 in dizygotic men and women, respectively). In univariate analyses of BMI and type 2 diabetes in women, and of type 2 diabetes in men, the AE model fitted the data best. For BMI in men, the ACE model fitted the data slightly better than the AE model (p = 0.005, difference −2LL = 8.06 with df = 1), but the proportion of the C component was no more than 0.10 (95% CI 0.00–0.25). Therefore, the bivariate AE model was finally selected for both sexes. According to the model, 75% and 71% of the variance in BMI in men and women respectively could be attributed to additive genetic influences. Accordingly for type 2 diabetes, additive genetic factors accounted for 73% of the total variation in disease susceptibility in men and for 64% in women. The model-based r g of 0.40 in men and 0.45 in women implied that genetic covariation with BMI explained 16% (men) and 21% (women) of the genetic covariance for type 2 diabetes. The coefficients for r e (men 0.37, women 0.22) were slightly lower than those for r g.

Table 2 Results from the sex-specific bivariate genetic AE model including age and Box–Cox-transformed BMI as continuous, and type 2 diabetes as dichotomous variables

Discussion

To our knowledge, this is the first population-based follow-up study on the cumulative incidence of type 2 diabetes among twin pairs. With more than 10,000 adult monozygotic and dizygotic twin pairs from the Finnish Twin Cohort Study we show how well even a single estimate for BMI predicted future type 2 diabetes. Diabetes risk was almost 14-fold greater among morbidly obese (BMI ≥ 35 kg/m2) than among normal-weight participants. Our study also illustrates that, although BMI and type 2 diabetes had high heritability, only a minor proportion of their covariance was explained by shared genetic factors.

As a confirmatory outcome, we observed a monotonous increase in HR for type 2 diabetes between BMI 20 and 38 kg/m2. Similar findings were seen in 1995 in the US Nurses Study [16]. In general, the HR for type 2 diabetes was in line with previous reports. The age and sex adjusted HR values in our study were well in line with the respective values in a population-based follow-up study from Glasgow [1]. The increase in risk per BMI unit of 1.97 was in line with the pooled relative risk of 1.87 in a large meta-analysis published in 2007 [17]. In our study the HR for type 2 diabetes in severely obese participants was as much as 14 times that for all normal-weight participants, while it increased from 1 to 30 between BMI 20 and 35 when analysed per BMI unit with the low end of normal weight as reference category. Our study therefore supports the view that, in the prevention of type 2 diabetes, weight control is essential and also beneficial within BMI ranges considered to be normal.

Although earlier cross-sectional studies differed from ours with regard to recruitment of study participants, absence of follow-up, and ascertainment and prevalence of type 2 diabetes, they all resulted in surprisingly similar r g-estimates between BMI and diabetes, e.g. 0.55 in the Strong Heart Family Study [10] and 0.46 in a recent Australian twin study [11]. In the Strong Heart Family Study with 950 adults from 32 extended families of American-Indian ancestry, the prevalence of diabetes was substantial, namely 30%. In the Australian study, the prevalence of type 2 diabetes was only 0.34%, with only 113 of the nearly 20,000 pairs screened having type 2 diabetes in one or both twins. The similarity in r g can be taken to indicate that only a portion of the genes that contribute to BMI and type 2 diabetes mellitus are the same.

While the ratio of the risk of type 2 diabetes in siblings of patients with the disease compared with disease risk in the general population has varied between 3 and 4 [18], genome-wide association studies have estimated the λ attributable to all loci for type 2 diabetes to be only little above one, i.e. these loci have been estimated to explain 5 to 10% of the inherited predisposition [19]. As the FTO gene appears to increase diabetes risk mainly via obesity [20], no gene accounting for the observed r g between BMI and type 2 diabetes exists at present. Of the genes found for BMI, a significant proportion act through the central nervous system [4, 5], whereas most diabetes genes implicate pancreatic beta cell function or insulin resistance [3]. In other words, despite the strong association between diabetes risk and obesity, only part of this association is explained by genetic similarity. Thus although new genetic influences between adiposity and type 2 diabetes will emerge, the odds for finding a major gene accounting for obesity and type 2 diabetes have decreased.

Our heritability estimates for BMI were in line with previous studies [14, 16]. Although the respective estimates for type 2 diabetes appeared high, they were, under an AE model, similar to our earlier report [15]. That is to say, due to longer follow-up time, new cases of diabetes had emerged, but at parallel rates among monozygotic and dizygotic twins. The extension of follow-up time enabled us to view the longitudinal development of concordance for type 2 diabetes among monozygotic twins. Interestingly, it took more than a decade for type 2 diabetes concordance in monozygotic twins to reach 0.5, although the members of each pair share all their genes and childhood environments. In other words, for the risk of developing type 2 diabetes, current individual lifestyle-related behaviour, which is embedded in the variance component E, is more important than the past childhood family environment (variance component C). At the onset of the current study, less than 5% of the twins reported having lived with their co-twin after the age of 25 years (data not shown). Lack of influence of shared environment after adolescence has been observed, among other things, for food use [21], BMI [22] and physical activity [23].

The main virtue of our study is that it is a large, population-based study cohort with long follow-up of all available twin pairs. As pointed out by Poulsen in 1999, clinic-based studies tend to be biased due to disproportionate sampling of concordant and monozygotic pairs, as well as the tendency to include more severe cases of diabetes, thus increasing the possibility of reporting higher concordance rates among monozygotic twins [24]. We aimed to avoid this bias by using information from all available pairs. The long follow-up of a population-based twin cohort in our study also enabled coverage of the near-life-time risk of type 2 diabetes, hence improving the possibility of detecting diabetes cases that would have remained unnoticed in a cross sectional set-up. The prospective study design, moreover, enabled us to steer clear of problems of information loss due to study participants not attending follow-up visits, as has been the case for most cohort studies of type 2 diabetes incidence [18]. Thus, the combination of a large population-based twin sample, together with the nationwide diabetes medication register, provides at least as reliable tools for the study of type 2 diabetes epidemiology as any of the smaller scale studies using direct measurement of blood glucose values.

The limitations of our study include self-reported BMI values and variability in the diagnosis of type 2 diabetes. As mentioned above, we had previously shown that self-reported BMI values were well in line with measured ones within the same twin cohort [14]. The prospective study design also reduced the bias of reporting weight and avoided the problem of weight change after type 2 diabetes diagnosis. Moreover, in the current twin cohort the genetic influences on weight change have already been shown to be virtually uncorrelated with the genetic effects on BMI [25]. The diagnostic criteria for type 2 diabetes changed between 1975 and 2005 with regard to name (adult-onset diabetes mellitus, non-insulin-dependent diabetes mellitus, type 2 diabetes mellitus) and clinical definition [26]. However, all medical records were reviewed in detail and recoded up to 1995. After that, misdiagnosis between type 1 and type 2 diabetes played a minor role as even the youngest twins were at least 38 years old when the latest compilation of diabetes was begun. Finally, although the twins without a participating co-twin were more obese than those from complete pairs, the association between BMI and risk of type 2 diabetes remained unchanged in both groups.

Summary

Four salient findings emerged from this 28-year follow-up of 23,585 twins from the Finnish Twin Cohort Study. First, the risk of developing type 2 diabetes increased monotonically at levels already within the normal BMI range. Second, there was huge variability in the age at onset of type 2 diabetes, even within genetically identical twins. Third, the most parsimonious bivariate model suggested high additive genetic variance components for BMI and type 2 diabetes. Finally, shared genetic influences explained about one fifth of the covariation of BMI and type 2 diabetes.

Conclusions

Our data add to evidence that BMI is a strong risk factor for future type 2 diabetes, even in individuals with normal weight. While there are some shared genetic factors between BMI and type 2 diabetes, many genetic influences are specific either to BMI or to type 2 diabetes.