Introduction

Prevention of diabetes and its associated complications has become a major public health priority worldwide. Recent clinical trials demonstrated that lifestyle interventions in individuals with impaired glucose tolerance can substantially delay the development of diabetes [1, 2], providing a rationale for the identification of high-risk individuals so as to implement early lifestyle intervention strategies to prevent diabetes. The prediction models for the risk of diabetes can help to guide screening and interventions and to predict diabetes occurrence [35]. Routinely available and easily collected clinical and lifestyle-related information has been found to be effective for identifying diabetes cases [612]. In addition, a prediction model has been developed in Mexican Americans [13] and further tested in Japanese Americans [14]. This San Antonia model included a variety of diabetes risk factors, including age, sex, ethnicity, fasting glucose, systolic blood pressure, HDL-cholesterol, body mass index and family history, to generate a prediction points system and appeared to have a predictive power similar to that of the diagnostic criteria for metabolic syndrome [13, 15]. Several other prediction models have also been developed, primarily in white populations [3, 5, 16, 17]. However, risk scores developed in white people may not apply to other ethnic groups [18]. Therefore, we specifically developed a prediction model for type 2 diabetes using a community-based cohort of middle-aged and elderly Chinese in Taiwan.

Methods

Study design and participants

The National Taiwan University Hospital Committee Review Board approved the study protocol, details of which have been published previously [19, 20]. Briefly, the Chin-Shan Community Cardiovascular Cohort Study began in 1990 by recruiting 1,703 men and 1,899 women of Chinese ethnicity aged 35 years old or older from the Chin-Shan township, 30 km north of metropolitan Taipei, Taiwan. Information about anthropometry, lifestyle and medical conditions was assessed by interview questionnaires in 2 year cycles for the initial 6 years; the validity and reproducibility of the collected data and measurements have been reported in detail elsewhere [21]. Individuals with baseline incomplete blood data (n = 41), diagnosis of diabetes (fasting glucose ≥7.0 mmol/l or with a history of hypoglycaemic medication, n = 473), or history of cardiovascular disease or cancer (n = 170) were excluded from this investigation. After these exclusions, we included 2,960 individuals in this study. During the follow-up from 1990 to 2000, 548 individuals developed diabetes defined by fasting glucose levels ≥7.0 mmol/l or by the use of oral hypoglycaemic or insulin medication. The response rate of the cohort participants was 85.7% at the end of the study. All participants give informed consent.

Body mass index was calculated from weight (kg)/height2 (m2). Blood pressure was measured twice in the right arm using a mercury sphygmomanometer with the individual seated comfortably, arms supported and positioned at the level of the heart after resting for 10 min. Hypertension was defined according to the criteria established by the Seventh Joint National Committee [22]: systolic blood pressure ≥ 140 mmHg; diastolic blood pressure ≥90 mmHg; or history of hypertension medication. Family history was defined by any parental or first-degree sibling history of diabetes.

Measurement of biochemical markers

The procedure for blood collection has been reported elsewhere [23, 24]. Briefly, all venous blood samples drawn after a 12 h overnight fast were immediately refrigerated and transported within 6 h to the National Taiwan University Hospital. Serum samples were then stored at −70°C before batch assay for levels of total cholesterol, triacylglycerol, and HDL-cholesterol. Standard enzymatic tests for serum cholesterol and triacylglycerol were used (Merck 14354 and 14366, Merck, Darmstadt, Germany). HDL-cholesterol levels were measured in supernatant fractions after the precipitation of specimens with magnesium chloride phosphotungstate reagents (Merck 14993). LDL-cholesterol concentrations were calculated as total cholesterol minus cholesterol in the supernatant fraction by the precipitation method (Merck 14992) [25]. Blood samples for glucose analysis were drawn into glass test tubes each containing 80 mol/l fluoride/oxalate reagent. After centrifugation at 1,500×g for 10 min, the glucose level in the supernatant fraction was measured by enzymatic assay (Merck 3389) using an Eppendorf 5060 autoanalyzer. The peripheral-blood-cell analysis was carried out using the blood-cell counter (Sysmex Cell Counter NE-8000, TOA Medical Electronics, Kobe, Japan).

Statistical analysis

We used a multivariate Cox proportional hazards model to establish a parsimonious model for predicting risk of diabetes. This model included six significant predictors: age, body mass index, white blood cell count, triacylglycerol, HDL-cholesterol and fasting glucose. Because sex, family history of diabetes, hypertension and systolic blood pressure are biologically important predictors of the risk of diabetes [26], we examined the incremental predictive value of adding these variables to the above model. In addition, lifestyle factors such as physical activity, smoking and drinking alcohol were also tested to derive the parsimonious model. However, the likelihood ratio test suggested that adding these variables into the model did not improve prediction beyond the concise model. Thus, our final model did not include sex, hypertension or lifestyle factors.

We constructed the categorisation point system according to the concise model using the methods suggested by Sullivan and colleagues [27]. First, we organised the continuous variables into meaningful categories and determined reference values for each variable. Second, we determined the referent risk-factor profile by assigning the median value in each category and estimated how far each category was from the referent in regression units. Then we set a constant to reflect the increase in risk associated with a 10 year increase in age and decided points associated with each of the risk-factor categories. The point totals ranged from –15 to 32. Finally, we constructed the individual’s risk from the formula:

$${\text{Risk}} = 1 - {\text{S}}_0 \left( t \right)^{\exp \left( {\sum {\beta X - \sum {\beta \overline X } } } \right)} $$

where S0(t) was the average survival at time t (e.g. t = 10 years) or the survival rate at the mean values of the risk factor, βX values were approximated from the sum of baseline risk and product of point totals and the constant. \(\beta \overline X \) values were the sum of the products of the regression coefficients and means or the proportions of the variables.

We also constructed the regression coefficient-based model by assigning β values as estimated regression coefficients.

We conducted the internal validation of the simple points model and obtained a bias-corrected estimate of the AUC using a fivefold cross-validation procedure [28]. We randomly split the data into five equal parts. For k = 1, ..., 5, we used the kth part as the validation dataset and the remaining four parts as the training dataset. For each partition of the validation and training sets, we obtained the coefficients from the training set and assigned the points from the coefficients in the validation set to evaluate performance by estimating the AUC for the corresponding simple points model. Then we evaluated the overall performance of the points model by averaging the AUC estimates obtained from the five different partitions. The ranges of AUC across the five partitions indicated the stability of the prediction model [29]. To account for the variability in estimating the model variables and the AUC, we used the bootstrap method to construct 95% confidence intervals for the AUC. The standard-error estimates and the confidence intervals were obtained based on 1,000 bootstrap samples.

Finally, we compared the performance of the proposed prediction model with that of various prediction models derived from other populations, including Cambridge [810], Prospective Cardiovascular Münster (PROCAM) [30], San Antonia [13, 14] and Framingham [5]. AUC was used to compare the discriminatory capabilities of these models and our simple points model. An AUC curve is a graph of sensitivity vs 1–specificity (or false-positive rate) for various cut-off definitions of a positive diagnostic test result [31]. We listed the sensitivity and specificity for the best cut-off values from various models. Statistical differences in the AUCs were compared using the method of DeLong et al. [32]. In addition, we assessed the goodness of fit for all models based on the Hosmer–Lemeshow test [33]. The global summary statistics, including Yates slope [34, 35], Brier score [34] and discrimination C statistics [36], were calculated in various models. Moreover, we compared the simple points model with other models by using the net reclassification improvement (NRI) and integrated discrimination improvement (IDI) statistics [37]. The NRI statistic was based on the reclassification tables and was calculated from a sum of differences between the ‘upward’ movement in categories for event participants and the ‘downward’ movement in those for non-event participants [37]. We presented the NRI according to the a priori risk categories of diabetes (0–15%, 15–20%, 20–25%, and ≥25%). The IDI can be interpreted as a difference between improvement in average sensitivity and any potential increase in average ‘1–specificity’, and the statistic was a difference in Yates discrimination slopes between the new and old models [34, 35].

We also ran a clinical model that included age, sex, BMI, family history and antihypertensive medication, but not requiring blood test results, and used the AUC to compare the predictive ability of this model with the model that included laboratory-based measures.

All statistical tests were two-sided with a type I error of 0.05, and p values < 0.05 were considered statistically significant. Analyses were performed with SAS version 9.1 (SAS Institute, Cary, NC, USA) and Stata version 9.1 (Stata Corporation, College Station, TX, USA).

Results

Of the 2,960 participants without diabetes at the baseline examination, 548 developed type 2 diabetes during a 10 year follow-up period. Among the 548 participants with incident diabetes, 396 were not receiving pharmacological treatment for diabetes and were given a diagnosis exclusively on the basis of plasma glucose levels that met the American Diabetes Association criteria (≥7.0 mmol/l). Of the 136 participants with confirmed pharmacological treatment for diabetes, 78 also met plasma glucose criteria for diabetes. The baseline characteristics of study participants and the results of a Cox multivariate analysis that included age, body mass index, white cell count, triacylglycerol, HDL-cholesterol and fasting glucose level are shown in Table 1. Participants who developed incident diabetes also tended to have increased systolic blood pressure and a higher prevalence of family history of diabetes, but these two variables were not significantly associated with risk of diabetes after adjustment for other covariates. Therefore, they were not included in the final model.

Table 1 Diabetes risk factors at baseline according to disease status in the study participants and the estimated coefficients and relative risk (95% CI) from the multivariate Cox model

We developed a simple points system to estimate the diabetes risk using the baseline survival function at 10 years and the coefficients of the concise model (Table 2): age (four points), elevated fasting glucose (11 points), body mass index (eight points), triacylglycerol (five points), white blood cell count (four points) and a higher HDL-cholesterol (negative four points). This approach allowed manual estimation of 10 year risk of developing diabetes for each individual, as shown in Table 3.

Table 2 Simple points system according to the concise model
Table 3 The total points and absolute risk function for the categorical model in the simple points model

By using the simple points system to estimate risk of incident diabetes during a 10 year follow-up interval, we determined that 42% of the sample had a risk below 20%, 28% had risk a 20–30% risk, and 30% had a risk of 30% or higher. This simple points model has a good discriminatory ability with an AUC of 0.702 (95% CI 0.676–0.727). Adding sex, hypertension and family history of diabetes did not improve the predictive power (AUC 0.700, 95% CI 0.675–0.725). In addition, the regression-coefficient-based model has a similar AUC value (0.701, 95% CI 0.675–0.726). The optimal cut-off value for the simple points model was set as 13, with a sensitivity of 0.52 and a specificity of 0.78 (Table 4). We found that the simple points model had the highest proportions of correctly classified and best likelihood ratio values among all models, and had a Youden index value similar to those of the coefficient-based models. The clinical model based on anthropometric measures and medication had a lower predictive ability (AUC 0.646, 95% CI 0.621–0.672) than the simple points model that included blood test results (p < 0.001).

Table 4 Sensitivity, specificity, best Youden index and likelihood ratios of the optimal cut-off value for the risk of diabetes in each model

The within-study model validation was assessed by the aforementioned fivefold cross-validation procedure. The AUCs for the five partitions ranged from 0.664 to 0.711, indicating a moderately high reliability of discrimination for the model in repeated random-sample subsets. The AUCs from the PROCAM, Cambridge, San Antonia and Framingham models were significantly lower than that from our simple points model (Fig. 1).

Fig. 1
figure 1

Receiver-operating characteristic curves for various models applied to the study population. Blue, simple (AUC 0.701); grey, San Antonia (AUC 0.675); green, Framingham (AUC 0.662); orange, PROCAM (AUC 0.631); dark blue, Cambridge (AUC 0.581); black, reference (AUC 0.500)

Table 5 presents summary statistics for the performance of models in terms of predicting diabetes risk in the cohort dataset. The simple points model had the highest Yates’ slope and C statistics, indicating good discriminatory ability. When comparing the predicted and observed risks, we observed a non-significant p value for the Hosmer–Lemeshow statistics for the simple points model, indicating good calibration. The San Antonia model was most similar to the simple points model, with the smallest NRI and IDI values and the highest correlation coefficient. In addition, we calculated the NRI according to different risk categories and the results were similar. We found the NRI values were similar in the quartiles and the 15–20–25% risk categories.

Table 5 Summary statistics comparing risk prediction algorithms with prediction based on covariates in the PROCAM, Cambridge, Framingham, San Antonia and the simple points algorithms for the study cohort (N = 2960)

Discussion

Statement of principal findings

Using a community-based cohort study, we developed a simple points model to predict 10 year risk of type 2 diabetes in a Chinese population based on six variables: age, body mass index, white blood cell count, triacylglycerol, HDL-cholesterol and fasting glucose levels. These values could be relatively easily obtained in clinical practice and the points system we developed is simple to use. The availability of the simple clinical tool to predict future risk of disease, as has been the case for prediction of coronary heart disease, should improve the prediction of diabetes risk, identify high-risk populations and enhance preventive strategies.

Existing diabetes risk functions

Several diabetes-prediction models have previously been developed in various populations. In cross-sectional studies conducted in the USA and Europe, prediction models based on clinical information and lifestyle-related factors have appeared to be useful for identifying undiagnosed diabetes cases and high HbA1c levels in screening populations [7, 8, 38, 39]. For example, the Cambridge model has been applied successfully to identify individuals with high HbA1c levels [9, 10]. In addition, a recent cohort study showed that the Cambridge model was useful for identifying individuals with a higher risk of type 2 diabetes during follow-up [40]. Other cohort-based risk models, such as the FINDRISC (Finnish Diabetes Risk Score) [3], PROCAM [30], San Antonia [13] and Framingham models [5], were developed to predict incident diabetes in different populations. We did not compare our model with FINDRISC or the German risk score [17, 41] because dietary information, which is needed for risk scoring, was not collected from our cohort.

Using age, body mass index, fasting glucose, HDL-cholesterol, family history and hypertension status, von Eckardstein and colleagues developed a model (PROCAM) to predict the incidence of diabetes during 6.3 years of follow-up in a German population [30]. Stern and colleagues also constructed the San Antonia model to predict diabetes risk in Mexican Americans and non-Hispanic whites during a 7.5-year follow-up period [13] and this model was validated among Japanese Americans during a 10 year follow-up period [14]. The San Antonia model included biomarkers such as fasting glucose, blood pressure and HDL-cholesterol, in addition to age, sex, obesity and family history of diabetes. Recently, Wilson and colleagues developed a simple points diabetes-prediction model for the Framingham Offspring Study [5]. This model included fasting glucose, body mass index, HDL-cholesterol, family history of diabetes, triacylglycerol and hypertension. Among these models, only the San Antonia model resembled our simple points model in terms of predictive power (see Electronic supplementary material [ESM] Table 1).

Most of the variables included in our model were similar to those in previous risk functions. Moreover, white cell count, a marker of chronic inflammation, was incorporated into our prediction model. In addition to fasting glucose and BMI, metabolic variables such as high triacylglycerol and low HDL-cholesterol were found to be strong predictors of type 2 diabetes in our study. These variables in our model have also been included in prediction model developed for other populations [39, 42]. Moreover, we found that BMI had a discriminatory power similar to that of waist circumference in our model. Although family history of diabetes and hypertension were important predictors of diabetes in other studies, adding these variables to our risk scores did not improve risk prediction. In addition, we did not include lifestyle factors such as physical activity and smoking because they were not significant predictors of diabetes after adjusting for other risk factors.

Strengths and weaknesses of the study

To our knowledge, this is the first diabetes prediction model specifically developed for a Chinese population. Because of the large sample size, the estimates from our prediction models were found to be stable, as demonstrated by the internal validation study. Also, the use of a community-based population could reduce the possibility of selection bias. However, several potential limitations of this study should be mentioned.

First, the discriminatory capability of our simple points model was only moderately high (the AUCs were ∼0.70), and somewhat lower than that of other prediction models from other populations.

Second, we did not include extensive biomarker data in the model. Adding other biomarkers such as insulin resistance may improve the discriminatory ability. However, these variables are more difficult to measure and interpret in clinical practice.

Third, our cohort has a higher diabetes incidence rate than Chinese women in Shanghai, China [43]. The high risk of diabetes among our study participants may be explained by older age, higher body mass index and a single measurement for case diagnosis [44].

Fourth, fasting plasma glucose but not post-challenge glucose was used to define the incidence of diabetes and to exclude individuals with undiagnosed diabetes at baseline. This is likely to have led to some error in estimation of the risk of diabetes and hence the performance of our model. Although fasting glucose is widely measured in routine health checkups in Taiwan [45], our model may be less practical in other populations in which blood test results for white cell count and fasting glucose, triacylglycerol and cholesterol are not routinely available or easily obtained.

In conclusion, we have constructed a simple points model for predicting the 10 year incidence of diabetes, and this model performed significantly better than other existing diabetes prediction models within the study population of ethnic Chinese people. This simple clinical tool should help identify high-risk populations and improve preventive and treatment strategies for the Chinese population.