- Split View
-
Views
-
Cite
Cite
Andrew N. Freedman, Daniela Seminara, Mitchell H. Gail, Patricia Hartge, Graham A. Colditz, Rachel Ballard-Barbash, Ruth M. Pfeiffer, Cancer Risk Prediction Models: A Workshop on Development, Evaluation, and Application, JNCI: Journal of the National Cancer Institute, Volume 97, Issue 10, 18 May 2005, Pages 715–723, https://doi.org/10.1093/jnci/dji128
- Share Icon Share
Abstract
Cancer researchers, clinicians, and the public are increasingly interested in statistical models designed to predict the occurrence of cancer. As the number and sophistication of cancer risk prediction models have grown, so too has interest in ensuring that they are appropriately applied, correctly developed, and rigorously evaluated. On May 20–21, 2004, the National Cancer Institute sponsored a workshop in which experts identified strengths and limitations of cancer and genetic susceptibility prediction models that were currently in use and under development and explored methodologic issues related to their development, evaluation, and validation. Participants also identified research priorities and resources in the areas of 1) revising existing breast cancer risk assessment models and developing new models, 2) encouraging the development of new risk models, 3) obtaining data to develop more accurate risk models, 4) supporting validation mechanisms and resources, 5) strengthening model development efforts and encouraging coordination, and 6) promoting effective cancer risk communication and decision-making.
Cancer researchers, clinicians, and the general public are devoting increased attention to statistical models designed to predict the occurrence of cancer. The increasing numbers of websites ( 1 – 5 ) , handbooks, and information resources from professional societies ( 6 – 8 ) attest to the growing interest. A number of companies in the United States and the United Kingdom now offer genetic risk profiling ( 9 – 11 ) , and the National Cancer Institute (NCI) has identified risk prediction as an area of extraordinary opportunity in “The Nation's Investment in Cancer Research” ( 12 ) .
The number of models has grown steadily ( Table 1 ) since the first risk prediction model for a chronic disease was published in 1976 ( 47 ) . This model, the Framingham Coronary Risk Prediction Model, used several clinical and biologic factors to predict an individual's risk of developing heart disease. Modified versions of this early model are now widely used by physicians to make decisions on prevention and treatment strategies ( 48 ) . In the late 1980s and early 1990s, investigators began to publish models that predicted the absolute risk of breast cancer—that is, the probability that an individual would develop breast cancer over a defined period of time. These models incorporated known risk factors, such as age, age at menarche, age at first live birth, and family history of breast cancer. After the discovery of the breast cancer susceptibility genes BRCA1 and BRCA2 between 1994 and 1995, a number of genetic susceptibility risk prediction models were developed to predict the likelihood that an individual carried a BRCA1 or BRCA2 gene for breast cancer by use of her family history.
Breast cancer risk prediction models |
Absolute risk prediction |
Ottman et al. ( 13 ) |
Anderson et al. ( 14 ) |
Gail et al. ( 15 ) |
Taplin et al. ( 16 ) |
Claus et al. ( 17 ) ; Claus et al. ( 18 ) |
Rosner et al. ( 19 ) ; Colditz et al. ( 20 ) |
Ueda et al. ( 21 ) |
Tyrer et al. ( 22 ) |
Risk prediction of gene carrier status |
Couch et al. ( 23 ) |
Shattuck-Eidens et al. ( 24 ) |
Parmigiani et al. ( 25 ) ; Berry et al. ( 26 ) |
Frank et al. ( 27 ) ; Frank et al. ( 28 ) |
Antoniou et al. ( 29 ) |
de la Hoya et al. ( 30 ) |
Vahteristo et al. ( 31 ) |
Hartge et al. ( 32 ) |
Apicella et al. ( 33 ) |
Jonker et al. ( 34 ) |
Risk prediction of women at high risk |
Gilpin et al. ( 35 ) |
Fisher et al. ( 36 ) |
Colorectal cancer risk prediction models |
Absolute risk prediction |
Selvachandran et al. ( 37 ) |
Imperiale et al. ( 38 ) |
Risk prediction of gene carrier status |
Wijnen et al. ( 39 ) |
Prostate cancer risk prediction models |
Absolute risk prediction |
Ohori et al. ( 40 ) |
Bruner et al. ( 41 ) |
Eastham et al. ( 42 ) |
Optenberg et al. ( 43 ) |
Lung cancer risk prediction models |
Absolute risk prediction |
Bach et al. ( 44 ) |
Ovarian cancer risk prediction models |
Absolute risk prediction |
Hartge et al. ( 45 ) |
Risk prediction models for other cancers |
Absolute risk prediction |
Colditz et al. ( 46 ) |
Breast cancer risk prediction models |
Absolute risk prediction |
Ottman et al. ( 13 ) |
Anderson et al. ( 14 ) |
Gail et al. ( 15 ) |
Taplin et al. ( 16 ) |
Claus et al. ( 17 ) ; Claus et al. ( 18 ) |
Rosner et al. ( 19 ) ; Colditz et al. ( 20 ) |
Ueda et al. ( 21 ) |
Tyrer et al. ( 22 ) |
Risk prediction of gene carrier status |
Couch et al. ( 23 ) |
Shattuck-Eidens et al. ( 24 ) |
Parmigiani et al. ( 25 ) ; Berry et al. ( 26 ) |
Frank et al. ( 27 ) ; Frank et al. ( 28 ) |
Antoniou et al. ( 29 ) |
de la Hoya et al. ( 30 ) |
Vahteristo et al. ( 31 ) |
Hartge et al. ( 32 ) |
Apicella et al. ( 33 ) |
Jonker et al. ( 34 ) |
Risk prediction of women at high risk |
Gilpin et al. ( 35 ) |
Fisher et al. ( 36 ) |
Colorectal cancer risk prediction models |
Absolute risk prediction |
Selvachandran et al. ( 37 ) |
Imperiale et al. ( 38 ) |
Risk prediction of gene carrier status |
Wijnen et al. ( 39 ) |
Prostate cancer risk prediction models |
Absolute risk prediction |
Ohori et al. ( 40 ) |
Bruner et al. ( 41 ) |
Eastham et al. ( 42 ) |
Optenberg et al. ( 43 ) |
Lung cancer risk prediction models |
Absolute risk prediction |
Bach et al. ( 44 ) |
Ovarian cancer risk prediction models |
Absolute risk prediction |
Hartge et al. ( 45 ) |
Risk prediction models for other cancers |
Absolute risk prediction |
Colditz et al. ( 46 ) |
Breast cancer risk prediction models |
Absolute risk prediction |
Ottman et al. ( 13 ) |
Anderson et al. ( 14 ) |
Gail et al. ( 15 ) |
Taplin et al. ( 16 ) |
Claus et al. ( 17 ) ; Claus et al. ( 18 ) |
Rosner et al. ( 19 ) ; Colditz et al. ( 20 ) |
Ueda et al. ( 21 ) |
Tyrer et al. ( 22 ) |
Risk prediction of gene carrier status |
Couch et al. ( 23 ) |
Shattuck-Eidens et al. ( 24 ) |
Parmigiani et al. ( 25 ) ; Berry et al. ( 26 ) |
Frank et al. ( 27 ) ; Frank et al. ( 28 ) |
Antoniou et al. ( 29 ) |
de la Hoya et al. ( 30 ) |
Vahteristo et al. ( 31 ) |
Hartge et al. ( 32 ) |
Apicella et al. ( 33 ) |
Jonker et al. ( 34 ) |
Risk prediction of women at high risk |
Gilpin et al. ( 35 ) |
Fisher et al. ( 36 ) |
Colorectal cancer risk prediction models |
Absolute risk prediction |
Selvachandran et al. ( 37 ) |
Imperiale et al. ( 38 ) |
Risk prediction of gene carrier status |
Wijnen et al. ( 39 ) |
Prostate cancer risk prediction models |
Absolute risk prediction |
Ohori et al. ( 40 ) |
Bruner et al. ( 41 ) |
Eastham et al. ( 42 ) |
Optenberg et al. ( 43 ) |
Lung cancer risk prediction models |
Absolute risk prediction |
Bach et al. ( 44 ) |
Ovarian cancer risk prediction models |
Absolute risk prediction |
Hartge et al. ( 45 ) |
Risk prediction models for other cancers |
Absolute risk prediction |
Colditz et al. ( 46 ) |
Breast cancer risk prediction models |
Absolute risk prediction |
Ottman et al. ( 13 ) |
Anderson et al. ( 14 ) |
Gail et al. ( 15 ) |
Taplin et al. ( 16 ) |
Claus et al. ( 17 ) ; Claus et al. ( 18 ) |
Rosner et al. ( 19 ) ; Colditz et al. ( 20 ) |
Ueda et al. ( 21 ) |
Tyrer et al. ( 22 ) |
Risk prediction of gene carrier status |
Couch et al. ( 23 ) |
Shattuck-Eidens et al. ( 24 ) |
Parmigiani et al. ( 25 ) ; Berry et al. ( 26 ) |
Frank et al. ( 27 ) ; Frank et al. ( 28 ) |
Antoniou et al. ( 29 ) |
de la Hoya et al. ( 30 ) |
Vahteristo et al. ( 31 ) |
Hartge et al. ( 32 ) |
Apicella et al. ( 33 ) |
Jonker et al. ( 34 ) |
Risk prediction of women at high risk |
Gilpin et al. ( 35 ) |
Fisher et al. ( 36 ) |
Colorectal cancer risk prediction models |
Absolute risk prediction |
Selvachandran et al. ( 37 ) |
Imperiale et al. ( 38 ) |
Risk prediction of gene carrier status |
Wijnen et al. ( 39 ) |
Prostate cancer risk prediction models |
Absolute risk prediction |
Ohori et al. ( 40 ) |
Bruner et al. ( 41 ) |
Eastham et al. ( 42 ) |
Optenberg et al. ( 43 ) |
Lung cancer risk prediction models |
Absolute risk prediction |
Bach et al. ( 44 ) |
Ovarian cancer risk prediction models |
Absolute risk prediction |
Hartge et al. ( 45 ) |
Risk prediction models for other cancers |
Absolute risk prediction |
Colditz et al. ( 46 ) |
In recent years, cancer risk prediction models published in the scientific literature have included refinements of older breast cancer risk models and new models that estimate the risk of melanoma, lung, prostate, colorectal, breast, and other cancers. Many of the new models combine clinical and epidemiologic risk factors with new biologic and genetic data to more accurately assess cancer risk.
As the number and sophistication of cancer risk prediction models have grown, so too has interest in ensuring that the models are appropriately applied, correctly developed, and rigorously evaluated. On May 20–21, 2004, the NCI sponsored “Cancer Risk Prediction Models: a Workshop on Development, Evaluation, and Application” in Washington, DC. Experts currently developing, evaluating, or using risk prediction models met to identify strengths and limitations of cancer and genetic susceptibility prediction models currently in use and under development; to explore methodologic issues related to their development, evaluation, and validation; and to identify research priorities and resources needed to advance the field.
This report summarizes the presenters' major points by topic area, provides additional highlights from specific presentations, briefly describes several risk prediction models presented at the meeting, and summarizes participants' recommendations for future research and activity.
I SSUES IN A PPLICATIONS OF C ANCER R ISK P REDICTION M ODELS
Workshop participants identified a number of research and clinical applications for cancer risk prediction models ( Table 2 ). The first of these applications was designing, planning, and establishing eligibility criteria for cancer intervention and screening trials. For example, the Gail Breast Cancer Risk Assessment Model ( 15 ) was adapted ( 49 ) to design the Breast Cancer Prevention Trial, a randomized, placebo-controlled study of the chemopreventive effects of tamoxifen in a population of women with an elevated risk of breast cancer ( 50 ) .
Planning intervention trials |
Assisting in creating benefit–risk indices |
Estimating the cost of the population burden of disease |
Identifying individuals at high risk |
Designing population prevention strategies |
Improving clinical decision-making (genetic counseling) |
Planning intervention trials |
Assisting in creating benefit–risk indices |
Estimating the cost of the population burden of disease |
Identifying individuals at high risk |
Designing population prevention strategies |
Improving clinical decision-making (genetic counseling) |
Planning intervention trials |
Assisting in creating benefit–risk indices |
Estimating the cost of the population burden of disease |
Identifying individuals at high risk |
Designing population prevention strategies |
Improving clinical decision-making (genetic counseling) |
Planning intervention trials |
Assisting in creating benefit–risk indices |
Estimating the cost of the population burden of disease |
Identifying individuals at high risk |
Designing population prevention strategies |
Improving clinical decision-making (genetic counseling) |
Cancer risk prediction models also have been used to identify individuals at high risk of cancer who may benefit from targeted screening or other interventions, such as tamoxifen chemoprevention. The U.S. Food and Drug Administration uses a 5-year breast cancer risk cutoff of 1.67% or higher that was based on the Gail model for chemopreventive use of tamoxifen among women aged 35 years or older.
Cancer risk prediction models also have been used to develop benefit–risk indices. For example, although the Breast Cancer Prevention Trial demonstrated that tamoxifen treatment produced a 49% reduction in invasive breast cancer in women at elevated disease risk, adverse events such as stroke, pulmonary embolism, and endometrial cancer also occurred more often in women taking tamoxifen than in women who did not take tamoxifen. Using the Gail model, Gail et al. ( 51 ) created a benefit–risk index that weighs the benefits of taking tamoxifen against the reduced breast cancer risk and the risks of adverse events. Rather than recommend a single 5-year level of breast cancer risk, such as the figure of 1.67%, Gail et al. found that the level of risk needed to justify the use of tamoxifen for breast cancer prevention was much higher in older women, who had higher risks of adverse events.
Another application of risk prediction models is estimating the population burden, the cost of cancer, and the impact of a specific intervention. For example, using the Gail model and a benefit–risk index, Freedman et al. ( 52 ) were able to estimate the numbers of women who would be eligible for and would benefit from taking tamoxifen for breast cancer chemoprevention in the United States.
Perhaps the best known use of cancer risk prediction models is in clinical decision-making to help physicians and patients determine appropriate screening regimens and/or interventions. Genetic susceptibility risk models also are used by physicians for their patients with a strong family history to estimate their cancer risk and to help them decide whether to pursue genetic testing.
At the meeting, participants stressed the need to consider the appropriate use of risk prediction models in different contexts and the related implications for development, application, and validation. For example, those who use models should determine the extent to which models that were developed and validated at the population level are useful in aiding decision-making for an individual.
Intervention Trials
Dr. Joseph Costantino (Graduate School in Public Health, University of Pittsburgh) stressed the need for good population data on baseline rates for non-cancer events, so that model developers can incorporate competing causes of death into benefit–risk indices. He also noted that most cancer risk prediction models were developed predominately from a Caucasian population. Although the models work well for these populations, race-specific estimates of relative and attributable risks are needed to refine these models for use in non-Caucasian populations. Specifically, there is a need to develop new risk prediction models for both breast cancer and prostate cancer in African Americans, two common cancers with a high mortality in this population.
Population Burden of Disease and Impact of Changing Risk Factors
Dr. Karen Kuntz (Harvard School of Public Health) described how population-based simulation models could aid in evaluating the impact of changes over time in risk factors, screening and chemoprevention patterns and the cost-effectiveness of interventions. She mentioned that this topic was the focus of the Cancer Intervention and Surveillance Modeling Network, a consortium of NCI-sponsored investigators who use modeling to assess the impact of cancer control interventions (e.g., prevention strategies, screening, and treatment) on population trends in incidence and mortality. These models are also used to project future trends and to help determine optimal cancer control strategies ( 53 ) .
Clinical Decision-Making (Breast Cancer Risk in General)
Dr. Laura Esserman (University of California, San Francisco, Carol Franc Buck Breast Care Center) spoke about the need to integrate all currently available information on risk to ensure that models were useful for clinical decision-making. She stressed that women could be helped through the use of decision aids that provide simple graphical information on their breast cancer risk and the risk and benefits of potential prevention strategies in the context of their overall health and the average breast cancer risk for the population. She also mentioned the critical need for models and decision aids to incorporate biomarker data, including imaging and genetic studies. These technologies can help clinicians and patients decide which intervention to pursue and can help clinicians assess the impact of these interventions on a patient. Decision aids also can facilitate a dialogue between patients and clinicians, motivate behavior change, and increase a woman's willingness to accept interventions or reassure her that her risk is not elevated compared with that of average women.
Clinical Decision-Making (Genetic Susceptibility and Breast Cancer Risk)
Dr. Susan Domchek (Abramson Cancer Center, University of Pennsylvania) began her talk by stressing the limitations of assessing risk from family history alone because factors such as adoption, small family size, and inaccurate family history may lead to erroneous conclusions about risk. She emphasized that the goal of breast cancer genetic susceptibility risk models was to identify candidates for screening for BRCA1 and/or BRCA2 mutations. Furthermore, she stressed that high sensitivity was needed for these models if they were to identify all mutation carriers. High specificity also is necessary, both clinically and economically, to avoid genetic testing of women who are less likely to be mutation carriers. She mentioned the limitations and tremendous variation in different prediction models and conceded that the current medical-legal environment encouraged clinicians in the United States to use the models that give the highest sensitivity but at a cost in decreases of specificity.
I SSUES IN D EVELOPING C ANCER R ISK P REDICTION M ODELS
An important part of risk modeling is to obtain accurate relative risk and attributable risk estimates for etiologic factors, such as demographics, reproductive history, smoking, dietary patterns, medications, genetic factors (e.g., family history and susceptibility genes), and clinical and biologic markers (e.g., blood pressure, cholesterol, enzyme levels, and histologic markers). How these factors act jointly on risk also is important. These relative risk and attributable risk estimates, as well as data on risk of competing diseases, can be obtained from a number of different study designs, including cohort, case–control, family, and clinical studies; from SEER data; and from cross-sectional population surveys. Statistical techniques used to calculate risk include empirical analysis, logistic regression, proportional hazards models, Bayesian analyses, log incidence, Markov models, and decision theory. Table 3 illustrates the components used in the development of an absolute cancer risk prediction model, with the Gail Breast Cancer Risk Assessment Model as an example.
1. Selection of risk factors and estimation of relative risks for risk factor combinations |
Gail model: Using data from a case–control study and unconditional logistic regression, several risk factors and corresponding risk estimates were determined to be predictors of breast cancer risk ( Table 4 ). Relative risks for combinations of these risk factors are obtained by multiplying the component relative risks corresponding to each of the four categories A, B, C, and D as shown in Table 4 . |
2. Determine the population attributable risk fraction (AR) |
Gail model: AR estimates were obtained from the covariate compositions for the case patients in the case–control study and the relative risks for each covariate combination, obtained by multiplying the component relative risks in Table 4 . The AR is the disease rate in the population minus the rate if all individuals were at the lowest possible risk level divided by the rate in the population. In the Gail model, the AR was 0.4212 for white women of all ages ( 15 ) . |
3. Estimate the baseline age-specific breast cancer hazard rate [see ( 49 ) ] |
Gail model: The baseline hazard rates were obtained by multiplying (1 − AR) = 0.5788 times the age-specific SEER breast cancer incidence rates as shown in Table 5 . |
4. Incorporate data on age-specific competing causes of death |
Gail model: Data on mortality rates were obtained from National Center for Health Statistics vital statistics for all causes except breast cancer. Formulas [found in ( 4 ) ] can be used to calculate absolute risk, taking competing risks into account. These calculations can be found at http://cancer.gov/bcrisktool . |
5. Approximate calculation of absolute risk |
Over short intervals, such as 5 years, the effects of competing risks are small. To approximate absolute risk of invasive breast cancer over a 5-year period, multiply four component relative risks from categories A, B, C, and D (in Table 4 ) to obtain an overall relative risk and multiply this value by the 5-year baseline risk of invasive breast cancer. For example, a 42-year-old white nulliparous woman who began menstruating at the age of 12, who has no affected first-degree relatives, and who has had one previous breast biopsy with specimens interpreted as benign and no evidence of atypical hyperplasia has an overall relative risk of 1.10 × 1.70 × 1.55 × 0.93 = 2.70. From the data on 5-year baseline risk, her projected 5-year risk of invasive breast cancer is 2.70 × 0.366 = 1.0%. |
1. Selection of risk factors and estimation of relative risks for risk factor combinations |
Gail model: Using data from a case–control study and unconditional logistic regression, several risk factors and corresponding risk estimates were determined to be predictors of breast cancer risk ( Table 4 ). Relative risks for combinations of these risk factors are obtained by multiplying the component relative risks corresponding to each of the four categories A, B, C, and D as shown in Table 4 . |
2. Determine the population attributable risk fraction (AR) |
Gail model: AR estimates were obtained from the covariate compositions for the case patients in the case–control study and the relative risks for each covariate combination, obtained by multiplying the component relative risks in Table 4 . The AR is the disease rate in the population minus the rate if all individuals were at the lowest possible risk level divided by the rate in the population. In the Gail model, the AR was 0.4212 for white women of all ages ( 15 ) . |
3. Estimate the baseline age-specific breast cancer hazard rate [see ( 49 ) ] |
Gail model: The baseline hazard rates were obtained by multiplying (1 − AR) = 0.5788 times the age-specific SEER breast cancer incidence rates as shown in Table 5 . |
4. Incorporate data on age-specific competing causes of death |
Gail model: Data on mortality rates were obtained from National Center for Health Statistics vital statistics for all causes except breast cancer. Formulas [found in ( 4 ) ] can be used to calculate absolute risk, taking competing risks into account. These calculations can be found at http://cancer.gov/bcrisktool . |
5. Approximate calculation of absolute risk |
Over short intervals, such as 5 years, the effects of competing risks are small. To approximate absolute risk of invasive breast cancer over a 5-year period, multiply four component relative risks from categories A, B, C, and D (in Table 4 ) to obtain an overall relative risk and multiply this value by the 5-year baseline risk of invasive breast cancer. For example, a 42-year-old white nulliparous woman who began menstruating at the age of 12, who has no affected first-degree relatives, and who has had one previous breast biopsy with specimens interpreted as benign and no evidence of atypical hyperplasia has an overall relative risk of 1.10 × 1.70 × 1.55 × 0.93 = 2.70. From the data on 5-year baseline risk, her projected 5-year risk of invasive breast cancer is 2.70 × 0.366 = 1.0%. |
1. Selection of risk factors and estimation of relative risks for risk factor combinations |
Gail model: Using data from a case–control study and unconditional logistic regression, several risk factors and corresponding risk estimates were determined to be predictors of breast cancer risk ( Table 4 ). Relative risks for combinations of these risk factors are obtained by multiplying the component relative risks corresponding to each of the four categories A, B, C, and D as shown in Table 4 . |
2. Determine the population attributable risk fraction (AR) |
Gail model: AR estimates were obtained from the covariate compositions for the case patients in the case–control study and the relative risks for each covariate combination, obtained by multiplying the component relative risks in Table 4 . The AR is the disease rate in the population minus the rate if all individuals were at the lowest possible risk level divided by the rate in the population. In the Gail model, the AR was 0.4212 for white women of all ages ( 15 ) . |
3. Estimate the baseline age-specific breast cancer hazard rate [see ( 49 ) ] |
Gail model: The baseline hazard rates were obtained by multiplying (1 − AR) = 0.5788 times the age-specific SEER breast cancer incidence rates as shown in Table 5 . |
4. Incorporate data on age-specific competing causes of death |
Gail model: Data on mortality rates were obtained from National Center for Health Statistics vital statistics for all causes except breast cancer. Formulas [found in ( 4 ) ] can be used to calculate absolute risk, taking competing risks into account. These calculations can be found at http://cancer.gov/bcrisktool . |
5. Approximate calculation of absolute risk |
Over short intervals, such as 5 years, the effects of competing risks are small. To approximate absolute risk of invasive breast cancer over a 5-year period, multiply four component relative risks from categories A, B, C, and D (in Table 4 ) to obtain an overall relative risk and multiply this value by the 5-year baseline risk of invasive breast cancer. For example, a 42-year-old white nulliparous woman who began menstruating at the age of 12, who has no affected first-degree relatives, and who has had one previous breast biopsy with specimens interpreted as benign and no evidence of atypical hyperplasia has an overall relative risk of 1.10 × 1.70 × 1.55 × 0.93 = 2.70. From the data on 5-year baseline risk, her projected 5-year risk of invasive breast cancer is 2.70 × 0.366 = 1.0%. |
1. Selection of risk factors and estimation of relative risks for risk factor combinations |
Gail model: Using data from a case–control study and unconditional logistic regression, several risk factors and corresponding risk estimates were determined to be predictors of breast cancer risk ( Table 4 ). Relative risks for combinations of these risk factors are obtained by multiplying the component relative risks corresponding to each of the four categories A, B, C, and D as shown in Table 4 . |
2. Determine the population attributable risk fraction (AR) |
Gail model: AR estimates were obtained from the covariate compositions for the case patients in the case–control study and the relative risks for each covariate combination, obtained by multiplying the component relative risks in Table 4 . The AR is the disease rate in the population minus the rate if all individuals were at the lowest possible risk level divided by the rate in the population. In the Gail model, the AR was 0.4212 for white women of all ages ( 15 ) . |
3. Estimate the baseline age-specific breast cancer hazard rate [see ( 49 ) ] |
Gail model: The baseline hazard rates were obtained by multiplying (1 − AR) = 0.5788 times the age-specific SEER breast cancer incidence rates as shown in Table 5 . |
4. Incorporate data on age-specific competing causes of death |
Gail model: Data on mortality rates were obtained from National Center for Health Statistics vital statistics for all causes except breast cancer. Formulas [found in ( 4 ) ] can be used to calculate absolute risk, taking competing risks into account. These calculations can be found at http://cancer.gov/bcrisktool . |
5. Approximate calculation of absolute risk |
Over short intervals, such as 5 years, the effects of competing risks are small. To approximate absolute risk of invasive breast cancer over a 5-year period, multiply four component relative risks from categories A, B, C, and D (in Table 4 ) to obtain an overall relative risk and multiply this value by the 5-year baseline risk of invasive breast cancer. For example, a 42-year-old white nulliparous woman who began menstruating at the age of 12, who has no affected first-degree relatives, and who has had one previous breast biopsy with specimens interpreted as benign and no evidence of atypical hyperplasia has an overall relative risk of 1.10 × 1.70 × 1.55 × 0.93 = 2.70. From the data on 5-year baseline risk, her projected 5-year risk of invasive breast cancer is 2.70 × 0.366 = 1.0%. |
Risk factor category . | No. of first-degree relatives with breast cancer . | RR (95% CI) . |
---|---|---|
Age at menarche | ||
≥14 y | 1.00 (referent) | |
12–13 y | 1.10 (1.02 to 1.19) | |
<12 y | 1.21 (1.03 to 1.41) | |
No. of breast biopsies | ||
Age at counseling <50 y | ||
0 | 1.00 (referent) | |
1 | 1.70 (1.40 to 2.06) | |
≥2 | 2.88 (1.97 to 4.23) | |
Age at counseling ≥50 y | ||
0 | 1.00 (referent) | |
1 | 1.29 (1.11 to 1.49) | |
≥2 | 1.64 (1.32 to 2.04) | |
Age at first live birth | ||
<20 y | 0 | 1.00 (referent) |
1 | 2.61 (1.99 to 3.42) | |
≥2 | 6.80 (3.96 to 11.68) | |
20–24 y | 0 | 1.24 (1.16 to 1.34) |
1 | 2.68 (2.23 to 3.23) | |
≥2 | 5.78 (4.14 to 8.06) | |
25–29 y or nulliparous | 0 | 1.55 (1.35 to 1.78) |
1 | 2.76 (2.32 to 6.41) | |
≥2 | 4.91 (3.76 to 6.41) | |
≥30 y | 0 | 1.93 (1.56 to 2.38) |
1 | 2.83 (2.22 to 3.62) | |
≥2 | 4.17 (2.75 to 6.31) | |
Atypical hyperplasia * | ||
No biopsies | 1.00 * | |
At least one biopsy and no atypical hyperplasia found in any biopsy specimen | 0.93 * | |
No atypical hyperplasia found and hyperplasia status unknown for at least one biopsy specimen | 1.00 * | |
Atypical hyperplasia found in at least one biopsy specimen | 1.82 * |
Risk factor category . | No. of first-degree relatives with breast cancer . | RR (95% CI) . |
---|---|---|
Age at menarche | ||
≥14 y | 1.00 (referent) | |
12–13 y | 1.10 (1.02 to 1.19) | |
<12 y | 1.21 (1.03 to 1.41) | |
No. of breast biopsies | ||
Age at counseling <50 y | ||
0 | 1.00 (referent) | |
1 | 1.70 (1.40 to 2.06) | |
≥2 | 2.88 (1.97 to 4.23) | |
Age at counseling ≥50 y | ||
0 | 1.00 (referent) | |
1 | 1.29 (1.11 to 1.49) | |
≥2 | 1.64 (1.32 to 2.04) | |
Age at first live birth | ||
<20 y | 0 | 1.00 (referent) |
1 | 2.61 (1.99 to 3.42) | |
≥2 | 6.80 (3.96 to 11.68) | |
20–24 y | 0 | 1.24 (1.16 to 1.34) |
1 | 2.68 (2.23 to 3.23) | |
≥2 | 5.78 (4.14 to 8.06) | |
25–29 y or nulliparous | 0 | 1.55 (1.35 to 1.78) |
1 | 2.76 (2.32 to 6.41) | |
≥2 | 4.91 (3.76 to 6.41) | |
≥30 y | 0 | 1.93 (1.56 to 2.38) |
1 | 2.83 (2.22 to 3.62) | |
≥2 | 4.17 (2.75 to 6.31) | |
Atypical hyperplasia * | ||
No biopsies | 1.00 * | |
At least one biopsy and no atypical hyperplasia found in any biopsy specimen | 0.93 * | |
No atypical hyperplasia found and hyperplasia status unknown for at least one biopsy specimen | 1.00 * | |
Atypical hyperplasia found in at least one biopsy specimen | 1.82 * |
These values were obtained from the literature and are regarded as constant.
Risk factor category . | No. of first-degree relatives with breast cancer . | RR (95% CI) . |
---|---|---|
Age at menarche | ||
≥14 y | 1.00 (referent) | |
12–13 y | 1.10 (1.02 to 1.19) | |
<12 y | 1.21 (1.03 to 1.41) | |
No. of breast biopsies | ||
Age at counseling <50 y | ||
0 | 1.00 (referent) | |
1 | 1.70 (1.40 to 2.06) | |
≥2 | 2.88 (1.97 to 4.23) | |
Age at counseling ≥50 y | ||
0 | 1.00 (referent) | |
1 | 1.29 (1.11 to 1.49) | |
≥2 | 1.64 (1.32 to 2.04) | |
Age at first live birth | ||
<20 y | 0 | 1.00 (referent) |
1 | 2.61 (1.99 to 3.42) | |
≥2 | 6.80 (3.96 to 11.68) | |
20–24 y | 0 | 1.24 (1.16 to 1.34) |
1 | 2.68 (2.23 to 3.23) | |
≥2 | 5.78 (4.14 to 8.06) | |
25–29 y or nulliparous | 0 | 1.55 (1.35 to 1.78) |
1 | 2.76 (2.32 to 6.41) | |
≥2 | 4.91 (3.76 to 6.41) | |
≥30 y | 0 | 1.93 (1.56 to 2.38) |
1 | 2.83 (2.22 to 3.62) | |
≥2 | 4.17 (2.75 to 6.31) | |
Atypical hyperplasia * | ||
No biopsies | 1.00 * | |
At least one biopsy and no atypical hyperplasia found in any biopsy specimen | 0.93 * | |
No atypical hyperplasia found and hyperplasia status unknown for at least one biopsy specimen | 1.00 * | |
Atypical hyperplasia found in at least one biopsy specimen | 1.82 * |
Risk factor category . | No. of first-degree relatives with breast cancer . | RR (95% CI) . |
---|---|---|
Age at menarche | ||
≥14 y | 1.00 (referent) | |
12–13 y | 1.10 (1.02 to 1.19) | |
<12 y | 1.21 (1.03 to 1.41) | |
No. of breast biopsies | ||
Age at counseling <50 y | ||
0 | 1.00 (referent) | |
1 | 1.70 (1.40 to 2.06) | |
≥2 | 2.88 (1.97 to 4.23) | |
Age at counseling ≥50 y | ||
0 | 1.00 (referent) | |
1 | 1.29 (1.11 to 1.49) | |
≥2 | 1.64 (1.32 to 2.04) | |
Age at first live birth | ||
<20 y | 0 | 1.00 (referent) |
1 | 2.61 (1.99 to 3.42) | |
≥2 | 6.80 (3.96 to 11.68) | |
20–24 y | 0 | 1.24 (1.16 to 1.34) |
1 | 2.68 (2.23 to 3.23) | |
≥2 | 5.78 (4.14 to 8.06) | |
25–29 y or nulliparous | 0 | 1.55 (1.35 to 1.78) |
1 | 2.76 (2.32 to 6.41) | |
≥2 | 4.91 (3.76 to 6.41) | |
≥30 y | 0 | 1.93 (1.56 to 2.38) |
1 | 2.83 (2.22 to 3.62) | |
≥2 | 4.17 (2.75 to 6.31) | |
Atypical hyperplasia * | ||
No biopsies | 1.00 * | |
At least one biopsy and no atypical hyperplasia found in any biopsy specimen | 0.93 * | |
No atypical hyperplasia found and hyperplasia status unknown for at least one biopsy specimen | 1.00 * | |
Atypical hyperplasia found in at least one biopsy specimen | 1.82 * |
These values were obtained from the literature and are regarded as constant.
Age, y . | Baseline 5-y risk, % . |
---|---|
20–24 | 0.003 |
25–29 | 0.022 |
30–34 | 0.077 |
35–39 | 0.191 |
40–44 | 0.366 |
45–49 | 0.540 |
50–54 | 0.640 |
55–59 | 0.788 |
60–64 | 0.969 |
65–69 | 1.135 |
70–74 | 1.209 |
75–79 | 1.285 |
80–84 | 1.280 |
Age, y . | Baseline 5-y risk, % . |
---|---|
20–24 | 0.003 |
25–29 | 0.022 |
30–34 | 0.077 |
35–39 | 0.191 |
40–44 | 0.366 |
45–49 | 0.540 |
50–54 | 0.640 |
55–59 | 0.788 |
60–64 | 0.969 |
65–69 | 1.135 |
70–74 | 1.209 |
75–79 | 1.285 |
80–84 | 1.280 |
Age, y . | Baseline 5-y risk, % . |
---|---|
20–24 | 0.003 |
25–29 | 0.022 |
30–34 | 0.077 |
35–39 | 0.191 |
40–44 | 0.366 |
45–49 | 0.540 |
50–54 | 0.640 |
55–59 | 0.788 |
60–64 | 0.969 |
65–69 | 1.135 |
70–74 | 1.209 |
75–79 | 1.285 |
80–84 | 1.280 |
Age, y . | Baseline 5-y risk, % . |
---|---|
20–24 | 0.003 |
25–29 | 0.022 |
30–34 | 0.077 |
35–39 | 0.191 |
40–44 | 0.366 |
45–49 | 0.540 |
50–54 | 0.640 |
55–59 | 0.788 |
60–64 | 0.969 |
65–69 | 1.135 |
70–74 | 1.209 |
75–79 | 1.285 |
80–84 | 1.280 |
Several participants highlighted the need for investigators to understand and consider the fundamental meaning of cancer predictability when developing risk models. The term risk can be thought of as the inherent risk among healthy individuals of developing cancer at some time in the future. A different way to view risk involves detecting a cancer in an individual at an early pre-neoplastic stage, which puts the individual at higher risk of continued cancer development. These two types of risk are often confused in the literature, and it is important to distinguish between them.
Design Issues in Developing Risk Prediction Models
Dr. Mitchell Gail (NCI) spoke about different study designs that could be used to develop and evaluate models of absolute risk. Cohort studies allow one to obtain baseline hazard rates of incidence, hazard of mortality from competing risks, and relative risk estimates. However, cohort studies often focus on special populations, lack covariate data, require long follow-up times, and collect only imprecise data on competing causes of death. Sampling from a cohort to estimate relative risks and cumulative hazards with case–cohort or nested case–control designs can compensate for some of these limitations.
Another strategy for developing risk prediction models is to combine case–control data with national registry data. This strategy can provide detailed information on covariates in a relatively short time. Several of these case–control studies can be combined to obtain a relative risk model. Drawbacks of this approach are the potential recall bias from the case–control study and the lack of national registry data for many non-cancer diseases.
Absolute risk associated with a mutation in a genetic susceptibility gene is commonly calculated by use of pedigrees of families with many affected members. Geneticists often correct for ascertainment by controlling for the family phenotypes or disease history. Dr. Gail commented on reasons why ascertainment correction may be suspect and noted that it was difficult to obtain accurate information on covariates from all members of a pedigree.
Incorporating Conceptual Issues in Risk Prediction Into Models
Dr. Colin Begg (Memorial Sloan-Kettering Cancer Center) used Lorentz curves to demonstrate the extent to which the inherently stochastic aspects of carcinogenesis limit the ability to predict a future breast cancer in healthy individuals and the extent to which unknown risk factors might improve upon the predictive accuracy of the Gail model. In contrast, there is no theoretical limit to the accuracy of identifying an existing pre-neoplastic lesion or early cancer. He explained that the more predictable the risk, the greater the rationale for focusing prevention strategies on high-risk individuals; broad population-based strategies are more appropriate for less predictable risks. He concluded by mentioning, that as new risk factors were identified, investigators were unlikely to be able to rely on single, large data sources to devise improved risk prediction models. Information will need to be assembled from different sources. Validation will be an especially pivotal concern for these models.
Incorporating Risk Factor Changes Over Time Into Models
Dr. Bernard Rosner (Harvard Medical School) stressed that breast cancer was a complex disease with multiple risk factors and that the nature of the risk factors and magnitude of their effect changed over time. For example, as a woman ages, her body mass index may increase, decrease, or remain stable. Breast cancer risks may be different in each of these cases, depending on the age of the woman. However, virtually all risk prediction models for breast cancer assume that it is a homogeneous disease, even though evidence is accumulating that risk profiles for breast cancer may vary according to both estrogen receptor and progesterone receptor status for some, but not all, risk factors. He discussed the need for different risk models for breast cancer–specific subtypes, noting that each of these cancer subtypes required different treatment decisions. Evaluating subtypes also may improve the discriminatory power of risk prediction models.
Incorporating Epidemiologic With Genetic Factors in Risk Model Development
Dr. Timothy Rebbeck (University of Pennsylvania) discussed characteristics of BRCA1 and/or BRCA2 mutation carriers and their families (e.g., age at diagnosis, cancer occurrence, tumor site, and prognosis) that may contribute to the heterogeneity of the disease, and he described what predictors may be required for personalized cancer risk assessment for these women. Factors, such as smoking, reproductive history, other genotypes unlinked to BRCA gene status, and interactions among these factors, may modify cancer risk. However, currently available data are insufficient to be useful in clinical risk prediction. A number of methodologic issues in studies attempting to identify these factors exist, including the choice of an appropriate sampling design.
I SSUES IN E VALUATING AND V ALIDATING C ANCER R ISK P REDICTION M ODELS
The most important characteristics of risk model performance are calibration, discrimination, and accuracy. Calibration (or reliability) assesses the ability of a model to predict the number of events in subgroups of the population. Calibration is most commonly evaluated by use of the goodness-of-fit or chi-square statistic, which compares the observed number of events with the expected numbers of events. Good calibration is important in all models, particularly in those used to estimate population disease burden and to plan population-level interventions. Recalibration of a model can be performed when risk is systematically overestimated or underestimated.
Discrimination measures a model's ability to distinguish at the individual level between those who will develop disease and those who will not develop disease. Discrimination is commonly quantified by calculating the concordance statistic, which corresponds to the area under a receiver operating characteristic curve. Good discrimination in a model is important for decisions made at the individual level (i.e., clinical decision-making and screening).
Accuracy scores—including positive and negative predictive values—can be used to evaluate how well a model categorizes specific individuals. This type of measure can be especially helpful in evaluating models used for clinical decision-making. However, even with good sensitivity and specificity, the positive predictive value may be low, especially for rare diseases.
Considerations of Evaluation of Risk Models by Application
Dr. Ruth Pfeiffer (NCI) discussed general criteria for assessing the performance of risk prediction models, and she proposed that criteria that were based on a specific loss function be used for screening and intervention applications. Using these particular criteria can help investigators evaluate their beneficial as well as adverse effects. She found that for some applications, such as screening, discriminatory power was much more important than for other applications, such as preventive interventions. She pointed out that the usefulness of general criteria, such as concordance, depended on the application and that the use of specific loss functions could lead to more appropriate model assessments.
Validation and Use of Risk Models
Dr. Dan McGee (Florida State University) emphasized that a model should be validated on the basis of its use. These uses include creating clinical risk groups for stratification, informing patients and their families about the state of the patient's health, and helping patients make treatment and other decisions. Model “usefulness” is determined by how well a model works in practice, so that a model should be validated for its specific use.
Comparing New and Established Risk Models
Dr. Michael Kattan (Memorial Sloan-Kettering Cancer Center) discussed the need to determine whether a new model being developed was “better” at risk prediction than were clinically established models. He explained that a metric was required to compare the predictive accuracy of the existing against the new models and listed several desirable characteristics of such an error measure. The measure should be understandable and interpretable to persuade physicians that the new model is superior to current risk prediction methods. It also should be amenable to improvement, be parameter free, and be unaffected by censoring. As an error measure, he favors the concordance index—the probability that, given two randomly drawn patients, the patient for whom treatment fails first had a higher probability of treatment failure.
Statistical Measures for Determining Risk Model Error
Dr. Martin Schumacher (Freiburg University, Freiburg, Germany) focused on the Brier score (the mean-squared error of prediction when predictions are made in terms of event or event-free probabilities). The Brier score can be adapted for competing risks and for updating dynamic predictions. He showed how to estimate the Brier score nonparametrically for survival outcomes in the presence of right censoring by a weighted residual sum of squares. These prediction error methods are valuable for detecting over-fitting, and they yield R2 (i.e., explained variation measures) for checking the explanatory power of prediction models. He illustrated the survival outcomes by applying it to data from the German Breast Cancer Study Group ( 54 ) . In addition, he used published aggregate data to show that the prediction error of current breast cancer prediction models was only marginally smaller than that of a constant prediction that ignores the information on risk factors.
I MPLICATIONS FOR P OLICY AND P REVENTION
Workshop participants repeatedly discussed the use of cancer risk prediction models for “high-risk” versus “population” approaches to cancer prevention. In her presentation, Dr. Beverly Rockhill (University of North Carolina at Chapel Hill) explained that, unless the relative risks for certain risk factors are high (i.e., 20 or more), the probability that a person with a certain risk factor profile will develop even a relatively common cancer (positive predictive value) was low because the lifetime risk of developing common cancers was low (e.g., 12% for breast cancer and 5% for colorectal cancer). Therefore, most individuals will remain cancer free over a considerable period of time; most cancers will arise among individuals from the population with close to an average individual risk. She concluded that, until it was possible to use models with high discriminatory power to accurately identify small groups of individuals who will develop a disease, a population prevention strategy of reducing risk factor prevalence in the whole population would yield maximum benefits. The alternative strategy of targeting high-risk individuals on the basis of a specific risk factor profile could miss a substantial number of individuals with disease.
The importance of this issue was highlighted by many participants, who commented during the course of the workshop that, despite current knowledge of risk and protective factors for chronic disease as well as for certain cancers (e.g., smoking cessation, diet, exercise, and screening), a large proportion of the population did not adhere to current cancer control recommendations. Some participants noted that more intensive approaches to cancer prevention may be appropriate in certain circumstances, such as for individuals who are at high risk of cancer. Moreover, certain interventions (e.g., tamoxifen for breast cancer prevention) do not lend themselves to population-level strategies and should be reserved for selected groups. Many workshop participants thought that both high-risk individual and population-level approaches were needed to achieve cancer prevention goals.
H IGHLIGHTS OF S ELECTED A BSOLUTE AND G ENETIC S USCEPTIBILITY R ISK A SSESSMENT M ODELS
Several speakers described established or newly developed risk prediction models during their presentations. These models illustrate many of the issues discussed in the presentations.
The Experience of Coronary Heart Disease Risk Prediction Models
Dr. Lisa Sullivan (Boston University Statistics and Consulting Unit, Framingham Heart Study) outlined the experience of cardiovascular risk prediction, a field that is further developed than cancer risk prediction, and discussed the wide clinical use of these risk prediction models. She explained that the Framingham Study used readily available risk factors, including blood pressure, serum lipids, age, sex, smoking status, and diabetes status, and had developed several models by use of different risk populations and outcomes. Although the Framingham Study primarily follows Caucasian middle-class participants, study investigators recalibrated its coronary heart disease risk prediction model, and this recalibrated model is now valid in other ethnic cohorts and has reasonably good discriminatory power ( 55 ) .
The Framingham models can be used by physicians and their patients to make decisions about lifestyle changes or pharmacologic interventions that may reduce coronary heart disease risk and to assess changes in risk over time ( 48 ) . This risk assessment system has been translated into a simple scoring system that assigns integer points to each risk factor category. This scoring system allows physicians and patients to easily compute risk estimates and to then match the nature and intensity of treatment (e.g., lowering cholesterol) to the absolute risk of coronary heart disease. The Framingham models do not generally discriminate between coronary heart disease and coronary death and do not correct for deaths from competing diseases. Framingham investigators are currently adding confidence intervals and risk factors (such as nutrition, exercise, and family history) to the model.
Breast Cancer Risk Assessment Model Incorporating Genetic and Epidemiologic Factors
Dr. Jack Cuzick (Imperial Cancer Research Fund) explained that, because no single data set combined hormonal, reproductive, and genetic risk characteristics, his research team had used various published data sets to develop one comprehensive breast cancer risk prediction model ( 22 ) . The model first calculates the likelihood of a woman carrying mutations in the highly penetrant genetic susceptibility genes BRCA1 and BRCA2, as well as hypothetical low-penetrance genes that are so far unidentified. Logistic regression analysis is then performed to modify risk for environmental, reproductive, and hormonal factors, such as weight, height, ages at menopause and menarche, parity, and hormone use. These factors are incorporated into the model to estimate a woman's individual's absolute risk of developing breast cancer over a 10-year period, as well as over her lifetime. Work is under way to incorporate mammographic density. Estrogen receptor status, which has implications for use of chemopreventive agents such as tamoxifen, is another important future component of the model, although the factors that specifically affect estrogen receptor-positive breast cancer risk are still unclear.
Breast and Ovarian Cancer Genetic Susceptibility Risk Model
Dr. Antonis Antoniou (Cambridge University) discussed the latest developments of the Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation (BOADICEA) susceptibility model ( 29 , 56 ) . This model was initially developed by use of segregation analysis of a population-based series of 1484 breast cancer patients and 156 multiple-case families from the United Kingdom. The model has now been updated with additional data from two UK population–based studies of breast cancer ( 57 , 58 ) and family data from the BRCA1 and/or BRCA2 carriers identified in 22 population-based studies of breast and ovarian cancer ( 59 ) . The combined data set included 2785 families, among whom 301 segregated BRCA1 mutations and 236 segregated BRCA2 mutations. According to the model, susceptibility to breast cancer is explained by mutations in BRCA1 and BRCA2 plus a polygenic effect. The latest version includes a birth cohort effect on cancer risks and smoothed cancer incidence rates.
Dr. Antoniou noted that this model was intended to serve as the basis of a risk assessment package that could predict gene carrier status and absolute breast cancer risk. Future work will focus on incorporating new genetic and environmental factors, and validating the model in external data sets.
Hereditary Nonpolyposis Colorectal Cancer (HNPCC) Genetic Susceptibility Risk Model
Dr. Chris Amos (M.D. Anderson Cancer Center) described the genetic basis and epidemiology of HNPCC and the difficulty of studying this syndrome because it encompassed a broad spectrum of different cancers in addition to colorectal cancer. He discussed the substantial variation in the age of onset of colorectal cancer in HNPCC families and noted that this variation could not be explained by the genotype alone, suggesting that additional unknown genetic and environmental factors must be involved in determining the age of onset. He stressed the importance and need for the development of genetic susceptibility risk models for HNPCC.
Dr. Giovanni Parmigiani (Johns Hopkins University) described how family history could be used to develop cancer genetic susceptibility risk models for high-risk individuals. These models can provide valuable information about the presence of a mutation in a breast cancer susceptibility gene and can be used in counseling patients and estimating the likelihood of developing cancer. He also discussed empirical and Mendelian modeling. The former evaluates the probability of a positive genetic test and determines associations between genetic testing results and features of family history. The latter evaluates the probability of a deleterious mutation at a cancer susceptibility gene and derives carrier probabilities from genetic parameters. He also described the open-source Bayes–Mendel software used in his group for Mendelian risk prediction.
In an extension of their work on breast cancer susceptibility, Dr. Parmigiani noted that he and his colleagues were currently developing a statistical model with associated software named CRCAPRO. The model uses family history of colorectal and endometrial cancer to assess the probability that an individual carries a mutation of the MLH1 and MSH2 genes ( 60 ) . Like the breast cancer model, CRCAPRO uses a Mendelian approach that assumes autosomal dominant inheritance. Age-dependent penetrance and prevalences are based on a systematic review of the literature.
F UTURE R ESEARCH
Workshop participants were divided into four breakout sessions to identify research issues, gaps, priorities, and resources needed to advance the field of cancer risk prediction and make specific recommendations for implementations. These recommendations fell into six broad areas:
Revise existing breast cancer risk assessment models and develop new models to improve predictive power
Encourage the development of new types of risk models
Obtain data to develop more accurate risk models
Support mechanisms and resources to validate risk models
Strengthen model development efforts and encourage coordination within large research and clinical centers
Promote effective cancer risk communication and decision-making
Details of these recommendations are summarized in Table 6 .
Revise existing breast cancer risk assessment models and develop new models to improve predictive power |
Develop new breast cancer absolute and genetic susceptibility risk models that incorporate modifiable risk factors (e.g., alcohol, obesity, breast density), age-dependent and temporal exposures (e.g., body mass index), subtypes of breast cancer (e.g., estrogen receptor–negative, estrogen receptor–positive, HER2-positive), biologic markers of risk (e.g., mammographic density, atypia, ductal carcinoma in situ), and somatic and inherited biomarkers (e.g., single-nucleotide polymorphisms, proteomics). Incorporating these factors will allow models to more accurately estimate risk, predict effectiveness of chemopreventive agents or lifestyle changes, and provide intermediate markers of the effectiveness of interventions. |
Encourage the development of new types of risk models |
Develop new risk models to stratify risk of cancers other than breast cancer. Current likely candidates for tailoring screening and surveillance efforts and planning chemoprevention trials include colorectal, lung, melanoma, esophageal, bladder, pancreatic, and prostate cancer. |
Develop risk models with multiple cancer and non-cancer outcomes to enhance benefit–risk indices for various interventions and in decision-making. |
Create simple user-friendly models for primary care providers to facilitate the referral of high-risk subjects. |
Extend existing models by developing them with data sources that include diverse racial and ethnic groups and representation of a broad range of factors that influence risk, such as age, income, and geographic region. |
Obtain data to develop more accurate risk models |
Expand collection of high-quality data on relative and attributable risks for cancer in various racial and ethnic groups to develop accurate risk prediction models in these populations. |
When developing cancer risk prediction models and benefit–risk indices, obtain accurate data on baseline rates for non-cancer events from diverse representative populations so as to understand competing diseases and how prevention intervention may affect these diseases. |
Support mechanisms and resources to validate risk models |
Develop innovative new statistical methods for validating and evaluating absolute and genetic susceptibility risk prediction models for various applications. |
Develop general criteria for appropriate validation and evaluation of risk models. |
Ensure the availability of population-based biospecimens so that genetic profiles can be validated in large population studies. |
Strengthen model development efforts and encourage coordination within large research and clinical centers |
Integrate and strengthen programs dedicated to risk assessment, prevention, and screening. |
Encourage collaboration by large research centers to combine different components of risk models, including data on epidemiology, screening and imaging, serum/blood biomarkers, serum banking, and genetic polymorphisms. Screening and intervention programs are valuable sources of data and should be used to refine risk models (e.g., serial serum measurements) and to provide risk assessment evaluation and information on risk modification. |
Promote effective cancer risk communication and decision-making |
Develop and evaluate communication tools for cancer risk assessment and shared decision-making by clinicians and patients. |
Given the limited time available for patient education in physician-patient interactions, give greater attention to how clinicians communicate risk and patients make decisions, how risk models affect those decisions, how those decisions affect patient behavior, and how cancer risk information can be effectively communicated outside the doctor-patient relationship. |
Incorporate patient preferences, utilities, and other critical individual factors into efforts to build models for use in clinical decision-making. |
Revise existing breast cancer risk assessment models and develop new models to improve predictive power |
Develop new breast cancer absolute and genetic susceptibility risk models that incorporate modifiable risk factors (e.g., alcohol, obesity, breast density), age-dependent and temporal exposures (e.g., body mass index), subtypes of breast cancer (e.g., estrogen receptor–negative, estrogen receptor–positive, HER2-positive), biologic markers of risk (e.g., mammographic density, atypia, ductal carcinoma in situ), and somatic and inherited biomarkers (e.g., single-nucleotide polymorphisms, proteomics). Incorporating these factors will allow models to more accurately estimate risk, predict effectiveness of chemopreventive agents or lifestyle changes, and provide intermediate markers of the effectiveness of interventions. |
Encourage the development of new types of risk models |
Develop new risk models to stratify risk of cancers other than breast cancer. Current likely candidates for tailoring screening and surveillance efforts and planning chemoprevention trials include colorectal, lung, melanoma, esophageal, bladder, pancreatic, and prostate cancer. |
Develop risk models with multiple cancer and non-cancer outcomes to enhance benefit–risk indices for various interventions and in decision-making. |
Create simple user-friendly models for primary care providers to facilitate the referral of high-risk subjects. |
Extend existing models by developing them with data sources that include diverse racial and ethnic groups and representation of a broad range of factors that influence risk, such as age, income, and geographic region. |
Obtain data to develop more accurate risk models |
Expand collection of high-quality data on relative and attributable risks for cancer in various racial and ethnic groups to develop accurate risk prediction models in these populations. |
When developing cancer risk prediction models and benefit–risk indices, obtain accurate data on baseline rates for non-cancer events from diverse representative populations so as to understand competing diseases and how prevention intervention may affect these diseases. |
Support mechanisms and resources to validate risk models |
Develop innovative new statistical methods for validating and evaluating absolute and genetic susceptibility risk prediction models for various applications. |
Develop general criteria for appropriate validation and evaluation of risk models. |
Ensure the availability of population-based biospecimens so that genetic profiles can be validated in large population studies. |
Strengthen model development efforts and encourage coordination within large research and clinical centers |
Integrate and strengthen programs dedicated to risk assessment, prevention, and screening. |
Encourage collaboration by large research centers to combine different components of risk models, including data on epidemiology, screening and imaging, serum/blood biomarkers, serum banking, and genetic polymorphisms. Screening and intervention programs are valuable sources of data and should be used to refine risk models (e.g., serial serum measurements) and to provide risk assessment evaluation and information on risk modification. |
Promote effective cancer risk communication and decision-making |
Develop and evaluate communication tools for cancer risk assessment and shared decision-making by clinicians and patients. |
Given the limited time available for patient education in physician-patient interactions, give greater attention to how clinicians communicate risk and patients make decisions, how risk models affect those decisions, how those decisions affect patient behavior, and how cancer risk information can be effectively communicated outside the doctor-patient relationship. |
Incorporate patient preferences, utilities, and other critical individual factors into efforts to build models for use in clinical decision-making. |
Revise existing breast cancer risk assessment models and develop new models to improve predictive power |
Develop new breast cancer absolute and genetic susceptibility risk models that incorporate modifiable risk factors (e.g., alcohol, obesity, breast density), age-dependent and temporal exposures (e.g., body mass index), subtypes of breast cancer (e.g., estrogen receptor–negative, estrogen receptor–positive, HER2-positive), biologic markers of risk (e.g., mammographic density, atypia, ductal carcinoma in situ), and somatic and inherited biomarkers (e.g., single-nucleotide polymorphisms, proteomics). Incorporating these factors will allow models to more accurately estimate risk, predict effectiveness of chemopreventive agents or lifestyle changes, and provide intermediate markers of the effectiveness of interventions. |
Encourage the development of new types of risk models |
Develop new risk models to stratify risk of cancers other than breast cancer. Current likely candidates for tailoring screening and surveillance efforts and planning chemoprevention trials include colorectal, lung, melanoma, esophageal, bladder, pancreatic, and prostate cancer. |
Develop risk models with multiple cancer and non-cancer outcomes to enhance benefit–risk indices for various interventions and in decision-making. |
Create simple user-friendly models for primary care providers to facilitate the referral of high-risk subjects. |
Extend existing models by developing them with data sources that include diverse racial and ethnic groups and representation of a broad range of factors that influence risk, such as age, income, and geographic region. |
Obtain data to develop more accurate risk models |
Expand collection of high-quality data on relative and attributable risks for cancer in various racial and ethnic groups to develop accurate risk prediction models in these populations. |
When developing cancer risk prediction models and benefit–risk indices, obtain accurate data on baseline rates for non-cancer events from diverse representative populations so as to understand competing diseases and how prevention intervention may affect these diseases. |
Support mechanisms and resources to validate risk models |
Develop innovative new statistical methods for validating and evaluating absolute and genetic susceptibility risk prediction models for various applications. |
Develop general criteria for appropriate validation and evaluation of risk models. |
Ensure the availability of population-based biospecimens so that genetic profiles can be validated in large population studies. |
Strengthen model development efforts and encourage coordination within large research and clinical centers |
Integrate and strengthen programs dedicated to risk assessment, prevention, and screening. |
Encourage collaboration by large research centers to combine different components of risk models, including data on epidemiology, screening and imaging, serum/blood biomarkers, serum banking, and genetic polymorphisms. Screening and intervention programs are valuable sources of data and should be used to refine risk models (e.g., serial serum measurements) and to provide risk assessment evaluation and information on risk modification. |
Promote effective cancer risk communication and decision-making |
Develop and evaluate communication tools for cancer risk assessment and shared decision-making by clinicians and patients. |
Given the limited time available for patient education in physician-patient interactions, give greater attention to how clinicians communicate risk and patients make decisions, how risk models affect those decisions, how those decisions affect patient behavior, and how cancer risk information can be effectively communicated outside the doctor-patient relationship. |
Incorporate patient preferences, utilities, and other critical individual factors into efforts to build models for use in clinical decision-making. |
Revise existing breast cancer risk assessment models and develop new models to improve predictive power |
Develop new breast cancer absolute and genetic susceptibility risk models that incorporate modifiable risk factors (e.g., alcohol, obesity, breast density), age-dependent and temporal exposures (e.g., body mass index), subtypes of breast cancer (e.g., estrogen receptor–negative, estrogen receptor–positive, HER2-positive), biologic markers of risk (e.g., mammographic density, atypia, ductal carcinoma in situ), and somatic and inherited biomarkers (e.g., single-nucleotide polymorphisms, proteomics). Incorporating these factors will allow models to more accurately estimate risk, predict effectiveness of chemopreventive agents or lifestyle changes, and provide intermediate markers of the effectiveness of interventions. |
Encourage the development of new types of risk models |
Develop new risk models to stratify risk of cancers other than breast cancer. Current likely candidates for tailoring screening and surveillance efforts and planning chemoprevention trials include colorectal, lung, melanoma, esophageal, bladder, pancreatic, and prostate cancer. |
Develop risk models with multiple cancer and non-cancer outcomes to enhance benefit–risk indices for various interventions and in decision-making. |
Create simple user-friendly models for primary care providers to facilitate the referral of high-risk subjects. |
Extend existing models by developing them with data sources that include diverse racial and ethnic groups and representation of a broad range of factors that influence risk, such as age, income, and geographic region. |
Obtain data to develop more accurate risk models |
Expand collection of high-quality data on relative and attributable risks for cancer in various racial and ethnic groups to develop accurate risk prediction models in these populations. |
When developing cancer risk prediction models and benefit–risk indices, obtain accurate data on baseline rates for non-cancer events from diverse representative populations so as to understand competing diseases and how prevention intervention may affect these diseases. |
Support mechanisms and resources to validate risk models |
Develop innovative new statistical methods for validating and evaluating absolute and genetic susceptibility risk prediction models for various applications. |
Develop general criteria for appropriate validation and evaluation of risk models. |
Ensure the availability of population-based biospecimens so that genetic profiles can be validated in large population studies. |
Strengthen model development efforts and encourage coordination within large research and clinical centers |
Integrate and strengthen programs dedicated to risk assessment, prevention, and screening. |
Encourage collaboration by large research centers to combine different components of risk models, including data on epidemiology, screening and imaging, serum/blood biomarkers, serum banking, and genetic polymorphisms. Screening and intervention programs are valuable sources of data and should be used to refine risk models (e.g., serial serum measurements) and to provide risk assessment evaluation and information on risk modification. |
Promote effective cancer risk communication and decision-making |
Develop and evaluate communication tools for cancer risk assessment and shared decision-making by clinicians and patients. |
Given the limited time available for patient education in physician-patient interactions, give greater attention to how clinicians communicate risk and patients make decisions, how risk models affect those decisions, how those decisions affect patient behavior, and how cancer risk information can be effectively communicated outside the doctor-patient relationship. |
Incorporate patient preferences, utilities, and other critical individual factors into efforts to build models for use in clinical decision-making. |
Additional information on the Workshop's agenda, breakout sessions, oral and poster presentations, and participant list can be obtained at http://www.cancermeetings.org/RiskPrediction/index.cfm .
We thank the presenters and moderators for their time and efforts on behalf of the workshop and their helpful comments and suggestions in preparing the workshop summary. We thank the NCI's Division of Cancer Control and Populations Sciences, Division of Cancer Epidemiology and Genetics, Office of Women's Health, and the Department of Health and Human Services, Office of Women's Health, for providing the funding and support for this workshop. We also thank Ms. Anne Rodgers for her editorial assistance, as well as Mary Jane Kissel of Nova Research Company and Geoffery Tobias of NCI for providing logistic support for the workshop.
References
Statistical Research and Application Branch, Cancer Control and Population Sciences, National Cancer Institute. Available at: http://srab.cancer.gov/devcan/ . [Last accessed: December 1, 2004.]
Memorial Sloan-Kettering Cancer Center. Available at: http://www.mskcc.org/ . [Last accessed: December 1, 2004.]
University of Texas Southwestern Medical Center at Dallas. Available at: http://www3.utsouthwestern.edu/cancergene/ . [Last accessed: December 1, 2004.]
Breast Cancer Risk Assessment Tool. National Cancer Institute. Available at: http://bcra.nci.nih.gov/ . [Last accessed: December 1, 2004.]
Your Disease Risk. The Source on Prevention. The Harvard Center for Cancer Prevention. Available at: http://www.yourdiseaserisk.harvard.edu/hccpquiz.pl?func=show&page=cancer_index/ . [Last accessed: December 1, 2004.]
Vogel VG, Bevers T. Handbook of breast cancer risk-assessment: evidence-based guidelines for evaluation, prevention, counseling, and treatment. Boston (MA): Jones & Bartlett Publishers;
Colditz GA. Handbook of cancer risk assessment and prevention. Boston (MA): Jones & Bartlett Publishers;
International Society of Cancer Risk Assessment and Management (ISC-RAM). Available at: http://www.isc-ram.org/index.html . [Last accessed: December 1, 2004.]
Genovations™ predictive genomics for personalized medicine. Available at: http://www.genovations.com . [Last accessed: December 1, 2004.]
Intergenetics™ Incorporated. Available at: http://www.intergenetics.com/intergenetics/index.html . [Last accessed: December 1, 2004.]
Sciona. Science-based tools for personalized product design. Available at: http:www.sciona.com/coresite/index.asp?p=1 . [Last accessed: December 1, 2004.]
National Cancer Institute. The nation's investment in cancer research. A plan and budget proposal for the fiscal year 2006. Available at: http://plan.cancer.gov/ . [Last accessed: February 22, 2005.]
Ottman R, Pike MC, King MC, Henderson BE. Practical guide for estimating risk for familial breast cancer.
Gail MH, Brinton LA, Byar DP, Corle DK, Green SB, Schairer C, et al. Projecting individualized probabilities of developing breast cancer for white females who are being examined annually.
Taplin SH, Thompson RS, Schnitzer F, Anderman C, Immanuel V. Revisions in the risk-based Breast Cancer Screening Program at Group Health Cooperative.
Claus EB, Risch N, Thompson WD. The calculation of breast cancer risk for women with a first degree family history of ovarian cancer.
Claus EB, Risch N, Thompson WD. Autosomal dominant inheritance of early-onset breast cancer. Implications for risk prediction.
Rosner B, Colditz GA. Nurses' health study: log-incidence mathematical model of breast cancer incidence.
Colditz GA, Rosner B. Cumulative risk of breast cancer to age 70 years according to risk factor status: data from the Nurses' Health Study.
Ueda K, Tsukuma H, Tanaka H, Ajiki W, Oshima A. Estimation of individualized probabilities of developing breast cancer for Japanese women.
Tyrer J, Duffy SW, Cuzick J. A breast cancer prediction model incorporating familial and personal risk factors.
Couch FJ, DeShano ML, Blackwood MA, Calzone K, Stopfer J, Campeau L, et al. BRCA1 mutations in women attending clinics that evaluate the risk of breast cancer.
Shattuck-Eidens D, Oliphant A, McClure M, McBride C, Gupte J, Rubano T, et al. BRCA1 sequence analysis in women at high risk for susceptibility mutations. Risk factor analysis and implications for genetic testing.
Parmigiani G, Berry D, Aguilar O. Determining carrier probabilities for breast cancer-susceptibility genes BRCA1 and BRCA2.
Berry DA, Iversen ES Jr, Gudbjartsson DF, Hiller EH, Garber JE, Peshkin BN, et al. BRCAPRO validation, sensitivity of genetic testing of BRCA1/BRCA2, and prevalence of other breast cancer susceptibility genes.
Frank TS, Manley SA, Olopade OI, Cummings S, Garber JE, Bernhardt B, et al. Sequence analysis of BRCA1 and BRCA2: correlation of mutations with family history and ovarian cancer risk.
Frank TS, Deffenbaugh AM, Reid JE, Hulick M, Ward BE, Lingenfelter B, et al. Clinical characteristics of individuals with germline mutations in BRCA1 and BRCA2: analysis of 10 000 individuals.
Antoniou AC, Pharoah PD, McMullan G, Day NE, Stratton MR, Peto J, et al. A comprehensive model for familial breast cancer incorporating BRCA1, BRCA2 and other genes.
de la Hoya M, Osorio A, Godino J, Sulleiro S, Tosar A, Perez-Segura P, et al. Association between BRCA1 and BRCA2 mutations and cancer phenotype in Spanish breast/ovarian cancer families: implications for genetic testing.
Vahteristo P, Eerola H, Tamminen A, Blomqvist C, Nevanlinna H. A probability model for predicting BRCA1 and BRCA2 mutations in breast and breast-ovarian cancer families.
Hartge P, Struewing JP, Wacholder S, Brody LC, Tucker MA. The prevalence of common BRCA1 and BRCA2 mutations among Ashkenazi Jews.
Apicella C, Andrews L, Hodgson SV, Fisher SA, Lewis CM, Solomon E, et al. Log odds of carrying an ancestral mutation in BRCA1 or BRCA2 for a defined personal and family history in an Ashkenazi Jewish woman (LAMBDA).
Jonker MA, Jacobi CE, Hoogendoorn WE, Nagelkerke NJ, de Bock GH, van Houwelingen JC. Modeling familial clustered breast cancer using published data.
Gilpin CA, Carson N, Hunter AG. A preliminary validation of a family history assessment form to select women at risk for breast or ovarian cancer for referral to a genetics center.
Fisher TJ, Kirk J, Hopper JL, Godding R, Burgemeister FC. A simple tool for identifying unaffected women at a moderately increased or potentially high risk of breast cancer based on their family history.
Selvachandran SN, Hodder RJ, Ballal MS, Jones P, Cade D. Prediction of colorectal cancer by a patient consultation questionnaire and scoring system: a prospective study.
Imperiale TF, Wagner DR, Lin CY, Larkin GN, Rogge JD, Ransohoff DF. Using risk for advanced proximal colonic neoplasia to tailor endoscopic screening for colorectal cancer.
Wijnen JT, Vasen HF, Khan PM, Zwinderman AH, van der KH, Mulder A, et al. Clinical findings with implications for genetic testing in families with clustering of colorectal cancer.
Ohori M, Swindle P. Nomograms and instruments for the initial prostate evaluation: the ability to estimate the likelihood of identifying prostate cancer.
Bruner DW, Baffoe-Bonnie A, Miller S, Diefenbach M, Tricoli JV, Daly M, et al. Prostate cancer risk assessment program. A model for the early detection of prostate cancer.
Eastham JA, May R, Robertson JL, Sartor O, Kattan MW. Development of a nomogram that predicts the probability of a positive prostate biopsy in men with an abnormal digital rectal examination and a prostate-specific antigen between 0 and 4 ng/mL.
Optenberg SA, Clark JY, Brawer MK, Thompson IM, Stein CR, Friedrichs P. Development of a decision-making tool to predict risk of prostate cancer: the Cancer of the Prostate Risk Index (CAPRI) test.
Bach PB, Kattan MW, Thornquist MD, Kris MG, Tate RC, Barnett MJ, et al. Variations in lung cancer risk among smokers.
Hartge P, Whittemore AS, Itnyre J, McGowan L, Cramer D. Rates and risks of ovarian cancer in subgroups of white women in the United States. The Collaborative Ovarian Cancer Group.
Colditz GA, Atwood KA, Emmons K, Monson RR, Willett WC, Trichopoulos D, Hunter DJ. Harvard report on cancer prevention volume 4: Harvard cancer risk index. Risk Working Group, Harvard Center for Cancer Prevention.
Kannel WB, McGee D, Gordon T. A general cardiovascular risk profile: the Framingham Study.
Grundy SM, Balady GJ, Criqui MH, Fletcher G, Greenland P, Hiratzka LF, et al. Primary prevention of coronary heart disease: guidance from Framingham: a statement for healthcare professionals from the AHA Task Force on Risk Reduction. American Heart Association.
Costantino JP, Gail MH, Pee D, Anderson S, Redmond CK, Benichou J, et al. Validation studies for models projecting the risk of invasive and total breast cancer incidence.
Fisher B, Costantino JP, Wickerham DL, Redmond CK, Kavanah M, Cronin WM, et al. Tamoxifen for prevention of breast cancer: report of the National Surgical Adjuvant Breast and Bowel Project P-1 Study.
Gail MH, Costantino JP, Bryant J, Croyle R, Freedman L, Helzlsouer K, et al. Weighing the risks and benefits of tamoxifen treatment for preventing breast cancer.
Freedman AN, Graubard BI, Rao SR, McCaskill-Stevens W, Ballard-Barbash R, Gail MH. Estimates of the number of US women who could benefit from tamoxifen for breast cancer chemoprevention.
Cancer Intervention and Surveillance Modeling Network. National Cancer Institute. Available at: http://cisnet.cancer.gov/about/ . [Last accessed: December 1, 2004.]
Graf E, Schmoor C, Sauerbrei W, Schumacher M. Assessment and comparison of prognostic classification schemes for survival data.
D'Agostino RB Sr, Grundy S, Sullivan LM, Wilson P. Validation of the Framingham coronary heart disease prediction scores: results of a multiple ethnic groups investigation.
Antoniou AC, Pharoah PP, Smith P, Easton DF. The BOADICEA model of genetic susceptibility to breast and ovarian cancer.
Peto J, Collins N, Barfoot R, Seal S, Warren W, Rahman N, et al. Prevalence of BRCA1 and BRCA2 gene mutations in patients with early-onset breast cancer.
Lalloo F, Varley J, Ellis D, Moran A, O'Dair L, Pharoah P, et al. Prediction of pathogenic mutations in patients with early-onset breast cancer by family history.
Antoniou A, Pharoah PD, Narod S, Risch HA, Eyfjord JE, Hopper JL, et al. Average risks of breast and ovarian cancer associated with BRCA1 or BRCA2 mutations detected in case series unselected for family history: a combined analysis of 22 studies.
Bayes Mendel Laboratory. Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins University. Available at: http://astor.som.jhmi.edu/BayesMendel/crcapro.html/ . [Last accessed: December 1, 2004.]