Introduction

Mantle cell lymphoma (MCL) is a distinct B cell non-Hodgkin lymphoma with a poor clinical long-term outcome [1]. The currently used intensive treatment protocols including anti-CD20 antibody and autologous stem cell transplantation increased the response rate and remission duration but have failed to improve long-term overall survival so far [12]. The poor clinical outcome and the high number of relapses have stimulated research for alternative chemotherapeutic approaches for patients after a relapse and for high-risk patients as an alternative frontline therapy [12].

MCL is characterized by the translocation t(11;14)(q13;q32), which juxtaposes the immunoglobulin heavy-chain gene to the gene for cyclin D1 [8]. As a consequence of the translocation, cases of MCL show constitutive upregulation of cyclin D1 expression. Since cyclin D1 promotes the transition from the G1 to S phase of the cell cycle, MCLs are characterized by cell cycle deregulation [8]. The cell proliferation rate, as measured by the Ki-67 index (the percentage of Ki-67-positive tumor cells), is correlated with the level of cyclin D1 expression and other cell cycle regulators [16].

The Ki-67 index has been confirmed as a very powerful single prognostic factor for overall survival, with highly proliferative cases showing a much poorer outcome than tumors with low proliferation [21]. Recently, we demonstrated that the Ki-67 index retains its prognostic relevance in randomized prospective trials employing immunochemotherapy with the anti-CD20 antibody rituximab [6]. Moreover, using the Ki-67 index in combination with clinical parameters such as age, performance status, lactate dehydrogenase, and leukocyte count (Mantle Cell Lymphoma International Prognostic Index), we were able to design a combined clinicobiological prognostic index for MCL [9]. Defining clinical risk scores might help to design patient-specific and lymphoma-specific therapy.

So far, in most of the published studies of other tumors, the stainings for Ki-67 were evaluated in a single center or by very few observers, thus avoiding the problem of interobserver variability. However, in order to allow the application of the Ki-67 index in clinical routine management of MCL with cut-off points defining low-, intermediate-, or high-risk groups [5, 6] or incorporation of the Ki-67 index as a continuous variable in a prognostic index [9], it is of the highest importance to unify methods and establish guidelines for Ki-67 assessment. For this purpose, the pathology panel of the European MCL Network compared different methods of assessing the Ki-67 index on samples from patients with advanced-stage MCL treated within the randomized trials of European MCL Network and German Low Grade Lymphoma Study Group (GLSG). We provide recommendations and guidelines for the future use of the Ki-67 index in MCL.

Materials and methods

Lymphoma samples

Thirty-two primary diagnostic lymph node biopsy specimens from different patients with advanced-stage MCL treated within the clinical trials of the European MCL Network and GLSG [7, 13, 14] were randomly chosen. The stainings for Ki-67 were performed in different laboratories using diverse techniques and antibody clones, namely, Berlin (n = 10, Mib-1 antibody, alkaline phosphatase–antialkaline phosphatase method; Dako, Glostrup, Denmark), Kiel (n = 7, Ki-S5 antibody, alkaline phosphatase–antialkaline phosphatase method, home-brewed reagents), Lübeck (n = 2, Mib-1 antibody, Envision method; Dako, Glostrup, Denmark), and Würzburg (n = 13, Mib-1 antibody, Envision method; Dako, Glostrup, Denmark). Only slides which showed unambiguous strong nuclear staining were analyzed. Therefore, three cases had to be excluded, resulting in 29 evaluable cases.

Ki-67 index

The Ki-67 index was defined as the percentage of Ki-67-positive tumor cells in representative areas (see below) of the lymphoma and was evaluated by counting, eyeballing, and digital image analysis.

Counting

Counting was performed by one observer (O.D.). To count the number of Ki-67-positive cells, two representative areas were chosen. A representative area was defined not to contain residual germinal centers, hot spots of proliferation or proliferating T cells. Hot spots of proliferation are areas of tumor cells (not germinal center residues) of less than two high-power fields in size (HPF, field of vision at ×400 magnification), which proliferate higher than the rest of the tumor. Usually, hot spots are already visible at a low-power magnification. In each area, the positive cells among 500 cells were counted using an eyepiece with a grid in a ×400 magnification. The Ki-67 index was calculated as the percentage of positive cells by averaging the values obtained for the two areas (count–Ki-67 index).

Recently, we demonstrated that the Ki-67 index assessed by counting 2 × 500 cells in representative areas is a reliable method to predict overall survival of MCL based on patients treated within prospective randomized trials of the European MCL Network [6]. Since this method has proven its clinical significance, the Ki-67 index obtained by counting (count–Ki-67 index) of 2 × 500 cells was considered the gold standard.

In order to reduce the number of counted cells in a second independent experiment, ten cases were randomly selected from the series. These cases were evaluated by four observers (OD, IO, WK, and HHW) without knowledge of the counted value. All observers counted the Ki-67-positive cells among 100 lymphoma cells in five consecutive HPF that had been selected as representative by each observer. The values for the first to the fifth count of 100 cells were registered separately.

Eyeballing

Eleven experienced expert hematopathologists evaluated all cases at an onsite pathology panel meeting. Each pathologist estimated the Ki-67 index independently in representative areas of the lymphoma chosen by the pathologist himself and blinded for the results of the other investigators.

Digital image analysis

Pictures of representative areas were obtained by the same observers who performed the counting. The images were analyzed blind (without knowledge of the results of counting) by an independent observer using a KS400 system (Carl Zeiss, Jena, Germany). Automatic white balance was performed and, subsequently, RGB color thresholds were applied to identify positive and negative nuclei. Fixed threshold values were used. Two values were calculated: the percentage of positive nuclei and the percentage of the area covered by the positive nuclei.

Statistical methods

To quantify the degree of agreement between the quantitative Ki-67 values generated by counting, eyeballing, and digital image analysis, concordance correlation coefficients (CCC) were estimated (Lin 1989; Barnhart 2002). A CCC of 1 indicates complete agreement, whereas a CCC of 0 indicates no correlation. To obtain 95% confidence intervals, asymptotic confidence limits were estimated for comparisons of two methods and bootstrap confidence limits were estimated with 2,000 bootstrap samples for comparisons of more than two methods. The calculations used the SAS-macro CCC provided online by the Department of Biostatistics and Bioinformatics of the Duke University Medical Center, Durham, NC, USA. Quantitative values were described with median and range, and group comparisons for paired samples were performed using the nonparametric signed rank test. Single extreme values not representative for the range of values were excluded. The working significance level was 5%. All analyses were performed using SAS Version 9.1 (SAS Institute, Cary, NC, USA).

Results

Gold standard

The Ki-67 index assessed by counting 2 × 500 cells showed a median of 14.7% (minimum 2.1%, maximum 91%) in the analyzed series. Two cases of highly proliferative lymphomas (Ki-67 index >80% by counting; see Fig. 1) had to be excluded from further analysis in order to avoid the detection of pseudocorrelations, resulting in a median of 14.6% and a maximum of 50.9%.

Fig. 1
figure 1

Variability of Ki-67 eyeballing. Plot of Ki-67 values estimated by eyeballing by 11 experienced hematopathologists vs. Ki-67 values obtained via counting of 2 × 500 cells (gold standard). Each pathologist is represented by one color. In order to visualize identical estimation values corresponding to the same counting values, counting values are minimally scattered around the true values

Estimation by eyeballing

The Ki-67 index estimated by eyeballing by 11 experienced hematopathologists deviated considerably between individual observers (deviation up to 85%) and from the count–Ki-67 index (deviation up to 72%; Fig. 1). The concordance to the value obtained by eyeballing compared to counting was poor (CCC between 0.29 and 0.61) and did not improve after averaging all values for the 11 pathologists (CCC = 0.53; Table 1). The concordance between all 11 individual pathologists was also poor (CCC = 0.56).

Table 1 Values for the 11 pathologists

Digital image analysis

Ki-67 values obtained by digital image analysis deviated largely from the values obtained by counting. Values as percentage of nuclei were 4.8% higher than the count–Ki-67 index (median, p = 0.034, signed rank test) and values as percentage of area were 8.5% lower than the count–Ki-67 index (median, p = 0.0011, signed rank test). The concordance of digital image analysis with the count–Ki-67 index was poor for both methods of analysis used (CCC 0.24 for percentage of nuclei and 0.37 for percentage of positive area; Table 2). All values as percentage of nuclei were higher than the percentage of area values (p < 0.0001, signed rank test) and concordance between percentage of nuclei and percentage of area was low (CCC = 0.46).

Table 2 Values of percentage of nuclei and percentage of area

Counting 100 cells per representative area

In order to evaluate the effect of reducing the number of counted cells on the interobserver concordance, 100 cells were counted in five consecutive HPF by four observers in ten cases. Again, to avoid the detection of pseudocorrelations, two cases with high Ki-67 values had to be excluded from analysis. As expected, the intraobserver concordance (CCC = 0.83–0.96) was higher than the interobserver concordance (CCC = 0.68) if 100 cells in one HPF were counted. The concordance between the observers increased with each additional 100 cells analyzed, with the strongest increase in concordance between the first and the second HPF with 1 × 100 cells in each (CCC = 0.68 and 0.74, respectively; Table 3, Fig. 2) and only minor improvement with further increase or the number of cells to be counted. The concordance with the gold standard of 2 × 500 counted cells increased in average with each step of additional 100 cells counted. The average CCC for 2 × 100 compared to the gold standard was 0.79. By counting more than 2 × 100 cells, a further relevant increase of the CCC with the gold standard could be achieved in only two of four observers (data not shown).

Table 3 Counting 100 cells per representative area
Fig. 2
figure 2

Interobserver agreement for Ki-67 assessment by counting 1 × 100 to 5 × 100 cells and using the average. The concordance correlation coefficient (CCC) was estimated with four raters on eight samples

Discussion

Which specimens should be assessed for Ki-67?

Several studies have demonstrated the prognostic significance of the Ki-67 index in MCL [6, 9, 11, 1719, 21]. All of these above-mentioned studies, including our own of the European MCL Network, evaluated the Ki-67 index in the primary biopsy specimens. To the best of our knowledge, no studies analyzing the prognostic role of the Ki-67 index in relapsed disease have been published so far. Thus, to date, the use of the Ki-67 index as a prognostic marker has only been proven in specimens obtained at primary diagnosis. Of note, if tissues are fixed with Bouin’s solution, an immunohistochemistry for Ki-67 is not possible. Therefore, a fixation in buffered formalin solution is mandatory.

In our study, we used lymph node specimens to evaluate different methods of calculating the Ki-67 index. However, other studies have also included extranodal biopsy specimens. We generally recommend analysis of nodal and extranodal specimens but not bone marrow (see below) to evaluate the Ki-67 index. In any specimen, but especially in extranodal specimens, only samples with dense lymphoma areas should be analyzed because the Ki-67 index is based on the percentage of Ki-67-positive lymphoma cells, ignoring reactive background cells. Thus, minimally infiltrated biopsy specimens or specimens with a prominent reactive background should be excluded. The sample size should allow a minimum of five independent HPF (at ×400 magnification) to be selected for the counting. Thus, a considerable percentage of punch needle biopsy specimens is too small, especially since the selection of a representative area is not possible in some cases (see below).

All above-mentioned studies evaluated the Ki-67 index on histopathology slides. However, flow cytometry is increasingly being used for nuclear antigens in the diagnosis of lymphomas from blood and bone marrow aspirates and lymph node fine needle aspirates [2, 3]. The advantage of flow cytometry is the reliable standardization of the method and the possibility to restrict the measurement of proliferative index to the tumor cells excluding reactive T cells from the analysis. Since the majority of MCLs are leukemic [2], flow cytometry on peripheral blood might represent an easily assessable source for the Ki-67 analysis. However, in chronic lymphocytic leukemia, Ki-67 values of peripheral blood and lymph node compartment differs significantly as the proliferative compartment probably resides predominantly in solid tissues [15]. Therefore, future studies have to evaluate whether Ki-67 staining of peripheral blood cells and bone marrow are appropriate and representative for the proliferation in solid tissue and whether flow cytometry of blood, bone marrow, or lymph node biopsies may substitute for the Ki-67 index obtained by histology on tissue sections. To date, the use of flow cytometry of blood, bone marrow, and lymph nodes cannot be regarded as a standard approach to assess the Ki-67 index in MCL.

How should the Ki-67 index be assessed in MCL?

The Ki-67 index varies between different MCL subtypes defined by cytology, with blastic and pleomorphic MCL showing the highest proliferation on average [21]. However, there is great variability in the Ki-67 index within the subgroup of classical MCL, with values reaching up to the range found in blastic or pleomorphic MCL [21]. Thus, the blastic or pleomorphic subtype of MCL should only be diagnosed on the basis of cytology, not of the Ki-67 index. On the other hand, in some cases of classical MCL, a high proliferation rate may also be expected.

Digital image analysis is increasingly being used to analyze immunohistochemical stainings in lymphoma research and has been successfully applied to assess the Ki-67 index in several solid tumors and lymphomas [4, 20, 22]. However, we did not find a convincing concordance of the digital image analysis with our gold standard, the count–Ki-67 index. MCL are composed of relatively small cells with a narrow rim of cytoplasm [1, 10, 21]. Since the lymphoma cells are usually densely packed, currently available software for digital image analysis often fails to recognize single nuclei. Nevertheless, we believe that future development may prove digital image analysis to be a valuable tool for the quantification of immunohistochemistry. Moreover, it is theoretically possible that the Ki-67 values obtained by digital image analysis might correlate with the clinical outcome despite the poor correlation with the counted value. Therefore, image analysis on a larger set of lymphomas and a clinical correlation will be necessary to finally solve this question.

As shown above, eyeballing and digital image analysis (with the method used in this study) did not show an acceptable concordance with the count–Ki-67 index and the interobserver correlation was poor. We cannot rule out completely that estimation of the Ki-67 index by eyeballing might still be of relevance in the future, since guidelines as reported herein and tools like software for training of estimation (Y. Krivolapov, personal communication) might improve the estimation skills of pathologists. Whether the staining method/color or the selection of the biopsy specimen influenced the estimation performance of the pathologist has to be determined in future studies. Although previous analysis showed that the mitotic count did not reach the predictive value of the Ki-67 index (data not shown), future studies comparing these two methods are also needed. To date, our results confirmed that quantitative count of lymphoma cells at a high magnification represents the method of choice to obtain the Ki-67 index and yields the best intraobserver and interobserver reproducibility.

Counting 2 × 500 cells is very time-consuming and will probably not be applicable in clinical routine. Thus, various methods for the determination of the Ki-67 index were evaluated and compared to define the optimal approach which provides reliable and reproducible results and can be used in multiobserver studies with a minimum of workload. Accordingly, we evaluated the effect of reducing the number of counted cells on the interobserver concordance. Increasing the number of counted cells improved both interobserver and intraobserver concordance, with the strongest increase in concordance between 1 × 100 counted cells and 2 × 100 counted cells. The values obtained by counting 2 × 100 cells show a high concordance with the gold standard. We thus recommend assessing the Ki-67 index in MCL by counting the percentage of positive cells among 200 lymphoma cells in two independent, representative lymphoma areas at a high magnification (HPF). This method allows a reliable assessment of the Ki-67 index in <2 min per case. The Ki-67 index in MCL can be used as a continuous parameter or with cut-off values to define risk groups [6, 9]. However, in any case, a reliable Ki-67 value provided by the pathologist will be necessary.

Since Ki-67-positive cells are not homogenously distributed, the selection of representative lymphoma areas for the counting of the Ki-67 index is of crucial importance. Several areas of high proliferation should be excluded: (1) “hot spots” of proliferation (see the “Materials and methods” section for the definition), (2) T cells in the periphery of lymphoma nodules (Fig. 3), and (3) residual germinal centers. The pitfalls in the selection of representative areas in MCL are outlined in Fig. 4. “Hot spots” of proliferation are not suitable to assess a prognostic Ki-67 value in MCL [21]. However, even if our guidelines are followed, the selection of the representative area might still have a subjective element.

Fig. 3
figure 3

Areas of proliferating T cells in the case of a classical MCL. The MCL cells are characterized by CD20 expression and infiltrating T cells by staining for CD3. The Ki-67 index is higher in the T cell-rich area (corresponding areas are marked by an arrow). Double staining for CD20/Ki-67 of the T cell-rich area at a higher magnification shows numerous Ki-67 positive T cells (red) which are negative for the B cell marker CD20 (brown)

Fig. 4
figure 4

Pitfalls in the selection of representative areas for the Ki-67 index in MCL. The upper panel shows two examples of Ki-67 staining. In the middle panel, residual germinal centers are marked by an arrow, hot spots of proliferation by an asterisk, and proliferating T cells by an arrowhead. In the lower panel, suggested representative areas for assessing the Ki-67 index are indicated by circles

We have to stress that the prognostic relevance of the Ki-67 index in MCL so far has been proven only in retrospective analysis. To date, the Ki-67 index has never been applied as a stratifying marker for clinical decision making. Therefore, the use of the Ki-67 index as a prospectively evaluated factor for treatment stratification has to be analyzed prospectively before a general use of this marker outside of studies can be recommended. Nevertheless, the recommendations and guidelines presented in this study will be of great importance for the design of such future trials.

In summary, we recommend assessing the Ki-67 index as a prognostic tool in MCL specimens:

  • obtained at primary diagnosis before any treatment.

  • from nodal sites but not from bone marrow. The eligibility of biopsies from extranodal sites has to be analyzed in future studies.

  • of a minimum size that allows at least five independent representative HPF to be selected.

  • in 200 lymphoma cells, by counting the positive cells among 100 lymphoma cells in two HPF each.

  • in representative areas which do not include residual germinal centers, hot spots of proliferation, or proliferating T cells.