Automated segmentation of tissue images for computerized IHC analysis

https://doi.org/10.1016/j.cmpb.2010.02.002Get rights and content

Abstract

This paper presents two automated methods for the segmentation of immunohistochemical tissue images that overcome the limitations of the manual approach as well as of the existing computerized techniques. The first independent method, based on unsupervised color clustering, recognizes automatically the target cancerous areas in the specimen and disregards the stroma; the second method, based on colors separation and morphological processing, exploits automated segmentation of the nuclear membranes of the cancerous cells. Extensive experimental results on real tissue images demonstrate the accuracy of our techniques compared to manual segmentations; additional experiments show that our techniques are more effective in immunohistochemical images than popular approaches based on supervised learning or active contours. The proposed procedure can be exploited for any applications that require tissues and cells exploration and to perform reliable and standardized measures of the activity of specific proteins involved in multi-factorial genetic pathologies.

Introduction

In this paper we address the problem of immunohistochemical tissue image segmentation.

In the last few years biological imaging has definitely undergone a revolution and new techniques have been shown to be effective in extracting clinical and functional information from images of molecules and tissues. In particular, the quantification of the activity of specific proteins through the analysis of tissue images provides critical information about important multi-factorial genetic pathologies, such as cancer. For example, EGFR/erb-B family of receptors plays an important role for non-small cell lung carcinoma (NSCLC) development. Quantifying and classifying the EGFR expression and activity in NSCLC with special regard to the assessment of the prevalence of EGFR mutations as well as to ligand–receptor interactions lead to new insights into the modulation of EGFR in individual lung carcinomas [1].

In addition to its well-established role in the early diagnosis of cancerous diseases, protein expression analysis recently acquired a new important role in the design of novel targeted therapies: the correlation of standardized measurements of protein expression from pathological tissue images with genetic expression data extracted from the same tissues allows to define a group of potential candidates to protein family-inhibiting therapies as well as to classify the specific pathology in a more accurate way through its specific genetic alterations [1], [2], [3].

A widespread technique in this field which leads to protein activity quantification is immunohistochemistry (IHC), that is the analysis of cancer tissue images where marked antibodies are used to link specific proteins in situ, as well as their ligands; the evaluation of the colored stains at the specific subcellular regions where the markers are localized (i.e. nucleus, cellular membrane, cytoplasm) provides important information for the assessment of cancer [1]. Fig. 1 shows three images taken from different tissue specimens, where brown and blue stained regions indicate that the relative marker is present.

In the last few years this technique has acquired a central role in pathology thanks to its several advantages over alternative bioimaging techniques (e.g. fluorescence in situ hybridization, FISH); among them, its wide availability, low cost, easy and long preservation of the stained slides [4].

Currently, pathologists usually resort to slow, manual analysis to extract information from IHC images. This is time-consuming, not scalable for very large datasets and extremely affected by inter- as well as intra-operator variability [5]. Moreover, IHC’s new role in target therapy response prediction and in the correlation of protein and genetic expression data [1], [2], [3] is placing new demands on the reproducibility of the obtained information [6]. This calls for new automated tools able to assist the pathologist in their examinations and to provide fast, accurate and standardized measures of protein activity in IHC images.

Protein expression has to be computed within the specific subcellular location of interest of the analyzed receptors (nuclei, cellular membranes or cytoplasm, depending on the receptor) in order to obtain sensible biological information [1]; in fact the subcellular location in which the receptors are expressed may represent a very small percentage of the whole imaged area. This translates into a cell-by-cell evaluation of the subcellular stain, that is much more specific than simple quantification of the global amount of the stain in the whole image [7], [8] and that requires preventive recognition of the cancer cells in the specimen and accurate segmentation of the subcellular compartments where the target receptors are localized.

In this paper we present fully automated techniques that exploit: (i) recognition of the cancerous cells and separation from the non-cancerous tissue areas in the specimen (e.g. stroma) and (ii) segmentation of cancerous nuclei in the areas identified at point (i). The proposed procedure can be exploited to perform reliable and standardized measures of protein activity as well as for any other applications that require tissue and cell exploration.

To this day, literature does not provide effective fully automated procedures to perform the tasks addressed by this paper. Commercial microscopy software usually allow the user to select the subcellular location of interest for protein activity quantification in a semiautomated or automated way; nevertheless, these tools are based on simple approaches that suffer due overgeneralization, and most of the times require extensive user interaction, so that the objectivity of the result tends to be lost [4], [6], [9], [10]. In particular, commercial products generally require the user to select manually the areas that are richest in tumoral cells [11], [12], to set intensity thresholds or levels to distinguish the cellular patterns from the background [13], [14] or to outline a set of cells that are representative of the tumor or of the specific targeted cellular regions [15], [16]. Moreover, most of these commercial tools are integrated into whole-slide scanning instruments [11], [16] or are hardware-based in that they require multispectral imaging technologies attached to microscope platforms that may be not readily available to a regular pathologist or onchologist [15], [17].

The recent literature reports many attempts to develop automated methods for nuclear segmentation. The used approaches vary among the most widely known image processing techniques, including intensity thresholding [18], [19], active contours [20], [21], [22], [23], [24], watersheds [24], [25], [26], multiscale analysis [27], a priori geometric models [28], [29], graph-cuts [30], Markov random fields [31], etc. Despite active research in this field, reliable cancerous nuclei segmentation remains extremely challenging.

The major contribution of this paper is to provide a fully automated and flexible procedure that overcomes the limitations of the existing computer-based techniques.

The paper is structured as follows: in Section 2 we discuss the main challenges related to IHC tissue image segmentation and limitations of the existing techniques and we describe the main contributions of our work; in Section 3 we present our case study related to the IHC analysis of lung cancer tissue images and we describe in detail our proposed methods; in Section 4 we discuss parameters and implementation; in Section 5 we present experimental results; in Section 6 we conclude the paper.

Section snippets

IHC analysis and image characteristics

The images targeted by our methods are high-resolution microscopy pictures of tissue samples stained with marked antibodies [1] (see Fig. 1 for examples).

In general, IHC techniques use different stains to detect the activations of specific proteins and distinguish them from regions without activations. The purpose of IHC image analysis is to separate the activated image regions from the rest of the image. Literature reports several methods and protocols for the quantification of IHC specimens

Case study: segmentation of lung cancer IHC tissue images

In this paper we applied our proposed techniques to images of lung cancer tissue (see Fig. 1 for examples). This is an application of great importance in biomedicine: in fact lung cancer is one of the leading causes of death for tumor: despite extensive preclinical and clinical research has been led for decades, lung cancer’s prognosis is still very low with only 5–15% of patients surviving 5 years after the first diagnosis. The characteristics of these images entail all the challenges

Implementation

We implemented our procedure in Java as a plugin for ImageJ [45], a powerful public domain image analysis and processing software which runs on all standard operating systems (Windows, Mac OS, Mac OS X, Linux); therefore it is totally hardware-independent, flexible and upgradeable. We inherited the whole class hierarchy of ImageJ 1.38 API and open-source plugins and macros in [41], [46], [47] and we implemented our own functions and classes.

Parameters’ values tuned on real-life lung cancer

Experimental results

We tested the performance of our automated pipeline on real IHC images of lung cancer tissue. H-DAB staining was preventively performed on the tissue to highlight positive activations at the EGF-R or TGF-alpha receptors, that are respectively localized in the cellular membrane or in the cytoplasm of epithelial cells (see Section 3.1 for details). Other than the cancerous epithelial cells, that are the ones targeted by IHC analysis, the tissue samples contained portions of non-cancerous

Conclusions

In this paper we presented an automated method for immunohistochemical tissue image segmentation able to recognize cancerous areas disregarding non-pathological connective tissue and to perform accurate and precise segmentation of nuclear membranes within pathological areas. These two tasks are critical in order to obtain a reliable and standardized measure of the activity of specific proteins involved in the genesis and development of multi-factorial genetic pathologies. Moreover, the

Acknowledgements

The authors acknowledge Dr. Marco Volante and Dr. Ida Rapa of S. Luigi Hospital in Orbassano (Torino) for the IHC images used in this paper as well as for the helpful suggestions.

References (49)

  • E. Ficarra et al.

    Joint co-clustering: co-clustering of genomic and clinical bioimaging data

    Comput. Math. Appl.

    (2008)
  • M. Lacroix-Triki et al.

    High inter-observer agreement in immunohistochemical evaluation of HER-2/neu expression in breast cancer: a multicentre GEFPICS study

    EJC

    (2006)
  • T.K. Taneja et al.

    Markers of small cell lung cancer

    World J. Surg. Oncol.

    (2004)
  • M.J. Borad et al.

    Molecular profiling using immunohistochemistry (IHC) and DNA microarray (DMA) as a tool to determine potential therapeutic targets in patients who have progressed on multiple prior therapies

  • Z. Theodosiou et al.

    Automated analysis of FISH and immunohistochemistry images: a review

    Cytometry A

    (2007)
  • M. Cregger et al.

    Immunohistochemistry and quantitative analysis of protein expression

    Arch. Pathol. Lab. Med.

    (2006)
  • E.M. Brey et al.

    Automated selection of DAB-labeled tissue for immunohistochemical quantification

    J. Histochem. Cytochem.

    (2003)
  • A.C. Ruifrok et al.

    Comparison of quantification of histochemical staining by Hue-Saturation-Intensity (HSI) transformation and color deconvolution

    Appl. Immunohistochem.

    (2004)
  • K.A. Divito et al.

    Tissue microarrays-automated analysis and future directions

    Breast Cancer Online

    (2005)
  • J. Cheng et al.

    Segmentation of clustered nuclei with shape markers and marking functions

    IEEE Trans. Biomed. Eng.

    (2009)
  • BioImagene Innovative Digital Pathology,...
  • H.D. Cualing et al.

    Virtual flow cytometry of immunostained lymphocytes on microscopic tissue slides: iHCFlow tissue cytometry

    Cytometry B

    (2007)
  • Tissuegnostics Image Analysis System,...
  • Q. Huang et al.

    DNA index determination with automated cellular imaging system (ACIS) in Barrett’s esophagus: comparison with CAS 200

    BMC Clin. Pathol.

    (2005)
  • Cambridge Research Inc.,...
  • Aperio Scanscope,...
  • AQUA Automated Quantitative Analysis,...
  • Y.J. Kim et al.

    Automated nuclear segmentation in the determination of the Ki-67 labeling index in meningiomas

    Clin. Neuropathol.

    (2006)
  • F. Long et al.

    Automatic segmentation of nuclei in 3D microscopy images of C. elegans

  • L. Yang et al.

    Unsupervised segmentation based on robust estimation and color active contour models

    IEEE Trans. Inform. Technol. B

    (2005)
  • D.P. Mukherjee et al.

    Level set analysis for leukocyte detection and tracking

    IEEE Trans. Image Process

    (2004)
  • M. Jacob et al.

    Efficient energies and algorithms for parametric snakes

    IEEE Trans. Image Process

    (2004)
  • B. Zhang et al.

    Tracking fluorescent cells with coupled geometric active contours

  • J. Cheng et al.

    Segmentation of clustered nuclei with shape markers and marking function

    IEEE Trans. Biomed. Eng.

    (2009)
  • Cited by (110)

    • Combinatorial therapy in tumor microenvironment: Where do we stand?

      2021, Biochimica et Biophysica Acta - Reviews on Cancer
    • IHC-Net: A fully convolutional neural network for automated nuclear segmentation and ensemble classification for Allred scoring in breast pathology

      2021, Applied Soft Computing
      Citation Excerpt :

      Computer-aided diagnosis in digital pathology has enabled quantitative studies with rapid diagnosis results targeting complex diseases like cancer. In the research field of Immunohistochemical test related to the prognosis of breast cancer, many off-the-shelf schemes [16,17] for automated Allred score are available. They generally included machine learning techniques such as artificial neural network, random forest, etc.

    View all citing articles on Scopus
    View full text