New modifications and applications of fuzzy C-means methodology

https://doi.org/10.1016/j.csda.2007.10.020Get rights and content

Abstract

The fuzzy C-means (FCM) algorithm and various modifications of it with focus on practical applications in both industry and science are discussed. The general methodology is presented, as well as some well-known and also some less known modifications. It is demonstrated that the simple structure of the FCM algorithm allows for cluster analysis with non-typical and implicitly defined distance measures. Examples are residual distance for regression purposes, prediction sorting and penalised clustering criteria. Specialised applications of fuzzy clustering to be used for a sequential clustering strategy and for semi-supervised clustering are also discussed.

Introduction

Cluster analysis is one of the several important tools in modern data analysis. The underlying assumption is that there are natural tendencies of cluster or group structure in the data and the goal is to be able to uncover this structure. A number of methods have been put forward and several successful applications have been reported. The methods can broadly be split into two main categories, hierarchical methods and criterion-based methods.

The focus of the present paper will be on criterion-based methods, more specifically on the class of methods based on the fuzzy C-means (FCM) algorithm (Bezdek, 1981). The reason is that we believe the original FCM algorithm and all its modifications and possible generalisations offer a great and yet not fully explored potential in modern science. First of all, the method provides membership values, which can be useful for assessing the validity of the cluster structure obtained. Second, the method has a simple and efficient algorithm which makes it applicable in a broad class of situations. The algorithm is also less prone to local minima than crisp clustering algorithms such as K-means clustering (Rousseeuw, 1995). A third aspect is the flexibility of the FCMs with respect to distance measure that can be used and how easy it is to incorporate various types of penalties in the distance measure. A large number of papers regarding both theoretical and applied aspects of FCM have been published (see reference list and Volume 51(1), 2006, of this journal). We refer to Döring et al. (2006) for an excellent overview.

The present paper is a review of some of the recent developments within the area of FCM clustering: Focus will be on the following five topics: (1) flexibility with respect to what types of applications it can be used for (Section 3). In particular, its power related to handling implicitly defined distances, which cannot be used in regular hierarchical clustering, will be stressed. (2) How to use penalty functions for the purpose of restricting the FCM solution (Section 4). (3) The use of FCM for classifying variables, not only objects (Section 4). (4) Sequential identification of clusters as a tool for determining the appropriate number of clusters (Section 5). (5) Semi-supervised clustering, i.e. the use of prior information of the membership values (Section 6). Examples, illustrations and a large number of relevant references will also be given.

Section snippets

The FCM criterion and algorithm

In contrast to crisp partitioning/clustering methods, which allocate each object to a unique cluster, fuzzy clustering algorithms result in membership values between 0 and 1 that indicate the degree of membership for each object to each of the clusters. This principle is illustrated in Fig. 1. The sum of the membership values for each object to all clusters is defined to be equal to 1. Note that, although the term fuzzy clustering is used for this method, it has little to do with fuzzy sets

Flexibility with respect to distance used and situation considered

In this section we will show how FCM can be modified to accommodate a number of situations and distance measures which are different from the classical ones. The two most well known among these modifications are probably the Gustafson–Kessel modification (Gustafson and Kessel, 1979, see also Rousseeuw et al., 1996) and the modification based on distances relative to the first few principal components (Bezdek, 1981).

In the former, the Euclidean distance is replaced by a Mahalanobis distance with

Restricting FCM solutions by the use of penalties

In some cases, regardless of which distance measure is used, it may be natural to restrict or penalise the FCM solution based on, for instance, economic requirements or prior information about possible structures in the data. Common to all the methods to be considered here is that they optimise a criterion based on a weighted sum of J and a penalty functionJ*=(1-γ)J+γj=1Ci=1Nuijmpij(D,W).Here, pij is the penalty function, D is the distance matrix, W is a general term for other

Sequential versus joint clustering and selecting the number of clusters

The choice of C is important for regular use of FCM. One possibility is to solve the FCM problem for different choices of C and look at various measures of cluster validity (Bezdek, 1981, Halkidi et al., 2001). Inspecting plots of the u-values (Næs and Isaksson, 1991), possibly after data compression by the use of principal components analysis (Rousseeuw, 1987), is another possibility. A third strategy is to use a sequential procedure (Berget et al., 2005) based on the concept of noise cluster

Semi-supervised classification (the use of prior information about the membership values)

The term semi-supervised classification refers to situations which lie between fully supervised (discriminant analysis) and fully unsupervised (cluster analysis) classification with respect to prior information available and used during clustering.

Conclusions

In this paper it has been demonstrated that the original FCM method can be modified in various ways to solve problems within a wide range of applications. Strong focus has been given to its flexibility with respect to the type of distance measure that can be used, also to distance measures developed by penalising the original solution. Another aspect that has been given attention is how the introduction of sequential clustering by the use of noise clusters can be used to complement already

References (45)

  • M. Wedel et al.

    A fuzzy cluster-wise regression approach to benefit segmentation

    Internat. J. Res. and Marketing

    (1989)
  • I. Berget et al.

    Optimal sorting of raw materials based on predicted end product quality

    Qual. Eng.

    (2002)
  • I. Berget et al.

    Sorting of raw materials with focus on multiple end product properties

    J. Chemometrics

    (2002)
  • I. Berget et al.

    Properties of prediction sorting

    J. Chemometrics

    (2004)
  • I. Berget et al.

    Using unclassified observations for improving classifiers

    J. Chemometrics

    (2004)
  • I. Berget et al.

    A strategy for finding biological relevant clusters in microarray data

    J. Chemometrics

    (2005)
  • J.C. Bezdek

    Pattern Recognition with Fuzzy Objective Function Algorithms

    (1981)
  • J.C. Bezdek et al.

    Detection and characterization of cluster substructure. 1. Linear structure—fuzzy C-lines

    SIAM J. Appl. Math.

    (1981)
  • J.C. Bezdek et al.

    Detection and characterization of cluster substructure. 2. Fuzzy C-varieties and convex combinations thereof

    SIAM J. Appl. Math.

    (1981)
  • W.M. Carlyle et al.

    Optimization problems and methods in quality control and improvement

    J. Qual. Tech.

    (2000)
  • R.N. Dave et al.

    Robust clustering methods: a unified view

    IEEE Trans. Fuzzy Systems

    (1997)
  • D. Dembele et al.

    Fuzzy C-means method for clustering microarray data

    Bioinformatics

    (2003)
  • Cited by (76)

    View all citing articles on Scopus
    View full text