Clustering gene expression data using a novel model of self-organizing map
Clustering is an important technique for analyzing gene expression data. The self-organizing map is one of the most useful clustering algorithms. However, its applicability is limited by the fact that some knowledge about the data is required prior to clustering. This paper introduces a novel model of self-organizing map (SOM) called growing hierarchical self-organizing map (GHSOM) to cluster gene expression data. The training and growth processes of GHSOM are entirely data driven, requiring no prior knowledge or estimates for parameter specification, thus help find not only the appropriate number of clusters but also the hierarchical relations in the data set. Compared with other clustering algorithms, GHSOM has better accuracy. To validate the results, a novel validation technique is used, known as figure of merit (FOM).