Initial points selection for clustering gene expression data: A spatial contiguity analysis-based approach

Hui Yi, Cuimei Bo, Xiaofeng Song, Yuhao Yuan

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Clustering is considered one of the most powerful tools for analyzing gene expression data. Although clustering has been extensively studied, a problem remains significant: iterative techniques like k-means clustering are especially sensitive to initial starting conditions. An unreasonable selection of initial points leads to problems including local minima and massive computation. In this paper, a spatial contiguity analysis-based approach is proposed, aiming to solve this problem. It employs principal component analysis (PCA) to identify data points that are likely extracted from different clusters as initial points. This helps to avoid local minima, and accelerates the computation. The effectiveness of the proposed approach was validated on several benchmark datasets.

Original languageEnglish
Pages (from-to)3709-3717
Number of pages9
JournalBio-Medical Materials and Engineering
Volume24
Issue number6
DOIs
StatePublished - 2014

Keywords

  • Gene expression data
  • Initial points
  • K-means
  • Spatial contiguity analysis

Fingerprint

Dive into the research topics of 'Initial points selection for clustering gene expression data: A spatial contiguity analysis-based approach'. Together they form a unique fingerprint.

Cite this