Main Article Content

Authors

Diego Liberati

Abstract

This paper attempts to cluster leukemia patients described by gene expression data and to discover the most discriminating genes that are responsible for the clustering. A combined approach of Principal Direction Divisive Partitioning (PDDP) and bisecting K-means algorithms is applied to the clustering of the investigated leukemia dataset. Both unsupervised and supervised methods are considered in order to get optimal results. The combination of PDDP and bisecting K-means successfully clusters leukemia patients and efficiently discovers salient genes able to discriminate the clusters. The combined approach works well on the automatic clustering of leukemia patients depending merely on the gene expression information, and it has great potential for solving similar problems, like classifying pancreatic tumors. The salient identified genes may thus enhance relevant information for discriminating among leukemias. A previous paper by us, cited in the references and in the paper, based on the same technique, was able to outperform a seminal paper on Science on their same data. In this paper, the bisection is iterated on more complex data in order to identify a tree of leukemias discriminated through their salient involved genes.

Share This Article On Social Media
Usage Statistics

Article Details

Section
Research