ACTA VETERINARIA ET ZOOTECHNICA SINICA ›› 2009, Vol. 40 ›› Issue (2): 180-184.doi:

• 遗传繁育 • Previous Articles     Next Articles

Effects of Data Preprocessing and Measuring Metrics for Different Gene Expression Data

LIU Tian-fei,TANG Guo-qing,LI Xue-wei*   

  1. College of Animal Science and Technology,Sichuan Agricultural University, Ya’an 625014, China
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-02-24 Published:2009-02-24

Abstract: The effects of different measuring metrics and data preprocessing for different gene expression data on K-means clustering were studied. The results illustrated that different data preprocessing ways made significant differences under different measuring metrics. The best data preprocessing in K-means clustering was to select log transformations for the timecourse gene expression dataset, and measuring metrics is to select covariance metrics. However, the best data preprocessing is log transformations for other datasets, three measuring metrics (Euclidean distance, squared Euclidean distance and Manhattan distance) led to better results.