Chem-Bio Informatics Journal
Online ISSN : 1347-0442
Print ISSN : 1347-6297
ISSN-L : 1347-0442
Original
Detecting outlying samples in microarray data: A critical assessment of the effect of outliers on sample classification
Koji KadotaDaisuke TominagaYutaka AkiyamaKatsutoshi Takahashi
Author information
JOURNAL FREE ACCESS

2003 Volume 3 Issue 1 Pages 30-45

Details
Abstract

Among samples analyzed for gene expression, samples incorrectly labeled or identified as likely contaminated are those whose expression patterns are markedly different. Such samples should be designated outliers, since they can exert a negative effect on the selection of informative genes for sample classification. We developed a method based on Akaike's Information Criterion (AIC) to detect such outliers. Our method is advantageous because it is free from a significance level and it facilitates objective decision-making. We applied our method to analyze the public microarray data of Alon et al. (1999) and found that some of the detected outlying samples coincided with samples considered as likely contaminated. Application of our method produced a higher discrimination level for informative genes in tumor- and normal tissues and, upon exclusion of the outliers, yielded higher classification accuracy. The detection of outlying samples prior to sample classification is essential, and the method described here serves as a valuable check.

Content from these authors
2003 Chem-Bio Informatics Society
Previous article
feedback
Top