Accurate and robust algorithms for microarray data classification

PhD Thesis


Hu, Hong. 2008. Accurate and robust algorithms for microarray data classification. PhD Thesis Doctor of Philosophy. University of Southern Queensland.
Title

Accurate and robust algorithms for microarray data classification

TypePhD Thesis
Authors
AuthorHu, Hong
SupervisorWang, Hua
Institution of OriginUniversity of Southern Queensland
Qualification NameDoctor of Philosophy
Number of Pages168
Year2008
Abstract

[Abstract]Microarray data classification is used primarily to predict unseen data using a model built on categorized existing Microarray data. One of the major challenges is that Microarray data contains a large number of genes with a small number of samples. This high dimensionality problem has prevented many existing classification methods from directly dealing with this type of data. Moreover, the small number of samples increases the overfitting problem of Classification, as a result leading to lower accuracy classification performance. Another major challenge is that of the uncertainty of Microarray
data quality. Microarray data contains various levels of noise and quite often high levels of noise, and these data lead to unreliable and low accuracy analysis as well as the high dimensionality problem. Most current classification methods are not robust enough to handle these type of data properly.

In our research, accuracy and noise resistance or robustness issues are focused on. Our approach is to design a robust classification method for Microarray data classification.

An algorithm, called diversified multiple decision trees (DMDT) is proposed, which makes use of a set of unique trees in the decision committee. The DMDT method has increased the diversity of ensemble committees and
therefore the accuracy performance has been enhanced by avoiding overlapping genes among alternative trees.

Some strategies to eliminate noisy data have been looked at. Our method ensures no overlapping genes among alternative trees in an ensemble committee, so a noise gene included in the ensemble committee can affect one
tree only; other trees in the committee are not affected at all. This design increases the robustness of Microarray classification in terms of resistance to noise data, and therefore reduces the instability caused by overlapping genes in current ensemble methods.

The effectiveness of gene selection methods for improving the performance of Microarray classification methods are also discussed.

We conclude that the proposed method DMDT substantially outperforms the other well-known ensemble methods, such as Bagging, Boosting and Random Forests, in terms of accuracy and robustness performance. DMDT is more tolerant to noise than Cascading-and-Sharing trees (CS4), particulary
with increasing levels of noise in the data. The results also indicate that some classification methods are insensitive to gene selection while some methods
depend on particular gene selection methods to improve their performance of classification.

Keywordsmicroarray data classification;accuracy; robustness; algorithms
ANZSRC Field of Research 2020469999. Other information and computing sciences not elsewhere classified
Byline AffiliationsDepartment of Mathematics and Computing
Permalink -

https://research.usq.edu.au/item/9z54q/accurate-and-robust-algorithms-for-microarray-data-classification

Download files


Published Version
Hu_2008_whole.pdf
File access level: Anyone

  • 2203
    total views
  • 554
    total downloads
  • 3
    views this month
  • 2
    downloads this month

Export as

Related outputs

Routing Protocol for Healthcare Applications Data Over the 6LoWPAN-based Wireless Sensor Networks
Zhang, Zhongwei, Hu, Hong and Hu, Xiaohua. 2023. "Routing Protocol for Healthcare Applications Data Over the 6LoWPAN-based Wireless Sensor Networks." Howlett, R. (ed.) 27th International Conference on Knowledge Based and Intelligent Information and Engineering Sytems (KES 2023). Athens, Greece 04 - 08 Sep 2023 https://doi.org/10.1016/j.procs.2023.10.206
Robustness analysis of diversified ensemble decision tree algorithms for microarray data classification
Hu, Hong, Li, Jiuyong, Wang, Hua, Daggard, Grant and Wang, Li-Zhen. 2008. "Robustness analysis of diversified ensemble decision tree algorithms for microarray data classification." ICMLC 2008: 7th International Conference on Machine Learning and Cybernetics. Kunming, China 12 - 15 Jul 2008 United States. https://doi.org/10.1109/ICMLC.2008.4620389
Combined gene selection methods for microarray data analysis
Hu, Hong, Li, Jiuyong, Wang, Hua and Daggard, Grant. 2006. "Combined gene selection methods for microarray data analysis." Gabrys, Bogdan, Howlett, Robert J. and Jain, Lakhmi C. (ed.) 10th International Conference on Knowledge-Based Intelligent Information and Engineering Systems (KES 2006). Bournemouth, United Kingdom 09 - 11 Oct 2006 Germany. Springer. https://doi.org/10.1007/11892960_117
A robust ensemble classification method analysis
Zhang, Zhongwei, Li, Jiuyong, Hu, Hong and Zhou, Hong. 2010. "A robust ensemble classification method analysis." Arabnia, Hamid R. (ed.) Advances in computational biology. New York, NY. United States. Springer. pp. 149-155
On the effectiveness of gene selection for microarray classification methods
Zhang, Zhongwei, Li, Jiuyong, Hu, Hong and Zhou, Hong. 2010. "On the effectiveness of gene selection for microarray classification methods." Nguyen, Ngoc Thanh, Le, Manh Thanh and Swiatek, Jerzy (ed.) 2nd Asian Conference on Intelligent Information and Database Systems (ACIIDS 2010). Hue City, Vietnam 24 - 26 Mar 2010 Heidelberg, Germany. https://doi.org/10.1007/978-3-642-12101-2_31
Using association rules to make rule-based classifiers robust
Hu, Hong and Li, Jiuyong. 2005. "Using association rules to make rule-based classifiers robust." Williams, Hugh E. and Dobbie, Gillian (ed.) ADC 2005: 16th Australasian Database Conference. Newcastle, Australia 31 Jan - 03 Feb 2005 Sydney, Australia.
A maximally diversified multiple decision tree algorithm for microarray data classification
Hu, Hong, Li, Jiuyong, Wang, Hua, Daggard, Grant and Shi, Mingren. 2006. "A maximally diversified multiple decision tree algorithm for microarray data classification." Boden, Mikael and Bailey, Timothy (ed.) Workshop on Intelligent Systems for Bioinformatics (2006). Hobart, Australia 04 Dec 2006 Sydney, Australia.
A comparative study of classification methods for microarray data analysis
Hu, Hong, Li, Jiuyong, Plank, Ashley, Wang, Hua and Daggard, Grant. 2006. "A comparative study of classification methods for microarray data analysis." Christen, Peter, Kennedy, Paul J., Li, Jiuyong, Simoff, Simeon J. and Williams, Graham J. (ed.) 5th Australasian Conference on Data Mining and Analystics (AusDM 2006). Sydney, Australia 29 - 30 Nov 2006 Canberra, Australia.