Abstract
Classi cation is a central problem in the elds of data mining and machine learning. Using a training set of labeled instances, the task is to build a model (classi er) that can be used to predict the class of new unlabeled instances. Data preparation is crucial to the data mining process, and its focus is to improve the tness of the training data for the learning algorithms to produce more e ective classi ers. Two widely applied data preparation methods are feature selection and instance selection, which fall under the umbrella of data reduction. For my research I propose ADR-Miner, a novel data reduction algorithm that utilizes ant colony optimization (ACO). ADR-Miner is designed to perform instance selection to improve the predictive e ectiveness of the constructed classi cation models. Two versions of ADR-Miner are developed: a base version that uses a single classi cation algorithm during both training and testing, and an extended version which uses separate classi cation algorithms for each phase. The base version of the ADR-Miner algorithm is evaluated against 20 data sets using three classi cation algorithms, and the results are compared to a benchmark data reduction algorithm. The non-parametric Wilcoxon signed-ranks test will is employed to gauge the statistical signi cance of the results obtained. The extended version of ADR-Miner is evaluated against 37 data sets using pairings from fi ve classi cation algorithms and these results are benchmarked against the performance of the classi cation algorithms but without reduction applied as pre-processing. Keywords: Ant Colony Optimization (ACO), Data Mining, Classi cation, Data Reduction.
Department
Computer Science & Engineering Department
Degree Name
MS in Computer Science
Graduation Date
6-1-2015
Submission Date
May 2015
First Advisor
Abdelbar, Ashraf
Committee Member 1
Goneid, Amr
Committee Member 2
Ismail, Ismail Amr
Extent
129 p.
Document Type
Master's Thesis
Library of Congress Subject Heading 1
Artificial intelligence.
Library of Congress Subject Heading 2
Data mining.
Rights
The author retains all rights with regard to copyright. The author certifies that written permission from the owner(s) of third-party copyrighted matter included in the thesis, dissertation, paper, or record of study has been obtained. The author further certifies that IRB approval has been obtained for this thesis, or that IRB approval is not necessary for this thesis. Insofar as this thesis, dissertation, paper, or record of study is an educational record as defined in the Family Educational Rights and Privacy Act (FERPA) (20 USC 1232g), the author has granted consent to disclosure of it to anyone who requests a copy.
Institutional Review Board (IRB) Approval
Approval has been obtained for this item
Recommended Citation
APA Citation
Abdel Salam, I.
(2015).ADR-Miner: An Ant-based data reduction algorithm for classification [Master's Thesis, the American University in Cairo]. AUC Knowledge Fountain.
https://fount.aucegypt.edu/etds/130
MLA Citation
Abdel Salam, Ismail Mohamed Anwar. ADR-Miner: An Ant-based data reduction algorithm for classification. 2015. American University in Cairo, Master's Thesis. AUC Knowledge Fountain.
https://fount.aucegypt.edu/etds/130