Abstract
Ensemble encoding is a biologically-motivated, distributed data representation scheme for MLP networks. Multiple overlapping receptive fields are used to enhance locality of representation. The number, form, and placement of receptive fields have a great impact on performance. This thesis presents four heuristics, two based on descriptive statistics, and two based on clustering, for optimizing receptive field configuration, and compares their performance on four benchmark data sets. The two statistical approaches are based on the mean and median properties of the data set. The two clustering methods are the c-means and fuzzy c-means clustering. The four data sets used are well-known machine learning benchmarks, which are breast cancer diagnosis, predicting the contraceptive method used by women in Indonesia based on social and economic status, predicting whether a hepatitis patient would live or die based on symptoms and clinical observations, and predicting the protein localization sites for e.coli bacteria. Performance varies among the benchmarks, but on one benchmark, the fuzzy clustering heuristic yields a 56.6% improvement in test set classification over unencoded data, and a 48.98% improvement over symmetrical-placement three-receptor ensemble encoding. The thesis provides an extension to the original symmetrical placement approach proposed by Narayan, which focused only on the use of three receptive fields. Some experimentation was done on extending the number of receptors to be allocated using symmetrical placement. Further more, this thesis explores the possibility of extending backpropagation to incorporate the parameters of the receptive fields into the learning process. Results show that such an extension is outperformed by the proposed clustering heuristics. Previous work introduced the idea of using standard deviation for dilation of receptive fields. Experiments were run using different fractions of the standard deviation with the available datasets. Results show that such an approach doesn't result in significant improvement in performance. All the experiments were run using leave-one-out cross validation to guarantee a fair evaluation of the trained networks. Moreover, Analysis of Variance is used to confirm that the results are of a statistical significance. The total number of networks trained during the experimental process is 182,844 networks.
School
School of Sciences and Engineering
Department
Computer Science & Engineering Department
Degree Name
MS in Computer Science
Date of Award
6-1-2003
Online Submission Date
1-1-2003
First Advisor
Ashraf Abdelbar
Document Type
Thesis
Extent
244 leaves
Library of Congress Subject Heading 1
Computer networks.
Library of Congress Subject Heading 2
Cluster set theory
Rights
The American University in Cairo grants authors of theses and dissertations a maximum embargo period of two years from the date of submission, upon request. After the embargo elapses, these documents are made available publicly. If you are the author of this thesis or dissertation, and would like to request an exceptional extension of the embargo period, please write to thesisadmin@aucegypt.edu
Recommended Citation
APA Citation
Hassan, D.
(2003).Optimization of receptive fields for MLP networks with ensemble encoding [Thesis, the American University in Cairo]. AUC Knowledge Fountain.
https://fount.aucegypt.edu/retro_etds/1714
MLA Citation
Hassan, Deena Osama. Optimization of receptive fields for MLP networks with ensemble encoding. 2003. American University in Cairo, Thesis. AUC Knowledge Fountain.
https://fount.aucegypt.edu/retro_etds/1714
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Call Number
Thesis 2003/43
Location
mgfth;mrs2