Skip to content

5. Experimental results

Ioannis G. Tsoulos edited this page Feb 3, 2017 · 3 revisions

In order to measure the efficiency of the proposed method a series of experiments were conducted on some common classification problems. For all the experiments we have used 10-fold and they were conducted 30 times using different seed for the random generator each time and averages were taken. For our experiments we have used the following parameters:

  1. Number of chromosomes: 200

  2. Number of generations: 500

  3. Selection rate: 90%

  4. Mutation rate: 5%

The following datasets were used

  1. Wine dataset. The wine recognition dataset contains data from wine chemical analysis. It contains 178 examples of 13 features each.

  2. Glass dataset. The dataset contains glass component analysis for glass pieces that belong to 6 classes.

  3. Pima dataset. The Pima Indians Diabetes dataset contains 768 examples of 8 attributes with two categories.

  4. Ionosphere dataset. The ionosphere dataset contains data from the Johns Hopkins Ionosphere database.

  5. Eeg dataset. The EEG dataset described in [1] is used here. The dataset consists of five sets (denoted as Z, O, N, F and S) each containing 100 single-channel EEG segments each having 23.6 sec duration.

  6. Spiral artificial data: This dataset contains 1000 two-dimensional examples that belong to two classes (500 examples each). The number of the features is 2.

  7. Wisconsin diagnostic breast cancer: The Wisconsin diagnostic breast cancer dataset (WDBC) contains data for breast tumours. It contains 569 training examples of 30 features each that are classified into two categories.

  8. Fertility Data Set (FERT): 100 volunteers provide a semen sample analysed according to the WHO 2010 criteria. It contains 100 examples of 10 features each.

  9. Regions Data Set:Regions Dataset is created from liver biopsy images of patients with hepatitis C [2]. The dataset includes 600 samples belonging into 6 classes.

  10. Thyroid Data Set: Thyroid disease records[3] with 7200 patterns of 21 features each.

  11. Parkinsons Data Set:This dataset[4] is composed of a range of biomedical voice measurements from 31 people, 23 with Parkinson's disease.

  12. Abalone Data Set: A dataset to predict the age of abalone from physical measurements[5].

  13. Satellite image Data Set (Satimage): The database consists of the multi-spectral values of pixels in 3x3 neighbourhoods in a satellite image, and the classification associated with the central pixel in each neighbourhood.The dataset contains 6635 patterns.

  14. Dermatology Data Set: The aim is used for the Eryhemato-Squamous Disease. The dataset contains 366 patterns with 33 features each.

The results from the experiments are displayed in the following table

Table of results

The column DATASET denotes the name of the dataset. The column NEURAL stands for the average test error from the application of neural network to the corresponding dataset. The number of weights (hidden nodes) for the neural network was set to 10 and a BFGS variant due to Powell[6] was used to train the network. The column RBF denotes the average test error from the application of a Radial Basis Function network to the dataset. The number of hidden nodes for this network was also set to 10. Finally, the column GENCLASS denotes the average test error from the application of the proposed method to the dataset. As it can be deduced from the results, the proposed can improve classification accuracy in the majority of the used datasets.

References

[1] R.G. Andrzejak, K. Lehnertz, F. Mormann, C. Rieke, P. David, and C. E. Elger, Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state, Phys. Rev. E 64, pp. 1-8, 2001.

[2] Giannakeas, N., Tsipouras, M.G., Tzallas, A.T.,Kyriakidi, K., Tsianou, Z.E., Manousou, P., Hall, A., Karvounis, E.C., Tsianos, V., Tsianos, E. A clustering based method for collagen proportional area extraction in liver biopsy images (2015) Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, 2015-November, art. no. 7319047, pp. 3097-3100.

[3]Quinlan,J.R., Compton,P.J., Horn,K.A., and Lazurus,L. (1986). Inductive knowledge acquisition: A case study. In Proceedings of the Second Australian Conference on Applications of Expert Systems. Sydney, Australia.

[4] Max A. Little, Patrick E. McSharry, Eric J. Hunter, Lorraine O. Ramig (2008), 'Suitability of dysphonia measurements for telemonitoring of Parkinson's disease', IEEE Transactions on Biomedical Engineering 56, pp. 1015-1022, 2009.

[5] Warwick J Nash, Tracy L Sellers, Simon R Talbot, Andrew J Cawthorn and Wes B Ford (1994) The Population Biology of Abalone (Haliotis species) in Tasmania. I. Blacklip Abalone (H. rubra) from the North Coast and Islands of Bass Strait, Sea Fisheries Division, Technical Report No. 48 (ISSN 1034-3288)

[6] M.J.D. Powell, A Tolerant Algorithm for Linearly Constrained Optimization Calculations, Mathematical Programming 45, pp 547, 1989.

Clone this wiki locally