Face recognition experiment using custom PCA and LDA methods along with SKLearn K Nearest Neighbors classifier. Experiment is conduction using CMU PIE data set which consists of 67 subject and 21 samples of each subject. Each sample is a 30 x 30 image. These images vary in lighting in angle.
The first results are a visualization of the first five leading "eigen faces". These faces represent the "principal faces" of the data set. This is analagous to dimensions in "face space". In Principal Component Analysis the principal components are the right singular vectors from singular value decomposition. These components are used to reduce the dimensionality of the data set from 900 to, in the case of this experiment, 100. Thus, the data is projected down from 900 dimensions, i.e. axes, to 100 using the first 100 singular vectors.
The next results come from a classification system using K Nearest Neighbors classifier. The data set is split such that the model is trained on 5, 10, or 15 samples from each subject. The model is first trained and tested with PCA (results in 'results.png'. Next, LDA is implemented to help maximize the spread between classes theoretically improving the classifier model's ability to identify subjects. LDA uses the product of the inverse of the between class scatter matrix and the within class scatter matrix. The eigen value decomposition of this new matrix gives the best linear discriminants (linear discrimnants being the seperated classes). LDA is helped by PCA by using PCA to first reduce the data and then using LDA to choose the best "angle" to maximize seperation between those classes. The new transformed data by the succession of PCA and LDA is trained and tested using the 5, 10, or 15 split defined previously. The results for this method show a marked improvement resulting in 100% classification rate on the test data with maximal training data. As of now no methods are used to validate the fitness of this model. It is possible this model could be overfitting to the data but the purpose of this experiment is mainly to compare the performance of PCA on its own and the combination of PCA and LDA.
To run this code, LDA.py, PCA.py, PIE.mat, and test.py must be in the same working directory. 'test.py' contains the settings for the model at the top. This test.py file is used to run the experiment.