[WIP] Add model selection example with LFW dataset and KNN task #344
+115
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I created a model selection example for supervised Mahalanobis learners, to show the effectiveness of the linear transformation.
I use a "large" dataset from sklearn: Labeled Faces in the Wild (LFW) people dataset (classification). That it's a bit more complex than using iris, and for the same reason I use PCA to reduce dimentionality.
The usual pipeline would be: PCA-> Classifier, but in this case we try PCA-> Metric learner-> Classifier, and we compare how precision, recall and f1 scores vary to the first scenario that I call a baseline.
To compare models I fixed the last Classifier being a
KNeighborsClassifier
.In general, all supervised learners are able to outperform the baseline.
I think this example can be useful to users, because its hard to know beforehand which model will perform the best with our dataset.
Note: The models's parameters are not tuned, this example act as a "final" comparison between models.