Python package for plug and play feature selection techniques, cross-validation and performance evalutation of machine learing models. If you like the idea or you find usefull this repo in your job, please leave a ⭐ to support this personal project.
- Feature Selection techiniques (to be tested)
To accompany the feature section method this package has also:
-
Cross Validation methods with performance metrics
- K-fold;
- Leave One Out (LOO);
- Leave One Subject Out (LOSO).
-
Performance Metrics for binary and multi-class tasks:
- Confusion Matrix Plot (Binary and multi class tasks);
- Precision (binary tasks);
- Sensitivity (binary tasks);
- Specificity (binary tasks);
- F1 Score (binary tasks);
- sklearn classification report (Binary and multi class tasks).
Each method returns three outputs:
conf_matrix
: confusion Matrix of the 5-fold cross validation using the input model and the selected features;fs_perf
: dataframe with the baseline and the feature selection classification performance, to understand of the feature selection method works for your classification task;feat_selected
: dataframe with the selected features, this dataframe is the input X dataframe with only the selected columns.
At the moment the package is not available using pip install <PACKAGE-NAME>
.
For the installation from the source code click here.
from src.feature_selection.feature_selection import FeatureSelection
conf_matrix, fs_perf, feat_selected = FeatureSelection().variance_threshold(clf, X, y, thr=0.5, baseline=True)
from src.feature_selection.feature_selection import FeatureSelection
conf_matrix, fs_perf, feat_selected = FeatureSelection().anova(clf, X, y, n_feat=30, baseline=True)
from src.feature_selection.feature_selection import FeatureSelection
conf_matrix, fs_perf, feat_selected = FeatureSelection().mutual_info(clf, X, y, n_feat=30, baseline=True)
from src.feature_selection.feature_selection import FeatureSelection
conf_matrix, fs_perf, feat_selected = FeatureSelection().recursive_feature_elimination(clf, X, y, n_feat=30, baseline=True)
from src.feature_selection.feature_selection import FeatureSelection
conf_matrix, fs_perf, feat_selected = FeatureSelection().random_forest_importance(clf, X, y, threshold=0.8, baseline=True, verbose=True)
from src.feature_selection.feature_selection import FeatureSelection
conf_matrix, fs_perf, feat_selected = FeatureSelection().relieff(clf, X, y, n_feat=30, baseline=True)
from src.feature_selection.feature_selection import FeatureSelection
from src.feature_selection.feature_selection import FeatureSelection
conf_matrix, fs_perf, feat_selected = FeatureSelection().cluster_quality(clf, X, y, n_feat=30, baseline=True, verbose=True)
For the installation from the source code type this command into your terminal window:
pip install git+<repository-link>
or
python -m pip install git+<repository-link>
or
python3 -m pip install git+<repository-link>