An machine learning model that classifies a website as phising website or not by using various classification algorithm and comparing their results.
Following libraries are used in this whole project
- numpy
- pandas
- seaborn
- sciket-learn
- matplotlib
- Pickle
- Pandas is used to load the dataset.
- Separation of features from target variable.
- Seborn is used to display the heatmap of features.
- Division of the dataset into 80-20 for training and testing.
- Then KNN algorithm is applied by using hyperparameter tuning for best results.
- For hyperparameter tuning GridSearchCV is used.
- Then accuracy and confusion matrix is calculated.
- Similarly Naive Bayes, Support vector machine(SVM), decsion tree and Random forest is applied.
- Then pickle dump is used to store the accuracy to the file.
Following are the result of hyperparameter tuning and algoritms applied.
jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace notebooks/*.ipynb