GitHub - cecilyal/Project_MadelonDatasets: Looking at important features in a Madelon Dataset

This project looked at two Madelon Datasets.

The first one contained 500 features and the second one contained over 6,000 features.

The purpose of this project was to look at trying to find the correlation between those features and the target. It also meant that I created a model in order to do so.

First step included creating benchmark models to see how the models performed.

I used these models for my benchmarking: logistic regression decision tree k nearest neighbors support vector classifier

The second step included taking the model that performed the best and used that to identify important features. I then tried to adjust the pipelines to improve the model even more.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
Dataset Conclusion.docx		Dataset Conclusion.docx
Main_Large_Dataset_Work.ipynb		Main_Large_Dataset_Work.ipynb
Project 3 Cleaning Notebook.ipynb		Project 3 Cleaning Notebook.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

cecilyal/Project_MadelonDatasets

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages