The project aims to develop a prediction model for estimating the likelihood of diabetes in patients.
The following steps were taken to achieve the objective:
-
Conducted exploratory data analysis using Pandas and Seaborn, and visualized the data using Seaborn and Matplotlib.
-
Tested three distinct classification algorithms, including logistic regression and decision trees, to develop a prediction model using the Scikit-learn toolkit.
Please note that the Jupyter notebooks included in this project are written in Spanish.
This is a final project developed during theBootcamp de ciencia de datos by Código Facilito.
You'll need a few things installed on your computer to be able to run the content in this repository:
- A working Python installation ((Anaconda o Miniconda)).
- The conda environment installed (conda environment).
- A web browser that works with Jupyter.
To run these notebooks, we use the Python distribution from Anaconda along with the conda
package manager.
If you already have Anaconda or Miniconda installed, you can skip this step. Otherwise, follow the instructions on the Anaconda website.
f you need more help installing Anaconda, you can watch this video tutorial from Software Carpentry.
All necessary dependencies can be installed through the conda
package manager.
The file called environment.yml
contains all the dependencies that the manager needs to install.
To do this, run:
conda env create -f environment.yml
Then activate the environment:
conda activate diabetes
Once the environment is activated, launch the JupyterLab server:
jupyter lab
Jupyter should open in your default web browser. If you need more help running JupyterLab, you can watch this lesson from Software Carpentry.
All Python source code is available under the BSD 3-clause license. You are free to use and modify the code, without warranty, as long as you provide attribution to the authors.
Unless otherwise specified, all figures and Jupyter notebooks are available under the Creative Commons Attribution 4.0 (CC-BY) license.
The full text of these licenses is provided in the LICENSE.txt file.