Skip to content

Latest commit

 

History

History
41 lines (27 loc) · 1.5 KB

README.md

File metadata and controls

41 lines (27 loc) · 1.5 KB

About data

Dataset 1

  • Create a model for predicting mortality caused by Heart Failure. (Labels are not balanced, consider using f1-score instead of accuracy) https://www.kaggle.com/andrewmvd/heart-failure-clinical-data

    Dataset 2

  • Create a model for predicting complitacions during surgery. (Labels are not balanced, consider using f1-score instead of accuracy) https://www.kaggle.com/omnamahshivai/surgical-dataset-binary-classification

    Setup

  • Create a virtual environment using virtualenv venv
  • Activate the virtual environment by running venv/bin/activate
  • On Windows use venv\Scripts\activate.bat
  • Install the dependencies using pip install -r requirements.txt
  • Check possibilities python main.py -h or run default python main.py

    Example

    1. Run in terminal python main.py -m DecisionTree -d data/dataset1.csv
    2. This is what you get: Sample photo: Screenshot1

    Limitations

    1. Dataset must be preprocessed and cleaned (no missing values, no outliers, no duplicates).
    2. Dataset must be in csv format.
    3. Models solve the classification problem.
    4. Available models: Decision Tree, Random Forest K-Nearest Neighbors

    Future work

    1. Add models: Logistic Regression, Support Vector Machine, Naive Bayes, K-Means, Linear SVM, and Gradient Boosting Classifier.
    2. Add model manipulation: Grid Search, Cross Validation and Hyperparameter Tuning.
    3. Add visualization.