Skip to content

Latest commit

 

History

History
111 lines (81 loc) · 8.66 KB

Discussion_NextSteps.md

File metadata and controls

111 lines (81 loc) · 8.66 KB

Summary of results and next steps

Characteristics of dataset

The dataset is built in Make_dataset.ipynb, in which SQL queries to the MIMIC-III database are done to retrieve creatinine measurements together with other interesting features : static information such as patient's age, diagnosis, and dynamic information such as arterial pressure. As regards dynamic variables that vary with time, we retrieve the last measurement that was done before the creatinine measurement, as well as the delay between these two measurements.

Important remark

During the Hackathon, we used the labevents table from MIMIC-III to retrieve all the lab results (including creatinine rates). However, following a discussion with other MIMIC users, the dates stored in labevents correspond to the date when a sample was taken, not when the results where known. For this reason, we decided to use exclusively values stored in the chartevents, where all the lab measurements are supposed to be reported. Indeed, chartevents contains another time variable (storetime) that is closest to the time at which the staff get the results from the lab.

Drawback : The dataset obtained from chartevents contains a lot less creatinine measurements.

Description of the dataset :

Basic statistics about the dataset can be found in Explore_dataset.ipynb.

Number of 24h-variations in creatinine rates : 36251

Number of features (including delays) : 61

List of features :

['creatinine' 'age' 'arterial_pressure_systolic' 'arterial_pressure_systolic_delay' 'arterial_pressure_diastolic' 'arterial_pressure_diastolic_delay' 'heart_rate' 'heart_rate_delay' 'temperature' 'temperature_delay' 'ph_blood' 'ph_blood_delay' 'ethnicity' 'diagnosis' 'gender' 'weight_daily' 'weight_daily_delay' 'urine_output' 'urine_output_delay' 'day_urine_output' 'day_urine_output_delay' 'scr' 'scr_delay' 'sodium' 'sodium_delay' 'potassium' 'potassium_delay' 'calcium' 'calcium_delay' 'phosphor' 'phosphor_delay' 'hemoglobine' 'hemoglobine_delay' 'uric_acid' 'uric_acid_delay' 'chloride' 'chloride_delay' 'platelet_count' 'platelet_count_delay' 'fibrinogen' 'fibrinogen_delay' 'urinary_sodium' 'urinary_sodium_delay' 'urinary_potassium' 'urinary_potassium_delay' 'urine_creatinin' 'urine_creatinin_delay' 'alkaline_phospatase' 'alkaline_phospatase_delay' 'total_protein_blood' 'total_protein_blood_delay' 'albumin' 'albumin_delay' 'total_protein_urine' 'total_protein_urine_delay' 'bilirubin' 'bilirubin_delay' 'c_reactive_protein' 'c_reactive_protein_delay' 'creatinine_yesterday' 'creatinine_before_yesterday']

The dataset is quite sparse (please refer to Explore_dataset.ipynb). For now, we didn't use any imputation method to replace missing values. A threshold was fixed (t=0.3) such that each feature with a rate of missing values > t is dropped. After that, each example with remaining missing values is also dropped.

Number of examples after dropping NAs and outliers : 27313

Number of features after dropping NAs : 15

List of features after dropping NAs :

['creatinine' 'age' 'arterial_pressure_systolic' 'arterial_pressure_systolic_delay' 'arterial_pressure_diastolic' 'arterial_pressure_diastolic_delay' 'heart_rate' 'heart_rate_delay' 'temperature' 'temperature_delay' 'ph_blood' 'ph_blood_delay' 'ethnicity' 'diagnosis' 'gender']

Characteristics of models

Description Number of features (after dummy encoding) Number of examples in training set
linearSVM linear Support Vector Machine 35 21850
logreg logistic regression with l2 regularizer 35 21850
logreg_elasticnet logistic regression with elasticnet regularizer 35 21850
XGBoost Gradient boosted CARTs (Classification And Regression Trees) 35 21850
perceptron_1hl Multilayer perceptron with 1 hidden layer 35 21850
perceptron_2hl Multilayer perceptron with 2 hidden layers 35 21850
XGBoost_NAs Gradient boosted CARTs with internal handling of missing values. This last model allows to keep missing values in the dataset and hence keep all features and examples. 66 35081

Please refer to Train_models.ipynb for further details.

Performances

Below are listed the performances obtained for each model on the test set. The performances that would be obtained for a random classifier are also reported for comparison. The metrics sensitivity(inc) and specificity(dec) respectively refer to the sensitivity when predicting an increase in creatinine rate, and to the specificity when predicting a decrease.

No cross-validation was performed to assess performances on the test set. However, to assess the typical variation in the performance metrics depending on the test set, this one was manually modified and the models re-trained. A variation of about 0.03 was observed for each metric when changing the test set.

sensitivity(inc) (%) specificity(dec) (%)
linearSVM 15 88
logreg 19 85
logreg_elasticnet 16 81
XGBoost 5 87
perceptron_1hl 1 85
perceptron_2hl 2 86
XGBoost_NAs 16 87
Random classifier 18 78

Discussion

Taking into consideration the 3% variation in performances when changing the test set, we can conclude that :

  • The specificity when predicting a decrease is hardly better than the one obtained for a random classifier
  • The sensitivity in increase is never better than the one of a random classifier ! This is really bad, as the increase in creatinine rate is precisely what we aim at predicting well.

Possible reasons for performing so badly :

  • There are too few creatinine measurements available in chartevents
  • The features we use are not predictive for variation in creatinine rates
  • The class in which we are the most interested in ("increase"), is the one that is the rarest in the dataset. We may have too few examples in this class, or the metric we used to train the models (the mean accuracy over classes) may not be suited for predicting well the increase.

Next steps

Below are listed some hints to improve our results :

  • Check the building of the dataset :
    • Check the SQL queries
    • Understand why we get a lot less creatinine measurements in the chartevents table compared to the labevents table  - Understand why there are so many missing values, and why some features are empty
  • make sure that features containing creatinine values of last days are not dropped
  • Retrieve more features
  • Use imputation methods to replace missing values
  • Use data augmentation to get a balanced dataset
  • Try alternatives to the mean accuracy as metric to optimize the models' parameters
  • Meet, code and have fun :)