In this project we are going to perform exploartory data analysis and create machine learning models using following:
- Linear regression
- Random Forest
- Neural network (Keras)
to predict which attributes influence the quality of red wine and predict quality red wine.
For predicting the red wine quality we require physicochemical information of different quality of wines to find the features of a best quality wine. the physicochemical information includes:
- fixed acidity
- volatile acidity
- citric acid
- residual sugar
- chlorides
- free sulfur dioxide
- total sulfur dioxide
- density
- pH
- sulphates
- alcohol
Output variable (based on sensory data):
- quality (score between 0 and 10) the dataset is taken from here.
To develop this project Jupyter Notebooks and Anaconda are used. You can install Anaconda from here. Then either use Jupyter Labs or jupyter notebook extension to open the files. You can also view the project on Kaggle here
Paulo Cortez, University of Minho, Guimarães, Portugal, http://www3.dsi.uminho.pt/pcortez A. Cerdeira, F. Almeida, T. Matos and J. Reis, Viticulture Commission of the Vinho Verde Region(CVRVV), Porto, Portugal @2009
CC-0 license