As a Data Scientist, you work for Hass Consulting Company which is a real estate leader with over 25 years of experience. You have been tasked to study the factors that affect housing prices using the given information on real estate properties that was collected over the past few months. Later onwards, create a model that would allow the company to accurately predict the sale of prices upon being provided with the predictor variables.
Define the question, the metric for success, the context, experimental design taken.
Read and explore the given dataset.
Define the appropriateness of the available data to answer the given question.
Find and deal with outliers, anomalies, and missing data within the dataset.
Perform univariate, bivariate and multivariate analysis recording your observations.
Performing regression analysis.
Incorporate categorical independent variables into your models.
Check for multicollinearity
Provide a recommendation based on your analysis.
Create residual plots for your models, and assess heteroskedasticity using Barlett's test.
Challenge your solution by providing insights on how you can make improvements in model improvement.
While performing your regression analysis, you will be required to perform modeling using the given regression techniques then evaluate their performance. You will be then required to provide your observations and recommendation on the suitability of each of the tested models on their appropriateness of solving the given problem.
Multiple Linear Regression
Quantile Regression
Ridge Regression
Lasso Regression
Elastic Net Regression
Remember to go through the rubric so that you can see how you will be assessed on the above regression techniques.
The dataset to use for this project can be found by following this link: [http://bit.ly/IndependentProjectWeek7Dataset].
Id
price - Price of the house
bedrooms - Number of Bedrooms
bathrooms - Number of Bathrooms
sqft_living - Square feet area of living area
sqft_lot - Square feet area of parking Layout
floors - Number of Floors
waterfront - Whether waterfront is there or not
view - Number of Views
grade - Grades
sqft_above
sqft_basement - Square feet area off basement
yr_built - Year the house is built
yr_renovated - Year the house is renovated
zipcode - zipcode os the house
lat : Latitude of the house
lon : Longitude of the house
sqft_living15
sqft_lot15