Successfully established a machine learning model which can accurately predict the net Black Friday sales for a specific customer, based on various characteristics pertaining to that particular customer.
Link: https://black-friday-sales-forecast.herokuapp.com/
A retail company “ABC Private Limited” wants to understand the customer purchase behaviour (specifically, purchase amount) against various products of different categories. They have shared purchase summary of various customers for selected high volume products from last month. The data set also contains customer demographics (age, gender, marital status, citytype, stayincurrentcity), product details (productid and product category) and Total purchaseamount from last month.
Now, they want to build a model to predict the purchase amount of customer against various products which will help them to create personalized offer for customers against different products.
Variable | Definition |
---|---|
User_ID | User ID |
Product_ID | Product ID |
Gender | Sex of User |
Age | Age in bins |
Occupation | Occupation(Masked) |
City_Category | Category of the City (A,B,C) |
StayInCurrentCityYears | Number of years of stay in the current city |
Marital_Status | Marital Status |
ProductCategory1 | Product Category (Masked) |
ProductCategory2 | Product may belong to other category also (Masked) |
ProductCategory3 | Product may belong to other category as well (Masked) |
Purchase | Purchase Amount (Target Variable) |
The performances of all regression ML models have been evaluated on the basis of predictions of the purchase amount for the test data (test.csv), which contains similar data points as train except for their purchase amount.
Model evaluation has been done using the root mean squared error (RMSE). RMSE is very common and is a suitable general-purpose error metric. Compared to the Mean Absolute Error, RMSE punishes large errors:
Where y hat is the predicted value and y is the original value.