Skip to content

This data science project aims to gather insights into various factors affecting restaurants

Notifications You must be signed in to change notification settings

Nanduvamshi/Restaurant-Rating-Prediction-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Restaurant - Exploratory Data Analysis

307020272-955223f3-3c28-4027-aa0b-74ae3e93b6f1 This Analysis includes 3 levels of tasks to perform.It contains analysis of restaurants and factors affecting their ratings. The main objective was to gather meaningful insights by conducting exploratory data analysis on the large restaurant dataset, as well as build a ML model to predict ratings. This github repo contains all files that includes necessary data exploraton, preprecessing and various visualization methods.

Dataset

Dataset is uploaded in repo

Platform used

Jupyter Notebook

Libraries and Tools

pandas, numpy, matplotlib, seaborn, scikitlearn, folium, geopanda

Data Preprocessing & Feature Engineering

  • Cuisines had 9 null values. So dropped the rows
  • Removed features that will inhibit model performance
  • Split training data and test data in the ratio 8:2
  • Some features/columns needed label encoding

Model Training and Performance

  • Used Random Forest, Decision Tree Logistic Regression algorithms to build the models
  • My restaurant rating prediction model (Random Forest and Decision Tree) obtained an aggregate R2 score of 0.93

EDA : Insights

(Analysis and task wise conclusions are given in repo folders in detail.)

  • There are many restaurants having 0 rating probably due to less popularity.
  • Visualized the geospatial distribution of restaurants on the map coordinates using folium and geopanda
  • Most popular restaurants come in the range of ratings 3 to 3.5.
  • Expensive restaurants (higher price range) tend to have higher ratings.
  • New Delhi has the highest number of restaurants.
  • By country, country code “1”, probably North America has most no of restaurants.
  • 'North Indian' is the most popular cuisine overall, followed by "Chinese" and "fast food".
  • Restaurants having table booking facility have fairly higher average rating.
  • “Sunda” is the highest rated cuisine and also has the most votes.

Some visualizations for reference

307020452-996ef9a6-327b-4c53-a560-19016ec9fe0f 307020587-df299531-addb-4a07-b54b-2b896dc9136a 307020830-e4f1574a-dfbb-4959-9694-56b8e63215b9 307020948-f157aa0a-7c6d-439c-82f9-48b4eb18640c 307021052-9d1a09be-63db-456b-b32b-b73bb7da9594 307021178-4bdad205-2656-4a67-88b1-32cc6e83b6f2 307022399-8a3a32c2-7299-4475-8d3a-b2c06252bbf9 307022483-6eaee4f2-34fc-4ccb-b485-e97b83b21dd6 307022671-904992df-bd11-4fd3-8ca8-b40058ce7975