Skip to content

ahmedatia456123/Restaurant-Market-Analysis-Predictive-Pricing-Model

Repository files navigation

Open In Colab

# Restaurant Price Prediction Project

Introduction to Analyzing the Zomato Dataset

Analyzing the Zomato dataset offers valuable insights into the restaurant scene in Bengaluru, a city bustling with over 12,000 eateries that cater to a diverse range of culinary tastes from around the world. As new restaurants continue to open daily, the industry remains dynamic, with growing demand that presents both opportunities and challenges. For newcomers, competing with well-established establishments can be tough, especially when many restaurants offer similar fare.

Bengaluru, known as the IT hub of India, has a large population that relies heavily on dining out due to busy lifestyles, making the study of restaurant demographics crucial. This analysis aims to uncover key patterns and preferences, including:

  • Explore the food and restaurant industry in the city.
  • Analyze trends and identify opportunities for market entry or improvement.
  • Uncover market characteristics and classify restaurants into five distinct groups.
  • Conduct a detailed analysis of each restaurant group.
  • Examine customer behavior to understand preferences and dislikes by analyzing reviews from various restaurants.
  • Build a predictive model to assist in the pricing process for restaurants based on market pricing trends.

By studying these aspects, we can gain a deeper understanding of the restaurant landscape in Bengaluru, helping new and existing restaurants better align with local tastes and demands.

Objectives

The primary objective of this data analysis project is to identify the most promising investment opportunities in the restaurant and cafe sector in Bangalore. This involves analyzing various factors that influence the success and customer appeal of these establishments and developing machine learning models to support pricing strategies and enhance customer experience.

  • Investment Analysis:
    • Identify High-Performing Establishments: Analyze the data to pinpoint restaurants and cafes with high ratings, significant customer engagement, and strong financial performance indicators. Focus on key attributes such as location, type, and customer reviews to assess which establishments are likely to offer the best returns on investment.
    • Evaluate Pricing Strategies: Develop and implement machine learning models to predict optimal pricing for menu items based on factors such as location, type, and customer feedback. This will help establish competitive pricing that aligns with market expectations and maximizes profitability.
  • Customer Experience Enhancement:
    • Analyze Customer Preferences: Utilize the data to understand customer preferences regarding dish likes, cuisines, and other attributes. This will inform strategies to improve the dining experience by focusing on popular dishes, preferred cuisines, and services that enhance overall satisfaction.
    • Improve Engagement and Accessibility: Examine the impact of online ordering and table booking options on customer engagement and satisfaction. Determine how these features contribute to higher ratings and increased customer interactions.
  • Classify Restaurants: Classify restaurants into different categories based on their customer characteristics, from lower class to high class.
  • Machine Learning Model Development:
    • Predictive Pricing Model: Build and refine machine learning models to forecast prices for menu items based on historical data, restaurant type, location, and customer reviews. This model will provide insights into setting competitive prices that attract customers while ensuring profitability.
    • Enhancement Recommendations: Generate actionable recommendations for improving customer experience based on predictive analytics and historical trends. This will include suggestions for menu adjustments, service enhancements, and strategic changes to attract and retain customers.

Scope

This analysis covers restaurants listed on the Zomato website in Bengaluru, focusing on over 51,000 entries to identify trends and patterns that impact investment and customer experience. The project encompasses:

  • Data extraction and preprocessing to ensure accurate and relevant information.
  • Exploratory data analysis (EDA) to uncover underlying patterns and insights.
  • Classification of restaurants based on customer characteristics and satisfaction levels.
  • Development of machine learning models to predict pricing and enhance customer engagement.

Data Features

The dataset contains various features that provide detailed information about the restaurants. Below is a summary of each feature along with its description:

Feature Description
url The URL of the restaurant's listing.
address The physical address of the restaurant.
name The name of the restaurant.
online_order Indicates if online ordering is available (Yes/No).
book_table Indicates if table booking is available (Yes/No).
rate The rating of the restaurant.
votes The number of votes or reviews the restaurant has received.
phone The contact phone number of the restaurant.
location The locality or area where the restaurant is located.
rest_type The type of restaurant (e.g., Casual Dining, Fine Dining).
dish_liked Dish recommendations or items liked by customers.
cuisines The types of cuisine offered by the restaurant.
approx_cost(for two people) The approximate cost of a meal for two people.
reviews_list The list of customer reviews for the restaurant.
menu_item The items available on the restaurant's menu.
listed_in(type) The type of listing category (e.g., Dine-out, Drinks & Nightlife).
listed_in(city) The city where the restaurant is listed.

Data Limitations

Several limitations affect the quality and accuracy of the data:

  • Insufficient Reviews for Some Classes: Not all restaurant classes have a sufficient number of reviews to cover all aspects in sentiment analysis. This limits the comprehensiveness of the analysis for some categories.
  • Lack of Menu Pricing Details: Menu items do not contain specific prices, which can hinder accurate pricing predictions. Currently, the data only provides approximate costs for a meal for two people, which may not reflect the true cost of individual menu items.
  • Unorganized Address Data: The address field is not well-organized or clean, requiring human revision for accuracy. Address accuracy is crucial for effective clustering and machine learning model performance, and discrepancies in address data may affect the quality of location-based insights.

Stakeholders

The stakeholders in this analysis include:

  • Investors and Restaurant Owners: These stakeholders are interested in identifying high-performing establishments and optimizing pricing strategies. Insights from this analysis will help them make informed decisions on investments and operational adjustments to maximize profitability.
  • Customers: Restaurant patrons benefit from improved dining experiences and more accurate information about restaurant quality and pricing. Understanding customer preferences and trends will help restaurants better cater to their needs and enhance overall satisfaction.
  • Data Analysts and Machine Learning Engineers: These professionals are involved in building and refining models based on the analysis. The insights generated will support their efforts in developing predictive analytics tools and recommendations for enhancing customer experiences and pricing strategies.
  • Marketing and Business Development Teams: These teams use the analysis results to devise targeted marketing strategies and business development plans. By understanding market trends and customer preferences, they can create effective campaigns and promotional activities to attract and retain customers.

By addressing the needs and interests of these stakeholders, this analysis aims to provide actionable insights that drive success in the competitive restaurant market in Bengaluru.

Data Cleaning and Preparation

Missing Data

To ensure data quality, we first need to address the missing data in the dataset. The following table summarizes the count of missing values for each feature:

Feature Missing Values
url 0
address 0
name 0
online_order 0
book_table 0
rate 7,775
votes 0
phone 1,208
location 21
rest_type 227
dish_liked 28,078
cuisines 45
approx_cost(for two people) 346
reviews_list 0
menu_item 0
listed_in(type) 0
listed_in(city) 0

Handling Missing Rate and Rate Distribution and Weighted Rating

Handling the missing values in the rate column involves several steps:

  1. Calculate Ratings from Reviews: Derive ratings from the reviews_list column where available. Convert ratings from string format (e.g., 'Rated 3.0') to numeric float format (e.g., 3.0).
  2. Handle Missing Ratings: For restaurants with no reviews, estimate their ratings using the average rating of restaurants in the same location.
  3. Preserve 'NEW' Information: Create a new column named is_new with values 'yes' or 'no' to retain information about new establishments. This column will be converted to binary values (1 and 0) for modeling purposes.
  4. Convert Ratings to Numeric Format: Convert ratings from string format (e.g., '4.6/5') to numeric float format (e.g., 4.6).

After checking the distribution of ratings, we observed that many ratings are either 1 or 5, which is unrealistic. Some restaurants have a rating of 5 with only one vote, which can mislead the model. To address this:

Rate Distribution

This image shows that many ratings are skewed. We need to use a Weighted Rating to account for this bias.

Weighted Rating Formula:

Weighted Rating = (v × r + m × c) / (v + m)

where:

  • r = average rating of the item
  • v = number of votes for the item
  • m = minimum number of votes required to be listed (threshold)
  • c = mean rating across all items

Feature Engineering

  1. Handling Duplicate Rows: The dataset contains duplicate rows with different numbers of reviews. Clean the URLs to retain only the highest number of reviews for each unique URL. For example, clean the URL to https://www.zomato.com/bangalore/jalsa-banashankari and keep only the rows with the highest number of reviews for each unique URL.
  2. Dealing with Restaurant Types and Cuisines:
    • Separate Elements: Split elements in the rest_type, cuisines, and dish_liked columns into individual columns (e.g., rest_type_0, rest_type_1, etc.).
    • New Columns: Create new columns to represent the number of specializations:
      • no_spec: Number of restaurant types per restaurant.
      • no_cuisines: Number of different cuisines per restaurant.
      • no_liked_dishes: Number of liked dishes per restaurant.
  3. Handling menu_item and reviews_list:
    • menu_item: Create a new column to count the number of items in each restaurant's menu.
    • reviews_list: Convert the reviews to a Python list of tuples and create a new column num_reviews to count the number of reviews per restaurant.
  4. Handling location: Use the Geopy module to get latitude and longitude based on location information. Prefix location names with 'Bangalore' to avoid matching issues and use these for future spatial analysis.

Final Steps for Encoding

  1. Create a Cleaned Copy for Encoding: Make a copy of the cleaned dataset for encoding and further analysis.
  2. Delete Irrelevant Columns: Remove columns that are not needed for analysis or machine learning:
    • url
    • address
    • name
    • rate
    • location
    • rest_type
    • dish_liked
    • cuisines
    • reviews_list
    • listed_in(city)
    • menu_item
    • count
  3. Apply Binary Encoding: Apply binary encoding to:
    • online_order
    • book_table
    • is_new
    • is_road
  4. Apply Target Encoding with Smoothing: Use Target Encoding with Smoothing for categorical columns:
    • rest_type_
    • cuisine_
    • dish_liked_
    Sum each group into a single column for ease of use in models.
  5. Remove Separated Columns: After summarizing encoded columns, remove the original separated columns to keep the dataset streamlined.

Exploratory Data Analysis

Top Restaurants in Terms of Number of Outlets

The image below illustrates the top restaurants based on the number of outlets in the city:

Top Restaurants by Number of Outlets

Key Findings:

  • Café Coffee Day: Leading with the highest number of outlets.
  • Domino's Pizza: A close second in terms of outlet numbers.
  • Five Star Chicken: Ranking third in the number of outlets.

Top Restaurants in Terms of Votes (Engagements)

The image below shows the top restaurants based on votes, reflecting customer engagement:

Top Restaurants by Votes

Key Insights:

  • Café Coffee Day: Despite having the highest number of outlets, it also shows strong customer engagement.
  • Byg Brewski Brewing Company: Stands out with significant customer engagement despite fewer outlets.
  • Toit: Known for high engagement with a focused strategy.
  • The Black Pearl: Successful in attracting customers with its niche offerings.

Market Implications:

  • Diverse Customer Preferences: Bangalore's market shows a range of customer preferences. Chains like Café Coffee Day cater to high-volume needs, while establishments like Byg Brewski and Toit offer specialized experiences with high engagement.
  • Strategic Focus and Engagement: Focused establishments with unique experiences achieve higher engagement. For example, Byg Brewski’s brewery setting and Toit’s brewery and dining experience contribute to their high votes.
  • Opportunities for Growth: Investors can explore expanding popular chains and investing in niche concepts with high engagement. Chains with many outlets should enhance customer experience, while specialized establishments might consider scaling up while maintaining their unique appeal.

In summary, the Bangalore market features a dynamic blend of high-volume chains and niche establishments. Leveraging insights on customer preferences and engagement can guide strategic growth and investment decisions.

Restaurant Types Analysis

Type Distribution

The pie chart below shows the distribution of different restaurant types in Bangalore:

Type Distribution

Top Performing Categories in Terms of Votes and Engagement

The chart below highlights the top-performing restaurant categories based on votes and customer engagement:

Top Performing Categories

Characteristics Comparison by Category

The radar chart below compares the characteristics of each restaurant type:

Characteristics Comparison by Category

Key Findings:

  • Type Distribution: Delivery (48%), Dine-out (39%), Desserts (8%).
  • Top Performing Categories:
    • Drinks & Nightlife: Highest in engagement and ratings.
    • Buffet and Pubs and Bars: Slightly lower but still notable in engagement and ratings.
  • Lower Performing Categories:
    • Delivery, Dine-out, and Desserts: Show lower votes and ratings compared to other categories.

Market Implications:

  • Customer Preferences: Drinks & Nightlife venues are highly favored, indicating a preference for vibrant social experiences with entertainment and a lively atmosphere. High engagement suggests customers are willing to invest time and money in these experiences.
  • Delivery, Dine-out, and Desserts: These categories, despite being popular, show lower engagement and ratings. Improvements in service quality, food variety, or dining experiences could boost performance.
  • Opportunities:
    • Improving Delivery and Dine-out services through quality and uniqueness can enhance performance.
    • Diversifying offerings by integrating successful elements from high-performing categories could improve overall appeal.

Conclusion:

The Bangalore market exhibits diverse preferences, with Drinks & Nightlife experiences being highly valued. While there are opportunities to enhance Delivery, Dine-out, and Desserts offerings, focusing on quality and unique experiences can drive higher engagement and satisfaction. Investors should consider these insights for strategic growth and investment opportunities.

Cost Analysis: The Impact of Booking Tables, Online Orders, and Location on Restaurant Pricing

Introduction

The Indian restaurant and cafe market is characterized by diverse consumer preferences and business models. This analysis examines the influence of booking tables, online ordering, location, and customer feedback (ratings and votes) on pricing strategies within this sector.

Key Insights

Booking Tables and Pricing

Booking tables at restaurants shows a strong correlation with pricing. Establishments that offer table reservations tend to have higher price points. This is likely due to the premium experience associated with dine-in services, which includes not only the food but also the ambiance and personalized service. Consumers are often willing to pay more for the assurance of a reserved spot, especially in popular or high-demand venues.

Online Orders and High-Cost Restaurants

High-cost restaurants often do not offer online ordering services. This is primarily because the experience of dining in such establishments includes being physically present to enjoy the environment and service, which cannot be replicated through home delivery. Additionally, the logistical challenges and potential compromise on food quality during delivery deter high-end restaurants from offering online orders.

Impact of Location

Restaurants located on main roads generally exhibit higher pricing compared to those within residential areas. Being in a prime location allows these establishments to command higher prices due to increased visibility and accessibility. Moreover, such locations often attract a broader customer base, enabling them to cover a wider price range to cater to diverse economic segments.

Votes and Ratings Influence

While customer votes (the number of reviews) have a limited impact on pricing, ratings (the quality of reviews) significantly influence price levels. An increase in ratings from 3.5 to 4.3 is typically associated with higher prices, as it reflects consumer satisfaction and perceived value. However, beyond a rating of 4.3, prices tend to decrease. This trend suggests that to achieve exceptionally high ratings, restaurants might lower prices to enhance value perception and attract more customers, creating a balance between cost and quality.

Conclusion

The dynamics of the Indian restaurant market reveal that consumer preferences and business strategies are intricately linked. High-rated establishments often find themselves adjusting pricing to maintain quality and customer satisfaction. For investors, understanding these nuances can guide strategic decisions in the food service industry, emphasizing the importance of location, service offerings, and consumer engagement in determining pricing strategies.

Recommendations for Investors

  • Focus on Location: Invest in restaurants with strategic locations that naturally attract more foot traffic and can justify higher pricing.
  • Enhance Customer Experience: Encourage businesses to offer table bookings to capitalize on consumers' willingness to pay for a guaranteed dining experience.
  • Balance Pricing and Quality: For high-rating targets, focus on maintaining quality and adjusting prices to stay competitive without compromising the customer experience.
  • Leverage Customer Feedback: Use ratings and reviews as critical data points for continuous improvement and strategic pricing adjustments.

Visualizations

Scatter Plot Analysis:

The scatter plot below illustrates the relationship between cost, rating, booking table feature, being on the road, and online order feature:

Scatter Plot Analysis

Box Plots Comparison:

The group of box plots compares costs for booking tables, online orders, and being on the road:

Box Plots Comparison

Location Analysis: Understanding Bangalore's Culinary Hotspots

Which is the Foodie Area?

The map below highlights the concentration of dining establishments across Bangalore, showing the most popular foodie areas:

Foodie Areas in Bangalore

Characteristics of Location on Other Variables

Location vs Cost Chart (Sorted by Cost)

The chart below illustrates the relationship between location and cost, sorted by cost:

Location vs Cost Chart

Location vs Votes Chart (Sorted by Votes)

The chart below shows the relationship between location and votes, sorted by votes:

Location vs Votes Chart

Top Locations Characteristics

The chart below highlights the characteristics of top locations in Bangalore:

Top Locations Characteristics

Market Analysis

Bangalore's dynamic culinary landscape offers various opportunities for strategic investment. By examining the distribution and characteristics of dining establishments across key neighborhoods, investors can make informed decisions.

Centralized Nightlife Venues

Drinks & Nightlife: Concentrated in the heart of Bangalore, these establishments cater to the city's vibrant, tech-savvy young professionals and expats seeking entertainment and social experiences. The central locations offer high visibility and access to a diverse clientele. Investment in these areas should focus on innovative concepts that combine local culture with international trends to attract a wide audience.

Western Buffet Offerings

Buffet: Predominantly located on the western side of the city center, buffets appeal to families and groups. These venues should emphasize diverse culinary options and value for money to attract the surrounding residential communities. Expanding in these areas can capitalize on the demand for family-friendly dining experiences.

Emerging Restaurant Hubs

Whitefield, Electronic City, BTM Layout, HSR Layout, Marathahalli: These neighborhoods account for nearly 30% of the city's restaurants, with Whitefield alone comprising 10%. Known for their vibrant youth culture and burgeoning tech industry, these areas are ideal for casual dining and quick-service restaurants. Investors should focus on creating hip, affordable venues that cater to students, young professionals, and tech workers.

High-Spending Customer Zones

Sankey Road, Lavelle Road, Race Course Road, Infantry Road: These affluent areas attract customers with higher spending power, making them suitable for upscale dining establishments. Restaurants here should offer gourmet cuisine, exceptional service, and a premium ambiance to meet the expectations of discerning diners. Innovative and exclusive dining concepts will thrive in these high-value zones.

Engagement-Driven Destinations

Rajarajeshwari Nagar, Lavelle Road, Church Street: Known for high engagement and vibrant atmospheres, these areas attract patrons seeking unique culinary experiences. Establishments should focus on creating interactive and memorable dining experiences, such as themed decor, live performances, or fusion menus that highlight both global and local flavors.

Conclusion

Bangalore's diverse neighborhoods offer varied opportunities for restaurant investments. By aligning restaurant concepts with the unique characteristics and customer profiles of each area, investors can optimize market reach and profitability. Understanding local consumer behavior, leveraging the city's tech-driven innovation, and maintaining cultural relevance will be key to successful ventures in this bustling metropolis.

Restaurant Types

Chart: Most Common Restaurant Types

Most Common Restaurant Types

Bar Chart: Relation Between Type, Ratings, Votes, and Scaled Number of Outlets

Relation Between Type, Ratings, Votes, and Scaled Number of Outlets

Radar Charts: Characteristics of Each Type

Characteristics of Each Restaurant Type

Specializations in the Food Sector

Bangalore's food scene is vibrant and diverse, encompassing a wide range of dining options. The market is primarily divided into three main categories:

  • Quick Bites (40%): This category includes fast-food outlets and quick-service restaurants. While it constitutes a significant portion of the market, it lacks strong customer engagement.
  • Casual Dining and Cafes: These segments together account for about two-thirds of the food market. Casual Dining and Cafes have higher customer engagement, suggesting that diners prefer a mix of convenience and a pleasant dining atmosphere.
  • Irani Cafes: These are currently trending due to their engaging atmosphere, reasonable pricing, and high ratings (up to 4.4). Irani Cafes offer unique dining experiences, filling a niche in the market with limited competition.

Investment Opportunities

  • Fine Dining: Although this sector is the most expensive and currently has limited customer interest, it presents an opportunity for experienced investors to develop exceptional dining experiences for high-end customers.
  • Drinks and Nightlife: Pubs, microbreweries, and clubs are popular among younger demographics and show strong demand. Investing in these venues can be lucrative due to their popularity and relatively good pricing.

Customer Types and Best Locations

  • Young Professionals and Millennials: This group favors microbreweries, pubs, and clubs. Ideal locations for these venues are busy areas with vibrant nightlife.
  • Families and Casual Diners: Families prefer Casual Dining and Cafes, which are best situated in suburban areas with a community-oriented vibe.
  • Wealthy and Special Occasion Diners: Fine Dining establishments cater to high-income individuals and special events. These should be located in upscale neighborhoods or near cultural landmarks.
  • Culture Lovers: Irani Cafes appeal to those interested in traditional and cultural dining experiences. These cafes perform well in historical areas that complement their cultural theme.

Restaurant Types Distribution

Below is a table showing the distribution of various restaurant types across different locations in Bangalore. Each location is color-coded for clarity:

Location Restaurant Type Count
Yeshwantpur Quick Bites 385
Yeshwantpur Casual Dining 177
Yeshwantpur Delivery 117
Yeshwantpur Cafe 53
Yeshwantpur Dessert Parlor 45
Yeshwantpur Food Court 31
Yeshwantpur Bakery 27
Yeshwantpur Bar 26
Yeshwantpur Sweet Shop 15
Wilson Garden Takeaway 61
Wilson Garden Kiosk 9
Wilson Garden Mess 8
Whitefield Beverage Shop 43
Whitefield Pub 13
Whitefield Lounge 8
Whitefield Microbrewery 7
Whitefield Fine Dining 5
Whitefield Confectionery 3
Whitefield Dhaba 2
Whitefield Pop Up 1
West Bangalore Food Truck 15
Sankey Road Club 1
Lavelle Road Irani Cafe 1
Kalyan Nagar Meat Shop 1
ITPL Main Road, Whitefield Bhojanalya 1

Location types Analysis

  • Yeshwantpur: Diverse and affordable, appealing to the middle class.
  • Wilson Garden: Budget-friendly and fast food, suited for economical diners.
  • Whitefield: Mid-to-high budget, attracting both upper-middle-class and lower-high-class individuals, with a focus on nightlife.
  • West Bangalore: Food trucks offering diverse options, drawing a broad demographic.
  • Sankey Road, Lavelle Road, Kalyan Nagar, ITPL Main Road: Specialized dining experiences catering to niche markets.

Dishes Analysis

Most Preferred Dishes by Restaurant Type

Below is a table showcasing the most preferred dishes for each restaurant type, ranked from most to least popular:

Restaurant Type 1st Preference 2nd Preference 3rd Preference
Quick Bites Paratha Burgers Rolls
Casual Dining Biryani Pasta Cocktails
Delivery Paratha Biryani Chicken Biryani
Cafe Pasta Burgers Coffee
Dessert Parlor Waffles Coffee Hot Chocolate
Food Court Burgers Noodles Pasta
Bakery Sandwiches Coffee Chocolate Cake
Bar Cocktails Beer Pizza
Sweet Shop Chaat Samosa Rasmalai
Takeaway Paratha Biryani Salad
Kiosk Rolls Pasta Pav Bhaji
Mess Chicken Biryani Chicken Fry Masala Prawn
Beverage Shop Sandwiches Thick Shakes Faluda
Pub Cocktails Beer Pizza
Lounge Cocktails Nachos Beer
Microbrewery Cocktails Craft Beer Pizza
Fine Dining Salads Pasta Cocktails
Dhaba Naan Rumali Roti -
Food Truck Pizza Biryani Momos
Club Cocktails Salads Biryani
Irani Cafe Okra Pancakes Cocktails

Cuisine Analysis

The following bar charts and radar charts provide insights into the market dynamics of various cuisines in Bangalore:

1. Cumulative Share of Each Cuisine

Cumulative Share of Each Cuisine

2. Relationship Between Cuisines and Rating, Votes, and Cost (Sorted by Cost)

Relationship Between Cuisines and Rating, Votes, and Cost (Sorted by Cost)

3. Relationship Between Cuisines and Rating, Votes, and Cost (Sorted by Votes)

Relationship Between Cuisines and Rating, Votes, and Cost (Sorted by Votes)

4. Characteristics of Each Cuisine (Radar Chart)

Characteristics of Each Cuisine (Radar Chart)

This analysis examines the market dynamics of various cuisines in Bangalore, focusing on engagement levels, pricing strategies, and investment potential. Bangalore's cosmopolitan nature creates a diverse culinary landscape, offering opportunities for both local and foreign cuisines to thrive.

Cantonese Cuisine

Analysis:

  • Engagement: High
  • Pricing: Premium
  • Target Audience: Affluent individuals seeking exclusive dining experiences.
  • Profitability: Significant due to high pricing, but the customer base is niche.

Recommendation:

  • Investment Strategy: Invest in Cantonese cuisine by emphasizing targeted marketing and exclusive dining experiences. The high price point limits the customer base but ensures high returns per customer.
  • Explanation: High engagement despite premium pricing indicates strong demand among affluent consumers who value authentic experiences. This niche market can be highly profitable but requires targeted strategies to attract and retain customers.

German Cuisine

Analysis:

  • Engagement: High
  • Spreading: Low
  • Pricing: Lower than Cantonese
  • Target Audience: Middle-income groups looking for authentic yet affordable experiences.
  • Profitability: Balanced between exclusivity and mass appeal.

Recommendation:

  • Investment Strategy: Focus on providing authentic experiences at competitive prices to attract a broad demographic.
  • Explanation: German cuisine’s lower price point and moderate engagement suggest it is accessible to a larger audience compared to high-end options. This balance can attract middle-income groups while maintaining profitability.

Sri Lankan, Parsi, and Russian Cuisines

Analysis:

  • Engagement: High
  • Spreading: Low
  • Pricing: Medium
  • Target Audience: Diners interested in cultural diversity and unique flavors.
  • Profitability: Steady returns with a focus on authenticity and distinctive experiences.

Recommendation:

  • Investment Strategy: Emphasize cultural authenticity and unique offerings to maintain high engagement.
  • Explanation: Medium pricing combined with high engagement indicates a strong interest in diverse culinary experiences. By highlighting authenticity and unique flavors, these cuisines can sustain their appeal and provide steady returns.

Singaporean Cuisine

Analysis:

  • Engagement: Growing
  • Spreading: Low
  • Pricing: Moderate
  • Target Audience: Indian audiences interested in diverse culinary experiences.
  • Profitability: Promising, with increasing appeal due to unique flavor profiles and fusion influences.

Recommendation:

  • Investment Strategy: Leverage strategic marketing, including culinary festivals and pop-up events, to enhance visibility and engagement.
  • Explanation: Singaporean cuisine's emerging popularity aligns with growing interest in diverse food options. Strategic marketing and events can capitalize on this trend and boost engagement.

Foreign vs. Local Cuisines

Analysis:

  • Foreign Cuisines: Generally attract high engagement and can command premium pricing. There is strong market openness to international flavors.
  • Local Cuisines: Despite comprising 30% of the restaurant market, face saturation and reduced engagement. Consumers seek novelty.

Recommendation:

  • Investment Strategy: For local cuisines, focus on innovative approaches such as new regional specialties or fusion dishes.
  • Explanation: The saturation of local cuisines like Northern and Southern Indian reduces consumer engagement. Introducing novel options can rejuvenate interest and offer a competitive edge.

Chinese Cuisine

Analysis:

  • Engagement: Low
  • Spreading: High
  • Pricing: Variable, often affordable
  • Target Audience: Wide-ranging, with a taste for fusion flavors.
  • Popularity: Ranks second to Northern Indian cuisine.

Recommendation:

  • Investment Strategy: Continue investing in Chinese cuisine by leveraging its established popularity and introducing innovative dishes.
  • Explanation: The strong market presence and adaptability of Chinese cuisine, coupled with its affordability, contribute to its sustained popularity. Innovative offerings can further enhance its market position.

African Cuisine

Analysis:

  • Engagement: Low but with potential for growth
  • Spreading: Low
  • Pricing: Variable
  • Target Audience: Health-conscious and adventurous diners.
  • Profitability: High potential due to low competition; aligning with current health trends.

Recommendation:

  • Investment Strategy: Develop a robust marketing strategy focusing on cultural festivals and events to increase engagement and build a loyal customer base.
  • Explanation: Despite currently low engagement, African cuisine's rich flavors and health-oriented offerings align with consumer trends towards diverse and healthy eating. Effective marketing can tap into this potential.

Local Cuisines: Northern and Southern Indian

Analysis:

  • Market Share: 30% of restaurants
  • Competition: High
  • Engagement: Reduced due to saturation

Recommendation:

  • Investment Strategy: Innovate within local cuisines by introducing new regional specialties or fusion dishes.
  • Explanation: The high level of competition and saturation in local cuisines necessitates differentiation. Innovation is key to capturing consumer interest and staying relevant in the market.

Journal Article Insights

Study:

  • Title: "Restaurants in Little India, Singapore: A Study of Spatial Organization and Pragmatic Cultural Change"
  • Findings: Offers insights into how restaurants adapt to cultural changes and spatial organization.

Application:

  • Strategy: Apply insights to organize and position restaurants in Bangalore effectively. Understanding spatial and cultural adaptations will enhance the effectiveness of foreign cuisine offerings.
  • Explanation: Adapting restaurant setups based on cultural and spatial insights can improve market positioning and customer appeal.

Conclusion

Bangalore's diverse food scene presents substantial investment opportunities in both foreign and local cuisines. While foreign cuisines like Cantonese, German, Singaporean, and African offer promising prospects due to their unique appeal and engagement, local cuisines require innovative approaches to capture consumer interest in a saturated market. Strategic investments in marketing and unique culinary experiences are crucial for success.

Classification Analysis of Restaurants

This analysis aims to classify restaurants into distinct categories: Low, Mid, Upper Mid, Lower High, and High Class, using unsupervised machine learning techniques. K-Means clustering was applied, and Principal Component Analysis (PCA) was utilized for 2D visualization of the clusters. The features used for clustering include: 'book_table', 'dish_liked', 'rest_type', 'cuisines', 'votes', and 'approx_cost(for two people)'.

K-Means Clustering

K-Means clustering was employed to group restaurants into five distinct clusters. The choice of cluster configuration was tested to ensure the desired results were achieved. The following line plot illustrates the progression of the clustering process and the final results:

K-Means Clustering Line Plot

Cluster Validation

To confirm the clustering results, a scatter plot was used to visualize the proximity of clusters in a 2D space:

K-Means Clustering Scatter Plot

Principal Component Analysis (PCA) was utilized to reduce the dimensionality of the data and visualize the clusters in a 2D plane. The following plot shows the clusters in the PCA-reduced space:

PCA Visualization of Clusters

Cluster Quality

The silhouette score test was performed to assess the quality of the clusters. The results indicate the cohesiveness and separation of the clusters:

Silhouette Score Plot

Characteristics of Clusters

Radar charts were used to visualize the characteristics of each cluster, providing insights into their distinct attributes:

Radar Charts of Cluster Characteristics

Summary of Findings:

  • Low Class: Characterized by high traffic and spreading across the city. These restaurants offer balanced online order features, very low variety in specialization and dishes (cuisines), low pricing, and low ratings. They do not offer table booking.
  • Mid Class: Exhibits balanced traffic, high online order capabilities, above-average variety in dishes, but low variety in specializations. Pricing is higher than the Low Class but still low. These restaurants have low ratings and do not offer table booking.
  • Upper Mid Class: Shows high traffic and engagement but very low variety in specialization, high variety in dishes, average pricing, and rating. These restaurants offer table booking.
  • Lower High Class: Characterized by low traffic but very high engagement. They have average variety in specializations, high variety in dishes, above-average pricing, high ratings, and average table booking with low online orders.
  • High Class: Features very low traffic and low engagement, no online order capabilities, above-average table booking, average rating, very expensive pricing, high variety in dishes, and high variety in specialization.

Overall, the analysis provides a comprehensive classification of restaurants, allowing for targeted marketing strategies and investment decisions based on the restaurant's class.

Customer Behavior Analysis with Review Aspect Sentiment Analysis

Overview

In analyzing customer behavior in Bengaluru's restaurant industry, our findings reveal that the number of reviews is skewed toward higher-end establishments, which is expected since budget restaurants typically prioritize affordability over experience and quality. To accurately compare customer sentiments across different restaurant categories, it's essential to set a clear price range that defines each class.

Interestingly, there's a noticeable drop in the number of reviews in the range of approximately 1,500 to 2,300 reviews. This unusual pattern warrants further investigation to understand its underlying causes.

Additionally, word cloud analysis shows that the overall experience is as crucial as the food itself, sometimes even more so, particularly in higher-end establishments. However, it's important to note that these conclusions are primarily relevant to higher-class restaurants due to the bias in the data toward these types of establishments.

Tools and Techniques

For this analysis, I used SpaCy and NLTK to balance speed with accuracy, enabling a swift but reliable sentiment analysis of the review data.

Price Range and Review Distribution

I visualized the relationship between the number of reviews and the number of restaurants for each price point using a line chart with a confidence interval.

Price Range and Review Distribution

Word Cloud Analysis: Most Repeated Nouns in Reviews

An analysis of the most repeated nouns in reviews was conducted and visualized using a word cloud. This analysis helps to understand what aspects customers focus on the most in their reviews.

Word Cloud of Most Repeated Nouns in Reviews

Price Range Analysis: 1,700–2,300 INR

Over 90% of the classes in this price range cater to Mid, Upper-Mid, and High-Class groups. However, the engagement and ratings are lower because the area is equally divided between High-class and Mid-class residents. This makes it challenging to meet both groups' expectations in terms of quality and budget.

What classes fall in this price range? The following pie chart illustrates the distribution of different classes within this price range.

Class Distribution in Price Range 1,700-2,300 INR

Analysis of Restaurant Types and Pricing for 1,700–2,300 price range

The following analysis explores the frequency of each type of restaurant within each class and the average price for High and Mid-Class establishments. This analysis helps to identify which types of establishments are most prevalent in this price range and how their pricing strategies differ.

Frequency and Pricing of Restaurant Types in 1,700-2,300 INR Range

High class has the highest share in this price range, with Dine-out being the most common type among all categories.

Average Price Analysis for all Classes Types

Now, let’s look at the average price for Classes types in this range.

Average Price for High and Mid-Class Types Average Price Comparison for High and Mid-Class Types

Review Analysis by Class in this range 1,700–2,300

Next, we analyze the reviews to understand customer opinions on ambiance, service, food quality, price, special features, and cleanliness for each type in this class.

For High-Class

Radar Chart for High-Class

For Lower-High Class

Radar Chart for Lower-High Class

For Mid-Class

Radar Chart for Mid-Class

For Lower-Mid Class

Radar Chart for Lower-Mid Class

The review analysis presented here should be viewed with caution due to inherent biases. The data is skewed towards specific customer segments, as not everyone chooses to leave reviews. Additionally, the analysis prioritized speed over accuracy, making it a useful starting point but not a comprehensive assessment of all review data.

Price Range Analysis (1500-2300 INR):

In the 1500-2300 INR price range, we've identified that this segment spans multiple customer classes, creating challenges in meeting diverse expectations. If restaurants focus on enhancing food quality and the overall experience to satisfy higher-end customers, prices might become unsatisfactory for lower-end customers, and vice versa.

However, establishments within this price range that cater to the lower-high class, particularly in categories like Dine-out, Drinks & Nightlife, and Pubs and Bars, tend to perform better. Despite this, overall review sentiment remains less positive compared to other segments, highlighting the difficulty of catering to a diverse customer base within this range.

Recommendations:

For investors targeting this price range, it is essential to carefully select the location and decor to align with the desired customer class. Additionally, the marketing team should intensify efforts to effectively target the right customer segment.

What are Reviews Insights About Each Class on All Price Ranges?

The radar chart below shows the aspect sentiment analysis for each class:

Higher Class

Higher Class Radar Chart

Lower-High Class

Lower-High Class Radar Chart

Upper-Mid Class

Upper-Mid Class Radar Chart

Middle Class

Middle Class Radar Chart

Lower Class

Lower Class Radar Chart

Analysis of Customer Satisfaction by Class and Type

High Class:

For the high class, there is a noticeable similarity in customer satisfaction across Delivery, Cafes, and Drinks & Nightlife categories. This similarity might be due to the convenience and premium experience these options offer, catering well to the preferences of this segment. However, Buffet services stand out with higher satisfaction levels compared to other types, possibly because they offer a variety of options that appeal to this class. On the other hand, Dine-out experiences show poor satisfaction, particularly regarding pricing, likely because the high prices may not align with the perceived value for this class.

Recommendations for Investors:

For investors targeting the high class, focusing on enhancing the buffet experience could be promising. It’s also advisable to reconsider pricing strategies for Dine-out services to better meet the expectations of this segment.

Lower-High Class:

In the lower-high class, there is a similarity in satisfaction levels across Dine-out, Pubs & Bars, and Drinks & Nightlife categories, but all of these show lower satisfaction regarding pricing. This may be because the pricing in these categories doesn't match the perceived value for this group. Desserts also show poor satisfaction, indicating a potential area for improvement.

Recommendations for Investors:

Investors could focus on improving pricing strategies, food quality, special features, and cleanliness in Dessert offerings. Additionally, enhancing the Buffet experience with better pricing and cleanliness could attract this class.

Upper-Mid Class:

The upper-mid class shows similarities in satisfaction across Delivery, Cafes, and Dine-out categories, with generally acceptable satisfaction except for pricing. However, Desserts and Drinks & Nightlife categories show poor satisfaction, particularly regarding pricing and cleanliness. The best satisfaction levels are seen in Pubs & Bars and Buffets.

Recommendations for Investors:

For this segment, maintaining the high standards in Pubs & Bars and Buffets is key. Meanwhile, improving pricing and cleanliness in Desserts and Drinks & Nightlife could yield better satisfaction.

Mid Class:

The mid class displays poor satisfaction in Drinks & Nightlife and medium satisfaction in Pubs & Bars. In contrast, Buffets receive high satisfaction, but cleanliness is a concern. For Drinks & Nightlife and Pubs & Bars, satisfaction with food quality, services, and special features is moderate, suggesting areas for potential improvement.

Recommendations for Investors:

Investors should focus on improving cleanliness in Buffets and enhancing the overall experience in Drinks & Nightlife and Pubs & Bars, particularly in food quality and special features.

Low Class:

The low class has the least number of reviews, but the analysis shows significant satisfaction in Delivery, Desserts, Dine-out, Cafes, and Buffets. However, Buffet pricing shows very poor satisfaction, and Dessert pricing has only medium satisfaction. Overall, there is low satisfaction with Desserts across all aspects, and Drinks & similar venues show medium to poor satisfaction with pricing across all classes.

Recommendations for Investors:

For the low class, investors should focus on addressing pricing concerns in Buffets and Drinks with all types and improving the overall Dessert experience, including pricing and quality. These could be potential areas for gaining a competitive advantage.

Prediction of Price Using Machine Learning

In this section, we explore different machine learning models to predict restaurant prices. The process involves several steps, including data preparation, model selection, and performance evaluation.

Data Preparation

  1. Data Cleaning:
    • Removed columns: url, address, name, rest_type, dish_liked, cuisines, reviews_list, review_sentiment_list, menu_item.
    • Applied binary encoding to columns: online_order, book_table, is_new, is_road.
    • Appplying Ordinal Encoding for Classes
    • Remove highest resturant in cost because it is outlier and effect the model negatively
    • Split the data into training and test sets to prevent target encoding leakage.
  2. Target Encoding:
    • Applied target encoding to categorical features: rest_type_0, rest_type_1, cuisines_0, cuisines_1, cuisines_2, cuisines_3, cuisines_4, cuisines_5, cuisines_6, cuisines_7, dish_liked_0, dish_liked_1, dish_liked_2, dish_liked_3, dish_liked_4, dish_liked_5, dish_liked_6, listed_in(type), listed_in(city), location.
    • Summarized each group of columns into single columns and created a feature map to replace values in the test set.
  3. Correlation Analysis:
    • Reviewed heatmap for correlation between features.
    Heatmap
  4. Variance Inflation Factor (VIF):
    • Reviewed VIF scores to address multicollinearity. Key features with high VIF values were considered for feature selection.
    Feature VIF
    online_order2.580422
    book_table3.697239
    rate100.615671
    votes1.767582
    location31.621818
    approx_cost(for two people)12.031427
    listed_in(type)15.694212
    listed_in(city)58.392958
    is_new1.153140
    weighted_rating394.177513
    is_rate_valid20.250555
    is_road1.334013
    count4.735270
    lat100721.311388
    lon101671.964833
    num_spec23.435287
    num_dish_liked40.287972
    num_reviews1.244345
    num_menu_item1.360053
    num_cuisines34.303792
    cluster6.643466
    classes29.319548
    rest_type15.245586
    cuisines25.992990
    dish_liked35.868954
  5. Selected Features:
    • Based on VIF and heatmap analysis, the final features selected were: rest_type, cuisines, approx_cost(for two people), classes, num_spec, num_cuisines, num_dish_liked, listed_in(type), listed_in(city), num_reviews, book_table, online_order, votes, is_rate_valid.
  6. Standardization:
    • Applied StandardScaler to normalize feature values, ensuring that larger values have a comparable effect to smaller values.

Model Performance

  1. Random Forest:
    • Configuration: 600 estimators.
    • Results:
      • Training Score: 0.9685
      • R² Score: 0.7961
      • MAE: 111.75
      • MSE: 26100.80
      • RMSE: 161.56
    Random Forest Results
  2. XGBoost:
    • Configuration:
      • objective: 'reg:squarederror'
      • learning_rate: 0.1
      • max_depth: 6
      • alpha: 10
      • n_estimators: 100
      • eval_metric: 'rmse'
    • Results:
      • MAE: 110.13
      • MSE: 25320.74
      • RMSE: 159.12
      • R² Score: 0.8022
    XGBoost Results
  3. Neural Network:
    • Model Structure:
      • Input Layer: Dense layer with 13 units, activation function: tanh
      • Hidden Layer 1: Dense layer with 32 units, activation function: tanh, BatchNormalization, Dropout (0.18)
      • Hidden Layer 2: Dense layer with 64 units, activation function: tanh, BatchNormalization, Dropout (0.18)
      • Hidden Layer 3: Dense layer with 32 units, activation function: tanh, BatchNormalization, Dropout (0.18)
      • Output Layer: Dense layer with 1 unit
    • Model Configuration:
      • Optimizer: AdamW, learning rate: 0.1
      • Loss Function: Mean Squared Error
      • Epochs: 100000
      • Batch Size: 1024
    • Results:
      • MAE: 112.15
      • MSE: 27070.94
      • RMSE: 164.53
      • R² Score: 0.7885
    Neural Network Results

Future Improvements

  1. Increase Review Volume and Enhance Aspect Analysis: Gather more reviews and utilize advanced aspect analysis using a paid LLM model. This will add more complexity to the model, leading to more accurate predictions.
  2. Improve Address Accuracy: Obtain more precise addresses or correct existing ones. This will help reveal detailed location information, which can significantly impact price predictions.
  3. Incorporate Detailed Menu Pricing: Instead of predicting an approximation for the entire menu, obtain detailed menus with item-specific prices. This approach will enhance the accuracy of price predictions.
  4. Expand Data Collection: Collect additional data to improve the overall model performance and prediction accuracy.

About

Restaurant Market Analysis & Predictive Pricing Model

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published