The data dive continues!
Now, it's time to take what you've learned about Python Pandas and apply it to new situations. For this assignment, you'll need to complete one of two (not both) Data Challenges. Once again, which challenge you take on is your choice. Just be sure to give it your all -- as the skills you hone will become powerful tools in your data analytics tool belt.
-
Create a new repository for this project called
pandas-challenge
. Do not add this homework to an existing repository. -
Clone the new repository to your computer.
-
Inside your local git repository, create a directory for the Pandas Challenge you choose. Use folder names corresponding to the challenges: HeroesOfPymoli or PyCitySchools.
-
Add your Jupyter notebook to this folder. This will be the main script to run for analysis.
-
Push the above changes to GitHub or GitLab.
Congratulations! After a lot of hard work in the data munging mines, you've landed a job as Lead Analyst for an independent gaming company. You've been assigned the task of analyzing the data for their most recent fantasy game Heroes of Pymoli.
Like many others in its genre, the game is free-to-play, but players are encouraged to purchase optional items that enhance their playing experience. As a first task, the company would like you to generate a report that breaks down the game's purchasing data into meaningful insights.
Your final report should include each of the following:
- Total Number of Players
- Number of Unique Items
- Average Purchase Price
- Total Number of Purchases
- Total Revenue
- Percentage and Count of Male Players
- Percentage and Count of Female Players
- Percentage and Count of Other / Non-Disclosed
- The below each broken by gender
- Purchase Count
- Average Purchase Price
- Total Purchase Value
- Average Purchase Total per Person by Gender
- The below each broken into bins of 4 years (i.e. <10, 10-14, 15-19, etc.)
- Purchase Count
- Average Purchase Price
- Total Purchase Value
- Average Purchase Total per Person by Age Group
- Identify the the top 5 spenders in the game by total purchase value, then list (in a table):
- SN
- Purchase Count
- Average Purchase Price
- Total Purchase Value
- Identify the 5 most popular items by purchase count, then list (in a table):
- Item ID
- Item Name
- Purchase Count
- Item Price
- Total Purchase Value
- Identify the 5 most profitable items by total purchase value, then list (in a table):
- Item ID
- Item Name
- Purchase Count
- Item Price
- Total Purchase Value
As final considerations:
- You must use the Pandas Library and the Jupyter Notebook.
- You must submit a link to your Jupyter Notebook with the viewable Data Frames.
- You must include a written description of three observable trends based on the data.
- See Example Solution for a reference on expected format.