-
Notifications
You must be signed in to change notification settings - Fork 0
/
proposal.Rmd
75 lines (60 loc) · 3.18 KB
/
proposal.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
---
title: "P8105 Final Project Proposal"
authors: "Melvin Coleman(mbc2178)"
date: "2022-11-12"
output: github_document
---
### Project Team
- Hilina Giday (hg2596)
- Taylor Bays (tab2187)
- Jiawen Zhao (jz3570)
- Sarah Forrest (sef2183)
- Melvin Coleman (mbc2178)
### Tentative Project Title
The 2018 Fédération Internationale de Football Association (FIFA) Men's World Cup:
Factors that predict the total amount of goals scored by each participating
nation during the tournament
### Motivation
The Fédération Internationale de Football Association (FIFA) Men's World Cup
is an international soccer competition that takes place every 4 years
and is contested by 32 national soccer teams of member nations. The World Cup is
one of the most watched sporting events in the world with tons of money put into
sponsoring teams, hosting and betting due to its popularity across the globe. This
year, the tournament will take place in Qatar between Nov. 20th and Dec. 18.
The tournament boasts of some of the world's best soccer athletes and we're interested
in understanding the factors that predict a team's performance on the pitch,the
amount of goals scored and/or wins during the tournament. The more games a team wins
increases its chances to win the entire tournament. With the World Cup taking place
in a few weeks, our project could help share some insight into predicting the nation
that might take perhaps the biggest sports trophy in the world.
### Intended Final Products
The intended final products will include:<br>
- A report with visualizations, exploratory analyses and methods <br>
- Webpage & Screencast
### Anticipated Data Source(s)
- [2018 World Cup Data](https://www.kaggle.com/datasets/ahmedelnaggar/fifa-worldcup-2018-dataset)
- [2018 GDP data](http://data.un.org/Data.aspx?q=GdP&d=SNAAMA&f=grID%3A101%3BcurrID%3AUSD%3BpcFlag%3A1)
- [The Fjelstul World Cup Database](https://github.com/jfjelstul/worldcup)
### Planned Analyses / Visualizations / Coding Challenges
Our planned analyses include:<br>
- Regression modeling <br>
Outcome variable: Total number of goals scored by each team
Predictors: GDP, population, fifa ranking, etc...<br>
Visualizations <br>
- Bar plot of GDP for each country or scatter plot of GDP vs. total goals scored<br>
- Line chart of winning trend in World Cup for each country from 1930 to 2018 <br>
- Box plot of number of goals in all past games for each country <br>
Some coding challenges we may encounter could be problems with small sample size
because the World Cup only has 32 comnpeting countries, creating interactive plots
for our websites and scheduling challenges and availability to meet and/or code
as a group.
### Timeline
- November 11, 2022: Proposal Meeting
- November 12, 2022: Turn in Proposal to Professor
- November 14 -18, 2022: Filter through the data and finalize what we would
need to do to adequately clean the data
- November 28, 2022 - December 2: Start project and begin writing a written report
giving the detailed project description.
- December 5-9, 2022: Create a web page overview of the project and create a
video detailing the details of the webpage.
- December 15, 2022: In class discussion of project