Please add alt text to your posts

Please add alt text (alternative text) to all of your posted graphics for #TidyTuesday.

Twitter provides guidelines for how to add alt text to your images.

The DataViz Society/Nightingale by way of Amy Cesal has an article on writing good alt text for plots/graphs.

Here's a simple formula for writing alt text for data visualization:

Chart type

It's helpful for people with partial sight to know what chart type it is and gives context for understanding the rest of the visual. Example: Line graph

Type of data

What data is included in the chart? The x and y axis labels may help you figure this out. Example: number of bananas sold per day in the last year

Reason for including the chart

Think about why you're including this visual. What does it show that's meaningful. There should be a point to every visual and you should tell people what to look for. Example: the winter months have more banana sales

Link to data or source

Don't include this in your alt text, but it should be included somewhere in the surrounding text. People should be able to click on a link to view the source data or dig further into the visual. This provides transparency about your source and lets people explore the data. Example: Data from the USDA

Penn State has an article on writing alt text descriptions for charts and tables.

Charts, graphs and maps use visuals to convey complex images to users. But since they are images, these media provide serious accessibility issues to colorblind users and users of screen readers. See the examples on this page for details on how to make charts more accessible.

The {rtweet} package includes the ability to post tweets with alt text programatically.

Need a reminder? There are extensions that force you to remember to add Alt Text to Tweets with media.

Formula 1 Races

The data this week comes from the Ergast API, which has a CC-BY license. H/t to Sara Stoudt for sharing the link to the data by way of Data is Plural!

FiveThirtyEight published a nice article on "Who's The Best Formula One Driver Of All Time?". While the ELO data is not present in this dataset, you could calculate your own rating or using the {elo} package to create ELO scores.

Per Wikipedia, Formula 1:

Formula One (also known as Formula 1 or F1) is the highest class of international auto racing for single-seater formula racing cars sanctioned by the Fédération Internationale de l'Automobile (FIA). The World Drivers' Championship, which became the FIA Formula One World Championship in 1981, has been one of the premier forms of racing around the world since its inaugural season in 1950. The word formula in the name refers to the set of rules to which all participants' cars must conform. A Formula One season consists of a series of races, known as Grands Prix, which take place worldwide on both purpose-built circuits and closed public roads.

The results of each race are evaluated using a points system to determine two annual World Championships: one for drivers, the other for constructors. Each driver must hold a valid Super Licence, the highest class of racing licence issued by the FIA. The races must run on tracks graded "1" (formerly "A"), the highest grade-rating issued by the FIA. Most events occur in rural locations on purpose-built tracks, but several events take place on city streets.

Each team can be called a "constructor" and they have two drivers. For example, Lewis Hamilton is the primary (driver) for the Mercedes team (constructor).

License

Complete images of the Ergast database are published shortly after each race under the Attribution-NonCommercial-ShareAlike 3.0 Unported Licence.

Data

There is an option for raw CSVs (which is what is included in this repo), a SQL database, or querying the raw API. This is a great dataset to practice with using the httr package to query an API, SQL against the database or dbplyr against the database! You can also work with the raw CSVs and practice your dplyr::left_join() and friends. Read more about dplyr joins in the dplyr "joins" documentation.

If you wish to query the raw API, you can check the docs for example, to get a table of all drivers who have ever finished #1 in the championship: http://ergast.com/mrd/methods/status/. There is also an option to download the SQL database itself.

download.file(
  "http://ergast.com/downloads/f1db_ansi.sql.gz", 
  destfile = "f1db-mysql.zip"
)

Working with this data will require you to do several left joins, for example to get the standings for each race/driver. Each of the tables listed in the data dictionary have their keys for joining. If you don't want to dig too deep into the data, then I would recommend starting here. This is a good dataset of results by race, driver, season!

driver_results_df <- driver_standings %>% 
  left_join(races, by = "raceId") %>% 
  rename(driver_url = url) %>% 
  left_join(drivers, by = "driverId")
  
glimpse(driver_results_df)

#> Rows: 33,206
#> Columns: 22
#> $ driverStandingsId <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, #> 14, 1…
#> $ raceId            <dbl> 18, 18, 18, 18, 18, 18, 18, 18, 19, 19, 19, #> 19, …
#> $ driverId          <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, #> 8, …
#> $ points            <dbl> 10, 8, 6, 5, 4, 3, 2, 1, 14, 11, 6, 6, 10, 3, #> 2,…
#> $ position          <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 1, 3, 6, 7, 4, 9, 10, #> 2,…
#> $ positionText      <chr> "1", "2", "3", "4", "5", "6", "7", "8", "1", #> "3"…
#> $ wins              <dbl> 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, #> 1, …
#> $ year              <dbl> 2008, 2008, 2008, 2008, 2008, 2008, 2008, #> 2008, …
#> $ round             <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, #> 2, …
#> $ circuitId         <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, #> 2, …
#> $ name              <chr> "Australian Grand Prix", "Australian Grand #> Prix"…
#> $ date              <date> 2008-03-16, 2008-03-16, 2008-03-16, #> 2008-03-16,…
#> $ time              <chr> "04:30:00", "04:30:00", "04:30:00", #> "04:30:00", …
#> $ driver_url        <chr> #> "http://en.wikipedia.org/wiki/2008_Australian_Gr…
#> $ driverRef         <chr> "hamilton", "heidfeld", "rosberg", "alonso", #> "ko…
#> $ number            <chr> "44", "\\N", "6", "14", "\\N", "\\N", "\\N", #> "7"…
#> $ code              <chr> "HAM", "HEI", "ROS", "ALO", "KOV", "NAK", #> "BOU",…
#> $ forename          <chr> "Lewis", "Nick", "Nico", "Fernando", #> "Heikki", "…
#> $ surname           <chr> "Hamilton", "Heidfeld", "Rosberg", "Alonso", #> "Ko…
#> $ dob               <date> 1985-01-07, 1977-05-10, 1985-06-27, #> 1981-07-29,…
#> $ nationality       <chr> "British", "German", "German", "Spanish", #> "Finni…
#> $ url               <chr> #> "http://en.wikipedia.org/wiki/Lewis_Hamilton", "…

To query the raw API, you can use httr, just make sure to end the call/url in .json to return JSON data.

library(tidyverse)
library(jsonlite)
library(httr)

standing <- 1
raw_json <- httr::GET(url = glue::glue(
  "http://ergast.com/api/f1/driverStandings/{standing}/drivers.json")) %>% 
  content(type = "text", encoding = "UTF-8") %>% 
  jsonlite::parse_json(simplifyVector = FALSE) 
  
raw_json %>% 
  View()
  
winner_table <- raw_json$MRData$DriverTable$Drivers %>%
  tibble(data = .) %>%
  unnest_wider(data)
  
winner_table %>% glimpse()
#> Rows: 30
#> Columns: 8
#> $ driverId        <chr> "alonso", "mario_andretti", "ascari", "j…
#> $ permanentNumber <chr> "14", NA, NA, NA, "22", NA, NA, NA, NA, …
#> $ code            <chr> "ALO", NA, NA, NA, "BUT", NA, NA, NA, NA…
#> $ url             <chr> "http://en.wikipedia.org/wiki/Fernando_A…
#> $ givenName       <chr> "Fernando", "Mario", "Alberto", "Jack", …
#> $ familyName      <chr> "Alonso", "Andretti", "Ascari", "Brabham…
#> $ dateOfBirth     <chr> "1981-07-29", "1940-02-28", "1918-07-13"…
#> $ nationality     <chr> "Spanish", "American", "Italian", "Austr…

Get the data here

# Get the Data

# Read in with tidytuesdayR package 
# Install from CRAN via: install.packages("tidytuesdayR")
# This loads the readme and all the datasets for the week of interest

# Either ISO-8601 date or year/week works!

tuesdata <- tidytuesdayR::tt_load('2021-09-07')
tuesdata <- tidytuesdayR::tt_load(2021, week = 37)

results <- tuesdata$results

# Or read in the data manually

circuits <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-07/circuits.csv')
constructor_results <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-07/constructor_results.csv')
constructor_standings <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-07/constructor_standings.csv')
constructors <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-07/constructors.csv')
driver_standings <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-07/driver_standings.csv')
drivers <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-07/drivers.csv')
lap_times <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-07/lap_times.csv')
pit_stops <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-07/pit_stops.csv')
qualifying <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-07/qualifying.csv')
races <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-07/races.csv')
results <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-07/results.csv')
seasons <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-07/seasons.csv')
status <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-07/status.csv')

Data Dictionary

List of Tables
circuits
constructorResults
constructorStandings
constructors
driverStandings
drivers
lapTimes
pitStops
qualifying
races
results
seasons
status

General Notes
Dates, times and durations are in ISO 8601 format
Dates and times are UTC
Strings use UTF-8 encoding
Primary keys are for internal use only
Fields ending with "Ref" are unique identifiers for external use
A grid position of '0' is used for starting from the pitlane
Labels used in the positionText fields:
"D" - disqualified
"E" - excluded
"F" - failed to qualify
"N" - not classified
"R" - retired
"W" - withdrew

`circuits.csv`

Field	Type	Null	Key	Default	Extra	Description
circuitId	int(11)	NO	PRI	NULL	auto_increment	Primary key
circuitRef	varchar(255)	NO				Unique circuit identifier
name	varchar(255)	NO				Circuit name
location	varchar(255)	YES		NULL		Location name
country	varchar(255)	YES		NULL		Country name
lat	float	YES		NULL		Latitude
lng	float	YES		NULL		Longitude
alt	int(11)	YES		NULL		Altitude (metres)
url	varchar(255)	NO	UNI			Circuit Wikipedia page

constructor_results table

Field	Type	Null	Key	Default	Extra	Description
constructorResultsId	int(11)	NO	PRI	NULL	auto_increment	Primary key
raceId	int(11)	NO		0		Foreign key link to races table
constructorId	int(11)	NO		0		Foreign key link to constructors table
points	float	YES		NULL		Constructor points for race
status	varchar(255)	YES		NULL		"D" for disqualified (or null)

constructor_standings table

Field	Type	Null	Key	Default	Extra	Description
constructorStandingsId	int(11)	NO	PRI	NULL	auto_increment	Primary key
raceId	int(11)	NO		0		Foreign key link to races table
constructorId	int(11)	NO		0		Foreign key link to constructors table
points	float	NO		0		Constructor points for season
position	int(11)	YES		NULL		Constructor standings position (integer)
positionText	varchar(255)	YES		NULL		Constructor standings position (string)
wins	int(11)	NO		0		Season win count

constructors table

Field	Type	Null	Key	Default	Extra	Description
constructorId	int(11)	NO	PRI	NULL	auto_increment	Primary key
constructorRef	varchar(255)	NO				Unique constructor identifier
name	varchar(255)	NO	UNI			Constructor name
nationality	varchar(255)	YES		NULL		Constructor nationality
url	varchar(255)	NO				Constructor Wikipedia page

driver_standings table

Field	Type	Null	Key	Default	Extra	Description
driverStandingsId	int(11)	NO	PRI	NULL	auto_increment	Primary key
raceId	int(11)	NO		0		Foreign key link to races table
driverId	int(11)	NO		0		Foreign key link to drivers table
points	float	NO		0		Driver points for season
position	int(11)	YES		NULL		Driver standings position (integer)
positionText	varchar(255)	YES		NULL		Driver standings position (string)
wins	int(11)	NO		0		Season win count

drivers table

Field	Type	Null	Key	Default	Extra	Description
driverId	int(11)	NO	PRI	NULL	auto_increment	Primary key
driverRef	varchar(255)	NO				Unique driver identifier
number	int(11)	YES		NULL		Permanent driver number
code	varchar(3)	YES		NULL		Driver code e.g. "ALO"
forename	varchar(255)	NO				Driver forename
surname	varchar(255)	NO				Driver surname
dob	date	YES		NULL		Driver date of birth
nationality	varchar(255)	YES		NULL		Driver nationality
url	varchar(255)	NO	UNI			Driver Wikipedia page

lap_times table

Field	Type	Null	Key	Default	Description
raceId	int(11)	NO	PRI	NULL	Foreign key link to races table
driverId	int(11)	NO	PRI	NULL	Foreign key link to drivers table
lap	int(11)	NO	PRI	NULL	Lap number
position	int(11)	YES		NULL	Driver race position
time	varchar(255)	YES		NULL	Lap time e.g. "1:43.762"
milliseconds	int(11)	YES		NULL	Lap time in milliseconds

pit_stops table

Field	Type	Null	Key	Default	Description
raceId	int(11)	NO	PRI	NULL	Foreign key link to races table
driverId	int(11)	NO	PRI	NULL	Foreign key link to drivers table
stop	int(11)	NO	PRI	NULL	Stop number
lap	int(11)	NO		NULL	Lap number
time	time	NO		NULL	Time of stop e.g. "13:52:25"
duration	varchar(255)	YES		NULL	Duration of stop e.g. "21.783"
milliseconds	int(11)	YES		NULL	Duration of stop in milliseconds

qualifying table

Field	Type	Null	Key	Default	Extra	Description
qualifyId	int(11)	NO	PRI	NULL	auto_increment	Primary key
raceId	int(11)	NO		0		Foreign key link to races table
driverId	int(11)	NO		0		Foreign key link to drivers table
constructorId	int(11)	NO		0		Foreign key link to constructors table
number	int(11)	NO		0		Driver number
position	int(11)	YES		NULL		Qualifying position
q1	varchar(255)	YES		NULL		Q1 lap time e.g. "1:21.374"
q2	varchar(255)	YES		NULL		Q2 lap time
q3	varchar(255)	YES		NULL		Q3 lap time

races table

Field	Type	Null	Key	Default	Extra	Description
raceId	int(11)	NO	PRI	NULL	auto_increment	Primary key
year	int(11)	NO		0		Foreign key link to seasons table
round	int(11)	NO		0		Round number
circuitId	int(11)	NO		0		Foreign key link to circuits table
name	varchar(255)	NO				Race name
date	date	NO		0000-00-00		Race date e.g. "1950-05-13"
time	time	YES		NULL		Race start time e.g."13:00:00"
url	varchar(255)	YES	UNI	NULL		Race Wikipedia page

results table

Field	Type	Null	Key	Default	Extra	Description
resultId	int(11)	NO	PRI	NULL	auto_increment	Primary key
raceId	int(11)	NO		0		Foreign key link to races table
driverId	int(11)	NO		0		Foreign key link to drivers table
constructorId	int(11)	NO		0		Foreign key link to constructors table
number	int(11)	YES		NULL		Driver number
grid	int(11)	NO		0		Starting grid position
position	int(11)	YES		NULL		Official classification, if applicable
positionText	varchar(255)	NO				Driver position string e.g. "1" or "R"
positionOrder	int(11)	NO		0		Driver position for ordering purposes
points	float	NO		0		Driver points for race
laps	int(11)	NO		0		Number of completed laps
time	varchar(255)	YES		NULL		Finishing time or gap
milliseconds	int(11)	YES		NULL		Finishing time in milliseconds
fastestLap	int(11)	YES		NULL		Lap number of fastest lap
rank	int(11)	YES		0		Fastest lap rank, compared to other drivers
fastestLapTime	varchar(255)	YES		NULL		Fastest lap time e.g. "1:27.453"
fastestLapSpeed	varchar(255)	YES		NULL		Fastest lap speed (km/h) e.g. "213.874"
statusId	int(11)	NO		0		Foreign key link to status table

seasons table

Field	Type	Null	Key	Default	Extra	Description
year	int(11)	NO	PRI	0		Primary key e.g. 1950
url	varchar(255)	NO	UNI			Season Wikipedia page

status table

Field	Type	Null	Key	Default	Extra	Description
statusId	int(11)	NO	PRI	NULL	auto_increment	Primary key
status	varchar(255)	NO				Finishing status e.g. "Retired"

Cleaning Script

Not a real cleaning script, just me exploring the data structures.

library(tidyverse)
library(fs)
library(httr)

# if you want SQL with tables
# download.file(
#   "http://ergast.com/downloads/f1db_ansi.sql.gz", 
#   destfile = "2021/2021-09-07/f1db-mysql.zip"
#   )

download.file(
  "http://ergast.com/downloads/f1db_csv.zip", 
  destfile = "2021/2021-09-07/f1db.zip"
)

unzip("2021/2021-09-07/f1db.zip", exdir = "2021/2021-09-07/")

raw_data <- map(
  fs::dir_ls("2021/2021-09-07/", glob = "*.csv"),
  read_csv
  ) %>% 
  set_names(nm = str_remove(names(.), "2021/2021-09-07/"))

raw_data %>% 
  str(max.level = 1)

# Example of JSON/HTTR
raw_json <- httr::GET(url = glue::glue(
  "http://ergast.com/api/f1/driverStandings/{standing}/drivers.json")) %>% 
  content(type = "text", encoding = "UTF-8")

raw_json%>% 
  View()

raw_json$MRData$DriverTable$Drivers %>%
  tibble(data = .) %>%
  unnest_wider(data)

driver_standings %>% 
  left_join(raw_data$races.csv, by = "raceId") %>% 
  rename(driver_url = url) %>% 
  left_join(raw_data$drivers.csv, by = "driverId")

file_names <- fs::dir_ls("2021/2021-09-07/", glob = "*.csv") %>% 
  str_remove("2021/2021-09-07/") %>% 
  str_remove(".csv")
  
file_names

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

Please add alt text to your posts

Chart type

Type of data

Reason for including the chart

Link to data or source

Formula 1 Races

License

Data

Get the data here

Data Dictionary

`circuits.csv`

constructor_results table

constructor_standings table

constructors table

driver_standings table

drivers table

lap_times table

pit_stops table

qualifying table

races table

results table

seasons table

status table

Cleaning Script

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

Please add alt text to your posts

Chart type

Type of data

Reason for including the chart

Link to data or source

Formula 1 Races

License

Data

Get the data here

Data Dictionary

circuits.csv

constructor_results table

constructor_standings table

constructors table

driver_standings table

drivers table

lap_times table

pit_stops table

qualifying table

races table

results table

seasons table

status table

Cleaning Script

`circuits.csv`