-
Notifications
You must be signed in to change notification settings - Fork 47
/
README.Rmd
69 lines (50 loc) · 2.98 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/"
)
```
# datasauRus <img src="man/figures/logo.png" align="right" />
<!-- badges: start -->
[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
[![CRAN status](https://www.r-pkg.org/badges/version/datasauRus)](https://CRAN.R-project.org/package=datasauRus)
[![R-CMD-check](https://github.com/jumpingrivers/datasauRus/workflows/R-CMD-check/badge.svg)](https://github.com/jumpingrivers/datasauRus/actions)
[![R-CMD-check](https://github.com/jumpingrivers/datasauRus/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/jumpingrivers/datasauRus/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->
This package wraps the awesome Datasaurus Dozen datasets. The Datasaurus Dozen show us why visualisation is important -- summary statistics can be the same but distributions can be very different. In short, this package gives a fun alternative to [Anscombe's Quartet](https://en.wikipedia.org/wiki/Anscombe%27s_quartet), available in R as `anscombe`.
The original Datasaurus was created by Alberto Cairo. The other Dozen were generated using simulated annealing and the process
is described in the paper "Same Stats, Different Graphs: Generating
Datasets with Varied Appearance and Identical Statistics through
Simulated Annealing" by Justin
Matejka and George Fitzmaurice ([open access materials including manuscript and code](https://www.research.autodesk.com/publications/same-stats-different-graphs/), [official paper](https://doi.org/10.1145/3025453.3025912)).
In the paper, Justin and George simulate a variety of datasets that the same summary statistics to the Datasaurus but have very different distributions.
```{r, out.width="600px", fig.alt="Sequential dinosaur gif", echo = FALSE}
knitr::include_graphics("https://damassets.autodesk.net/content/dam/autodesk/research/publications-assets/gifs/same-stats-different-graphs/DinoSequentialSmaller.gif")
```
## Install
The latest stable version is available on CRAN
```{r, eval = FALSE}
install.packages("datasauRus")
```
You can get the latest development version from GitHub, so use {devtools} to install the package
```{r, eval = FALSE}
devtools::install_github("jumpingrivers/datasauRus")
```
## Usage
You can use the package to produce Anscombe plots and more.
```{r datasets, fig.height=12, fig.width=9}
library("ggplot2")
library("datasauRus")
ggplot(datasaurus_dozen, aes(x = x, y = y, colour = dataset))+
geom_point() +
theme_void() +
theme(legend.position = "none")+
facet_wrap(~dataset, ncol = 3)
```
## Code of Conduct
Please note that the datasauRus project is released with a [Contributor Code of Conduct](https://jumpingrivers.github.io/datasauRus/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms