07-Model.Rmd

---
title: "Model"
output: html_notebook
---

```{r setup, message=FALSE}
library(tidyverse)
library(modelr)
library(broom)

wages <- heights %>% filter(income > 0)
```

## Your Turn 1

Fit the model on the slide and then examine the output. What does it look like?

```{r}
mod_e <- lm(log(income) ~ education, data = wages)
mod_e
```

## Your Turn 2

Use a pipe to model `log(income)` against `height`. Then use broom and dplyr functions to extract:

1. The **coefficient estimates** and their related statistics 
2. The **adj.r.squared** and **p.value** for the overall model

```{r, error = TRUE}
mod_h <- wages %>% lm(    )


```

## Your Turn 3

Model `log(income)` against `education` _and_ `height`. Do the coefficients change?

```{r, error = TRUE}
mod_eh <- wages %>% lm(    )

```

## Your Turn 4

Model `log(income)` against `education` and `height` and `sex`. Can you interpret the coefficients?

```{r, error = TRUE}
mod_ehs <- wages %>%  lm(   )
```

## Your Turn 5

Use a broom function and ggplot2 to make a line graph of `height` vs `.fitted` for our heights model, `mod_h`.

_Bonus: Overlay the plot on the original data points._

```{r}
mod_h <- wages %>% lm(log(income) ~ height, data = .)

```

## Your Turn 6

Repeat the process to make a line graph of `height` vs `.fitted` colored by `sex` for model `mod_ehs`. Are the results interpretable? Add `+ facet_wrap(~education)` to the end of your code. What happens?

```{r}
mod_ehs <- wages %>% lm(log(income) ~ education + height + sex, data = .)

```

## Your Turn 7

Use one of `spread_predictions()` or `gather_predictions()` to make a line graph of `height` vs `pred` colored by `model` for each of mod_h, mod_eh, and mod_ehs. Are the results interpretable? 

Add `+ facet_grid(sex ~ education)` to the end of your code. What happens?

```{r warning = FALSE, message = FALSE}
mod_h <- wages %>% lm(log(income) ~ height, data = .)
mod_eh <- wages %>% lm(log(income) ~ education + height, data = .)
mod_ehs <- wages %>% lm(log(income) ~ education + height + sex, data = .)


```

***

# Take Aways

* Use `glance()`, `tidy()`, and `augment()` from the **broom** package to return model values in a data frame.

* Use `add_predictions()` or `gather_predictions()` or `spread_predictions()` from the **modelr** package to visualize predictions.

* Use `add_residuals()` or `gather_residuals()` or `spread_residuals()` from the **modelr** package to visualize residuals.