README.Rmd

---
title: "US Hardship Index"
output:
  md_document:
    variant: markdown_github

  rmarkdown::html_vignette:
    self_contained: no
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r opts, echo = FALSE}
knitr::opts_chunk$set (
    collapse = TRUE,
    warning = TRUE,
    message = TRUE,
    width = 120,
    comment = "#>",
    fig.retina = 2,
    fig.path = "README-"
)
```

[![R build
status](https://github.com/UrbanAnalyst/us-hardship-index/workflows/R-CMD-check/badge.svg)](https://github.com/UrbanAnalyst/us-hardship-index/actions?query=workflow%3AR-CMD-check)
[![Project Status: Active](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)


# US Hardship Index

`us-hardship-index` is an R package for calculating a hardship index for
specified US states, using data from the US Census Bureau's *American Community
Survey*. It includes the single primary function `hs_hardship_index()`, which
accepts the two parameters of:

- "state" for the desired US state; and
- "year" for the desired year (with values available since 2010).

The index itself is calculated from the methodology of "*An Update on Urban
Hardship*," by Lisa M. Monteil, Richard P. Nathan, and David J. Wright (2004),
*The Nelson A. Rockefeller Institute of Goverment*. It is formed as a multiple
of the following six measures, each standardised to unit (or percentage)
scales:

- *occupancy*: Proportion of rooms with > 1 occupant per room;
- *poverty*: Proportion of households below poverty line;
- *unemployment*: Proportion of unemployed adults;
- *no_hs*: Proportion of population without highschool diploma
- *deps*:  Proportion of population who may be considered dependent; that is
  either under 18 or over 65
- *income*: Per-capita income

All variables are quantified such that lower values are better, except for
income. The hardship index is then simply the product of those six metrics,
again standardised to a unit (or percentage) scale. An example of the index in
action is provided by [the city of Chicago](
https://data.cityofchicago.org/Health-Human-Services/Census-Data-Selected-socioeconomic-indicators-in-C/kn9c-c2s2/data),
as a table of the six metrics plus their conversion to a composite hardship
index.

Note that those Chicago data are ultimately transformed into a single rank for
each measured area, and so manifest a perfectly uniform distribution. In
contrast, the values returned by the `hs_hardship_index()` function are not
transformed, and so generally manifest highly skewed distributions. The
logarithm of these values will nevertheless generally be approximately normally
distributed. Accordingly, any statistical analyses of hardship values should
generally be applied to log-transformed versions of the values derived here.

## Access to Census Bureau Data

This package requires an API key for census.gov, which can be obtained from
[the Census Bureau's website](https://api.census.gov/data/key_signup.html).
This key should be stored as an environment variable named `CENSUS_API_KEY`,
generally by specifying its value in the `~/.Renviron` file. Alternatively, the
key can be set with the [`tidycensus` function,
`census_api_key()`](https://walker-data.com/tidycensus/reference/census_api_key.html).