Skip to content

Commit

Permalink
revert vignette formatting hacks
Browse files Browse the repository at this point in the history
closes #803
  • Loading branch information
jgabry committed Aug 1, 2023
1 parent 323c14d commit e53b503
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 101 deletions.
41 changes: 3 additions & 38 deletions vignettes/cmdstanr.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,7 @@ first argument specifies the variables to summarize and any arguments
after that are passed on to `posterior::summarise_draws()` to specify
which summaries to compute, whether to use multiple cores, etc.

```{r summary, eval=FALSE}
```{r summary}
fit$summary()
fit$summary(variables = c("theta", "lp__"), "mean", "sd")
Expand All @@ -202,24 +202,6 @@ fit$summary(
)
```

```{r, echo=FALSE}
# NOTE: the hack of using print.data.frame in chunks with echo=FALSE
# is used because the pillar formatting of posterior draws_summary objects
# isn't playing nicely with pkgdown::build_articles().
options(digits = 2)
print.data.frame(fit$summary())
print.data.frame(fit$summary(variables = c("theta", "lp__"), "mean", "sd"))
print.data.frame(fit$summary("theta", pr_lt_half = ~ mean(. <= 0.5)))
print.data.frame(fit$summary(
variables = NULL,
posterior::default_summary_measures(),
extra_quantiles = ~posterior::quantile2(., probs = c(.0275, .975))
))
```

#### CmdStan's stansummary utility

CmdStan itself provides a `stansummary` utility that can be called using the
Expand Down Expand Up @@ -351,20 +333,11 @@ the `$sample()` method demonstrated above.

We can find the (penalized) maximum likelihood estimate (MLE) using [`$optimize()`](https://mc-stan.org/cmdstanr/reference/model-method-optimize.html).

```{r optimize, eval=FALSE}
```{r optimize}
fit_mle <- mod$optimize(data = data_list, seed = 123)
fit_mle$summary() # includes lp__ (log prob calculated by Stan program)
fit_mle$mle("theta")
```
```{r, echo=FALSE}
# NOTE: the hack of using print.data.frame in chunks with echo=FALSE
# is used because the pillar formatting of posterior draws_summary objects
# isn't playing nicely with pkgdown::build_articles().
options(digits = 2)
fit_mle <- mod$optimize(data = data_list, seed = 123)
print.data.frame(fit_mle$summary()) # includes lp__ (log prob calculated by Stan program)
fit_mle$mle("theta")
```

Here's a plot comparing the penalized MLE to the posterior distribution of
`theta`.
Expand All @@ -380,18 +353,10 @@ We can run Stan's experimental variational Bayes algorithm (ADVI) using the
[`$variational()`](https://mc-stan.org/cmdstanr/reference/model-method-variational.html)
method.

```{r variational, eval=FALSE}
```{r variational}
fit_vb <- mod$variational(data = data_list, seed = 123, output_samples = 4000)
fit_vb$summary("theta")
```
```{r, echo=FALSE}
# NOTE: the hack of using print.data.frame in chunks with echo=FALSE
# is used because the pillar formatting of posterior draws_summary objects
# isn't playing nicely with pkgdown::build_articles().
options(digits = 2)
fit_vb <- mod$variational(data = data_list, seed = 123, output_samples = 4000)
print.data.frame(fit_vb$summary("theta"))
```

The `$draws()` method can be used to access the approximate posterior draws.
Let's extract the draws, make the same plot we made after MCMC, and compare the
Expand Down
75 changes: 12 additions & 63 deletions vignettes/posterior.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -15,62 +15,38 @@ vignette: >
```{r child="children/_settings-knitr.Rmd"}
```

<!--
NOTE: the hack below of using print.data.frame in chunks with echo=FALSE
is used because the pillar formatting of posterior draws_summary objects
isn't playing nicely with pkgdown::build_articles(). When that is fixed
using options(digits=2) also won't be necessary anymore.
-->
```{r, include=FALSE}
options(digits=2)
```

## Summary statistics

We can easily customise the summary statistics reported by `$summary()` and `$print()`.
We can easily customize the summary statistics reported by `$summary()` and `$print()`.

```{r eval=FALSE}
```{r}
fit <- cmdstanr::cmdstanr_example("schools", method = "sample")
fit$summary()
```
```{r echo=FALSE}
fit <- cmdstanr::cmdstanr_example("schools", method = "sample")
print.data.frame(fit$summary())
```

By default all variables are summaries with the follow functions:
```{r}
posterior::default_summary_measures()
```

To change the variables summarised, we use the variables argument
```{r eval=FALSE}
To change the variables summarized, we use the variables argument
```{r}
fit$summary(variables = c("mu", "tau"))
```
```{r echo=FALSE}
print.data.frame(fit$summary(variables = c("mu", "tau")))
```

We can additionally change which functions are used
```{r eval=FALSE}
```{r}
fit$summary(variables = c("mu", "tau"), mean, sd)
```
```{r echo=FALSE}
print.data.frame(fit$summary(variables = c("mu", "tau"), mean, sd))
```

To summarise all variables with non-default functions, it is necessary to set explicitly set the variables argument, either to `NULL` or the full vector of variable names.
```{r eval=FALSE}
To summarize all variables with non-default functions, it is necessary to set explicitly set the variables argument, either to `NULL` or the full vector of variable names.
```{r}
fit$metadata()$model_params
fit$summary(variables = NULL, "mean", "median")
```
```{r echo=FALSE}
fit$metadata()$model_params
print.data.frame(fit$summary(variables = NULL, "mean", "median"))
```

Summary functions can be specified by character string, function, or using a formula (or anything else supported by [rlang::as_function]). If these arguments are named, those names will be used in the tibble output. If the summary results are named they will take precedence.
```{r eval=FALSE}
```{r}
my_sd <- function(x) c(My_SD = sd(x))
fit$summary(
c("mu", "tau"),
Expand All @@ -81,58 +57,31 @@ fit$summary(
Minimum = function(x) min(x)
)
```
```{r echo=FALSE}
my_sd <- function(x) c(My_SD = sd(x))
print.data.frame(fit$summary(
c("mu", "tau"),
MEAN = mean,
"median",
my_sd,
~quantile(.x, probs = c(0.1, 0.9)),
Minimum = function(x) min(x)
))
```


Arguments to all summary functions can also be specified with `.args`.
```{r eval=FALSE}
```{r}
fit$summary(c("mu", "tau"), quantile, .args = list(probs = c(0.025, .05, .95, .975)))
```
```{r echo=FALSE}
print.data.frame(fit$summary(c("mu", "tau"), quantile, .args = list(probs = c(0.025, .05, .95, .975))))
```

The summary functions are applied to the array of sample values, with dimension `iter_sampling`x`chains`.
```{r eval=FALSE}
```{r}
fit$summary(variables = NULL, dim, colMeans)
```
```{r echo=FALSE}
print.data.frame(fit$summary(variables = NULL, dim, colMeans))
```


For this reason users may have unexpected results if they use `stats::var()` directly, as it will return a covariance matrix. An alternative is the `distributional::variance()` function,
which can also be accessed via `posterior::variance()`.
```{r eval=FALSE}
```{r}
fit$summary(c("mu", "tau"), posterior::variance, ~var(as.vector(.x)))
```
```{r echo=FALSE}
print.data.frame(fit$summary(c("mu", "tau"), posterior::variance, ~var(as.vector(.x))))
```


Summary functions need not be numeric, but these won't work with `$print()`.

```{r eval=FALSE}
```{r}
strict_pos <- function(x) if (all(x > 0)) "yes" else "no"
fit$summary(variables = NULL, "Strictly Positive" = strict_pos)
# fit$print(variables = NULL, "Strictly Positive" = strict_pos)
```
```{r echo=FALSE}
strict_pos <- function(x) if (all(x > 0)) "yes" else "no"
print.data.frame(fit$summary(variables = NULL, "Strictly Positive" = strict_pos))
# fit$print(variables = NULL, "Strictly Positive" = strict_pos)
```

For more information, see `posterior::summarise_draws()`, which is called by `$summary()`.

Expand Down

0 comments on commit e53b503

Please sign in to comment.