Skip to content

Commit

Permalink
add option to keep original data
Browse files Browse the repository at this point in the history
  • Loading branch information
drmowinckels committed Feb 26, 2024
1 parent 0b26375 commit 1d00617
Show file tree
Hide file tree
Showing 5 changed files with 192 additions and 18 deletions.
45 changes: 39 additions & 6 deletions R/qdec.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,37 @@
#' @param data data.frame, list or environment
#' containing the data. Neither matrix nor an array
#' will be accepted.
#' @param formula a model \code{\link{formula}} or \code{\link{terms}} object
#' @param formula a model \code{\link{formula}} or
#' \code{\link{terms}} object. For simple qdec file,
#' with binary columns for all levels of a factor,
#' make sure to include a formula with \code{-1} to
#' remove the intercept.
#' @param path an option file path to write qdec to csv
#' @param keep_orig logical or vector of column names,
#' to keep the original data in the output.
#' Default is \code{FALSE}.
#'
#' @return
#' @return data.frame with model matrix and
#' scaled continuous variables.
#' @export
#'
#' @examples
make_qdec <- function(data, formula, path = NULL) {
#' cars <- mtcars
#' cars$cyl <- as.factor(cars$cyl)
#' cars$gear <- as.factor(cars$gear)
#'
#' make_qdec(cars, mpg ~ cyl + hp)
#'
#' # Remove the intercept, necessary to follow
#' # steps from Freesurfer docs
#' make_qdec(cars, mpg ~ -1 + cyl + hp)
#' make_qdec(cars, mpg ~ -1 + cyl + hp + gear)
#'
#' # Keep the original data also in the output
#' make_qdec(cars, mpg ~ -1 + cyl + hp, keep_orig = TRUE)
make_qdec <- function(data, formula,
path = NULL,
keep_orig = FALSE) {
# extract variable names from formula
vars <- all.vars(formula)

Expand All @@ -21,14 +44,23 @@ make_qdec <- function(data, formula, path = NULL) {
mm <- stats::model.matrix(formula, data)
mm <- as.data.frame(mm)

cl <- apply(data, 2, class)
# scale continuous variables
cl <- sapply(data, class)
dataz <- data[,which(cl %in% "numeric")]
dataz <- apply(dataz, 2, scale_vec)
dataz <- as.data.frame(dataz)
dataz <- as.data.frame(dataz, check.names = FALSE)
names(dataz) <- paste0(names(dataz), "z")

# combine model matrix and scaled data
qdec <- cbind(mm, dataz)
qdec <- qdec[, !names(qdec) %in% names(data)]

if(inherits(class(keep_orig), "character")){
data <- data[, keep_orig, drop = FALSE]
qdec <- cbind(qdec, data)
}else if(keep_orig){
qdec <- cbind(qdec, data)
}

# write to path if requested
if(!is.null(path)){
Expand All @@ -46,12 +78,13 @@ make_qdec <- function(data, formula, path = NULL) {
#'
#' @return scaled vector
#' @export
#' @keywords internal
#' @examples
#' scale_vec(1:20)
scale_vec <- function(x, ...) {
# Error if x has more than one dimension
if(!is.null(dim(x))){
stop("Input must be a vector", call. = FALSE)
stop("Input `x` must be a vector", call. = FALSE)
}
as.numeric(scale(x, ...))
}
46 changes: 42 additions & 4 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ output: github_document
<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, include = FALSE}
options(max.print = 60)
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
Expand All @@ -17,24 +18,61 @@ knitr::opts_chunk$set(

<!-- badges: start -->
[![R-CMD-check](https://github.com/capro-uio/freesurfer.lme/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/capro-uio/freesurfer.lme/actions/workflows/R-CMD-check.yaml)
[![CRAN status](https://www.r-pkg.org/badges/version/freesurfer.lme)](https://CRAN.R-project.org/package=freesurfer.lme)
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
<!-- badges: end -->

Running vertex-wise linear mixed effects models in [Freesurfer](https://surfer.nmr.mgh.harvard.edu/fswiki/LinearMixedEffectsModels) requires a specific file format.
The qdec format requires neatly created binary columns for
categorical value levels, and is made more efficient if continuous
variables are scaled (or de-meaned).
This package provides functions to create these files from R.


## Installation

You can install the development version of freesurfer.lme like so:
You can install the development version of `freesurfer.lme` like so:

``` r
# FILL THIS IN! HOW CAN PEOPLE INSTALL YOUR DEV PACKAGE?
# Install Capro R-universe
install.packages('freesurfer.lme',
repos = 'https://capro-uio.r-universe.dev')
```

## Example

This is a basic example which shows you how to solve a common problem:
Leveraging R's power of making models, we can create a QDEC file from a model formula and a data frame.
The function relies on the correct setup of data (categoricals as factors etc), which you would normally do in R.

```{r example}
library(freesurfer.lme)
## basic example code
# prep data
cars <- mtcars
cars$type <- row.names(cars)
row.names(cars) <- NULL
cars$cyl <- factor(cars$cyl)
cars$gear <- factor(cars$gear)
cars
```

Make sure to include -1 to remove intercept

```{r example1}
make_qdec(cars, mpg ~ -1 + cyl + hp)
```

If not, the model does not expose all levels
of factors

```{r example2}
make_qdec(cars, mpg ~ cyl + hp)
```


`-1` makes sure you get a qdec as expected
from the [Freesurfer documentation](https://surfer.nmr.mgh.harvard.edu/fswiki/LinearMixedEffectsModels)

```{r example3}
make_qdec(cars, mpg ~ -1 + cyl + hp + gear)
```
90 changes: 84 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,26 +4,104 @@
# freesurfer.lme

<!-- badges: start -->

[![R-CMD-check](https://github.com/capro-uio/freesurfer.lme/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/capro-uio/freesurfer.lme/actions/workflows/R-CMD-check.yaml)
[![CRAN
status](https://www.r-pkg.org/badges/version/freesurfer.lme)](https://CRAN.R-project.org/package=freesurfer.lme)
[![Lifecycle:
experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
<!-- badges: end -->

Running vertex-wise linear mixed effects models in
[Freesurfer](https://surfer.nmr.mgh.harvard.edu/fswiki/LinearMixedEffectsModels)
requires a specific file format. This package provides functions to
create these files from R.
requires a specific file format. The qdec format requires neatly created
binary columns for categorical value levels, and is made more efficient
if continuous variables are scaled (or de-meaned). This package provides
functions to create these files from R.

## Installation

You can install the development version of freesurfer.lme like so:
You can install the development version of `freesurfer.lme` like so:

``` r
# FILL THIS IN! HOW CAN PEOPLE INSTALL YOUR DEV PACKAGE?
# Install Capro R-universe
install.packages('freesurfer.lme',
repos = 'https://capro-uio.r-universe.dev')
```

## Example

This is a basic example which shows you how to solve a common problem:
Leveraging R’s power of making models, we can create a QDEC file from a
model formula and a data frame. The function relies on the correct setup
of data (categoricals as factors etc), which you would normally do in R.

``` r
library(freesurfer.lme)
## basic example code

# prep data
cars <- mtcars
cars$type <- row.names(cars)
row.names(cars) <- NULL
cars$cyl <- factor(cars$cyl)
cars$gear <- factor(cars$gear)
cars
#> mpg cyl disp hp drat wt qsec vs am gear carb type
#> 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4
#> 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 Mazda RX4 Wag
#> 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 Datsun 710
#> 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 Hornet 4 Drive
#> 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 Hornet Sportabout
#> [ reached 'max' / getOption("max.print") -- omitted 27 rows ]
```

Make sure to include -1 to remove intercept

``` r
make_qdec(cars, mpg ~ -1 + cyl + hp)
#> cyl4 cyl6 cyl8 hp mpgz hpz
#> 1 0 1 0 110 0.1508848 -0.5350928
#> 2 0 1 0 110 0.1508848 -0.5350928
#> 3 1 0 0 93 0.4495434 -0.7830405
#> 4 0 1 0 110 0.2172534 -0.5350928
#> 5 0 0 1 175 -0.2307345 0.4129422
#> 6 0 1 0 105 -0.3302874 -0.6080186
#> 7 0 0 1 245 -0.9607889 1.4339030
#> 8 1 0 0 62 0.7150178 -1.2351802
#> 9 1 0 0 95 0.4495434 -0.7538702
#> 10 0 1 0 123 -0.1477738 -0.3454858
#> [ reached 'max' / getOption("max.print") -- omitted 22 rows ]
```

If not, the model does not expose all levels of factors

``` r
make_qdec(cars, mpg ~ cyl + hp)
#> (Intercept) cyl6 cyl8 hp mpgz hpz
#> 1 1 1 0 110 0.1508848 -0.5350928
#> 2 1 1 0 110 0.1508848 -0.5350928
#> 3 1 0 0 93 0.4495434 -0.7830405
#> 4 1 1 0 110 0.2172534 -0.5350928
#> 5 1 0 1 175 -0.2307345 0.4129422
#> 6 1 1 0 105 -0.3302874 -0.6080186
#> 7 1 0 1 245 -0.9607889 1.4339030
#> 8 1 0 0 62 0.7150178 -1.2351802
#> 9 1 0 0 95 0.4495434 -0.7538702
#> 10 1 1 0 123 -0.1477738 -0.3454858
#> [ reached 'max' / getOption("max.print") -- omitted 22 rows ]
```

`-1` makes sure you get a qdec as expected from the [Freesurfer
documentation](https://surfer.nmr.mgh.harvard.edu/fswiki/LinearMixedEffectsModels)

``` r
make_qdec(cars, mpg ~ -1 + cyl + hp + gear)
#> cyl4 cyl6 cyl8 hp gear4 gear5 mpgz hpz
#> 1 0 1 0 110 1 0 0.1508848 -0.5350928
#> 2 0 1 0 110 1 0 0.1508848 -0.5350928
#> 3 1 0 0 93 1 0 0.4495434 -0.7830405
#> 4 0 1 0 110 0 0 0.2172534 -0.5350928
#> 5 0 0 1 175 0 0 -0.2307345 0.4129422
#> 6 0 1 0 105 0 0 -0.3302874 -0.6080186
#> 7 0 0 1 245 0 0 -0.9607889 1.4339030
#> [ reached 'max' / getOption("max.print") -- omitted 25 rows ]
```
28 changes: 26 additions & 2 deletions man/make_qdec.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/scale_vec.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 1d00617

Please sign in to comment.