5software.Rmd


```{r setup01ss, include=FALSE}
require(knitr)
require(glmnet)
require(kableExtra)
require(dplyr)
require(xgboost)
require(SuperLearner)
require(sl3)
require(Rsolnp)
require(ltmle)
require(AIPW)
require(tmle3)
require(sl3)
options(knitr.kable.NA = '')
cachex=TRUE
cachexy=FALSE
```

```{r dataload_01, cache=cachex, echo = TRUE}
# Read the data saved at the last chapter
ObsData <- readRDS(file = "data/rhcAnalytic.RDS")
dim(ObsData)
```

# Pre-packaged software

## tmle

- The _tmle_ package can handle 
  - both binary and 
  - continuous outcomes, and 
  - uses the _SuperLearner_ package to construct both models just like we did in the steps above.
- The default SuperLearner library for estimating the outcome includes [@tmlePkgDocs] 
  - `SL.glm`: generalized linear models (GLMs)
  - `SL.glmnet`: LASSO
  - `tmle.SL.dbarts2`: modeling and prediction using BART
- The default library for estimating the propensity scores includes
  - `SL.glm`: generalized linear models (GLMs)
  - `tmle.SL.dbarts.k.5`: SL wrappers for modeling and prediction using BART
  - `SL.gam`: generalized additive models: (GAMs)  
- It is certainly possible to use different set of learners
  - More methods can be added by 
    - specifying lists of models in the _Q.SL.library_ (for the outcome model) and 
    - _g.SL.library_ (for the propensity score model) arguments. 
- Note also that the outcome $Y$ is required to be within the range of $[0,1]$ for this method as well, 
  - so we need to pass in the transformed data, then transform back the estimate.

```{r tmlepkg, cache=cachex, message=FALSE, warning=FALSE}
set.seed(1444) 
# transform the outcome to fall within the range [0,1]
min.Y <- min(ObsData$Y)
max.Y <- max(ObsData$Y)
ObsData$Y_transf <- (ObsData$Y-min.Y)/(max.Y-min.Y)

# run tmle from the tmle package 
ObsData.noYA <- dplyr::select(ObsData, 
                              !c(Y_transf, Y, A))
SL.library = c("SL.glm", 
               "SL.glmnet", 
               "SL.xgboost")
```


```{r tmlepkg33, cache=cachex, message=FALSE, warning=FALSE}
tmle.fit <- tmle::tmle(Y = ObsData$Y_transf, 
                   A = ObsData$A, 
                   W = ObsData.noYA, 
                   family = "gaussian", 
                   V = 3,
                   Q.SL.library = SL.library, 
                   g.SL.library = SL.library)
tmle.fit
```


```{r tmlepkgtr2, cache=cachex, message=FALSE, warning=FALSE}
summary(tmle.fit)
```


```{r tmlepkgtr, cache=cachex, message=FALSE, warning=FALSE}
tmle_est_tr <- tmle.fit$estimates$ATE$psi
tmle_est_tr
# transform back the ATE estimate
tmle_est <- (max.Y-min.Y)*tmle_est_tr
tmle_est
```

```{r, cache=TRUE, echo = TRUE}
saveRDS(tmle_est, file = "data/tmle.RDS")
```

```{r tmlepkg2, cache=cachex, results='hide', message=FALSE, warning=FALSE}
tmle_ci <- paste("(", 
                 round((max.Y-min.Y)*tmle.fit$estimates$ATE$CI[1], 3), ", ", 
                 round((max.Y-min.Y)*tmle.fit$estimates$ATE$CI[2], 3), ")", sep = "")
```

```{r, cache=TRUE, echo = TRUE}
tmle.ci <- (max.Y-min.Y)*tmle.fit$estimates$ATE$CI
saveRDS(tmle.ci, file = "data/tmleci.RDS")
```

```{r, cache=cachex, echo=FALSE}
cat("ATE from tmle package: ", tmle_est, tmle_ci, sep = "")
```

Notes about the _tmle_ package: 

* does not scale the outcome for you
* can give some error messages when dealing with variable types it is not expecting
* practically all steps are nicely packed up in one function, very easy to use but need to dig a little to truly understand what it does

Most helpful resources: 

* [CRAN docs](https://cran.r-project.org/web/packages/tmle/tmle.pdf)
* [tmle package paper](https://www.jstatsoft.org/article/view/v051i13)

## tmle (reduced computation)

We can use the previously calculated propensity score predictions from SL (calculated using `WeightIt` package) in the `tmle` to reduce some computing time.

```{r tmlepkg33b, cache=cachex, message=FALSE, warning=FALSE}
ps.obj <- readRDS(file = "data/ipwslps.RDS")
ps.SL <- ps.obj$weights
tmle.fit2 <- tmle::tmle(Y = ObsData$Y_transf, 
                   A = ObsData$A, 
                   W = ObsData.noYA, 
                   family = "gaussian",
                   V = 3,
                   Q.SL.library = SL.library, 
                   g1W = ps.SL)
tmle.fit2
```

```{r tmlepkgtrb, cache=cachex, message=FALSE, warning=FALSE}
# transform back ATE estimate
(max.Y-min.Y)*tmle.fit2$estimates$ATE$psi
```

## sl3 (optional)

```{r}
# install sl3 if not done so
# remotes::install_github("tlverse/sl3")
```
The _sl3_ package is a newer package, that implements two types of Super Learning: 

- **discrete Super Learning**, 
  - in which the best prediction algorithm (based on cross-validation) from a specified library is returned, and 
- **ensemble Super Learning**, 
  - in which the best linear combination of the specified algorithms is returned (@coyle2021sl3).

The first step is to create a sl3 task which keeps track of the roles of the variables in our problem (@coyle2021tlverse). 

```{r sl301, cache=cachexy}
require(sl3)
# create sl3 task, specifying outcome and covariates 
rhc_task <- make_sl3_Task(
  data = ObsData, 
  covariates = colnames(ObsData)[-which(names(ObsData) == "Y")],
  outcome = "Y"
)
```


```{r sl30156, cache=cachexy}
rhc_task
```

Next, we create our SuperLearner. To do this, 

- we need to specify a **selection of machine learning algorithms** we want to include as candidates, as well as 
- a **metalearner** that the SuperLearner will use to combine or choose from the machine learning algorithms provided (@coyle2021tlverse). 

```{r sl302, cache=cachexy}
# see what algorithms are available for a continuous outcome 
# (similar can be done for a binary outcome)
sl3_list_learners("continuous")
```

The chosen candidate algorithms can be created and collected in a Stack.

```{r sl303, cache=cachexy, results='hide', message=FALSE, warning=FALSE}
# initialize candidate learners
lrn_glm <- make_learner(Lrnr_glm)
lrn_lasso <- make_learner(Lrnr_glmnet) # alpha default is 1
xgb_5 <- Lrnr_xgboost$new(nrounds = 5)

# collect learners in stack
stack <- make_learner(
  Stack, lrn_glm, lrn_lasso, xgb_5
)
```

The stack is then given to the SuperLearner.
```{r sl304, cache=cachexy, results='hide', message=FALSE, warning=FALSE}
# to make an ensemble SuperLearner
sl_meta <- Lrnr_nnls$new()
sl <- Lrnr_sl$new(
  learners = stack,
  metalearner = sl_meta)

# or a discrete SuperLearner
sl_disc_meta <- Lrnr_cv_selector$new()
sl_disc <- Lrnr_sl$new(
  learners = stack, 
  metalearner = sl_disc_meta
)
```

The SuperLearner is then trained on the sl3 task we created at the start and then it can be used to make predictions.

```{r sl305, cache=cachexy, message=FALSE, warning=FALSE}
set.seed(1444)

# train SL
sl_fit <- sl$train(rhc_task)
# or for discrete SL
# sl_fit <- sl_disc$train(rhc_task)

# make predictions
sl3_data <- ObsData
sl3_data$sl_preds <- sl_fit$predict()

sl3_est <- mean(sl3_data$sl_preds[sl3_data$A == 1]) - 
  mean(sl3_data$sl_preds[sl3_data$A == 0])

sl3_est
```

```{r, cache=TRUE, echo = TRUE}
saveRDS(sl3_est, file = "data/sl3.RDS")
```

Notes about the _sl3_ package: 

* fairly easy to implement & understand structure
* large selection of candidate algorithms provided
* unsure why result is so different
* very different structure from _SuperLearner_ library, but very customizable
* could use more explanations of when to use what metalearner and what exactly the structure of the metalearner construction means 

Most helpful resources: 

* [tlverse sl3 page](https://tlverse.org/sl3/)
* [sl3 GitHub repository](https://github.com/tlverse/sl3/)
* [tlverse handbook chapter 6](https://tlverse.org/tlverse-handbook/tmle3.html)
* Vignettes in R

<!---
## ltmle

Similarly to the _tmle_ package, the _ltmle_ package gives the direct TMLE result with the call of one function. 

```{r ltmlepkg, cache=cachex, message=FALSE, warning=FALSE}
# exclude Y_transf since ltmle scales automatically
ltmle_data <- dplyr::select(ObsData, !Y_transf)
```


```{r ltmlepkgxd, cache=cachex, message=FALSE, warning=FALSE}
# run ltmle
ltmle_est <- ltmle(ltmle_data, 
                   Anodes = "A", 
                   Ynodes = "Y", 
                   abar = list(1,0), 
                   SL.cvControl=list(V=3),
                   SL.library = SL.library,
                   estimate.time = FALSE)
```


```{r ltmlepkgpr, cache=cachex}
summary(ltmle_est)
```

```{r, cache=TRUE, echo = TRUE}
saveRDS(ltmle_est, file = "data/ltmle.RDS")
```


```{r ltmlepkg2, cache=cachex, include = FALSE}
# # print result & confidence intervals
# ltmle_ci <- paste("(",
#                   round(summary(ltmle_est)[["treatment"]][["CI"]][,"2.5%"], 3),
#                   ", ", round(summary(ltmle_est)[["treatment"]][["CI"]][,"97.5%"], 3),
#                   ")", sep = "")
# cat("ATE from ltmle package: ",
#     ltmle_est$estimates[["tmle"]], ltmle_ci, sep = "")
```

- The main difference between the _tmle_ and the _ltmle_ package is that the _ltmle_ package is designed to handle longitudinal data, with measurements data recorded for each subject at multiple timepoints. 
- More information on the use of the _ltmle_ package in these settings can be found [here](https://cran.r-project.org/web/packages/ltmle/vignettes/ltmle-intro.html).

## AIPW

- The _aipw_ package implements augmented inverse probability weighting (another type of DR method). 
- It has similar parameters as the _tmle_ package. 
- It is typically used with the _SuperLearner_ library. 

```{r aipwpkg, cache=cachex, results='hide', message=FALSE, warning=FALSE, include = FALSE}
set.seed(1444) 
# construct AIPW estimator
aipw <- AIPW$new(Y=ObsData$Y, 
                 A=ObsData$A, 
                 W=ObsData[colnames(ObsData)[-which(names(ObsData) == "Y")]], 
                 Q.SL.library = c("SL.glm", 
                                  "SL.glmnet", 
                                  "SL.xgboost"), 
                 g.SL.library = c(c("SL.glm", 
                                    "SL.glmnet", 
                                    "SL.xgboost")), 
                 k_split=3, 
                 verbose=FALSE)
```


```{r aipwpkg2, cache=cachex, results='hide', message=FALSE, warning=FALSE, include = FALSE}
# fit AIPW object
aipw$fit()
```


```{r aipwpkg3, cache=cachex, results='hide', message=FALSE, warning=FALSE, include = FALSE}
# calculate ATE
aipw$summary()
print(aipw$estimates)
aipw_est <- aipw$estimates$RD[["Estimate"]]

# 95% CI
aipw_ci <- paste(" (", 
                 round(aipw$estimates$RD["95% LCL"], 3), ", ", 
                 round(aipw$estimates$RD["95% UCL"], 3), ")", sep = "")
```

```{r, cache=TRUE, echo = TRUE, include = FALSE}
saveRDS(aipw, file = "data/aipw.RDS")
```

```{r asg, echo=FALSE, include = FALSE}
cat("ATE from aipw package: ", aipw_est, aipw_ci, sep = "") 
```
--->

## RHC results

Gathering previously saved results:
```{r summarytable0, cache=cachex, echo=TRUE, results='hold', warning=FALSE, message=FALSE}
fit.reg <- readRDS(file = "data/adjreg.RDS")
TEr <- fit.reg$coefficients[2]
CIr <- as.numeric(confint(fit.reg, 'A'))
fit.matched <- readRDS(file = "data/match.RDS")
TEm <- fit.matched$coefficients[2]
CIm <- as.numeric(confint(fit.matched, 'A'))
TEg <- readRDS(file = "data/gcomp.RDS")
CIg <- readRDS(file = "data/gcompci.RDS")
CIgc <- CIg$percent[4:5]
TE1g <- readRDS(file = "data/gcompxg.RDS")
TE2g <- readRDS(file = "data/gcompls.RDS")
TE3g <- readRDS(file = "data/gcompsl.RDS")
ipw <- readRDS(file = "data/ipw.RDS")
TEi <- ipw$coefficients[2]
CIi <- as.numeric(confint(ipw, 'A'))
ipwsl <- readRDS(file = "data/ipwsl.RDS")
TEsli <- ipwsl$coefficients[2]
CIsli <- as.numeric(confint(ipwsl, 'A'))
tmleh <- readRDS(file = "data/tmlepointh.RDS")
tmlecih <- readRDS(file = "data/tmlecih.RDS")
tmlesl <- readRDS(file = "data/tmle.RDS")
tmlecisl <- readRDS(file = "data/tmleci.RDS")
slp <- readRDS(file = "data/sl3.RDS")
ci.b <- rep(NA,2)
ks <- 2.01
ci.ks <- c(0.6,3.41)
point <- as.numeric(c(TEr, TEm, TEg, TE1g, TE2g,  
                      TE3g, TEi, TEsli, tmleh, 
                      tmlesl, slp, ks))
CIs <- cbind(CIr, CIm, CIgc, ci.b, ci.b, ci.b, 
             CIi, CIsli, tmlecih, tmlecisl, 
             ci.b, ci.ks)    
```


```{r summarytable, cache=cachex, echo=TRUE}
method.list <- c("Adj. Reg","PS match", 
                 "G-comp (linear reg.)","G-comp (xgboost)", 
                 "G-comp (lasso)", "G-comp (SL)", 
                 "IPW (logistic)", "IPW (SL)", 
                 "TMLE (9 steps)", "TMLE (package)", 
                 "sl3 (package)", "Keele and Small (2021) paper") 
results <- data.frame(method.list) 
results$Estimate <- round(point,2)
results$`2.5 %` <- CIs[1,] 
results$`97.5 %` <- CIs[2,] 
kable(results,digits = 2)%>%
  row_spec(10, bold = TRUE, color = "white", background = "#D7261E")  
```

```{block, type='rmdcomment'}
@keele2021comparing used TMLE-SL based on an ensemble of 3 different learners: (1) GLM, (2) random forests, and (3) LASSO.
```

## Other packages

Other packages that may be useful: 

| Package | Resources | Notes |
|---|---|---|
| ltmle | [CRAN vignette](https://cran.r-project.org/web/packages/ltmle/vignettes/ltmle-intro.html) | Longitudinal |
| tmle3 | [GitHub](https://github.com/tlverse/tmle3), [framework overview](https://tlverse.org/tmle3/articles/framework.html), [tlverse handbook](https://tlverse.org/tlverse-handbook/tmle3.html) | tmle3 is still under development | 
| aipw | [GitHub](https://github.com/yqzhong7/AIPW), [CRAN vignette](https://cran.r-project.org/web/packages/AIPW/vignettes/AIPW.html) | Newer package for AIPW (another DR method) |
| Others | [van der Laan research group](https://www.stat.berkeley.edu/users/laan/Software/) |  |

You can find many other related packages on [CRAN](https://cran.r-project.org/search.html) or GitHub.