Add effect modifier #177
-
Hi PhilippBach I have reviewed the documentation of the package DoubleML, and I found it very useful, congrats on the work!!! However, I want to know if is it possible to add a variable as an effect modifier to the causal model? I refer to direct effect modifiers based on the taxonomy of effect modifiers by VanderWheele and Robins: “Four types of effect modification: A classification based on directed acyclic graphs. Epidemiology. 2007.” Thanks a lot for your answer. |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 6 replies
-
Hi @juandavidgutier , thanks for your question. Yes, it is basically possible to add effect modifiers, i.e., by including interaction terms. I hope the example, that is based on Section 12.5. of Hernán and Robins (2020) illustrates how to do it with DoubleML. Effect modification example based on Section 12.5. / Program 12.6 from Hernán and Robins (2020)Load data from What-if bookFind the code for downloading the data from the What-if book at the end of this post. import numpy as np
import pandas as pd
import doubleml as dml
from sklearn.base import clone
from sklearn.linear_model import LassoCV
nhefs_all = pd.read_excel('data/NHEFS.xls')
# consider only subset of variables
select_cols = ['wt82_71', 'sex', 'qsmk']
nhefs = nhefs_all[select_cols]
nhefs.dropna(inplace = True) Regression Model with Effect ModificationWe want to estimate the regression model (with effect modification) and want to estimate the coefficients We set up a data backend for this regression model. # create new column for interaction term
nhefs['qsmk_and_female'] = nhefs.qsmk * nhefs.sex
# create a data backend
dml_data_eff_mod = dml.DoubleMLData(nhefs,
y_col = 'wt82_71',
d_cols = ['qsmk', 'sex', 'qsmk_and_female']) Next, we specify the learners and initiate a partially linear regression # learners
learner = LassoCV()
ml_l = clone(learner)
ml_m = clone(learner)
np.random.seed(1)
dml_eff_mod = dml.DoubleMLPLR(dml_data_eff_mod,
ml_l, ml_m)
dml_eff_mod.fit()
dml_eff_mod.summary
In case you consider mulitple interaction terms, correcting for multiple dml_eff_mod.bootstrap()
dml_eff_mod.p_adjust()
CommentA caveat to the current implementation is that DoubleML only supports one learner for all treatment variables and hence, does not allow to specify classification and regression learners for the different treatment variables. However, it is possible to provide specific parameters for the learners of the corresponding treatment variables, see the learners chapter in the user guide. ReferencesHernán MA, Robins JM (2020). Causal Inference: What If. Boca Raton: Chapman & Hall/CRC. Code based on:
|
Beta Was this translation helpful? Give feedback.
-
If you consider the answer as addressing your question, I'd encourage you to mark it as an answer by pressing the "mark as answer button" under the reply |
Beta Was this translation helpful? Give feedback.
-
Hi PhilippBach I am trying to run an Interactive regression model (IRM) with an epidemiological dataset, where the effect modifier is a continuous variable (NBI), the treatment (NeutralNina) and outcome (excess_cases1) are binary variables, and the confounders are continuous (qbo, wpac, zwnd). Here is the dataset:top50.csv However, I get the next error: "Error in private$check_data(self$data) : Incompatible data. Here is my code in R `library(DoubleML) top50 <- read.csv("D:/top50.csv") #NeutralNiña dataset #interaction term #data #ML methods #DML specifications #estimation print(dml_irm_obj)` I will appreciate a lot your cooperation. |
Beta Was this translation helpful? Give feedback.
-
Hi PhilippBach, I have an event study using a difference-in-difference model (with time and firm fixed effects), comparing the outcome of treated and control samples before and after the event. The formula looks like this: Without fixed effects: $$ And with fixed effects: $$ I want to implement these two models with double machine learning model using the DML package. I have three questions for help: (1) If the event is mostly exogenous imposed by political issues, is it still sensible to use DML? In that case, E[D│X] is already expected to be zero. (2) For the no fixed effects model, in the DoubleMLData function, should the ['treated','post', 'treated_post'] put at the “d_cols” (the treatment variable) or “x_cols” (the control covariates)? The modifier effect example you show above put it at “x_cols”, but I just wonder whether it would be the same for DiD model with treated and post interaction terms. Specifically, slightly change that example, which one below should be correct? dml_data_eff_mod = dml.DoubleMLData(nhefs, Or dml_data_eff_mod = dml.DoubleMLData(nhefs, (3) For the fixed effects model, is the implementation below correct? dml_data_eff_mod = dml.DoubleMLData(nhefs, Thank you very much in advance. The APIs, documentation, and DoubleML Tutorial are excellent. |
Beta Was this translation helpful? Give feedback.
Hi @juandavidgutier
thanks! I'm not 100% sure, if I get your questions here...
yes I think that's possible. If you have categorical variables you should be able to include the interactions of the one-hot levels (= the corresponding dummies generated from the levels of the variable). Continous should be possible too
Yes, that's possible. You can provide them simply via
x_cols
when you cr…